mirror of
https://github.com/rasbt/LLMs-from-scratch.git
synced 2026-01-07 12:41:53 +00:00
update notes
This commit is contained in:
parent
0b47dfc381
commit
21140b98d4
@ -204,9 +204,7 @@
|
||||
"id": "ff856c58-8382-44c7-827f-798040e6e697",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"- By convention, the unnormalized attention weights are referred to as **\"attention scores\"** whereas the normalized attention scores, which sum to 1, are referred to as **\"attention weights\"**\n",
|
||||
"\n",
|
||||
"- The attention weights and context vector calculation are summarized in the figure below:"
|
||||
"- By convention, the unnormalized attention weights are referred to as **\"attention scores\"** whereas the normalized attention scores, which sum to 1, are referred to as **\"attention weights\"**\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
@ -2002,7 +2000,7 @@
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.10.12"
|
||||
"version": "3.10.6"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
|
||||
@ -7,7 +7,7 @@ As the Project Gutenberg website states, "the vast majority of Project Gutenberg
|
||||
Please read the [Project Gutenberg Permissions, Licensing and other Common Requests](https://www.gutenberg.org/policy/permission.html) page for more information about using the resources provided by Project Gutenberg.
|
||||
|
||||
|
||||
## How to use this code
|
||||
## How to Use This Code
|
||||
|
||||
|
||||
|
||||
@ -17,7 +17,11 @@ In this section, we download books from Project Gutenberg using code from the [`
|
||||
|
||||
As of this writing, this will require approximately 50 GB of disk space, but it may be more depending on how much Project Gutenberg grew since then.
|
||||
|
||||
Follow these steps to download the dataset:
|
||||
|
||||
#### Download instructions for Linux and macOS users
|
||||
|
||||
|
||||
Linux and macOS users can follow these steps to download the dataset (if you are a Windows user, please see the note below):
|
||||
|
||||
|
||||
1. `git clone https://github.com/pgcorpus/gutenberg.git`
|
||||
@ -31,8 +35,26 @@ Follow these steps to download the dataset:
|
||||
5. `cd ..`
|
||||
|
||||
|
||||
#### Special instructions for Windows users
|
||||
|
||||
The [`pgcorpus/gutenberg`](https://github.com/pgcorpus/gutenberg) code is compatible with both Linux and macOS. However, Windows users would have to make small adjustments, such as adding `shell=True` to the `subprocess` calls and replacing `rsync`.
|
||||
|
||||
Alternatively, an easier way to run this code on Windows is by using the "Windows Subsystem for Linux" (WSL) feature, which allows users to run a Linux environment using Ubuntu in Windows. For more information, please read [Microsoft's official installation instruction](https://learn.microsoft.com/en-us/windows/wsl/install) and [tutorial](https://learn.microsoft.com/en-us/training/modules/wsl-introduction/).
|
||||
|
||||
When using WSL, please make sure you have Python 3 installed (check via `python3 --version`, or install it for instance with `sudo apt-get install -y python3.10` for Python 3.10) and install following packages there:
|
||||
|
||||
```bash
|
||||
sudo apt-get update && \
|
||||
sudo apt-get upgrade -y && \
|
||||
sudo apt-get install -y python3-pip && \
|
||||
sudo apt-get install -y python-is-python3 && \
|
||||
sudo apt-get install -y rsync && \
|
||||
```
|
||||
|
||||
> [!NOTE]
|
||||
> The [`pgcorpus/gutenberg`](https://github.com/pgcorpus/gutenberg) code is compatible with both Linux and macOS. However, Windows users would have to make small adjustments, such as adding `shell=True` to the `subprocess` calls and replacing `rsync`. Alternatively, an easier way to run this code on Windows is by using the "Windows Subsystem for Linux" feature, which allows users to run a Linux environment in Windows. For more information, please read [Microsoft's official installation instruction](https://learn.microsoft.com/en-us/windows/wsl/install) and [tutorial](https://learn.microsoft.com/en-us/training/modules/wsl-introduction/).
|
||||
> Instructions about how to set up Python and installing packages can be found in [Appendix A: Optional Python Setup Preferences](../../appendix-A/01_optional-python-setup-preferences/README.md) and [Appendix A: Installing Python Libraries](../../appendix-A/02_installing-python-libraries/README.md).
|
||||
>
|
||||
> Optionally, a Docker image running Ubuntu is provided with this repository. When having cloned the [`pgcorpus/gutenberg`](https://github.com/pgcorpus/gutenberg) GitHub repository, copy the *.devcontainer* folder of this `LLMs-from-scratch` repository and paste it to the locally cloned `gutenberg` repository. Instructions about how to run a container with the provided Docker image can be found in [Appendix A: Optional Docker Environment](../../appendix-A/04_optional-docker-environment/README.md).
|
||||
|
||||
|
||||
### 2) Prepare the dataset
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user