Organized setup instructions (#115)

* Organized setup instructions

* update tets

* link checker action

* raise error upon broken link

* fix links

* fix links

* delete duplicated paragraph
This commit is contained in:
Sebastian Raschka 2024-04-10 22:09:46 -04:00 committed by GitHub
parent 0b866c133f
commit 790d0808b2
39 changed files with 75 additions and 49 deletions

View File

@ -34,7 +34,7 @@ jobs:
run: |
pytest ch04/01_main-chapter-code/tests.py
pytest ch05/01_main-chapter-code/tests.py
pytest appendix-A/02_installing-python-libraries/tests.py
pytest setup/02_installing-python-libraries/tests.py
- name: Validate Selected Jupyter Notebooks
run: |

24
.github/workflows/check-links.yml vendored Normal file
View File

@ -0,0 +1,24 @@
name: Check Markdown Links
on:
push:
branches:
- main
pull_request:
branches:
- main
jobs:
check-links:
runs-on: ubuntu-latest
steps:
- name: Checkout Repository
uses: actions/checkout@v3
- name: Install Markdown Link Checker
run: npm install -g markdown-link-check
- name: Find Markdown Files and Check Links
run: |
find . -name '*.md' -exec markdown-link-check {} \;

View File

@ -24,7 +24,7 @@ The method described in this book for training and developing your own small-but
# Table of Contents
Please note that the `Readme.md` file is a Markdown (`.md`) file. If you have downloaded this code bundle from the Manning website and are viewing it on your local computer, I recommend using a Markdown editor or previewer for proper viewing. If you haven't installed a Markdown editor yet, [MarkText](https://www.marktext.cc) is a good free option.
Please note that this `README.md` file is a Markdown (`.md`) file. If you have downloaded this code bundle from the Manning website and are viewing it on your local computer, I recommend using a Markdown editor or previewer for proper viewing. If you haven't installed a Markdown editor yet, [MarkText](https://www.marktext.cc) is a good free option.
Alternatively, you can view this and other files on GitHub at [https://github.com/rasbt/LLMs-from-scratch](https://github.com/rasbt/LLMs-from-scratch).
@ -36,17 +36,23 @@ Alternatively, you can view this and other files on GitHub at [https://github.co
<br>
> [!TIP]
> If you're seeking guidance on installing Python and Python packages and setting up your code environment, I suggest reading the [README.md](setup/README.md) file located in the [setup](setup) directory.
<br>
| Chapter Title | Main Code (for quick access) | All Code + Supplementary |
|------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------|-------------------------------|
| Ch 1: Understanding Large Language Models | No code | - |
| Ch 2: Working with Text Data | - [ch02.ipynb](ch02/01_main-chapter-code/ch02.ipynb)<br/>- [dataloader.ipynb](ch02/01_main-chapter-code/dataloader.ipynb) (summary)<br/>- [exercise-solutions.ipynb](ch02/01_main-chapter-code/exercise-solutions.ipynb) | [./ch02](./ch02) |
| Ch 3: Coding Attention Mechanisms | - [ch03.ipynb](ch03/01_main-chapter-code/ch03.ipynb)<br/>- [multihead-attention.ipynb](ch03/01_main-chapter-code/multihead-attention.ipynb) (summary) <br/>- [exercise-solutions.ipynb](ch03/01_main-chapter-code/exercise-solutions.ipynb)| [./ch03](./ch03) |
| Ch 4: Implementing a GPT Model from Scratch | - [ch04.ipynb](ch04/01_main-chapter-code/ch04.ipynb)<br/>- [gpt.py](ch04/01_main-chapter-code/gpt.py) (summary)<br/>- [exercise-solutions.ipynb](ch04/01_main-chapter-code/exercise-solutions.ipynb) | [./ch04](./ch04) |
| Ch 5: Pretraining on Unlabeled Data | - [ch05.ipynb](ch05/01_main-chapter-code/ch05.ipynb)<br/>- [train.py](ch05/01_main-chapter-code/train.py) (summary) <br/>- [generate.py](ch05/01_main-chapter-code/generate.py) (summary) <br/>- [exercise-solutions.ipynb](ch05/01_main-chapter-code/exercise-solutions.ipynb) | [./ch05](./ch05) |
| Ch 5: Pretraining on Unlabeled Data | - [ch05.ipynb](ch05/01_main-chapter-code/ch05.ipynb)<br/>- [gpt_train.py](ch05/01_main-chapter-code/gpt_train.py) (summary) <br/>- [gpt_generate.py](ch05/01_main-chapter-code/gpt_generate.py) (summary) <br/>- [exercise-solutions.ipynb](ch05/01_main-chapter-code/exercise-solutions.ipynb) | [./ch05](./ch05) |
| Ch 6: Finetuning for Text Classification | Q2 2024 | ... |
| Ch 7: Finetuning with Human Feedback | Q2 2024 | ... |
| Ch 8: Using Large Language Models in Practice | Q2/3 2024 | ... |
| Appendix A: Introduction to PyTorch | - [code-part1.ipynb](appendix-A/03_main-chapter-code/code-part1.ipynb)<br/>- [code-part2.ipynb](appendix-A/03_main-chapter-code/code-part2.ipynb)<br/>- [DDP-script.py](appendix-A/03_main-chapter-code/DDP-script.py)<br/>- [exercise-solutions.ipynb](appendix-A/03_main-chapter-code/exercise-solutions.ipynb) | [./appendix-A](./appendix-A) |
| Appendix A: Introduction to PyTorch | - [code-part1.ipynb](appendix-A/01_main-chapter-code/code-part1.ipynb)<br/>- [code-part2.ipynb](appendix-A/01_main-chapter-code/code-part2.ipynb)<br/>- [DDP-script.py](appendix-A/01_main-chapter-code/DDP-script.py)<br/>- [exercise-solutions.ipynb](appendix-A/01_main-chapter-code/exercise-solutions.ipynb) | [./appendix-A](./appendix-A) |
| Appendix B: References and Further Reading | No code | - |
| Appendix C: Exercises | No code | - |
| Appendix D: Adding Bells and Whistles to the Training Loop | - [appendix-D.ipynb](appendix-D/01_main-chapter-code/appendix-D.ipynb) | [./appendix-D](./appendix-D) |
@ -54,11 +60,6 @@ Alternatively, you can view this and other files on GitHub at [https://github.co
> [!TIP]
> Please see [this](appendix-A/01_optional-python-setup-preferences) and [this](appendix-A/02_installing-python-libraries) folder if you need more guidance on installing Python and Python packages.
<br>
<br>
@ -74,10 +75,10 @@ Shown below is a mental model summarizing the contents covered in this book.
Several folders contain optional materials as a bonus for interested readers:
- **Appendix A:**
- [Python Setup Tips](appendix-A/01_optional-python-setup-preferences)
- [Installing Libraries Used In This Book](appendix-A/02_installing-python-libraries)
- [Docker Environment Setup Guide](appendix-A/04_optional-docker-environment)
- **Setup**
- [Python Setup Tips](setup/01_optional-python-setup-preferences)
- [Installing Libraries Used In This Book](setup/02_installing-python-libraries)
- [Docker Environment Setup Guide](setup/03_optional-docker-environment)
- **Chapter 2:**
- [Comparing Various Byte Pair Encoding (BPE) Implementations](ch02/02_bonus_bytepair-encoder)
@ -88,9 +89,9 @@ Several folders contain optional materials as a bonus for interested readers:
- **Chapter 5:**
- [Alternative Weight Loading from Hugging Face Model Hub using Transformers](ch05/02_alternative_weight_loading/weight-loading-hf-transformers.ipynb)
- [Pretraining GPT on the Project Gutenberg Dataset](ch05/03_bonus_pretraining_on_gutenberg)
- [Pretraining GPT on the Project Gutenberg Dataset](ch05/03_bonus_pretraining_on_gutenberg)
- [Adding Bells and Whistles to the Training Loop](ch05/04_learning_rate_schedulers)
- [Optimizing Hyperparameters for Pretraining](05_bonus_hparam_tuning)
- [Optimizing Hyperparameters for Pretraining](ch05/05_bonus_hparam_tuning)
<br>
<br>

View File

@ -1,3 +0,0 @@
# Optional Docker Environment
This is an optional Docker environment for those users who prefer Docker. For more instructions, see the *Docker Environment Setup Guide* in [appendix-A/04_optional-docker-environment](../).

View File

@ -2,6 +2,6 @@
- [ch05.ipynb](ch05.ipynb) contains all the code as it appears in the chapter
- [previous_chapters.py](previous_chapters.py) is a Python module that contains the `MultiHeadAttention` module from the previous chapter, which we import in [ch05.ipynb](ch05.ipynb) to pretrain the GPT model
- [train.py](train.py) is a standalone Python script file with the code that we implemented in [ch05.ipynb](ch05.ipynb) to train the GPT model
- [generate.py](generate.py) is a standalone Python script file with the code that we implemented in [ch05.ipynb](ch05.ipynb) to load and use the pretrained model weights from OpenAI
- [gpt_train.py](gpt_train.py) is a standalone Python script file with the code that we implemented in [ch05.ipynb](ch05.ipynb) to train the GPT model
- [gpt_generate.py](gpt_generate.py) is a standalone Python script file with the code that we implemented in [ch05.ipynb](ch05.ipynb) to load and use the pretrained model weights from OpenAI

View File

@ -1383,7 +1383,7 @@
"id": "de713235-1561-467f-bf63-bf11ade383f0",
"metadata": {},
"source": [
"**If you are interested in augmenting this training function with more advanced techniques, such as learning rate warmup, cosine annealing, and gradient clipping, please refer to [Appendix D](../../appendix-D/03_main-chapter-code)**"
"**If you are interested in augmenting this training function with more advanced techniques, such as learning rate warmup, cosine annealing, and gradient clipping, please refer to [Appendix D](../../appendix-D/01_main-chapter-code)**"
]
},
{
@ -2438,7 +2438,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.12"
"version": "3.12.2"
}
},
"nbformat": 4,

View File

@ -42,7 +42,7 @@ cd gutenberg
```bash
pip install -r requirements.txt
```
5. Download the data:
```bash
python get_data.py
@ -71,9 +71,9 @@ sudo apt-get install -y rsync && \
```
> [!NOTE]
> Instructions about how to set up Python and installing packages can be found in [Appendix A: Optional Python Setup Preferences](../../appendix-A/01_optional-python-setup-preferences/README.md) and [Appendix A: Installing Python Libraries](../../appendix-A/02_installing-python-libraries/README.md).
> Instructions about how to set up Python and installing packages can be found in [Optional Python Setup Preferences](../../setup/01_optional-python-setup-preferences/README.md) and [Installing Python Libraries](../../setup/02_installing-python-libraries/README.md).
>
> Optionally, a Docker image running Ubuntu is provided with this repository. Instructions about how to run a container with the provided Docker image can be found in [Appendix A: Optional Docker Environment](../../appendix-A/04_optional-docker-environment/README.md).
> Optionally, a Docker image running Ubuntu is provided with this repository. Instructions about how to run a container with the provided Docker image can be found in [Optional Docker Environment](../../setup/03_optional-docker-environment/README.md).
&nbsp;
### 2) Prepare the dataset
@ -161,7 +161,7 @@ Note that this code focuses on keeping things simple and minimal for educational
3. Update the `train_model_simple` script by adding the features introduced in [Appendix D: Adding Bells and Whistles to the Training Loop](../../appendix-D/01_main-chapter-code/appendix-D.ipynb), namely, cosine decay, linear warmup, and gradient clipping.
4. Update the pretraining script to save the optimizer state (see section *5.4 Loading and saving weights in PyTorch* in chapter 5; [ch05.ipynb](../../ch05/01_main-chapter-code/ch05.ipynb)) and add the option to load an existing model and optimizer checkpoint and continue training if the training run was interrupted.
5. Add a more advanced logger (for example, Weights and Biases) to view the loss and validation curves live
6. Add distributed data parallelism (DDP) and train the model on multiple GPUs (see section *A.9.3 Training with multiple GPUs* in appendix A; [DDP-script.py](../../appendix-A/03_main-chapter-code/DDP-script.py)).
6. Add distributed data parallelism (DDP) and train the model on multiple GPUs (see section *A.9.3 Training with multiple GPUs* in appendix A; [DDP-script.py](../../appendix-A/01_main-chapter-code/DDP-script.py)).
7. Swap the from scratch `MultiheadAttention` class in the `previous_chapter.py` script with the efficient `MHAPyTorchScaledDotProduct` class implemented in the [Efficient Multi-Head Attention Implementations](../../ch03/02_bonus_efficient-multihead-attention/mha-implementations.ipynb) bonus section, which uses Flash Attention via PyTorch's `nn.functional.scaled_dot_product_attention` function.
8. Speeding up the training by optimizing the model via [torch.compile](https://pytorch.org/tutorials/intermediate/torch_compile_tutorial.html) (`model = torch.compile`) or [thunder](https://github.com/Lightning-AI/lightning-thunder) (`model = thunder.jit(model)`).
9. Implement Gradient Low-Rank Projection (GaLore) to further speed up the pretraining process. This can be achieved by just replacing the `AdamW` optimizer with the provided `GaLoreAdamW` provided in the [GaLore Python library](https://github.com/jiaweizzhao/GaLore).

View File

@ -1,10 +1,6 @@
# Optimizing Hyperparameters for Pretraining
The [hparam_search.py](hparam_search.py) is script based on the extended training function in [
Appendix D: Adding Bells and Whistles to the Training Loop](../appendix-D/01_main-chapter-code/appendix-D.ipynb) to find optimal hyperparameters via grid search
The [hparam_search.py](hparam_search.py) script, based on the extended training function in [
Appendix D: Adding Bells and Whistles to the Training Loop](../appendix-D/01_main-chapter-code/appendix-D.ipynb), is designed to find optimal hyperparameters via grid search.
The [hparam_search.py](hparam_search.py) script, based on the extended training function in [Appendix D: Adding Bells and Whistles to the Training Loop](../../appendix-D/01_main-chapter-code/appendix-D.ipynb), is designed to find optimal hyperparameters via grid search.
>[!NOTE]
This script will take a long time to run. You may want to reduce the number of hyperparameter configurations explored in the `HPARAM_GRID` dictionary at the top.

View File

@ -4,4 +4,4 @@
- [02_alternative_weight_loading](02_alternative_weight_loading) contains code to load the GPT model weights from alternative places in case the model weights become unavailable from OpenAI
- [03_bonus_pretraining_on_gutenberg](03_bonus_pretraining_on_gutenberg) contains code to pretrain the LLM longer on the whole corpus of books from Project Gutenberg
- [04_learning_rate_schedulers] contains code implementing a more sophisticated training function including learning rate schedulers and gradient clipping
- [05_hparam_tuning](05_hparam_tuning) contains an optional hyperparameter tuning script
- [05_bonus_hparam_tuning](05_bonus_hparam_tuning) contains an optional hyperparameter tuning script

View File

Before

Width:  |  Height:  |  Size: 180 KiB

After

Width:  |  Height:  |  Size: 180 KiB

View File

Before

Width:  |  Height:  |  Size: 220 KiB

After

Width:  |  Height:  |  Size: 220 KiB

View File

Before

Width:  |  Height:  |  Size: 174 KiB

After

Width:  |  Height:  |  Size: 174 KiB

View File

Before

Width:  |  Height:  |  Size: 185 KiB

After

Width:  |  Height:  |  Size: 185 KiB

View File

Before

Width:  |  Height:  |  Size: 107 KiB

After

Width:  |  Height:  |  Size: 107 KiB

View File

Before

Width:  |  Height:  |  Size: 79 KiB

After

Width:  |  Height:  |  Size: 79 KiB

View File

Before

Width:  |  Height:  |  Size: 103 KiB

After

Width:  |  Height:  |  Size: 103 KiB

View File

Before

Width:  |  Height:  |  Size: 94 KiB

After

Width:  |  Height:  |  Size: 94 KiB

View File

Before

Width:  |  Height:  |  Size: 36 KiB

After

Width:  |  Height:  |  Size: 36 KiB

View File

@ -0,0 +1,3 @@
# Optional Docker Environment
This is an optional Docker environment for those users who prefer Docker. In case you are interested in using this Docker DevContainer, please see the *Using Docker DevContainers* section in the [../../README.md](../../README.md) for more information.

View File

@ -27,10 +27,8 @@ git clone https://github.com/rasbt/LLMs-from-scratch.git
cd LLMs-from-scratch
```
2. Move the `.devcontainer` file to the main `LLMs-from-scratch` project directory.
```bash
mv appendix-A/04_optional-docker-environment/.devcontainer ./
mv 1st_setup/03_optional-docker-environment/.devcontainer ./
```
3. In Docker Desktop, make sure that ***desktop-linux* builder** is running and will be used to build the Docker container (see *Docker Desktop* -> *Change settings* -> *Builders* -> *desktop-linux* -> *...* -> *Use*)

View File

@ -19,11 +19,30 @@ pip install -r requirements.txt
If you don't have Python set up on your machine yet, I have written about my personal Python setup preferences in the following directories:
- [../appendix-A/01_optional-python-setup-preferences](../appendix-A/01_optional-python-setup-preferences)
- [../02_installing-python-libraries](../appendix-A/02_installing-python-libraries)
- [01_optional-python-setup-preferences](./01_optional-python-setup-preferences)
- [02_installing-python-libraries](./02_installing-python-libraries)
The *Using DevContainers* section below outlines an alternative approach for installing project dependencies on your machine.
&nbsp;
## Using Docker DevContainers
As an alternative to the *Setting up Python* section above, if you prefer a development setup that isolates a project's dependencies and configurations, using Docker is a highly effective solution. This approach eliminates the need to manually install software packages and libraries and ensures a consistent development environment. You can find more instructions for setting up Docker and using a DevContainer:
- [03_optional-docker-environment](03_optional-docker-environment)
&nbsp;
## Visual Studio Code Editor
There are many good options for code editors. My preferred choice is the popular open-source [Visual Studio Code (VSCode)](https://code.visualstudio.com) editor, which can be easily enhanced with many useful plugins and extensions (see the *VSCode Extensions* section below for more information). Download instructions for macOS, Linux, and Windows can be found on the [main VSCode website](https://code.visualstudio.com).
&nbsp;
## VSCode Extensions
If you are using Visual Studio Code (VSCode) as your primary code editor, you can find recommended extensions in the `.vscode` subfolder. To install these, open the `extensions.json` file in VSCode and click the "Install" button in the pop-up menu on the lower right.
&nbsp;
@ -44,18 +63,6 @@ You can optionally run the code on a GPU by changing the *Runtime* as illustrate
<img src="./figures/3.webp" alt="3" width="700">
&nbsp;
## Using DevContainers
Alternatively, If you prefer a development setup that isolates a project's dependencies and configurations, using Docker is a highly effective solution. This approach eliminates the need to manually install software packages and libraries and ensures a consistent development environment. You can find more instructions for setting up Docker and using a DevContainer here in [../appendix-A/04_optional-docker-environment](../appendix-A/04_optional-docker-environment).
&nbsp;
## VSCode extensions
If you are using Visual Studio Code (VSCode) as your primary code editor, you can find recommended extensions in the `.vscode` subfolder. To install these, open the `extensions.json` file in VSCode and click the "Install" button in the pop-up menu on the lower right.
&nbsp;
## Questions?

View File

Before

Width:  |  Height:  |  Size: 35 KiB

After

Width:  |  Height:  |  Size: 35 KiB

View File

Before

Width:  |  Height:  |  Size: 62 KiB

After

Width:  |  Height:  |  Size: 62 KiB

View File

Before

Width:  |  Height:  |  Size: 30 KiB

After

Width:  |  Height:  |  Size: 30 KiB