Organized setup instructions (#115)

* Organized setup instructions * update tets * link checker action * raise error upon broken link * fix links * fix links * delete duplicated paragraph
2025-12-24 05:32:15 +00:00 · 2024-04-10 22:09:46 -04:00 · 2024-04-10 22:09:46 -04:00 · 790d0808b2
commit 790d0808b2
parent 0b866c133f
39 changed files with 75 additions and 49 deletions
--- a/.github/workflows/basic-tests.yml
+++ b/.github/workflows/basic-tests.yml
@ -34,7 +34,7 @@ jobs:
      run: |
        pytest ch04/01_main-chapter-code/tests.py
        pytest ch05/01_main-chapter-code/tests.py
-        pytest appendix-A/02_installing-python-libraries/tests.py
+        pytest setup/02_installing-python-libraries/tests.py

    - name: Validate Selected Jupyter Notebooks
      run: |
--- a/.github/workflows/check-links.yml
+++ b/.github/workflows/check-links.yml
@ -0,0 +1,24 @@
+name: Check Markdown Links
+
+on:
+  push:
+    branches:
+      - main
+  pull_request:
+    branches:
+      - main
+
+jobs:
+  check-links:
+    runs-on: ubuntu-latest
+
+    steps:
+      - name: Checkout Repository
+        uses: actions/checkout@v3
+
+      - name: Install Markdown Link Checker
+        run: npm install -g markdown-link-check
+
+      - name: Find Markdown Files and Check Links
+        run: |
+          find . -name '*.md' -exec markdown-link-check {} \;
--- a/README.md
+++ b/README.md
@ -24,7 +24,7 @@ The method described in this book for training and developing your own small-but

 # Table of Contents

-Please note that the `Readme.md` file is a Markdown (`.md`) file. If you have downloaded this code bundle from the Manning website and are viewing it on your local computer, I recommend using a Markdown editor or previewer for proper viewing. If you haven't installed a Markdown editor yet, [MarkText](https://www.marktext.cc) is a good free option.
+Please note that this `README.md` file is a Markdown (`.md`) file. If you have downloaded this code bundle from the Manning website and are viewing it on your local computer, I recommend using a Markdown editor or previewer for proper viewing. If you haven't installed a Markdown editor yet, [MarkText](https://www.marktext.cc) is a good free option.

 Alternatively, you can view this and other files on GitHub at [https://github.com/rasbt/LLMs-from-scratch](https://github.com/rasbt/LLMs-from-scratch).

@ -36,17 +36,23 @@ Alternatively, you can view this and other files on GitHub at [https://github.co

 <br>

+
+> [!TIP]
+> If you're seeking guidance on installing Python and Python packages and setting up your code environment, I suggest reading the [README.md](setup/README.md) file located in the [setup](setup) directory.
+
+<br>
+
 | Chapter Title                                              | Main Code (for quick access)                                                                                                    | All Code + Supplementary      |
 |------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------|-------------------------------|
 | Ch 1: Understanding Large Language Models                  | No code                                                                                                                         | -                             |
 | Ch 2: Working with Text Data                               | - [ch02.ipynb](ch02/01_main-chapter-code/ch02.ipynb)<br/>- [dataloader.ipynb](ch02/01_main-chapter-code/dataloader.ipynb) (summary)<br/>- [exercise-solutions.ipynb](ch02/01_main-chapter-code/exercise-solutions.ipynb)               | [./ch02](./ch02)              |
 | Ch 3: Coding Attention Mechanisms                          | - [ch03.ipynb](ch03/01_main-chapter-code/ch03.ipynb)<br/>- [multihead-attention.ipynb](ch03/01_main-chapter-code/multihead-attention.ipynb) (summary) <br/>- [exercise-solutions.ipynb](ch03/01_main-chapter-code/exercise-solutions.ipynb)| [./ch03](./ch03)              |
 | Ch 4: Implementing a GPT Model from Scratch                | - [ch04.ipynb](ch04/01_main-chapter-code/ch04.ipynb)<br/>- [gpt.py](ch04/01_main-chapter-code/gpt.py) (summary)<br/>- [exercise-solutions.ipynb](ch04/01_main-chapter-code/exercise-solutions.ipynb) | [./ch04](./ch04)           |
-| Ch 5: Pretraining on Unlabeled Data                        | - [ch05.ipynb](ch05/01_main-chapter-code/ch05.ipynb)<br/>- [train.py](ch05/01_main-chapter-code/train.py) (summary) <br/>- [generate.py](ch05/01_main-chapter-code/generate.py) (summary) <br/>- [exercise-solutions.ipynb](ch05/01_main-chapter-code/exercise-solutions.ipynb) | [./ch05](./ch05)              |
+| Ch 5: Pretraining on Unlabeled Data                        | - [ch05.ipynb](ch05/01_main-chapter-code/ch05.ipynb)<br/>- [gpt_train.py](ch05/01_main-chapter-code/gpt_train.py) (summary) <br/>- [gpt_generate.py](ch05/01_main-chapter-code/gpt_generate.py) (summary) <br/>- [exercise-solutions.ipynb](ch05/01_main-chapter-code/exercise-solutions.ipynb) | [./ch05](./ch05)              |
 | Ch 6: Finetuning for Text Classification                   | Q2 2024                                                                                                                         | ...                           |
 | Ch 7: Finetuning with Human Feedback                       | Q2 2024                                                                                                                         | ...                           |
 | Ch 8: Using Large Language Models in Practice              | Q2/3 2024                                                                                                                       | ...                           |
-| Appendix A: Introduction to PyTorch                        | - [code-part1.ipynb](appendix-A/03_main-chapter-code/code-part1.ipynb)<br/>- [code-part2.ipynb](appendix-A/03_main-chapter-code/code-part2.ipynb)<br/>- [DDP-script.py](appendix-A/03_main-chapter-code/DDP-script.py)<br/>- [exercise-solutions.ipynb](appendix-A/03_main-chapter-code/exercise-solutions.ipynb) | [./appendix-A](./appendix-A) |
+| Appendix A: Introduction to PyTorch                        | - [code-part1.ipynb](appendix-A/01_main-chapter-code/code-part1.ipynb)<br/>- [code-part2.ipynb](appendix-A/01_main-chapter-code/code-part2.ipynb)<br/>- [DDP-script.py](appendix-A/01_main-chapter-code/DDP-script.py)<br/>- [exercise-solutions.ipynb](appendix-A/01_main-chapter-code/exercise-solutions.ipynb) | [./appendix-A](./appendix-A) |
 | Appendix B: References and Further Reading                 | No code                                                                                                                         | -                             |
 | Appendix C: Exercises                                      | No code                                                                                                                         | -                             |
 | Appendix D: Adding Bells and Whistles to the Training Loop | - [appendix-D.ipynb](appendix-D/01_main-chapter-code/appendix-D.ipynb)                                                          | [./appendix-D](./appendix-D)  |
@ -54,11 +60,6 @@ Alternatively, you can view this and other files on GitHub at [https://github.co



-> [!TIP]
-> Please see [this](appendix-A/01_optional-python-setup-preferences) and [this](appendix-A/02_installing-python-libraries) folder if you need more guidance on installing Python and Python packages.
-
-
-
 <br>
 <br>

@ -74,10 +75,10 @@ Shown below is a mental model summarizing the contents covered in this book.

 Several folders contain optional materials as a bonus for interested readers:

- **Appendix A:**
-  - [Python Setup Tips](appendix-A/01_optional-python-setup-preferences)
-  - [Installing Libraries Used In This Book](appendix-A/02_installing-python-libraries)
-  - [Docker Environment Setup Guide](appendix-A/04_optional-docker-environment)
+- **Setup**
+  - [Python Setup Tips](setup/01_optional-python-setup-preferences)
+  - [Installing Libraries Used In This Book](setup/02_installing-python-libraries)
+  - [Docker Environment Setup Guide](setup/03_optional-docker-environment)

 - **Chapter 2:**
  - [Comparing Various Byte Pair Encoding (BPE) Implementations](ch02/02_bonus_bytepair-encoder)
@ -88,9 +89,9 @@ Several folders contain optional materials as a bonus for interested readers:

 - **Chapter 5:**
  - [Alternative Weight Loading from Hugging Face Model Hub using Transformers](ch05/02_alternative_weight_loading/weight-loading-hf-transformers.ipynb)
-  - [Pretraining GPT on the Project Gutenberg Dataset](ch05/03_bonus_pretraining_on_gutenberg) 
+  - [Pretraining GPT on the Project Gutenberg Dataset](ch05/03_bonus_pretraining_on_gutenberg)
  - [Adding Bells and Whistles to the Training Loop](ch05/04_learning_rate_schedulers)
-  - [Optimizing Hyperparameters for Pretraining](05_bonus_hparam_tuning)
+  - [Optimizing Hyperparameters for Pretraining](ch05/05_bonus_hparam_tuning)

 <br>
 <br>
--- a/appendix-A/01_main-chapter-code/DDP-script.py
+++ b/appendix-A/01_main-chapter-code/DDP-script.py
--- a/appendix-A/01_main-chapter-code/code-part1.ipynb
+++ b/appendix-A/01_main-chapter-code/code-part1.ipynb
--- a/appendix-A/01_main-chapter-code/code-part2.ipynb
+++ b/appendix-A/01_main-chapter-code/code-part2.ipynb
--- a/appendix-A/01_main-chapter-code/exercise-solutions.ipynb
+++ b/appendix-A/01_main-chapter-code/exercise-solutions.ipynb
--- a/appendix-A/04_optional-docker-environment/.devcontainer/README.md
+++ b/appendix-A/04_optional-docker-environment/.devcontainer/README.md
@ -1,3 +0,0 @@
-# Optional Docker Environment
-
-This is an optional Docker environment for those users who prefer Docker. For more instructions, see the *Docker Environment Setup Guide* in [appendix-A/04_optional-docker-environment](../).
--- a/ch05/01_main-chapter-code/README.md
+++ b/ch05/01_main-chapter-code/README.md
@ -2,6 +2,6 @@

 - [ch05.ipynb](ch05.ipynb) contains all the code as it appears in the chapter
 - [previous_chapters.py](previous_chapters.py) is a Python module that contains the `MultiHeadAttention` module from the previous chapter, which we import in [ch05.ipynb](ch05.ipynb) to pretrain the GPT model
- [train.py](train.py) is a standalone Python script file with the code that we implemented in [ch05.ipynb](ch05.ipynb) to train the GPT model
- [generate.py](generate.py) is a standalone Python script file with the code that we implemented in [ch05.ipynb](ch05.ipynb) to load and use the pretrained model weights from OpenAI
+- [gpt_train.py](gpt_train.py) is a standalone Python script file with the code that we implemented in [ch05.ipynb](ch05.ipynb) to train the GPT model
+- [gpt_generate.py](gpt_generate.py) is a standalone Python script file with the code that we implemented in [ch05.ipynb](ch05.ipynb) to load and use the pretrained model weights from OpenAI

--- a/ch05/01_main-chapter-code/ch05.ipynb
+++ b/ch05/01_main-chapter-code/ch05.ipynb
@ -1383,7 +1383,7 @@
   "id": "de713235-1561-467f-bf63-bf11ade383f0",
   "metadata": {},
   "source": [
-    "**If you are interested in augmenting this training function with more advanced techniques, such as learning rate warmup, cosine annealing, and gradient clipping, please refer to [Appendix D](../../appendix-D/03_main-chapter-code)**"
+    "**If you are interested in augmenting this training function with more advanced techniques, such as learning rate warmup, cosine annealing, and gradient clipping, please refer to [Appendix D](../../appendix-D/01_main-chapter-code)**"
   ]
  },
  {
@ -2438,7 +2438,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.10.12"
+   "version": "3.12.2"
  }
 },
 "nbformat": 4,
--- a/ch05/03_bonus_pretraining_on_gutenberg/README.md
+++ b/ch05/03_bonus_pretraining_on_gutenberg/README.md
@ -42,7 +42,7 @@ cd gutenberg
 ```bash
 pip install -r requirements.txt
 ```
- 
+
 5. Download the data:
 ```bash
 python get_data.py
@ -71,9 +71,9 @@ sudo apt-get install -y rsync && \
 ```

 > [!NOTE]
-> Instructions about how to set up Python and installing packages can be found in  [Appendix A: Optional Python Setup Preferences](../../appendix-A/01_optional-python-setup-preferences/README.md) and [Appendix A: Installing Python Libraries](../../appendix-A/02_installing-python-libraries/README.md).
+> Instructions about how to set up Python and installing packages can be found in [Optional Python Setup Preferences](../../setup/01_optional-python-setup-preferences/README.md) and [Installing Python Libraries](../../setup/02_installing-python-libraries/README.md).
 >
-> Optionally, a Docker image running Ubuntu is provided with this repository. Instructions about how to run a container with the provided Docker image can be found in [Appendix A: Optional Docker Environment](../../appendix-A/04_optional-docker-environment/README.md).
+> Optionally, a Docker image running Ubuntu is provided with this repository. Instructions about how to run a container with the provided Docker image can be found in [Optional Docker Environment](../../setup/03_optional-docker-environment/README.md).

 &nbsp;
 ### 2) Prepare the dataset
@ -161,7 +161,7 @@ Note that this code focuses on keeping things simple and minimal for educational
 3. Update the `train_model_simple` script by adding the features introduced in [Appendix D: Adding Bells and Whistles to the Training Loop](../../appendix-D/01_main-chapter-code/appendix-D.ipynb), namely, cosine decay, linear warmup, and gradient clipping.
 4. Update the pretraining script to save the optimizer state (see section *5.4 Loading and saving weights in PyTorch* in chapter 5; [ch05.ipynb](../../ch05/01_main-chapter-code/ch05.ipynb)) and add the option to load an existing model and optimizer checkpoint and continue training if the training run was interrupted.
 5. Add a more advanced logger (for example, Weights and Biases) to view the loss and validation curves live
-6. Add distributed data parallelism (DDP) and train the model on multiple GPUs (see section *A.9.3 Training with multiple GPUs* in appendix A; [DDP-script.py](../../appendix-A/03_main-chapter-code/DDP-script.py)).
+6. Add distributed data parallelism (DDP) and train the model on multiple GPUs (see section *A.9.3 Training with multiple GPUs* in appendix A; [DDP-script.py](../../appendix-A/01_main-chapter-code/DDP-script.py)).
 7. Swap the from scratch `MultiheadAttention` class in the `previous_chapter.py` script with the efficient `MHAPyTorchScaledDotProduct` class implemented in the [Efficient Multi-Head Attention Implementations](../../ch03/02_bonus_efficient-multihead-attention/mha-implementations.ipynb) bonus section, which uses Flash Attention via PyTorch's `nn.functional.scaled_dot_product_attention` function.
 8. Speeding up the training by optimizing the model via [torch.compile](https://pytorch.org/tutorials/intermediate/torch_compile_tutorial.html) (`model = torch.compile`) or [thunder](https://github.com/Lightning-AI/lightning-thunder) (`model = thunder.jit(model)`).
 9. Implement Gradient Low-Rank Projection (GaLore) to further speed up the pretraining process. This can be achieved by just replacing the `AdamW` optimizer with the provided `GaLoreAdamW` provided in the [GaLore Python library](https://github.com/jiaweizzhao/GaLore).
--- a/ch05/05_bonus_hparam_tuning/README.md
+++ b/ch05/05_bonus_hparam_tuning/README.md
@ -1,10 +1,6 @@
 # Optimizing Hyperparameters for Pretraining

-The [hparam_search.py](hparam_search.py) is script based on the extended training function in [
-Appendix D: Adding Bells and Whistles to the Training Loop](../appendix-D/01_main-chapter-code/appendix-D.ipynb) to find optimal hyperparameters via grid search 
-
-The [hparam_search.py](hparam_search.py) script, based on the extended training function in [
-Appendix D: Adding Bells and Whistles to the Training Loop](../appendix-D/01_main-chapter-code/appendix-D.ipynb), is designed to find optimal hyperparameters via grid search.
+The [hparam_search.py](hparam_search.py) script, based on the extended training function in [Appendix D: Adding Bells and Whistles to the Training Loop](../../appendix-D/01_main-chapter-code/appendix-D.ipynb), is designed to find optimal hyperparameters via grid search.

 >[!NOTE]
 This script will take a long time to run. You may want to reduce the number of hyperparameter configurations explored in the `HPARAM_GRID` dictionary at the top.
--- a/ch05/README.md
+++ b/ch05/README.md
@ -4,4 +4,4 @@
 - [02_alternative_weight_loading](02_alternative_weight_loading) contains code to load the GPT model weights from alternative places in case the model weights become unavailable from OpenAI
 - [03_bonus_pretraining_on_gutenberg](03_bonus_pretraining_on_gutenberg) contains code to pretrain the LLM longer on the whole corpus of books from Project Gutenberg
 - [04_learning_rate_schedulers] contains code implementing a more sophisticated training function including learning rate schedulers and gradient clipping
- [05_hparam_tuning](05_hparam_tuning) contains an optional hyperparameter tuning script
+- [05_bonus_hparam_tuning](05_bonus_hparam_tuning) contains an optional hyperparameter tuning script
--- a/.1st_setup/.vscode/extensions.json
+++ b/.1st_setup/.vscode/extensions.json
--- a/appendix-A/01_optional-python-setup-preferences/README.md
+++ b/appendix-A/01_optional-python-setup-preferences/README.md
--- a/appendix-A/01_optional-python-setup-preferences/figures/activate-env.png
+++ b/appendix-A/01_optional-python-setup-preferences/figures/activate-env.png
--- a/appendix-A/01_optional-python-setup-preferences/figures/check-pip.png
+++ b/appendix-A/01_optional-python-setup-preferences/figures/check-pip.png
--- a/appendix-A/01_optional-python-setup-preferences/figures/conda-install.png
+++ b/appendix-A/01_optional-python-setup-preferences/figures/conda-install.png
--- a/appendix-A/01_optional-python-setup-preferences/figures/download.png
+++ b/appendix-A/01_optional-python-setup-preferences/figures/download.png
--- a/appendix-A/01_optional-python-setup-preferences/figures/miniforge-install.png
+++ b/appendix-A/01_optional-python-setup-preferences/figures/miniforge-install.png
--- a/appendix-A/01_optional-python-setup-preferences/figures/new-env.png
+++ b/appendix-A/01_optional-python-setup-preferences/figures/new-env.png
--- a/appendix-A/01_optional-python-setup-preferences/figures/pytorch-installer.jpg
+++ b/appendix-A/01_optional-python-setup-preferences/figures/pytorch-installer.jpg
--- a/appendix-A/02_installing-python-libraries/README.md
+++ b/appendix-A/02_installing-python-libraries/README.md
--- a/appendix-A/02_installing-python-libraries/figures/check_1.jpg
+++ b/appendix-A/02_installing-python-libraries/figures/check_1.jpg
--- a/appendix-A/02_installing-python-libraries/figures/check_2.jpg
+++ b/appendix-A/02_installing-python-libraries/figures/check_2.jpg
--- a/appendix-A/02_installing-python-libraries/figures/jupyter-issues.jpg
+++ b/appendix-A/02_installing-python-libraries/figures/jupyter-issues.jpg
--- a/appendix-A/02_installing-python-libraries/figures/pytorch-installer.jpg
+++ b/appendix-A/02_installing-python-libraries/figures/pytorch-installer.jpg
--- a/appendix-A/02_installing-python-libraries/figures/watermark.jpg
+++ b/appendix-A/02_installing-python-libraries/figures/watermark.jpg
--- a/appendix-A/02_installing-python-libraries/python_environment_check.ipynb
+++ b/appendix-A/02_installing-python-libraries/python_environment_check.ipynb
--- a/appendix-A/02_installing-python-libraries/python_environment_check.py
+++ b/appendix-A/02_installing-python-libraries/python_environment_check.py
--- a/appendix-A/02_installing-python-libraries/tests.py
+++ b/appendix-A/02_installing-python-libraries/tests.py
--- a/appendix-A/04_optional-docker-environment/.devcontainer/Dockerfile
+++ b/appendix-A/04_optional-docker-environment/.devcontainer/Dockerfile
--- a/setup/03_optional-docker-environment/.devcontainer/README.md
+++ b/setup/03_optional-docker-environment/.devcontainer/README.md
@ -0,0 +1,3 @@
+# Optional Docker Environment
+
+This is an optional Docker environment for those users who prefer Docker. In case you are interested in using this Docker DevContainer, please see the *Using Docker DevContainers* section in the [../../README.md](../../README.md) for more information.
--- a/appendix-A/04_optional-docker-environment/.devcontainer/devcontainer.json
+++ b/appendix-A/04_optional-docker-environment/.devcontainer/devcontainer.json
--- a/appendix-A/04_optional-docker-environment/README.md
+++ b/appendix-A/04_optional-docker-environment/README.md
@ -27,10 +27,8 @@ git clone https://github.com/rasbt/LLMs-from-scratch.git
 cd LLMs-from-scratch
 ```

-2. Move the `.devcontainer` file to the main `LLMs-from-scratch` project directory.
-
 ```bash
-mv appendix-A/04_optional-docker-environment/.devcontainer ./
+mv 1st_setup/03_optional-docker-environment/.devcontainer ./
 ```

 3. In Docker Desktop, make sure that ***desktop-linux* builder** is running and will be used to build the Docker container (see *Docker Desktop* -> *Change settings* -> *Builders* -> *desktop-linux* -> *...* -> *Use*)
--- a/.1st_setup/README.md
+++ b/.1st_setup/README.md
@ -19,11 +19,30 @@ pip install -r requirements.txt

 If you don't have Python set up on your machine yet, I have written about my personal Python setup preferences in the following directories:

- [../appendix-A/01_optional-python-setup-preferences](../appendix-A/01_optional-python-setup-preferences)
- [../02_installing-python-libraries](../appendix-A/02_installing-python-libraries)
+- [01_optional-python-setup-preferences](./01_optional-python-setup-preferences)
+- [02_installing-python-libraries](./02_installing-python-libraries)

+The *Using DevContainers* section below outlines an alternative approach for installing project dependencies on your machine.

+&nbsp;

+## Using Docker DevContainers
+
+As an alternative to the *Setting up Python* section above, if you prefer a development setup that isolates a project's dependencies and configurations, using Docker is a highly effective solution. This approach eliminates the need to manually install software packages and libraries and ensures a consistent development environment. You can find more instructions for setting up Docker and using a DevContainer:
+
+- [03_optional-docker-environment](03_optional-docker-environment)
+
+&nbsp;
+
+## Visual Studio Code Editor
+
+There are many good options for code editors. My preferred choice is the popular open-source [Visual Studio Code (VSCode)](https://code.visualstudio.com) editor, which can be easily enhanced with many useful plugins and extensions (see the *VSCode Extensions* section below for more information). Download instructions for macOS, Linux, and Windows can be found on the [main VSCode website](https://code.visualstudio.com).
+
+&nbsp;
+
+## VSCode Extensions
+
+If you are using Visual Studio Code (VSCode) as your primary code editor, you can find recommended extensions in the `.vscode` subfolder. To install these, open the `extensions.json` file in VSCode and click the "Install" button in the pop-up menu on the lower right.

 &nbsp;

@ -44,18 +63,6 @@ You can optionally run the code on a GPU by changing the *Runtime* as illustrate
 <img src="./figures/3.webp" alt="3" width="700">


-&nbsp;
-
-## Using DevContainers
-
-Alternatively, If you prefer a development setup that isolates a project's dependencies and configurations, using Docker is a highly effective solution. This approach eliminates the need to manually install software packages and libraries and ensures a consistent development environment. You can find more instructions for setting up Docker and using a DevContainer here in [../appendix-A/04_optional-docker-environment](../appendix-A/04_optional-docker-environment).
-
-&nbsp;
-
-## VSCode extensions
-
-If you are using Visual Studio Code (VSCode) as your primary code editor, you can find recommended extensions in the `.vscode` subfolder. To install these, open the `extensions.json` file in VSCode and click the "Install" button in the pop-up menu on the lower right.
-
 &nbsp;

 ## Questions?
--- a/.1st_setup/figures/1.webp
+++ b/.1st_setup/figures/1.webp
--- a/.1st_setup/figures/2.webp
+++ b/.1st_setup/figures/2.webp
--- a/.1st_setup/figures/3.webp
+++ b/.1st_setup/figures/3.webp