diff --git a/ch02/README.md b/ch02/README.md index bd98860..8c3fd59 100644 --- a/ch02/README.md +++ b/ch02/README.md @@ -1,7 +1,12 @@ # Chapter 2: Working with Text Data + +## Main Chapter Code + - [01_main-chapter-code](01_main-chapter-code) contains the main chapter code and exercise solutions - + +## Bonus Materials + - [02_bonus_bytepair-encoder](02_bonus_bytepair-encoder) contains optional code to benchmark different byte pair encoder implementations - + - [03_bonus_embedding-vs-matmul](03_bonus_embedding-vs-matmul) contains optional (bonus) code to explain that embedding layers and fully connected layers applied to one-hot encoded vectors are equivalent. diff --git a/ch03/README.md b/ch03/README.md index b781e2e..2998662 100644 --- a/ch03/README.md +++ b/ch03/README.md @@ -1,4 +1,9 @@ # Chapter 3: Coding Attention Mechanisms +## Main Chapter Code + - [01_main-chapter-code](01_main-chapter-code) contains the main chapter code. + +## Bonus Materials + - [02_bonus_efficient-multihead-attention](02_bonus_efficient-multihead-attention) implements and compares different implementation variants of multihead-attention \ No newline at end of file diff --git a/ch04/README.md b/ch04/README.md index d96ecd6..4dd3a74 100644 --- a/ch04/README.md +++ b/ch04/README.md @@ -1,4 +1,9 @@ # Chapter 4: Implementing a GPT Model from Scratch to Generate Text +## Main Chapter Code + - [01_main-chapter-code](01_main-chapter-code) contains the main chapter code. + +## Bonus Materials + - [02_performance-analysis](02_performance-analysis) contains optional code analyzing the performance of the GPT model(s) implemented in the main chapter. \ No newline at end of file diff --git a/ch05/README.md b/ch05/README.md index 1a5a8a2..fec8237 100644 --- a/ch05/README.md +++ b/ch05/README.md @@ -1,6 +1,11 @@ # Chapter 5: Pretraining on Unlabeled Data +## Main Chapter Code + - [01_main-chapter-code](01_main-chapter-code) contains the main chapter code + +## Bonus Materials + - [02_alternative_weight_loading](02_alternative_weight_loading) contains code to load the GPT model weights from alternative places in case the model weights become unavailable from OpenAI - [03_bonus_pretraining_on_gutenberg](03_bonus_pretraining_on_gutenberg) contains code to pretrain the LLM longer on the whole corpus of books from Project Gutenberg - [04_learning_rate_schedulers](04_learning_rate_schedulers) contains code implementing a more sophisticated training function including learning rate schedulers and gradient clipping diff --git a/ch06/README.md b/ch06/README.md index 6c852c2..5d9582f 100644 --- a/ch06/README.md +++ b/ch06/README.md @@ -1,5 +1,11 @@ # Chapter 6: Finetuning for Classification + +## Main Chapter Code + - [01_main-chapter-code](01_main-chapter-code) contains the main chapter code + +## Bonus Materials + - [02_bonus_additional-experiments](02_bonus_additional-experiments) includes additional experiments (e.g., training the last vs first token, extending the input length, etc.) - [03_bonus_imdb-classification](03_bonus_imdb-classification) compares the LLM from chapter 6 with other models on a 50k IMDB movie review sentiment classification dataset \ No newline at end of file diff --git a/ch07/README.md b/ch07/README.md index 330a658..2e3796c 100644 --- a/ch07/README.md +++ b/ch07/README.md @@ -1,3 +1,11 @@ # Chapter 7: Finetuning to Follow Instructions -In progress ... \ No newline at end of file +## Main Chapter Code + +- [01_main-chapter-code](01_main-chapter-code) contains the main chapter code and exercise solutions + +## Bonus Materials + +- [02_dataset-utilities](02_dataset-utilities) contains utility code that can be used for preparing an instruction dataset. + +- [03_model-evaluation](03_model-evaluation) contains utility code for evaluating instruction responses using a local Llama 3 model and the GPT-4 API.