mirror of
https://github.com/rasbt/LLMs-from-scratch.git
synced 2025-08-28 10:30:36 +00:00
distinguish better between main chapter code and bonus materials
This commit is contained in:
parent
79210eb393
commit
b2ff989174
@ -1,7 +1,12 @@
|
||||
# Chapter 2: Working with Text Data
|
||||
|
||||
|
||||
## Main Chapter Code
|
||||
|
||||
- [01_main-chapter-code](01_main-chapter-code) contains the main chapter code and exercise solutions
|
||||
|
||||
|
||||
## Bonus Materials
|
||||
|
||||
- [02_bonus_bytepair-encoder](02_bonus_bytepair-encoder) contains optional code to benchmark different byte pair encoder implementations
|
||||
|
||||
|
||||
- [03_bonus_embedding-vs-matmul](03_bonus_embedding-vs-matmul) contains optional (bonus) code to explain that embedding layers and fully connected layers applied to one-hot encoded vectors are equivalent.
|
||||
|
@ -1,4 +1,9 @@
|
||||
# Chapter 3: Coding Attention Mechanisms
|
||||
|
||||
## Main Chapter Code
|
||||
|
||||
- [01_main-chapter-code](01_main-chapter-code) contains the main chapter code.
|
||||
|
||||
## Bonus Materials
|
||||
|
||||
- [02_bonus_efficient-multihead-attention](02_bonus_efficient-multihead-attention) implements and compares different implementation variants of multihead-attention
|
@ -1,4 +1,9 @@
|
||||
# Chapter 4: Implementing a GPT Model from Scratch to Generate Text
|
||||
|
||||
## Main Chapter Code
|
||||
|
||||
- [01_main-chapter-code](01_main-chapter-code) contains the main chapter code.
|
||||
|
||||
## Bonus Materials
|
||||
|
||||
- [02_performance-analysis](02_performance-analysis) contains optional code analyzing the performance of the GPT model(s) implemented in the main chapter.
|
@ -1,6 +1,11 @@
|
||||
# Chapter 5: Pretraining on Unlabeled Data
|
||||
|
||||
## Main Chapter Code
|
||||
|
||||
- [01_main-chapter-code](01_main-chapter-code) contains the main chapter code
|
||||
|
||||
## Bonus Materials
|
||||
|
||||
- [02_alternative_weight_loading](02_alternative_weight_loading) contains code to load the GPT model weights from alternative places in case the model weights become unavailable from OpenAI
|
||||
- [03_bonus_pretraining_on_gutenberg](03_bonus_pretraining_on_gutenberg) contains code to pretrain the LLM longer on the whole corpus of books from Project Gutenberg
|
||||
- [04_learning_rate_schedulers](04_learning_rate_schedulers) contains code implementing a more sophisticated training function including learning rate schedulers and gradient clipping
|
||||
|
@ -1,5 +1,11 @@
|
||||
# Chapter 6: Finetuning for Classification
|
||||
|
||||
|
||||
## Main Chapter Code
|
||||
|
||||
- [01_main-chapter-code](01_main-chapter-code) contains the main chapter code
|
||||
|
||||
## Bonus Materials
|
||||
|
||||
- [02_bonus_additional-experiments](02_bonus_additional-experiments) includes additional experiments (e.g., training the last vs first token, extending the input length, etc.)
|
||||
- [03_bonus_imdb-classification](03_bonus_imdb-classification) compares the LLM from chapter 6 with other models on a 50k IMDB movie review sentiment classification dataset
|
@ -1,3 +1,11 @@
|
||||
# Chapter 7: Finetuning to Follow Instructions
|
||||
|
||||
In progress ...
|
||||
## Main Chapter Code
|
||||
|
||||
- [01_main-chapter-code](01_main-chapter-code) contains the main chapter code and exercise solutions
|
||||
|
||||
## Bonus Materials
|
||||
|
||||
- [02_dataset-utilities](02_dataset-utilities) contains utility code that can be used for preparing an instruction dataset.
|
||||
|
||||
- [03_model-evaluation](03_model-evaluation) contains utility code for evaluating instruction responses using a local Llama 3 model and the GPT-4 API.
|
||||
|
Loading…
x
Reference in New Issue
Block a user