14 lines
690 B
Markdown
Raw Normal View History

2024-03-27 07:30:09 -05:00
# Chapter 4: Implementing a GPT Model from Scratch to Generate Text
2024-02-05 06:51:58 -06:00
2024-10-12 10:26:08 -05:00
 
## Main Chapter Code
2024-05-23 20:35:41 -05:00
- [01_main-chapter-code](01_main-chapter-code) contains the main chapter code.
2024-10-12 10:26:08 -05:00
 
## Bonus Materials
2024-06-19 17:48:25 -05:00
2024-10-12 10:26:08 -05:00
- [02_performance-analysis](02_performance-analysis) contains optional code analyzing the performance of the GPT model(s) implemented in the main chapter
- [ch05/07_gpt_to_llama](../ch05/07_gpt_to_llama) contains a step-by-step guide for converting a GPT architecture implementation to Llama 3.2 and loads pretrained weights from Meta AI (it might be interesting to look at alternative architectures after completing chapter 4, but you can also save that for after reading chapter 5)