add readme files

This commit is contained in:
rasbt 2023-10-25 18:46:40 -05:00
parent f26aa70ebd
commit e827b42e1e
5 changed files with 24 additions and 1 deletions

View File

@ -1,2 +1,3 @@
Details will follow ...
# Chapter 1: Understanding Large Language Models
There is no code in this chapter.

View File

@ -0,0 +1,5 @@
# Chapter 2: Working with Text Data
- [ch02.ipynb](ch02.ipynb) has all the code as it appears in the chapter
- [dataloader.ipynb](dataloader.ipynb) is a minimal notebook with the main data loading pipeline implemented in this chapter

View File

@ -0,0 +1,7 @@
# Chapter 2: Working with Text Data
- [compare-bpe-tiktoken.ipynb](compare-bpe-tiktoken.ipynb) benchmarks various byte pair encoding implementations
- [bpe_openai_gpt2.py](bpe_openai_gpt2.py) is the original bytepair encoder code used by OpenAI

View File

@ -0,0 +1,3 @@
# Chapter 2: Working with Text Data
- [embeddings-and-linear-layers.ipynb](embeddings-and-linear-layers.ipynb) contains optional (bonus) code to explain that embedding layers and fully connected layers applied to one-hot encoded vectors are equivalent.

7
ch02/README.md Normal file
View File

@ -0,0 +1,7 @@
# Chapter 2: Working with Text Data
- [01_main-chapter-code](01_main-chapter-code) contains the main chapter code
- [02_bonus_bytepair-encoder](02_bonus_bytepair-encoder) contains optional code to benchmark different byte pair encoder implementations
- [03_bonus_embedding-vs-matmul](03_bonus_embedding-vs-matmul) contains optional (bonus) code to explain that embedding layers and fully connected layers applied to one-hot encoded vectors are equivalent.