add and update readme files

This commit is contained in:
rasbt 2024-02-05 06:51:58 -06:00
parent 2b38b63a7a
commit 3a5fc79b38
7 changed files with 22 additions and 9 deletions

View File

@ -1,5 +1,5 @@
# Chapter 2: Working with Text Data # Chapter 2: Working with Text Data
- [ch02.ipynb](ch02.ipynb) has all the code as it appears in the chapter - [ch02.ipynb](ch02.ipynb) contains all the code as it appears in the chapter
- [dataloader.ipynb](dataloader.ipynb) is a minimal notebook with the main data loading pipeline implemented in this chapter - [dataloader.ipynb](dataloader.ipynb) is a minimal notebook with the main data loading pipeline implemented in this chapter

View File

@ -1,6 +1,6 @@
# Chapter 2: Working with Text Data # Chapter 2: Working with Text Data
- [01_main-chapter-code](01_main-chapter-code) contains the main chapter code - [01_main-chapter-code](01_main-chapter-code) contains the main chapter code and exercise solutions
- [02_bonus_bytepair-encoder](02_bonus_bytepair-encoder) contains optional code to benchmark different byte pair encoder implementations - [02_bonus_bytepair-encoder](02_bonus_bytepair-encoder) contains optional code to benchmark different byte pair encoder implementations

View File

@ -1,5 +1,5 @@
# Chapter 3: Understanding Attention Mechanisms # Chapter 3: Coding Attention Mechanisms
- [ch03.ipynb](ch03.ipynb) has all the code as it appears in the chapter - [ch03.ipynb](ch03.ipynb) contains all the code as it appears in the chapter
- [multihead-attention.ipynb](multihead-attention.ipynb) is a minimal notebook with the main data loading pipeline implemented in this chapter - [multihead-attention.ipynb](multihead-attention.ipynb) is a minimal notebook with the main data loading pipeline implemented in this chapter

View File

@ -1,3 +1,3 @@
# Chapter 3: Understanding Attention Mechanisms # Chapter 3: Coding Attention Mechanisms
- [01_main-chapter-code](01_main-chapter-code) contains the main chapter code. - [01_main-chapter-code](01_main-chapter-code) contains the main chapter code.

View File

@ -0,0 +1,6 @@
# Chapter 4: Implementing a GPT model from Scratch To Generate Text
- [ch04.ipynb](ch04.ipynb) contains all the code as it appears in the chapter
- [previous_chapters.py](previous_chapters.py) is a Python module that contains the `MultiHeadAttention` module from the previous chapter, which we import in [ch04.ipynb](ch04.ipynb) to create the GPT model
- [gpt.py](gpt.py) is a standalone Python script file with the code that we implemented thus far, including the GPT model we coded in this chapter

View File

@ -134,7 +134,9 @@
" \n", " \n",
" # Use a placeholder for LayerNorm\n", " # Use a placeholder for LayerNorm\n",
" self.final_norm = DummyLayerNorm(cfg[\"emb_dim\"])\n", " self.final_norm = DummyLayerNorm(cfg[\"emb_dim\"])\n",
" self.out_head = nn.Linear(cfg[\"emb_dim\"], cfg[\"vocab_size\"], bias=False)\n", " self.out_head = nn.Linear(\n",
" cfg[\"emb_dim\"], cfg[\"vocab_size\"], bias=False\n",
" )\n",
"\n", "\n",
" def forward(self, in_idx):\n", " def forward(self, in_idx):\n",
" batch_size, seq_len = in_idx.shape\n", " batch_size, seq_len = in_idx.shape\n",
@ -208,7 +210,7 @@
"batch.append(torch.tensor(tokenizer.encode(txt1)))\n", "batch.append(torch.tensor(tokenizer.encode(txt1)))\n",
"batch.append(torch.tensor(tokenizer.encode(txt2)))\n", "batch.append(torch.tensor(tokenizer.encode(txt2)))\n",
"batch = torch.stack(batch, dim=0)\n", "batch = torch.stack(batch, dim=0)\n",
"batch" "print(batch)"
] ]
}, },
{ {
@ -772,7 +774,7 @@
"torch.manual_seed(123)\n", "torch.manual_seed(123)\n",
"ex_short = ExampleWithShortcut()\n", "ex_short = ExampleWithShortcut()\n",
"inputs = torch.tensor([[-1., 1., 2.]])\n", "inputs = torch.tensor([[-1., 1., 2.]])\n",
"ex_short(inputs)" "print(ex_short(inputs))"
] ]
}, },
{ {
@ -947,7 +949,9 @@
" \n", " \n",
" # Use a placeholder for LayerNorm\n", " # Use a placeholder for LayerNorm\n",
" self.final_norm = LayerNorm(cfg[\"emb_dim\"])\n", " self.final_norm = LayerNorm(cfg[\"emb_dim\"])\n",
" self.out_head = nn.Linear(cfg[\"emb_dim\"], cfg[\"vocab_size\"], bias=False)\n", " self.out_head = nn.Linear(\n",
" cfg[\"emb_dim\"], cfg[\"vocab_size\"], bias=False\n",
" )\n",
"\n", "\n",
" def forward(self, in_idx):\n", " def forward(self, in_idx):\n",
" batch_size, seq_len = in_idx.shape\n", " batch_size, seq_len = in_idx.shape\n",

3
ch04/README.md Normal file
View File

@ -0,0 +1,3 @@
# Chapter 4: Implementing a GPT model from Scratch To Generate Text
- [01_main-chapter-code](01_main-chapter-code) contains the main chapter code.