fix 1024 characters to 1024 tokens (#152)

2025-10-26 23:39:53 +00:00 · 2024-05-12 03:17:07 +09:00 · 2024-05-12 03:17:07 +09:00 · 7b34833ee1
commit 7b34833ee1
parent c94f24e759
1 changed files with 1 additions and 1 deletions
--- a/ch05/01_main-chapter-code/ch05.ipynb
+++ b/ch05/01_main-chapter-code/ch05.ipynb
@ -161,7 +161,7 @@
   "source": [
    "- We use dropout of 0.1 above, but it's relatively common to train LLMs without dropout nowadays\n",
    "- Modern LLMs also don't use bias vectors in the `nn.Linear` layers for the query, key, and value matrices (unlike earlier GPT models), which is achieved by setting `\"qkv_bias\": False`\n",
-    "- We reduce the context length (`context_length`) of only 256 tokens to reduce the computational resource requirements for training the model, whereas the original 124 million parameter GPT-2 model used 1024 characters\n",
+    "- We reduce the context length (`context_length`) of only 256 tokens to reduce the computational resource requirements for training the model, whereas the original 124 million parameter GPT-2 model used 1024 tokens\n",
    "  - This is so that more readers will be able to follow and execute the code examples on their laptop computer\n",
    "  - However, please feel free to increase the `context_length` to 1024 tokens (this would not require any code changes)\n",
    "  - We will also load a model with a 1024 `context_length` later from pretrained weights"