6 -> 4

2025-11-04 11:50:14 +00:00 · 2024-05-10 07:02:14 -05:00 · 2024-05-10 07:02:14 -05:00 · 774974de97
commit 774974de97
parent d8de9377de
1 changed files with 11 additions and 4 deletions
--- a/ch06/01_main-chapter-code/ch06.ipynb
+++ b/ch06/01_main-chapter-code/ch06.ipynb
@ -1440,6 +1440,15 @@
    "print(\"Outputs dimensions:\", outputs.shape) # shape: (batch_size, num_tokens, num_classes)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "75430a01-ef9c-426a-aca0-664689c4f461",
   "metadata": {},
   "source": [
    "- As discussed in previous chapters, for each input token, there's one output vector\n",
    "- Since we fed the model a text sample with 4 input tokens, the output consists of 4 2-dimensional output vectors above"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7df9144f-6817-4be4-8d4b-5d4dadfe4a9b",
@ -1453,11 +1462,9 @@
   "id": "e3bb8616-c791-4f5c-bac0-5302f663e46a",
   "metadata": {},
   "source": [
    "- As discussed in previous chapters, for each input token, there's one output vector\n",
    "- Since we fed the model a text sample with 6 input tokens, the output consists of 6 2-dimensional output vectors above\n",
    "- In chapter 3, we discussed the attention mechanism, which connects each input token to each other input token\n",
    "- In chapter 3, we then also introduced the causal attention mask that is used in GPT-like models; this causal mask lets a current token only attend to the current and previous token positions\n",
-    "- Based on this causal attention mechanism, the 6th (last) token above contains the most information among all tokens because it's the only token that includes information about all other tokens\n",
+    "- Based on this causal attention mechanism, the 4th (last) token contains the most information among all tokens because it's the only token that includes information about all other tokens\n",
    "- Hence, we are particularly interested in this last token, which we will finetune for the spam classification task"
   ]
  },
@ -2265,7 +2272,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.10.6"
+   "version": "3.10.12"
  }
 },
 "nbformat": 4,