diff --git a/ch05/01_main-chapter-code/ch05.ipynb b/ch05/01_main-chapter-code/ch05.ipynb
index 394777f..c1f866e 100644
--- a/ch05/01_main-chapter-code/ch05.ipynb
+++ b/ch05/01_main-chapter-code/ch05.ipynb
@@ -1561,6 +1561,15 @@
     "print(inverse_vocab[next_token_id])"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "id": "c63d0a27-830b-42b5-9986-6d1a7de04dd9",
+   "metadata": {},
+   "source": [
+    "- Instead of determining the most likely token via `torch.argmax`, we use `torch.multinomial(probas, num_samples=1)` to determine the most likely token by sampling from the softmax distribution\n",
+    "- For illustration purposes, let's see what happens when we sample the next token 1,000 times using the original softmax probabilities:"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 32,
@@ -1593,15 +1602,6 @@
     "print_sampled_tokens(probas)"
    ]
   },
-  {
-   "cell_type": "markdown",
-   "id": "c63d0a27-830b-42b5-9986-6d1a7de04dd9",
-   "metadata": {},
-   "source": [
-    "- Instead of determining the most likely token via `torch.argmax`, we use `torch.multinomial(probas, num_samples=1)` to determine the most likely token by sampling from the softmax distribution\n",
-    "- For illustration purposes, let's see what happens when we sample the next token 1,000 times using the original softmax probabilities:"
-   ]
-  },
   {
    "cell_type": "markdown",
    "id": "32e7d9cf-a26d-4d9a-8664-4af1efa73832",