mirror of
https://github.com/rasbt/LLMs-from-scratch.git
synced 2025-09-26 08:34:22 +00:00
Add figures for ch06 (#141)
This commit is contained in:
parent
b8324061d0
commit
d3201f5aad
@ -25,7 +25,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 2,
|
||||
"execution_count": 1,
|
||||
"id": "5b7e01c2-1c84-4f2a-bb51-2e0b74abda90",
|
||||
"metadata": {
|
||||
"colab": {
|
||||
@ -62,6 +62,14 @@
|
||||
" print(f\"{p} version: {version(p)}\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "a445828a-ff10-4efa-9f60-a2e2aed4c87d",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch06_compressed/chapter-overview.webp\" width=500px>"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "3a84cf35-b37f-4c15-8972-dfafc9fadc1c",
|
||||
@ -82,6 +90,42 @@
|
||||
"- No code in this section"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "ac45579d-d485-47dc-829e-43be7f4db57b",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"- The most common ways to finetune language models are instruction-finetuning and classifcation finetuning\n",
|
||||
"- Instruction-finetuning, depicted below, is the topic of the next chapter"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "6c29ef42-46d9-43d4-8bb4-94974e1665e4",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch06_compressed/instructions.webp\" width=500px>"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "a7f60321-95b8-46a9-97bf-1d07fda2c3dd",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"- Classification finetuning, the topic of this chapter, is a procedure you may already be familiar with if you have a background in machine learning -- it's similar to training a convolutional network to classify handwritten digits, for example\n",
|
||||
"- In classification finetuning, we have a specific number of class labels (for example, \"spam\" and \"not spam\") that the model can output\n",
|
||||
"- A classification finetuned model can only predict classes it has seen during training (for example, \"spam\" or \"not spam\", whereas an instruction-finetuned model can usually perform many tasks\n",
|
||||
"- We can think of a classification-finetuned model as a very specialized model; in practice, it is much easier to create a specialized model than a generalist model that performs well on many different tasks"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "0b37a0c4-0bb1-4061-b1fe-eaa4416d52c3",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch06_compressed/spam-non-spam.webp\" width=500px>"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "8c7017a2-32aa-4002-a2f3-12aac293ccdf",
|
||||
@ -92,6 +136,14 @@
|
||||
"## 6.2 Preparing the dataset"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "5f628975-d2e8-4f7f-ab38-92bb868b7067",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch06_compressed/overview-1.webp\" width=500px>"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "9fbd459f-63fa-4d8c-8499-e23103156c7d",
|
||||
@ -106,7 +158,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"execution_count": 2,
|
||||
"id": "def7c09b-af9c-4216-90ce-5e67aed1065c",
|
||||
"metadata": {
|
||||
"colab": {
|
||||
@ -169,7 +221,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
"execution_count": 3,
|
||||
"id": "da0ed4da-ac31-4e4d-8bdd-2153be4656a4",
|
||||
"metadata": {
|
||||
"colab": {
|
||||
@ -283,7 +335,7 @@
|
||||
"[5572 rows x 2 columns]"
|
||||
]
|
||||
},
|
||||
"execution_count": 4,
|
||||
"execution_count": 3,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
@ -307,7 +359,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 5,
|
||||
"execution_count": 4,
|
||||
"id": "495a5280-9d7c-41d4-9719-64ab99056d4c",
|
||||
"metadata": {
|
||||
"colab": {
|
||||
@ -345,7 +397,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 6,
|
||||
"execution_count": 5,
|
||||
"id": "7be4a0a2-9704-4a96-b38f-240339818688",
|
||||
"metadata": {
|
||||
"colab": {
|
||||
@ -396,7 +448,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 7,
|
||||
"execution_count": 6,
|
||||
"id": "c1b10c3d-5d57-42d0-8de8-cf80a06f5ffd",
|
||||
"metadata": {
|
||||
"id": "c1b10c3d-5d57-42d0-8de8-cf80a06f5ffd"
|
||||
@ -418,7 +470,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 8,
|
||||
"execution_count": 7,
|
||||
"id": "uQl0Psdmx15D",
|
||||
"metadata": {
|
||||
"id": "uQl0Psdmx15D"
|
||||
@ -448,6 +500,14 @@
|
||||
"test_df.to_csv(\"test.csv\", index=None)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "a8d7a0c5-1d5f-458a-b685-3f49520b0094",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## 6.3 Creating data loaders"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "7126108a-75e7-4862-b0fb-cbf59a18bb6c",
|
||||
@ -465,7 +525,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 9,
|
||||
"execution_count": 8,
|
||||
"id": "74c3c463-8763-4cc0-9320-41c7eaad8ab7",
|
||||
"metadata": {
|
||||
"colab": {
|
||||
@ -490,6 +550,27 @@
|
||||
"print(tokenizer.encode(\"<|endoftext|>\", allowed_special={\"<|endoftext|>\"}))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 9,
|
||||
"id": "0ff0f6b2-376b-4740-8858-55b60784be73",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"[42, 13, 314, 481, 1908, 340, 757]"
|
||||
]
|
||||
},
|
||||
"execution_count": 9,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"tokenizer.encode(\"K. I will sent it again\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "04f582ff-68bf-450e-bd87-5fb61afe431c",
|
||||
@ -500,6 +581,14 @@
|
||||
"- The `SpamDataset` class below identifies the longest sequence in the training dataset and adds the padding token to the others to match that sequence length"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "0829f33f-1428-4f22-9886-7fee633b3666",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch06_compressed/pad-input-sequences.webp\" width=500px>"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 10,
|
||||
@ -611,6 +700,14 @@
|
||||
"- Next, we use the dataset to instantiate the data loaders, which is similar to creating the data loaders in previous chapters:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "64bcc349-205f-48f8-9655-95ff21f5e72f",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch06_compressed/batch.webp\" width=500px>"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 13,
|
||||
@ -730,7 +827,7 @@
|
||||
"id": "d1c4f61a-5f5d-4b3b-97cf-151b617d1d6c"
|
||||
},
|
||||
"source": [
|
||||
"## 6.3 Initializing a model with pretrained weights"
|
||||
"## 6.4 Initializing a model with pretrained weights"
|
||||
]
|
||||
},
|
||||
{
|
||||
@ -738,7 +835,9 @@
|
||||
"id": "97e1af8b-8bd1-4b44-8b8b-dc031496e208",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"- In this section, we initialize the pretrained model we worked with in the previous chapter"
|
||||
"- In this section, we initialize the pretrained model we worked with in the previous chapter\n",
|
||||
"\n",
|
||||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch06_compressed/overview-2.webp\" width=500px>"
|
||||
]
|
||||
},
|
||||
{
|
||||
@ -819,43 +918,86 @@
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 18,
|
||||
"id": "fe4af171-5dce-4f6e-9b63-1e4e16e8b94c",
|
||||
"metadata": {
|
||||
"colab": {
|
||||
"base_uri": "https://localhost:8080/"
|
||||
},
|
||||
"id": "fe4af171-5dce-4f6e-9b63-1e4e16e8b94c",
|
||||
"outputId": "8ff3ec54-1dc3-4930-9be6-8eeaf560f8d4"
|
||||
},
|
||||
"id": "d8ac25ff-74b1-4149-8dc5-4c429d464330",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Output text: Every effort moves you forward.\n",
|
||||
"Every effort moves you forward.\n",
|
||||
"\n",
|
||||
"The first step is to understand the importance of your work\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"from previous_chapters import generate_text_simple\n",
|
||||
"from previous_chapters import (\n",
|
||||
" generate_text_simple,\n",
|
||||
" text_to_token_ids,\n",
|
||||
" token_ids_to_text\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"start_context = \"Every effort moves you\"\n",
|
||||
"\n",
|
||||
"tokenizer = tiktoken.get_encoding(\"gpt2\")\n",
|
||||
"encoded = tokenizer.encode(start_context)\n",
|
||||
"encoded_tensor = torch.tensor(encoded).unsqueeze(0)\n",
|
||||
"text_1 = \"Every effort moves you\"\n",
|
||||
"\n",
|
||||
"out = generate_text_simple(\n",
|
||||
"token_ids = generate_text_simple(\n",
|
||||
" model=model,\n",
|
||||
" idx=encoded_tensor,\n",
|
||||
" idx=text_to_token_ids(text_1, tokenizer),\n",
|
||||
" max_new_tokens=15,\n",
|
||||
" context_size=BASE_CONFIG[\"context_length\"]\n",
|
||||
")\n",
|
||||
"decoded_text = tokenizer.decode(out.squeeze(0).tolist())\n",
|
||||
"\n",
|
||||
"print(\"Output text:\", decoded_text)"
|
||||
"print(token_ids_to_text(token_ids, tokenizer))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "69162550-6a02-4ece-8db1-06c71d61946f",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"- Before we finetune the model as a classifier, let's see if the model can perhaps already classify spam messages via prompting"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 19,
|
||||
"id": "94224aa9-c95a-4f8a-a420-76d01e3a800c",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Is the following text 'spam'? Answer with 'yes' or 'no': 'You are a winner you have been specially selected to receive $1000 cash or a $2000 award.' Answer with 'yes' or 'no'. Answer with 'yes' or 'no'. Answer with 'yes' or 'no'. Answer with 'yes'\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"text_2 = (\n",
|
||||
" \"Is the following text 'spam'? Answer with 'yes' or 'no':\"\n",
|
||||
" \" 'You are a winner you have been specially\"\n",
|
||||
" \" selected to receive $1000 cash or a $2000 award.'\"\n",
|
||||
" \" Answer with 'yes' or 'no'.\"\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"token_ids = generate_text_simple(\n",
|
||||
" model=model,\n",
|
||||
" idx=text_to_token_ids(text_2, tokenizer),\n",
|
||||
" max_new_tokens=23,\n",
|
||||
" context_size=BASE_CONFIG[\"context_length\"]\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"print(token_ids_to_text(token_ids, tokenizer))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "1ce39ed0-2c77-410d-8392-dd15d4b22016",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"- As we can see, the model is not very good at following instruction\n",
|
||||
"- This is expected, since it has only been pretrained and not instruction-finetuned (instruction finetuning will be covered in the next chapter)"
|
||||
]
|
||||
},
|
||||
{
|
||||
@ -865,7 +1007,15 @@
|
||||
"id": "4c9ae440-32f9-412f-96cf-fd52cc3e2522"
|
||||
},
|
||||
"source": [
|
||||
"## 6.4 Adding a classification head"
|
||||
"## 6.5 Adding a classification head"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "d6e9d66f-76b2-40fc-9ec5-3f972a8db9c0",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch06_compressed/lm-head.webp\" width=500px>"
|
||||
]
|
||||
},
|
||||
{
|
||||
@ -879,7 +1029,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 19,
|
||||
"execution_count": 20,
|
||||
"id": "b23aff91-6bd0-48da-88f6-353657e6c981",
|
||||
"metadata": {
|
||||
"colab": {
|
||||
@ -1149,7 +1299,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 20,
|
||||
"execution_count": 21,
|
||||
"id": "fkMWFl-0etea",
|
||||
"metadata": {
|
||||
"id": "fkMWFl-0etea"
|
||||
@ -1171,7 +1321,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 21,
|
||||
"execution_count": 22,
|
||||
"id": "7e759fa0-0f69-41be-b576-17e5f20e04cb",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
@ -1192,9 +1342,17 @@
|
||||
"- So, we are also making the last transformer block and the final `LayerNorm` module connecting the last transformer block to the output layer trainable"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "0be7c1eb-c46c-4065-8525-eea1b8c66d10",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch06_compressed/trainable.webp\" width=500px>"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 22,
|
||||
"execution_count": 23,
|
||||
"id": "2aedc120-5ee3-48f6-92f2-ad9304ebcdc7",
|
||||
"metadata": {
|
||||
"id": "2aedc120-5ee3-48f6-92f2-ad9304ebcdc7"
|
||||
@ -1219,7 +1377,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 23,
|
||||
"execution_count": 24,
|
||||
"id": "f645c06a-7df6-451c-ad3f-eafb18224ebc",
|
||||
"metadata": {
|
||||
"colab": {
|
||||
@ -1233,13 +1391,13 @@
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Inputs: tensor([[ 40, 1107, 8288, 428, 3807, 13]])\n",
|
||||
"Inputs dimensions: torch.Size([1, 6])\n"
|
||||
"Inputs: tensor([[5211, 345, 423, 640]])\n",
|
||||
"Inputs dimensions: torch.Size([1, 4])\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"inputs = tokenizer.encode(\"I really liked this movie.\")\n",
|
||||
"inputs = tokenizer.encode(\"Do you have time\")\n",
|
||||
"inputs = torch.tensor(inputs).unsqueeze(0)\n",
|
||||
"print(\"Inputs:\", inputs)\n",
|
||||
"print(\"Inputs dimensions:\", inputs.shape) # shape: (batch_size, num_tokens)"
|
||||
@ -1255,7 +1413,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 24,
|
||||
"execution_count": 25,
|
||||
"id": "48dc84f1-85cc-4609-9cee-94ff539f00f4",
|
||||
"metadata": {
|
||||
"colab": {
|
||||
@ -1270,13 +1428,11 @@
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Outputs:\n",
|
||||
" tensor([[[-1.9044, 1.5321],\n",
|
||||
" [-4.9851, 8.5136],\n",
|
||||
" [-1.6985, 4.6314],\n",
|
||||
" [-2.3820, 5.7547],\n",
|
||||
" [-3.8736, 4.4867],\n",
|
||||
" [-5.7543, 5.3615]]])\n",
|
||||
"Outputs dimensions: torch.Size([1, 6, 2])\n"
|
||||
" tensor([[[-1.5854, 0.9904],\n",
|
||||
" [-3.7235, 7.4548],\n",
|
||||
" [-2.2661, 6.6049],\n",
|
||||
" [-3.5983, 3.9902]]])\n",
|
||||
"Outputs dimensions: torch.Size([1, 4, 2])\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
@ -1288,6 +1444,14 @@
|
||||
"print(\"Outputs dimensions:\", outputs.shape) # shape: (batch_size, num_tokens, num_classes)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "7df9144f-6817-4be4-8d4b-5d4dadfe4a9b",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch06_compressed/input-and-output.webp\" width=500px>"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "e3bb8616-c791-4f5c-bac0-5302f663e46a",
|
||||
@ -1325,12 +1489,28 @@
|
||||
"print(\"Last output token:\", outputs[:, -1, :])"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "8df08ae0-e664-4670-b7c5-8a2280d9b41b",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch06_compressed/attention-mask.webp\" width=200px>"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "32aa4aef-e1e9-491b-9adf-5aa973e59b8c",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## 6.5 Calculating the classification loss and accuracy"
|
||||
"## 6.6 Calculating the classification loss and accuracy"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "669e1fd1-ace8-44b4-b438-185ed0ba8b33",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch06_compressed/overview-3.webp\" width=500px>"
|
||||
]
|
||||
},
|
||||
{
|
||||
@ -1545,7 +1725,7 @@
|
||||
"id": "456ae0fd-6261-42b4-ab6a-d24289953083"
|
||||
},
|
||||
"source": [
|
||||
"## 6.6 Finetuning the model on supervised data"
|
||||
"## 6.7 Finetuning the model on supervised data"
|
||||
]
|
||||
},
|
||||
{
|
||||
@ -1560,6 +1740,14 @@
|
||||
" 2. calculate the accuracy after each epoch instead of printing a sample text after each epoch"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "979b6222-1dc2-4530-9d01-b6b04fe3de12",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch06_compressed/training-loop.webp\" width=500px>"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 31,
|
||||
@ -1868,7 +2056,15 @@
|
||||
"id": "a74d9ad7-3ec1-450e-8c9f-4fc46d3d5bb0",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## 6.7 Using the LLM as a SPAM classifier"
|
||||
"## 6.8 Using the LLM as a SPAM classifier"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "72ebcfa2-479e-408b-9cf0-7421f6144855",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch06_compressed/overview-4.webp\" width=500px>"
|
||||
]
|
||||
},
|
||||
{
|
||||
@ -2069,7 +2265,7 @@
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.10.6"
|
||||
"version": "3.11.4"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
|
Loading…
x
Reference in New Issue
Block a user