mirror of
https://github.com/rasbt/LLMs-from-scratch.git
synced 2025-08-15 20:21:22 +00:00
make spam spelling consistent
This commit is contained in:
parent
9682b0e22d
commit
24e9110fa8
@ -1415,7 +1415,7 @@
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.10.12"
|
||||
"version": "3.11.4"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
|
@ -152,7 +152,7 @@
|
||||
},
|
||||
"source": [
|
||||
"- This section prepares the dataset we use for classification finetuning\n",
|
||||
"- We use a dataset consisting of SPAM and non-SPAM text messages to finetune the LLM to classify them\n",
|
||||
"- We use a dataset consisting of spam and non-spam text messages to finetune the LLM to classify them\n",
|
||||
"- First, we download and unzip the dataset"
|
||||
]
|
||||
},
|
||||
@ -354,7 +354,7 @@
|
||||
"id": "e7b6e631-4f0b-4aab-82b9-8898e6663109"
|
||||
},
|
||||
"source": [
|
||||
"- When we check the class distribution, we see that the data contains \"ham\" (i.e., not-SPAM) much more frequently than \"spam\""
|
||||
"- When we check the class distribution, we see that the data contains \"ham\" (i.e., \"not spam\") much more frequently than \"spam\""
|
||||
]
|
||||
},
|
||||
{
|
||||
@ -424,7 +424,7 @@
|
||||
" # Count the instances of \"spam\"\n",
|
||||
" num_spam = df[df[\"Label\"] == \"spam\"].shape[0]\n",
|
||||
" \n",
|
||||
" # Randomly sample \"ham' instances to match the number of 'spam' instances\n",
|
||||
" # Randomly sample \"ham\" instances to match the number of \"spam\" instances\n",
|
||||
" ham_subset = df[df[\"Label\"] == \"ham\"].sample(num_spam, random_state=123)\n",
|
||||
" \n",
|
||||
" # Combine ham \"subset\" with \"spam\"\n",
|
||||
@ -443,7 +443,7 @@
|
||||
"id": "d3fd2f5a-06d8-4d30-a2e3-230b86c559d6"
|
||||
},
|
||||
"source": [
|
||||
"- Next, we change the \"string\" class labels \"ham\" and \"spam\" into integer class labels 0 and 1:"
|
||||
"- Next, we change the string class labels \"ham\" and \"spam\" into integer class labels 0 and 1:"
|
||||
]
|
||||
},
|
||||
{
|
||||
@ -1330,7 +1330,7 @@
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"- Then, we replace the output layer (`model.out_head`), which originally maps the layer inputs to 50,257 dimensions (the size of the vocabulary)\n",
|
||||
"- Since we finetune the model for binary classification (predicting 2 classes, \"spam\" and \"ham\"), we can replace the output layer as shown below, which will be trainable by default\n",
|
||||
"- Since we finetune the model for binary classification (predicting 2 classes, \"spam\" and \"not spam\"), we can replace the output layer as shown below, which will be trainable by default\n",
|
||||
"- Note that we use `BASE_CONFIG[\"emb_dim\"]` (which is equal to 768 in the `\"gpt2-small (124M)\"` model) to keep the code below more general"
|
||||
]
|
||||
},
|
||||
@ -1538,7 +1538,7 @@
|
||||
"- Hence, instead, we minimize the cross entropy loss as a proxy for maximizing the classification accuracy (you can learn more about this topic in lecture 8 of my freely available [Introduction to Deep Learning](https://sebastianraschka.com/blog/2021/dl-course.html#l08-multinomial-logistic-regression--softmax-regression) class.\n",
|
||||
"\n",
|
||||
"- Note that in chapter 5, we calculated the cross entropy loss for the next predicted token over the 50,257 token IDs in the vocabulary\n",
|
||||
"- Here, we calculate the cross entropy in a similar fashion; the only difference is that instead of 50,257 token IDs, we now have only two choices: spam (label 1) or ham (label 0).\n",
|
||||
"- Here, we calculate the cross entropy in a similar fashion; the only difference is that instead of 50,257 token IDs, we now have only two choices: \"spam\" (label 1) or \"not spam\" (label 0).\n",
|
||||
"- In other words, the loss calculation training code is practically identical to the one in chapter 5, but we now only have two labels instead of 50,257 labels (token IDs).\n",
|
||||
"\n",
|
||||
"\n",
|
||||
@ -2071,7 +2071,7 @@
|
||||
"id": "a74d9ad7-3ec1-450e-8c9f-4fc46d3d5bb0",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## 6.8 Using the LLM as a SPAM classifier"
|
||||
"## 6.8 Using the LLM as a spam classifier"
|
||||
]
|
||||
},
|
||||
{
|
||||
@ -2284,7 +2284,7 @@
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.10.12"
|
||||
"version": "3.11.4"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
|
Loading…
x
Reference in New Issue
Block a user