mirror of
https://github.com/rasbt/LLMs-from-scratch.git
synced 2025-11-01 18:30:00 +00:00
reorg first section
This commit is contained in:
parent
b9ed5811c3
commit
029efee920
@ -28,17 +28,9 @@
|
||||
"# Chapter 7: Finetuning To Follow Instructions"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "a984b9ef-af93-415a-9ec7-97385f28af7b",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"- Comments & notes in progress ..."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 2,
|
||||
"execution_count": 1,
|
||||
"id": "4e19327b-6c02-4881-ad02-9b6d3ec0b1b4",
|
||||
"metadata": {
|
||||
"colab": {
|
||||
@ -52,11 +44,11 @@
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"matplotlib version: 3.8.4\n",
|
||||
"tiktoken version: 0.6.0\n",
|
||||
"matplotlib version: 3.8.2\n",
|
||||
"tiktoken version: 0.5.1\n",
|
||||
"torch version: 2.2.2\n",
|
||||
"tqdm version: 4.66.2\n",
|
||||
"tensorflow version: 2.16.1\n"
|
||||
"tqdm version: 4.66.1\n",
|
||||
"tensorflow version: 2.15.0\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
@ -82,6 +74,36 @@
|
||||
"## 7.1 Introduction to instruction finetuning"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "53dba24a-6805-496c-9a7f-c75e2d3527ab",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"- In chapter 5, we saw that pretraining an LLM involves a training procedure where it learns to generate one word at a time\n",
|
||||
"- Hence, a pretrained LLM is good at text completion, but it is not good at following instructions\n",
|
||||
"- In this chapter, we teach the LLM to better follow instructions"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "18dc0535-0904-44ed-beaf-9b678292ef35",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"[insert figure]"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "b4698b23-12e0-4bd7-a140-ccb3dd71d4e8",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"- An optional step after instruction finetuning is preference tuning, which refines the response style of an LLM; readers interested in preference tuning can find example code in the bonus materials: [../04_preference-tuning-with-dpo](../04_preference-tuning-with-dpo)\n",
|
||||
"\n",
|
||||
"- The topics covered in this chapter are summarized in the figure below\n",
|
||||
"\n",
|
||||
"[insert figure]"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "5384f0cf-ef3c-4436-a5fa-59bd25649f86",
|
||||
@ -90,9 +112,17 @@
|
||||
"## 7.2 Preparing a dataset for supervised instruction finetuning"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "f8b34ff8-619f-4e89-bd03-ce513269760d",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"- We will work with an instruction dataset I prepared for this chapter"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"execution_count": 2,
|
||||
"id": "0G3axLw6kY1N",
|
||||
"metadata": {
|
||||
"colab": {
|
||||
@ -106,7 +136,7 @@
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"1100\n"
|
||||
"Number of entries: 1100\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
@ -137,12 +167,20 @@
|
||||
"url = \"https://raw.githubusercontent.com/rasbt/LLMs-from-scratch/main/ch07/01_main-chapter-code/instruction-data.json\"\n",
|
||||
"\n",
|
||||
"data = download_and_load_file(file_path, url)\n",
|
||||
"print(len(data))"
|
||||
"print(\"Number of entries:\", len(data))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "d7af8176-4255-4e92-8c7d-998771733eb8",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"- Each item in the `data` list we loaded from the JSON file above is a dictionary in the following form:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
"execution_count": 3,
|
||||
"id": "-LiuBMsHkzQV",
|
||||
"metadata": {
|
||||
"colab": {
|
||||
@ -156,17 +194,27 @@
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"{'instruction': 'Evaluate the following phrase by transforming it into the spelling given.', 'input': 'freind --> friend', 'output': 'The spelling of the given phrase \"freind\" is incorrect, the correct spelling is \"friend\".'}\n"
|
||||
"Example entry:\n",
|
||||
"\n",
|
||||
" {'instruction': 'Identify the correct spelling of the following word.', 'input': 'Ocassion', 'output': \"The correct spelling is 'Occasion.'\"}\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"print(data[0])"
|
||||
"print(\"Example entry:\\n\\n\", data[50])"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "c5a32b34-485a-4816-a77a-da14f9fe6e46",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"- Note that the `'input'` field can be empty:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 5,
|
||||
"execution_count": 4,
|
||||
"id": "uFInFxDDk2Je",
|
||||
"metadata": {
|
||||
"colab": {
|
||||
@ -180,17 +228,137 @@
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"{'instruction': \"Change the sentence 'You should have called me.' into a question.\", 'input': '', 'output': 'Should you have called me?'}\n"
|
||||
"Another example entry:\n",
|
||||
"\n",
|
||||
" {'instruction': \"What is an antonym of 'complicated'?\", 'input': '', 'output': \"An antonym of 'complicated' is 'simple'.\"}\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"print(data[-1])"
|
||||
"print(\"Another example entry:\\n\\n\", data[999])"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "f034799a-6575-45fd-98c9-9d1012d0fd58",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"- Instruction finetuning is often referred to as \"supervised instruction finetuning\" because it involves training a model on a dataset where the input-output pairs are explicitly provided\n",
|
||||
"- There are different ways to format the entries as inputs to the LLM; the figure below illustrates two example formats that were used for training the Alpaca (https://crfm.stanford.edu/2023/03/13/alpaca.html) and Phi-3 (https://arxiv.org/abs/2404.14219) LLMs, respectively"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "dffa4f70-44d4-4be4-89a9-2159f4885b10",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"[insert figure]"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "dd79a74e-befb-491c-be49-f777a6a5b6a6",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"- In this chapter, we use Alpaca-style prompt formatting, which was the original prompt template for instruction finetuning\n",
|
||||
"- Below we format the input that we will pass as input to the LLM"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 5,
|
||||
"id": "Jhk37nnJnkBh",
|
||||
"metadata": {
|
||||
"id": "Jhk37nnJnkBh"
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"def format_input(entry):\n",
|
||||
" instruction_text = (\n",
|
||||
" f\"Below is an instruction that describes a task. \"\n",
|
||||
" f\"Write a response that appropriately completes the request.\"\n",
|
||||
" f\"\\n\\n### Instruction:\\n{entry['instruction']}\"\n",
|
||||
" )\n",
|
||||
"\n",
|
||||
" input_text = f\"\\n\\n### Input:\\n{entry['input']}\" if entry[\"input\"] else \"\"\n",
|
||||
"\n",
|
||||
" return instruction_text + input_text"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 6,
|
||||
"id": "F9UQRfjzo4Js",
|
||||
"metadata": {
|
||||
"colab": {
|
||||
"base_uri": "https://localhost:8080/"
|
||||
},
|
||||
"id": "F9UQRfjzo4Js",
|
||||
"outputId": "b56e6c03-f603-4e9d-c1b6-b4a70403caf9"
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Below is an instruction that describes a task. Write a response that appropriately completes the request.\n",
|
||||
"\n",
|
||||
"### Instruction:\n",
|
||||
"Identify the correct spelling of the following word.\n",
|
||||
"\n",
|
||||
"### Input:\n",
|
||||
"Ocassion\n",
|
||||
"\n",
|
||||
"### Response:\n",
|
||||
"The correct spelling is 'Occasion.'\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"model_input = format_input(data[50])\n",
|
||||
"desired_response = f\"\\n\\n### Response:\\n{data[50]['output']}\"\n",
|
||||
"\n",
|
||||
"print(model_input + desired_response)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 7,
|
||||
"id": "a3891fa9-f738-41cd-946c-80ef9a99c346",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Below is an instruction that describes a task. Write a response that appropriately completes the request.\n",
|
||||
"\n",
|
||||
"### Instruction:\n",
|
||||
"What is an antonym of 'complicated'?\n",
|
||||
"\n",
|
||||
"### Response:\n",
|
||||
"An antonym of 'complicated' is 'simple'.\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"model_input = format_input(data[999])\n",
|
||||
"desired_response = f\"\\n\\n### Response:\\n{data[999]['output']}\"\n",
|
||||
"\n",
|
||||
"print(model_input + desired_response)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "4aa8afd5-2a21-49a5-90c3-6a03865a4771",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"- Lastly, before we prepare the PyTorch data loaders in the next section, we divide the dataset into a training, validation, and test set"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 8,
|
||||
"id": "aFZVopbIlNfx",
|
||||
"metadata": {
|
||||
"id": "aFZVopbIlNfx"
|
||||
@ -208,7 +376,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 7,
|
||||
"execution_count": 9,
|
||||
"id": "-zf6oht6bIUQ",
|
||||
"metadata": {
|
||||
"colab": {
|
||||
@ -222,68 +390,16 @@
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"935\n",
|
||||
"55\n",
|
||||
"110\n"
|
||||
"Training set length: 935\n",
|
||||
"Validation set length: 55\n",
|
||||
"Test set length: 110\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"print(len(train_data))\n",
|
||||
"print(len(val_data))\n",
|
||||
"print(len(test_data))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 8,
|
||||
"id": "Jhk37nnJnkBh",
|
||||
"metadata": {
|
||||
"id": "Jhk37nnJnkBh"
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"def format_input(entry):\n",
|
||||
" instruction_text = (\n",
|
||||
" f\"Below is an instruction that describes a task. \"\n",
|
||||
" f\"Write a response that appropriately completes the request.\"\n",
|
||||
" f\"\\n\\n### Instruction:\\n{entry['instruction']}\"\n",
|
||||
" )\n",
|
||||
"\n",
|
||||
" input_text = f\"\\n\\n### Input:\\n{entry['input']}\" if entry[\"input\"] else \"\"\n",
|
||||
" instruction_text + input_text\n",
|
||||
"\n",
|
||||
" return instruction_text + input_text"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 9,
|
||||
"id": "F9UQRfjzo4Js",
|
||||
"metadata": {
|
||||
"colab": {
|
||||
"base_uri": "https://localhost:8080/"
|
||||
},
|
||||
"id": "F9UQRfjzo4Js",
|
||||
"outputId": "b56e6c03-f603-4e9d-c1b6-b4a70403caf9"
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Below is an instruction that describes a task. Write a response that appropriately completes the request.\n",
|
||||
"\n",
|
||||
"### Instruction:\n",
|
||||
"Evaluate the following phrase by transforming it into the spelling given.\n",
|
||||
"\n",
|
||||
"### Input:\n",
|
||||
"freind --> friend\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"print(format_input(train_data[0]))"
|
||||
"print(\"Training set length:\", len(train_data))\n",
|
||||
"print(\"Validation set length:\", len(val_data))\n",
|
||||
"print(\"Test set length:\", len(test_data))"
|
||||
]
|
||||
},
|
||||
{
|
||||
@ -1155,7 +1271,7 @@
|
||||
"id": "87b79a47-13f9-4d1f-87b1-3339bafaf2a3",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## 7.6 Saving the results"
|
||||
"## 7.6 Extracting and saving responses"
|
||||
]
|
||||
},
|
||||
{
|
||||
@ -1659,7 +1775,7 @@
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.10.14"
|
||||
"version": "3.10.12"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user