mirror of
				https://github.com/rasbt/LLMs-from-scratch.git
				synced 2025-10-31 09:50:23 +00:00 
			
		
		
		
	
		
			
				
	
	
		
			596 lines
		
	
	
		
			21 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
			
		
		
	
	
			596 lines
		
	
	
		
			21 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
| {
 | ||
|  "cells": [
 | ||
|   {
 | ||
|    "cell_type": "markdown",
 | ||
|    "id": "136a4efe-fb99-4311-8679-e0a5b6282755",
 | ||
|    "metadata": {},
 | ||
|    "source": [
 | ||
|     "<table style=\"width:100%\">\n",
 | ||
|     "<tr>\n",
 | ||
|     "<td style=\"vertical-align:middle; text-align:left;\">\n",
 | ||
|     "<font size=\"2\">\n",
 | ||
|     "Supplementary code for the <a href=\"http://mng.bz/orYv\">Build a Large Language Model From Scratch</a> book by <a href=\"https://sebastianraschka.com\">Sebastian Raschka</a><br>\n",
 | ||
|     "<br>Code repository: <a href=\"https://github.com/rasbt/LLMs-from-scratch\">https://github.com/rasbt/LLMs-from-scratch</a>\n",
 | ||
|     "</font>\n",
 | ||
|     "</td>\n",
 | ||
|     "<td style=\"vertical-align:middle; text-align:left;\">\n",
 | ||
|     "<a href=\"http://mng.bz/orYv\"><img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/cover-small.webp\" width=\"100px\"></a>\n",
 | ||
|     "</td>\n",
 | ||
|     "</tr>\n",
 | ||
|     "</table>"
 | ||
|    ]
 | ||
|   },
 | ||
|   {
 | ||
|    "cell_type": "markdown",
 | ||
|    "id": "b1910a06-e8a3-40ac-8201-ff70615b1ba4",
 | ||
|    "metadata": {
 | ||
|     "tags": []
 | ||
|    },
 | ||
|    "source": [
 | ||
|     "# Generating A Preference Dataset With Llama 3.1 70B And Ollama"
 | ||
|    ]
 | ||
|   },
 | ||
|   {
 | ||
|    "cell_type": "markdown",
 | ||
|    "id": "a128651b-f326-4232-a994-42f38b7ed520",
 | ||
|    "metadata": {},
 | ||
|    "source": [
 | ||
|     "- Preference finetuning is a process to align an instruction-finetuned LLM with human preferences\n",
 | ||
|     "- There are multiple ways to create a dataset for preference finetuning an LLM\n",
 | ||
|     "  1. We use the instruction-finetuned LLM to generate multiple responses and have humans rank them based on their preference and/or given preference criteria\n",
 | ||
|     "  2. We use the instruction-finetuned LLM to generate multiple responses and have LLMs rank them based on given preference criteria\n",
 | ||
|     "  3. We use an LLM to generate preferred and dispreferred responses given certain preference criteria\n",
 | ||
|     "- In this notebook, we consider approach 3\n",
 | ||
|     "- This notebook uses a 70-billion-parameter Llama 3.1-Instruct model through ollama to generate preference labels for an instruction dataset\n",
 | ||
|     "- The expected format of the instruction dataset is as follows:\n",
 | ||
|     "\n",
 | ||
|     "\n",
 | ||
|     "### Input\n",
 | ||
|     "\n",
 | ||
|     "```json\n",
 | ||
|     "[\n",
 | ||
|     "    {\n",
 | ||
|     "        \"instruction\": \"What is the state capital of California?\",\n",
 | ||
|     "        \"input\": \"\",\n",
 | ||
|     "        \"output\": \"The state capital of California is Sacramento.\",\n",
 | ||
|     "    },\n",
 | ||
|     "    {\n",
 | ||
|     "        \"instruction\": \"Provide a synonym for 'fast'.\",\n",
 | ||
|     "        \"input\": \"\",\n",
 | ||
|     "        \"output\": \"A synonym for 'fast' is 'quick'.\",\n",
 | ||
|     "    },\n",
 | ||
|     "    {\n",
 | ||
|     "        \"instruction\": \"What is the capital of Greece?\",\n",
 | ||
|     "        \"input\": \"\",\n",
 | ||
|     "        \"output\": \"The capital of Greece is Athens.\",\n",
 | ||
|     "\n",
 | ||
|     "    },\n",
 | ||
|     "...\n",
 | ||
|     "]\n",
 | ||
|     "```\n",
 | ||
|     "\n",
 | ||
|     "The output dataset will look as follows, where more polite responses are preferred (`'chosen'`), and more impolite responses are dispreferred (`'rejected'`):\n",
 | ||
|     "\n",
 | ||
|     "```json\n",
 | ||
|     "[\n",
 | ||
|     "    {\n",
 | ||
|     "        \"instruction\": \"What is the state capital of California?\",\n",
 | ||
|     "        \"input\": \"\",\n",
 | ||
|     "        \"output\": \"The state capital of California is Sacramento.\",\n",
 | ||
|     "        \"rejected\": \"Look, the state capital of California is obviously Sacramento.\",\n",
 | ||
|     "        \"chosen\": \"The state capital of California is Sacramento.\"\n",
 | ||
|     "    },\n",
 | ||
|     "    {\n",
 | ||
|     "        \"instruction\": \"Provide a synonym for 'fast'.\",\n",
 | ||
|     "        \"input\": \"\",\n",
 | ||
|     "        \"output\": \"A synonym for 'fast' is 'quick'.\",\n",
 | ||
|     "        \"chosen\": \"A suitable alternative to 'fast' would be 'quick'.\",\n",
 | ||
|     "        \"rejected\": \"A synonym for 'fast' is 'quick'.\"\n",
 | ||
|     "    },\n",
 | ||
|     "    {\n",
 | ||
|     "        \"instruction\": \"What is the capital of Greece?\",\n",
 | ||
|     "        \"input\": \"\",\n",
 | ||
|     "        \"output\": \"The capital of Greece is Athens.\",\n",
 | ||
|     "        \"chosen\": \"I'd be happy to help! The capital of Greece is indeed Athens.\",\n",
 | ||
|     "        \"rejected\": \"The capital of Greece is Athens.\"\n",
 | ||
|     "    },\n",
 | ||
|     "...\n",
 | ||
|     "]\n",
 | ||
|     "```\n",
 | ||
|     "\n",
 | ||
|     "### Output\n",
 | ||
|     "\n",
 | ||
|     "\n",
 | ||
|     "\n",
 | ||
|     "\n",
 | ||
|     "- The code doesn't require a GPU and runs on a laptop given enough RAM"
 | ||
|    ]
 | ||
|   },
 | ||
|   {
 | ||
|    "cell_type": "code",
 | ||
|    "execution_count": 1,
 | ||
|    "id": "63610acc-db94-437f-8d38-e99dca0299cb",
 | ||
|    "metadata": {},
 | ||
|    "outputs": [
 | ||
|     {
 | ||
|      "name": "stdout",
 | ||
|      "output_type": "stream",
 | ||
|      "text": [
 | ||
|       "tqdm version: 4.66.4\n"
 | ||
|      ]
 | ||
|     }
 | ||
|    ],
 | ||
|    "source": [
 | ||
|     "from importlib.metadata import version\n",
 | ||
|     "\n",
 | ||
|     "pkgs = [\"tqdm\",    # Progress bar\n",
 | ||
|     "        ]\n",
 | ||
|     "\n",
 | ||
|     "for p in pkgs:\n",
 | ||
|     "    print(f\"{p} version: {version(p)}\")"
 | ||
|    ]
 | ||
|   },
 | ||
|   {
 | ||
|    "cell_type": "markdown",
 | ||
|    "id": "8bcdcb34-ac75-4f4f-9505-3ce0666c42d5",
 | ||
|    "metadata": {},
 | ||
|    "source": [
 | ||
|     "## Installing Ollama and Downloading Llama 3.1"
 | ||
|    ]
 | ||
|   },
 | ||
|   {
 | ||
|    "cell_type": "markdown",
 | ||
|    "id": "5a092280-5462-4709-a3fe-8669a4a8a0a6",
 | ||
|    "metadata": {},
 | ||
|    "source": [
 | ||
|     "- Ollama is an application to run LLMs efficiently\n",
 | ||
|     "- It is a wrapper around [llama.cpp](https://github.com/ggerganov/llama.cpp), which implements LLMs in pure C/C++ to maximize efficiency\n",
 | ||
|     "- Note that it is a tool for using LLMs to generate text (inference), not training or finetuning LLMs\n",
 | ||
|     "- Prior to running the code below, install ollama by visiting [https://ollama.com](https://ollama.com) and following the instructions (for instance, clicking on the \"Download\" button and downloading the ollama application for your operating system)"
 | ||
|    ]
 | ||
|   },
 | ||
|   {
 | ||
|    "cell_type": "markdown",
 | ||
|    "id": "9558a522-650d-401a-84fc-9fd7b1f39da7",
 | ||
|    "metadata": {},
 | ||
|    "source": [
 | ||
|     "- For macOS and Windows users, click on the ollama application you downloaded; if it prompts you to install the command line usage, say \"yes\"\n",
 | ||
|     "- Linux users can use the installation command provided on the ollama website\n",
 | ||
|     "\n",
 | ||
|     "- In general, before we can use ollama from the command line, we have to either start the ollama application or run `ollama serve` in a separate terminal\n",
 | ||
|     "\n",
 | ||
|     "<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/bonus/ollama-eval/ollama-serve.webp?1\">\n",
 | ||
|     "\n",
 | ||
|     "\n",
 | ||
|     "- With the ollama application or `ollama serve` running, in a different terminal, on the command line, execute the following command to try out the 70-billion-parameter Llama 3.1 model \n",
 | ||
|     "\n",
 | ||
|     "```bash\n",
 | ||
|     "# 70B model\n",
 | ||
|     "ollama run llama3.1:70b\n",
 | ||
|     "```\n",
 | ||
|     "\n",
 | ||
|     "\n",
 | ||
|     "The output looks like as follows:\n",
 | ||
|     "\n",
 | ||
|     "```\n",
 | ||
|     "$ ollama run llama3.1:70b\n",
 | ||
|     "pulling manifest\n",
 | ||
|     "pulling aa81b541aae6... 100% ▕████████████████▏ 39 GB\n",
 | ||
|     "pulling 8cf247399e57... 100% ▕████████████████▏ 1.7 KB\n",
 | ||
|     "pulling f1cd752815fc... 100% ▕████████████████▏ 12 KB\n",
 | ||
|     "pulling 56bb8bd477a5... 100% ▕████████████████▏ 96 B\n",
 | ||
|     "pulling 3c1c2d3df5b3... 100% ▕████████████████▏ 486 B\n",
 | ||
|     "verifying sha256 digest\n",
 | ||
|     "writing manifest\n",
 | ||
|     "removing any unused layers\n",
 | ||
|     "success\n",
 | ||
|     "```\n",
 | ||
|     "\n",
 | ||
|     "- Note that `llama3.1:70b` refers to the instruction finetuned 70-billion-parameter Llama 3.1 model\n",
 | ||
|     "\n",
 | ||
|     "- Alternatively, you can also use the smaller, more resource-effiicent 8-billion-parameters Llama 3.1 model, by replacing `llama3.1:70b` with `llama3.1`\n",
 | ||
|     "\n",
 | ||
|     "- After the download has been completed, you will see a command line prompt that allows you to chat with the model\n",
 | ||
|     "\n",
 | ||
|     "- Try a prompt like \"What do llamas eat?\", which should return an output similar to the following:\n",
 | ||
|     "\n",
 | ||
|     "```\n",
 | ||
|     ">>> What do llamas eat?\n",
 | ||
|     "Llamas are ruminant animals, which means they have a four-chambered \n",
 | ||
|     "stomach and eat plants that are high in fiber. In the wild, llamas \n",
 | ||
|     "typically feed on:\n",
 | ||
|     "1. Grasses: They love to graze on various types of grasses, including tall \n",
 | ||
|     "grasses, wheat, oats, and barley.\n",
 | ||
|     "```"
 | ||
|    ]
 | ||
|   },
 | ||
|   {
 | ||
|    "cell_type": "markdown",
 | ||
|    "id": "0b5addcb-fc7d-455d-bee9-6cc7a0d684c7",
 | ||
|    "metadata": {},
 | ||
|    "source": [
 | ||
|     "- You can end this session using the input `/bye`"
 | ||
|    ]
 | ||
|   },
 | ||
|   {
 | ||
|    "cell_type": "markdown",
 | ||
|    "id": "dda155ee-cf36-44d3-b634-20ba8e1ca38a",
 | ||
|    "metadata": {},
 | ||
|    "source": [
 | ||
|     "## Using Ollama's REST API"
 | ||
|    ]
 | ||
|   },
 | ||
|   {
 | ||
|    "cell_type": "markdown",
 | ||
|    "id": "89343a84-0ddc-42fc-bf50-298a342b93c0",
 | ||
|    "metadata": {},
 | ||
|    "source": [
 | ||
|     "- Now, an alternative way to interact with the model is via its REST API in Python via the following function\n",
 | ||
|     "- Before you run the next cells in this notebook, make sure that ollama is still running, as described above, via\n",
 | ||
|     "  - `ollama serve` in a terminal\n",
 | ||
|     "  - the ollama application\n",
 | ||
|     "- Next, run the following code cell to query the model"
 | ||
|    ]
 | ||
|   },
 | ||
|   {
 | ||
|    "cell_type": "markdown",
 | ||
|    "id": "16642a48-1cab-40d2-af08-ab8c2fbf5876",
 | ||
|    "metadata": {},
 | ||
|    "source": [
 | ||
|     "- First, let's try the API with a simple example to make sure it works as intended:"
 | ||
|    ]
 | ||
|   },
 | ||
|   {
 | ||
|    "cell_type": "code",
 | ||
|    "execution_count": 2,
 | ||
|    "id": "65b0ba76-1fb1-4306-a7c2-8f3bb637ccdb",
 | ||
|    "metadata": {},
 | ||
|    "outputs": [
 | ||
|     {
 | ||
|      "name": "stdout",
 | ||
|      "output_type": "stream",
 | ||
|      "text": [
 | ||
|       "Llamas are herbivores, which means they primarily eat plants and plant-based foods. Their diet consists of:\n",
 | ||
|       "\n",
 | ||
|       "1. **Grasses**: Various types of grasses, including timothy grass, orchard grass, and brome grass.\n",
 | ||
|       "2. **Hay**: High-quality hay, such as alfalfa or clover hay, is a staple in a llama's diet.\n",
 | ||
|       "3. **Leaves**: Leaves from trees and shrubs, like willow, cottonwood, and mesquite, are also eaten.\n",
 | ||
|       "4. **Fruits and vegetables**: Llamas enjoy fruits like apples, carrots, and sweet potatoes, as well as leafy greens like kale and spinach.\n",
 | ||
|       "5. **Grains**: In moderation, llamas can eat grains like oats, barley, and corn.\n",
 | ||
|       "\n",
 | ||
|       "It's essential to note that llamas have a unique digestive system, with a three-part stomach and a large cecum (a specialized part of the large intestine). This allows them to break down and extract nutrients from plant material more efficiently than many other animals.\n",
 | ||
|       "\n",
 | ||
|       "A typical llama diet might consist of:\n",
 | ||
|       "\n",
 | ||
|       "* 1-2% of their body weight in hay per day\n",
 | ||
|       "* 0.5-1% of their body weight in grains per day (if fed)\n",
 | ||
|       "* Free-choice access to fresh water\n",
 | ||
|       "* Limited amounts of fruits and vegetables as treats\n",
 | ||
|       "\n",
 | ||
|       "It's also important to ensure that llamas have access to a mineral supplement, such as a salt lick or loose minerals, to help maintain optimal health.\n",
 | ||
|       "\n",
 | ||
|       "Remember, every llama is different, and their dietary needs may vary depending on factors like age, size, and activity level. Consult with a veterinarian or experienced llama breeder for specific guidance on feeding your llama.\n"
 | ||
|      ]
 | ||
|     }
 | ||
|    ],
 | ||
|    "source": [
 | ||
|     "import urllib.request\n",
 | ||
|     "import json\n",
 | ||
|     "\n",
 | ||
|     "\n",
 | ||
|     "def query_model(prompt, model=\"llama3.1:70b\", url=\"http://localhost:11434/api/chat\"):\n",
 | ||
|     "    # Create the data payload as a dictionary\n",
 | ||
|     "    data = {\n",
 | ||
|     "        \"model\": model,\n",
 | ||
|     "        \"messages\": [\n",
 | ||
|     "            {\n",
 | ||
|     "                \"role\": \"user\",\n",
 | ||
|     "                \"content\": prompt\n",
 | ||
|     "            }\n",
 | ||
|     "        ],\n",
 | ||
|     "        \"options\": {\n",
 | ||
|     "            \"seed\": 123,\n",
 | ||
|     "            \"temperature\": 0,\n",
 | ||
|     "        }\n",
 | ||
|     "    }\n",
 | ||
|     "\n",
 | ||
|     "    # Convert the dictionary to a JSON formatted string and encode it to bytes\n",
 | ||
|     "    payload = json.dumps(data).encode(\"utf-8\")\n",
 | ||
|     "\n",
 | ||
|     "    # Create a request object, setting the method to POST and adding necessary headers\n",
 | ||
|     "    request = urllib.request.Request(url, data=payload, method=\"POST\")\n",
 | ||
|     "    request.add_header(\"Content-Type\", \"application/json\")\n",
 | ||
|     "\n",
 | ||
|     "    # Send the request and capture the response\n",
 | ||
|     "    response_data = \"\"\n",
 | ||
|     "    with urllib.request.urlopen(request) as response:\n",
 | ||
|     "        # Read and decode the response\n",
 | ||
|     "        while True:\n",
 | ||
|     "            line = response.readline().decode(\"utf-8\")\n",
 | ||
|     "            if not line:\n",
 | ||
|     "                break\n",
 | ||
|     "            response_json = json.loads(line)\n",
 | ||
|     "            response_data += response_json[\"message\"][\"content\"]\n",
 | ||
|     "\n",
 | ||
|     "    return response_data\n",
 | ||
|     "\n",
 | ||
|     "\n",
 | ||
|     "result = query_model(\"What do Llamas eat?\")\n",
 | ||
|     "print(result)"
 | ||
|    ]
 | ||
|   },
 | ||
|   {
 | ||
|    "cell_type": "markdown",
 | ||
|    "id": "162a4739-6f03-4092-a5c2-f57a0b6a4c4d",
 | ||
|    "metadata": {},
 | ||
|    "source": [
 | ||
|     "## Load JSON Entries"
 | ||
|    ]
 | ||
|   },
 | ||
|   {
 | ||
|    "cell_type": "markdown",
 | ||
|    "id": "ca011a8b-20c5-4101-979e-9b5fccf62f8a",
 | ||
|    "metadata": {},
 | ||
|    "source": [
 | ||
|     "- Now, let's get to the data generation part\n",
 | ||
|     "- Here, for a hands-on example, we use the `instruction-data.json` file that we originally used to instruction-finetune the model in chapter 7:"
 | ||
|    ]
 | ||
|   },
 | ||
|   {
 | ||
|    "cell_type": "code",
 | ||
|    "execution_count": 3,
 | ||
|    "id": "8b2d393a-aa92-4190-9d44-44326a6f699b",
 | ||
|    "metadata": {},
 | ||
|    "outputs": [
 | ||
|     {
 | ||
|      "name": "stdout",
 | ||
|      "output_type": "stream",
 | ||
|      "text": [
 | ||
|       "Number of entries: 1100\n"
 | ||
|      ]
 | ||
|     }
 | ||
|    ],
 | ||
|    "source": [
 | ||
|     "from pathlib import Path\n",
 | ||
|     "\n",
 | ||
|     "json_file = Path(\"..\", \"01_main-chapter-code\", \"instruction-data.json\")\n",
 | ||
|     "\n",
 | ||
|     "with open(json_file, \"r\") as file:\n",
 | ||
|     "    json_data = json.load(file)\n",
 | ||
|     "\n",
 | ||
|     "print(\"Number of entries:\", len(json_data))"
 | ||
|    ]
 | ||
|   },
 | ||
|   {
 | ||
|    "cell_type": "markdown",
 | ||
|    "id": "b6c9751b-59b7-43fe-acc7-14e8daf2fa66",
 | ||
|    "metadata": {},
 | ||
|    "source": [
 | ||
|     "- The structure of this file is as follows, where we have the given response in the test dataset (`'output'`) that we trained the model to generate via instruction finetuning based on the `'input'` and `'instruction'`"
 | ||
|    ]
 | ||
|   },
 | ||
|   {
 | ||
|    "cell_type": "code",
 | ||
|    "execution_count": 4,
 | ||
|    "id": "7222fdc0-5684-4f2b-b741-3e341851359e",
 | ||
|    "metadata": {},
 | ||
|    "outputs": [
 | ||
|     {
 | ||
|      "data": {
 | ||
|       "text/plain": [
 | ||
|        "{'instruction': 'Evaluate the following phrase by transforming it into the spelling given.',\n",
 | ||
|        " 'input': 'freind --> friend',\n",
 | ||
|        " 'output': 'The spelling of the given phrase \"freind\" is incorrect, the correct spelling is \"friend\".'}"
 | ||
|       ]
 | ||
|      },
 | ||
|      "execution_count": 4,
 | ||
|      "metadata": {},
 | ||
|      "output_type": "execute_result"
 | ||
|     }
 | ||
|    ],
 | ||
|    "source": [
 | ||
|     "json_data[0]"
 | ||
|    ]
 | ||
|   },
 | ||
|   {
 | ||
|    "cell_type": "markdown",
 | ||
|    "id": "fcf0331b-6024-4bba-89a9-a088b14a1046",
 | ||
|    "metadata": {},
 | ||
|    "source": [
 | ||
|     "- Below is a small utility function that formats the instruction and input:"
 | ||
|    ]
 | ||
|   },
 | ||
|   {
 | ||
|    "cell_type": "code",
 | ||
|    "execution_count": 5,
 | ||
|    "id": "43263cd3-e5fb-4ab5-871e-3ad6e7d21a8c",
 | ||
|    "metadata": {},
 | ||
|    "outputs": [],
 | ||
|    "source": [
 | ||
|     "def format_input(entry):\n",
 | ||
|     "    instruction_text = (\n",
 | ||
|     "        f\"Below is an instruction that describes a task. Write a response that \"\n",
 | ||
|     "        f\"appropriately completes the request.\"\n",
 | ||
|     "        f\"\\n\\n### Instruction:\\n{entry['instruction']}\"\n",
 | ||
|     "    )\n",
 | ||
|     "\n",
 | ||
|     "    input_text = f\"\\n\\n### Input:\\n{entry['input']}\" if entry[\"input\"] else \"\"\n",
 | ||
|     "    instruction_text + input_text\n",
 | ||
|     "\n",
 | ||
|     "    return instruction_text + input_text"
 | ||
|    ]
 | ||
|   },
 | ||
|   {
 | ||
|    "cell_type": "markdown",
 | ||
|    "id": "39a55283-7d51-4136-ba60-f799d49f4098",
 | ||
|    "metadata": {},
 | ||
|    "source": [
 | ||
|     "- Now, let's try the ollama API to generate a `'chosen'` and `'rejected'` response for preference tuning a model\n",
 | ||
|     "- Here, to for illustration purposes, we create answers that are more or less polite\n"
 | ||
|    ]
 | ||
|   },
 | ||
|   {
 | ||
|    "cell_type": "code",
 | ||
|    "execution_count": 6,
 | ||
|    "id": "735cc089-d127-480a-b39d-0782581f0c41",
 | ||
|    "metadata": {},
 | ||
|    "outputs": [
 | ||
|     {
 | ||
|      "name": "stdout",
 | ||
|      "output_type": "stream",
 | ||
|      "text": [
 | ||
|       "\n",
 | ||
|       "Dataset response:\n",
 | ||
|       ">> The spelling of the given phrase \"freind\" is incorrect, the correct spelling is \"friend\".\n",
 | ||
|       "\n",
 | ||
|       "impolite response:\n",
 | ||
|       ">> The spelling of the given phrase \"freind\" is flat out wrong, get it together, the correct spelling is \"friend\".\n",
 | ||
|       "\n",
 | ||
|       "Dataset response:\n",
 | ||
|       ">> He goes to the park every day.\n",
 | ||
|       "\n",
 | ||
|       "polite response:\n",
 | ||
|       ">> He goes to the park daily, if I'm not mistaken.\n",
 | ||
|       "\n",
 | ||
|       "Dataset response:\n",
 | ||
|       ">> 45 kilometers is 45000 meters.\n",
 | ||
|       "\n",
 | ||
|       "polite response:\n",
 | ||
|       ">> 45 kilometers is equivalent to 45000 meters.\n",
 | ||
|       "\n",
 | ||
|       "Dataset response:\n",
 | ||
|       ">> Although it was raining, they went for a walk.\n",
 | ||
|       "\n",
 | ||
|       "polite response:\n",
 | ||
|       ">> Although it was raining outside, they still decided to go for a walk.\n",
 | ||
|       "\n",
 | ||
|       "Dataset response:\n",
 | ||
|       ">> 1, 4, 9, 16, 25, 36, 49, 64, 81, 100.\n",
 | ||
|       "\n",
 | ||
|       "impolite response:\n",
 | ||
|       ">> Here are your precious square numbers: 1, 4, 9, 16, 25, 36, 49, 64, 81, 100.\n"
 | ||
|      ]
 | ||
|     }
 | ||
|    ],
 | ||
|    "source": [
 | ||
|     "import random\n",
 | ||
|     "\n",
 | ||
|     "\n",
 | ||
|     "for entry in json_data[:5]:\n",
 | ||
|     "    \n",
 | ||
|     "    politeness = random.choice([\"polite\", \"impolite\"])    \n",
 | ||
|     "    prompt = (\n",
 | ||
|     "        f\"Given the input `{format_input(entry)}` \"\n",
 | ||
|     "        f\"and correct output `{entry['output']}`, \"\n",
 | ||
|     "        f\"slightly rewrite the output to be more {politeness}.\"\n",
 | ||
|     "        \"Keep the modification minimal.\"\n",
 | ||
|     "        \"Only return return the generated response and nothing else.\"\n",
 | ||
|     "    )\n",
 | ||
|     "    print(\"\\nDataset response:\")\n",
 | ||
|     "    print(\">>\", entry['output'])\n",
 | ||
|     "    print(f\"\\n{politeness} response:\")\n",
 | ||
|     "    print(\">>\", query_model(prompt))    "
 | ||
|    ]
 | ||
|   },
 | ||
|   {
 | ||
|    "cell_type": "markdown",
 | ||
|    "id": "142dfaa7-429f-4eb0-b74d-ff327f79547a",
 | ||
|    "metadata": {},
 | ||
|    "source": [
 | ||
|     "- If we find that the generated responses above look reasonable, we can go to the next step and apply the prompt to the whole dataset\n",
 | ||
|     "- Here, we add a `'chosen'` key for the preferred response and a `'rejected'` response for the dispreferred response"
 | ||
|    ]
 | ||
|   },
 | ||
|   {
 | ||
|    "cell_type": "code",
 | ||
|    "execution_count": 7,
 | ||
|    "id": "3349dbbc-963f-4af3-9790-12dbfdca63c3",
 | ||
|    "metadata": {},
 | ||
|    "outputs": [],
 | ||
|    "source": [
 | ||
|     "import random\n",
 | ||
|     "from tqdm import tqdm\n",
 | ||
|     "\n",
 | ||
|     "def generate_model_responses(json_data):\n",
 | ||
|     "\n",
 | ||
|     "    for i, entry in enumerate(tqdm(json_data, desc=\"Writing entries\")):\n",
 | ||
|     "        politeness = random.choice([\"polite\", \"impolite\"])    \n",
 | ||
|     "        prompt = (\n",
 | ||
|     "            f\"Given the input `{format_input(entry)}` \"\n",
 | ||
|     "            f\"and correct output `{entry['output']}`, \"\n",
 | ||
|     "            f\"slightly rewrite the output to be more {politeness}.\"\n",
 | ||
|     "            \"Keep the modification minimal.\"\n",
 | ||
|     "            \"Only return return the generated response and nothing else.\"\n",
 | ||
|     "        )\n",
 | ||
|     "        response = query_model(prompt)\n",
 | ||
|     "        \n",
 | ||
|     "        if politeness == \"polite\":\n",
 | ||
|     "            json_data[i][\"chosen\"] = response\n",
 | ||
|     "            json_data[i][\"rejected\"] = entry[\"output\"]\n",
 | ||
|     "        else:\n",
 | ||
|     "            json_data[i][\"rejected\"] = response\n",
 | ||
|     "            json_data[i][\"chosen\"] = entry[\"output\"]    "
 | ||
|    ]
 | ||
|   },
 | ||
|   {
 | ||
|    "cell_type": "markdown",
 | ||
|    "id": "b071ce84-1866-427f-a272-b46700f364b2",
 | ||
|    "metadata": {},
 | ||
|    "source": [
 | ||
|     "- Let's now apply this evaluation to the whole dataset and compute the average score of each model (this takes about 1 minute per model on an M3 MacBook Air laptop)\n",
 | ||
|     "- Note that ollama is not fully deterministic across operating systems (as of this writing) so the numbers you are getting might slightly differ from the ones shown below"
 | ||
|    ]
 | ||
|   },
 | ||
|   {
 | ||
|    "cell_type": "code",
 | ||
|    "execution_count": 8,
 | ||
|    "id": "4f700d4b-19e5-4404-afa7-b0f093024232",
 | ||
|    "metadata": {},
 | ||
|    "outputs": [
 | ||
|     {
 | ||
|      "name": "stderr",
 | ||
|      "output_type": "stream",
 | ||
|      "text": [
 | ||
|       "Writing entries: 100%|██████████| 1100/1100 [17:20<00:00,  1.06it/s]\n"
 | ||
|      ]
 | ||
|     }
 | ||
|    ],
 | ||
|    "source": [
 | ||
|     "generate_model_responses(json_data)"
 | ||
|    ]
 | ||
|   },
 | ||
|   {
 | ||
|    "cell_type": "code",
 | ||
|    "execution_count": 10,
 | ||
|    "id": "838d9747-0f7d-46fe-aab5-9ee6b765d021",
 | ||
|    "metadata": {},
 | ||
|    "outputs": [],
 | ||
|    "source": [
 | ||
|     "with open(\"instruction-data-with-preference.json\", \"w\") as file:\n",
 | ||
|     "    json.dump(json_data, file, indent=4)"
 | ||
|    ]
 | ||
|   }
 | ||
|  ],
 | ||
|  "metadata": {
 | ||
|   "kernelspec": {
 | ||
|    "display_name": "Python 3 (ipykernel)",
 | ||
|    "language": "python",
 | ||
|    "name": "python3"
 | ||
|   },
 | ||
|   "language_info": {
 | ||
|    "codemirror_mode": {
 | ||
|     "name": "ipython",
 | ||
|     "version": 3
 | ||
|    },
 | ||
|    "file_extension": ".py",
 | ||
|    "mimetype": "text/x-python",
 | ||
|    "name": "python",
 | ||
|    "nbconvert_exporter": "python",
 | ||
|    "pygments_lexer": "ipython3",
 | ||
|    "version": "3.10.6"
 | ||
|   }
 | ||
|  },
 | ||
|  "nbformat": 4,
 | ||
|  "nbformat_minor": 5
 | ||
| }
 | 
