diff --git a/ch07/04_preference-tuning-with-dpo/create-preference-data-ollama.ipynb b/ch07/04_preference-tuning-with-dpo/create-preference-data-ollama.ipynb index a8721b6..4b7303b 100644 --- a/ch07/04_preference-tuning-with-dpo/create-preference-data-ollama.ipynb +++ b/ch07/04_preference-tuning-with-dpo/create-preference-data-ollama.ipynb @@ -41,13 +41,13 @@ " 2. We use the instruction-finetuned LLM to generate multiple responses and have LLMs rank them based on given preference criteria\n", " 3. We use an LLM to generate preferred and dispreferred responses given certain preference criteria\n", "- In this notebook, we consider approach 3\n", - "- This notebook uses a 70 billion parameter Llama 3.1-Instruct model through ollama to generate preference labels for an instruction dataset\n", + "- This notebook uses a 70 billion parameters Llama 3.1-Instruct model through ollama to generate preference labels for an instruction dataset\n", "- The expected format of the instruction dataset is as follows:\n", "\n", "\n", "### Input\n", "\n", - "```python\n", + "```json\n", "[\n", " {\n", " \"instruction\": \"What is the state capital of California?\",\n", @@ -71,7 +71,7 @@ "\n", "The output dataset will look as follows, where more polite responses are preferred (`'chosen'`), and more impolite responses are dispreferred (`'rejected'`):\n", "\n", - "```python\n", + "```json\n", "[\n", " {\n", " \"instruction\": \"What is the state capital of California?\",\n", @@ -98,7 +98,7 @@ "]\n", "```\n", "\n", - "### Ouput\n", + "### Output\n", "\n", "\n", "\n", @@ -135,7 +135,7 @@ "id": "8bcdcb34-ac75-4f4f-9505-3ce0666c42d5", "metadata": {}, "source": [ - "## Installing Ollama and Downloading Llama 3" + "## Installing Ollama and Downloading Llama 3.1" ] }, { @@ -353,7 +353,7 @@ "source": [ "from pathlib import Path\n", "\n", - "json_file = Path(\"..\") / \"01_main-chapter-code\" / \"instruction-data.json\"\n", + "json_file = Path(\"..\", \"01_main-chapter-code\", \"instruction-data.json\")\n", "\n", "with open(json_file, \"r\") as file:\n", " json_data = json.load(file)\n", @@ -498,7 +498,7 @@ "metadata": {}, "source": [ "- If we find that the generated responses above look reasonable, we can go to the next step and apply the prompt to the whole dataset\n", - "- Here, we add a `'chosen`' key for the preferred response and a `'rejected'` response for the dispreferred response" + "- Here, we add a `'chosen'` key for the preferred response and a `'rejected'` response for the dispreferred response" ] }, {