fixed typos and formatting (#291)

This commit is contained in:
Daniel Kleine 2024-07-28 17:04:33 +02:00 committed by GitHub
parent 75a24c4897
commit 60752e3b3a

View File

@ -41,13 +41,13 @@
" 2. We use the instruction-finetuned LLM to generate multiple responses and have LLMs rank them based on given preference criteria\n", " 2. We use the instruction-finetuned LLM to generate multiple responses and have LLMs rank them based on given preference criteria\n",
" 3. We use an LLM to generate preferred and dispreferred responses given certain preference criteria\n", " 3. We use an LLM to generate preferred and dispreferred responses given certain preference criteria\n",
"- In this notebook, we consider approach 3\n", "- In this notebook, we consider approach 3\n",
"- This notebook uses a 70 billion parameter Llama 3.1-Instruct model through ollama to generate preference labels for an instruction dataset\n", "- This notebook uses a 70 billion parameters Llama 3.1-Instruct model through ollama to generate preference labels for an instruction dataset\n",
"- The expected format of the instruction dataset is as follows:\n", "- The expected format of the instruction dataset is as follows:\n",
"\n", "\n",
"\n", "\n",
"### Input\n", "### Input\n",
"\n", "\n",
"```python\n", "```json\n",
"[\n", "[\n",
" {\n", " {\n",
" \"instruction\": \"What is the state capital of California?\",\n", " \"instruction\": \"What is the state capital of California?\",\n",
@ -71,7 +71,7 @@
"\n", "\n",
"The output dataset will look as follows, where more polite responses are preferred (`'chosen'`), and more impolite responses are dispreferred (`'rejected'`):\n", "The output dataset will look as follows, where more polite responses are preferred (`'chosen'`), and more impolite responses are dispreferred (`'rejected'`):\n",
"\n", "\n",
"```python\n", "```json\n",
"[\n", "[\n",
" {\n", " {\n",
" \"instruction\": \"What is the state capital of California?\",\n", " \"instruction\": \"What is the state capital of California?\",\n",
@ -98,7 +98,7 @@
"]\n", "]\n",
"```\n", "```\n",
"\n", "\n",
"### Ouput\n", "### Output\n",
"\n", "\n",
"\n", "\n",
"\n", "\n",
@ -135,7 +135,7 @@
"id": "8bcdcb34-ac75-4f4f-9505-3ce0666c42d5", "id": "8bcdcb34-ac75-4f4f-9505-3ce0666c42d5",
"metadata": {}, "metadata": {},
"source": [ "source": [
"## Installing Ollama and Downloading Llama 3" "## Installing Ollama and Downloading Llama 3.1"
] ]
}, },
{ {
@ -353,7 +353,7 @@
"source": [ "source": [
"from pathlib import Path\n", "from pathlib import Path\n",
"\n", "\n",
"json_file = Path(\"..\") / \"01_main-chapter-code\" / \"instruction-data.json\"\n", "json_file = Path(\"..\", \"01_main-chapter-code\", \"instruction-data.json\")\n",
"\n", "\n",
"with open(json_file, \"r\") as file:\n", "with open(json_file, \"r\") as file:\n",
" json_data = json.load(file)\n", " json_data = json.load(file)\n",
@ -498,7 +498,7 @@
"metadata": {}, "metadata": {},
"source": [ "source": [
"- If we find that the generated responses above look reasonable, we can go to the next step and apply the prompt to the whole dataset\n", "- If we find that the generated responses above look reasonable, we can go to the next step and apply the prompt to the whole dataset\n",
"- Here, we add a `'chosen`' key for the preferred response and a `'rejected'` response for the dispreferred response" "- Here, we add a `'chosen'` key for the preferred response and a `'rejected'` response for the dispreferred response"
] ]
}, },
{ {