LLMs-from-scratch/ch07/03_model-evaluation/README.md

# Chapter 7: Finetuning to Follow Instructions

This folder contains utility code that can be used for model evaluation.


&nbsp;
## Evaluating Instruction Responses Using the OpenAI API


- The [llm-instruction-eval-openai.ipynb](llm-instruction-eval-openai.ipynb) notebook uses OpenAI's GPT-4 to evaluate responses generated by instruction finetuned models. It works with a JSON file in the following format:

```python
{
    "instruction": "What is the atomic number of helium?",
    "input": "",
    "output": "The atomic number of helium is 2.",               # <-- The target given in the test set
    "model 1 response": "\nThe atomic number of helium is 2.0.", # <-- Response by an LLM
    "model 2 response": "\nThe atomic number of helium is 3."    # <-- Response by a 2nd LLM
},
```

&nbsp;
## Evaluating Instruction Responses Locally Using Ollama

- The [llm-instruction-eval-ollama.ipynb](llm-instruction-eval-ollama.ipynb) notebook offers an alternative to the one above, utilizing a locally downloaded Llama 3 model via Ollama.
add instruction dataset 2024-06-08 10:38:41 -05:00			`# Chapter 7: Finetuning to Follow Instructions`
Add openai model eval utility code 2024-05-26 10:44:15 -05:00
			`This folder contains utility code that can be used for model evaluation.`



			` `
			`## Evaluating Instruction Responses Using the OpenAI API`

Ollama-based model evaluation 2024-06-05 08:21:28 -05:00
Add openai model eval utility code 2024-05-26 10:44:15 -05:00			`- The [llm-instruction-eval-openai.ipynb](llm-instruction-eval-openai.ipynb) notebook uses OpenAI's GPT-4 to evaluate responses generated by instruction finetuned models. It works with a JSON file in the following format:`

			```python
			`{`
			`"instruction": "What is the atomic number of helium?",`
			`"input": "",`
			`"output": "The atomic number of helium is 2.", # <-- The target given in the test set`
			`"model 1 response": "\nThe atomic number of helium is 2.0.", # <-- Response by an LLM`
			`"model 2 response": "\nThe atomic number of helium is 3." # <-- Response by a 2nd LLM`
			`},`
			```
Ollama-based model evaluation 2024-06-05 08:21:28 -05:00
			` `
			`## Evaluating Instruction Responses Locally Using Ollama`

			`- The [llm-instruction-eval-ollama.ipynb](llm-instruction-eval-ollama.ipynb) notebook offers an alternative to the one above, utilizing a locally downloaded Llama 3 model via Ollama.`