LLMs-from-scratch/ch07/03_model-evaluation
rasbt 134334ce21 Revert "Revert "newline""
This reverts commit 6aa2a587d22105910bd6f07c6c79a5abf83a5eb6.
2024-05-27 07:32:45 -05:00
..
2024-05-27 07:32:45 -05:00
2024-05-26 10:44:15 -05:00

Chapter 7: Instruction and Preference Finetuning

This folder contains utility code that can be used for model evaluation.

Install the additional package requirements via:

pip install -r requirements-extra.txt

 

Evaluating Instruction Responses Using the OpenAI API

  • The llm-instruction-eval-openai.ipynb notebook uses OpenAI's GPT-4 to evaluate responses generated by instruction finetuned models. It works with a JSON file in the following format:
{
    "instruction": "What is the atomic number of helium?",
    "input": "",
    "output": "The atomic number of helium is 2.",               # <-- The target given in the test set
    "model 1 response": "\nThe atomic number of helium is 2.0.", # <-- Response by an LLM
    "model 2 response": "\nThe atomic number of helium is 3."    # <-- Response by a 2nd LLM
},