mirror of https://github.com/rasbt/LLMs-from-scratch.git synced 2025-10-31 18:00:08 +00:00

History

Sebastian Raschka 5016499d1d Uv workflow improvements (#531 )

* Uv workflow improvements

* Uv workflow improvements

* linter improvements

* pytproject.toml fixes

* pytproject.toml fixes

* pytproject.toml fixes

* pytproject.toml fixes

* pytproject.toml fixes

* pytproject.toml fixes

* windows fixes

* windows fixes

* windows fixes

* windows fixes

* windows fixes

* windows fixes

* win32 fix

* win32 fix

* win32 fix

* win32 fix

* win32 fix

* win32 fix

* win32 fix

* win32 fix

* win32 fix

* win32 fix

* win32 fix

* win32 fix

* win32 fix

* win32 fix

* win32 fix

* win32 fix

* win32 fix

* win32 fix

* win32 fix

2025-02-16 13:16:51 -06:00

scores

add spearman and kendall-tau analysis

2024-07-02 07:55:32 -05:00

config.json

Revert "Revert "newline""

2024-05-27 07:32:45 -05:00

eval-example-data.json

Add openai model eval utility code

2024-05-26 10:44:15 -05:00

llm-instruction-eval-ollama.ipynb

Fix 8-billion-parameter spelling

2024-07-28 10:48:56 -05:00

llm-instruction-eval-openai.ipynb

Uv workflow improvements (#531 )

2025-02-16 13:16:51 -06:00

README.md

add instruction dataset

2024-06-08 10:38:41 -05:00

requirements-extra.txt

Ollama-based model evaluation

2024-06-05 08:21:28 -05:00

README.md

Chapter 7: Finetuning to Follow Instructions

This folder contains utility code that can be used for model evaluation.

Evaluating Instruction Responses Using the OpenAI API

The llm-instruction-eval-openai.ipynb notebook uses OpenAI's GPT-4 to evaluate responses generated by instruction finetuned models. It works with a JSON file in the following format:

{
    "instruction": "What is the atomic number of helium?",
    "input": "",
    "output": "The atomic number of helium is 2.",               # <-- The target given in the test set
    "model 1 response": "\nThe atomic number of helium is 2.0.", # <-- Response by an LLM
    "model 2 response": "\nThe atomic number of helium is 3."    # <-- Response by a 2nd LLM
},

Evaluating Instruction Responses Locally Using Ollama

The llm-instruction-eval-ollama.ipynb notebook offers an alternative to the one above, utilizing a locally downloaded Llama 3 model via Ollama.