mirror of https://github.com/rasbt/LLMs-from-scratch.git synced 2025-09-30 18:48:35 +00:00

History

* fixed issues, updated .gitignore

* added closing paren

* fixed CEL spelling

* fixed more minor issues

* Update ch07/01_main-chapter-code/ch07.ipynb

* Update ch07/04_preference-tuning-with-dpo/dpo-from-scratch.ipynb

* Update ch07/04_preference-tuning-with-dpo/dpo-from-scratch.ipynb

* Update ch07/04_preference-tuning-with-dpo/dpo-from-scratch.ipynb

---------

Co-authored-by: Sebastian Raschka <mail@sebastianraschka.com>

2024-08-05 08:40:46 -05:00

01_main-chapter-code

minor DPO fixes (#298 )

2024-08-05 08:40:46 -05:00

02_dataset-utilities

fix typos, add codespell pre-commit hook (#264 )

2024-07-16 07:07:04 -05:00

03_model-evaluation

Fix 8-billion-parameter spelling

2024-07-28 10:48:56 -05:00

04_preference-tuning-with-dpo

minor DPO fixes (#298 )

2024-08-05 08:40:46 -05:00

05_dataset-generation

Fix 8-billion-parameter spelling

2024-07-28 10:48:56 -05:00

README.md

Direct Preference Optimization from scratch (#294 )

2024-08-04 08:57:36 -05:00

README.md

Chapter 7: Finetuning to Follow Instructions

Main Chapter Code

01_main-chapter-code contains the main chapter code and exercise solutions

Bonus Materials

02_dataset-utilities contains utility code that can be used for preparing an instruction dataset.
03_model-evaluation contains utility code for evaluating instruction responses using a local Llama 3 model and the GPT-4 API.
04_preference-tuning-with-dpo implements code for preference finetuning with Direct Preference Optimization (DPO)
05_dataset-generation contains code to generate synthetic datasets for instruction finetuning