yujunjun/LLMs-from-scratch

mirror of https://github.com/rasbt/LLMs-from-scratch.git synced 2025-09-01 12:27:59 +00:00

Sebastian Raschka 09dc080cf3 Direct Preference Optimization from scratch (#294 )

2024-08-04 08:57:36 -05:00

366 B

Raw Blame History

Chapter 7: Finetuning to Follow Instructions

create-preference-data-ollama.ipynb: A notebook that creates a synthetic dataset for preference finetuning dataset using Llama 3.1 and Ollama
dpo-from-scratch.ipynb: This notebook implements Direct Preference Optimization (DPO) for LLM alignment