LLMs-from-scratch/ch07/04_preference-tuning-with-dpo/README.md

# Chapter 7: Finetuning to Follow Instructions

- [create-preference-data-ollama.ipynb](create-preference-data-ollama.ipynb): A notebook that creates a synthetic dataset for preference finetuning dataset using Llama 3.1 and Ollama

- In progress ...


In the meantime, also see

- LLM Training: RLHF and Its Alternatives, [https://magazine.sebastianraschka.com/p/llm-training-rlhf-and-its-alternatives](https://magazine.sebastianraschka.com/p/llm-training-rlhf-and-its-alternatives)
- Tips for LLM Pretraining and Evaluating Reward Models, [https://sebastianraschka.com/blog/2024/research-papers-in-march-2024.html](https://sebastianraschka.com/blog/2024/research-papers-in-march-2024.html)
add instruction dataset 2024-06-08 10:38:41 -05:00			`# Chapter 7: Finetuning to Follow Instructions`
Ollama-based model evaluation 2024-06-05 08:21:28 -05:00
Generate preference dataset with Llama 3.1 70B (#289) 2024-07-27 09:44:04 -05:00			`- [create-preference-data-ollama.ipynb](create-preference-data-ollama.ipynb): A notebook that creates a synthetic dataset for preference finetuning dataset using Llama 3.1 and Ollama`
Update README.md 2024-06-23 08:25:01 -05:00
Generate preference dataset with Llama 3.1 70B (#289) 2024-07-27 09:44:04 -05:00			`- In progress ...`



			`In the meantime, also see`
Update README.md 2024-06-23 08:25:01 -05:00
			`- LLM Training: RLHF and Its Alternatives, [https://magazine.sebastianraschka.com/p/llm-training-rlhf-and-its-alternatives](https://magazine.sebastianraschka.com/p/llm-training-rlhf-and-its-alternatives)`
			`- Tips for LLM Pretraining and Evaluating Reward Models, [https://sebastianraschka.com/blog/2024/research-papers-in-march-2024.html](https://sebastianraschka.com/blog/2024/research-papers-in-march-2024.html)`