From f78ad1f95b9610833e4bb5e4bd87ac5ace123ad5 Mon Sep 17 00:00:00 2001 From: Sebastian Raschka Date: Sun, 23 Jun 2024 08:25:01 -0500 Subject: [PATCH] Update README.md --- ch07/04_preference-tuning-with-dpo/README.md | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/ch07/04_preference-tuning-with-dpo/README.md b/ch07/04_preference-tuning-with-dpo/README.md index 330a658..bbbcc5e 100644 --- a/ch07/04_preference-tuning-with-dpo/README.md +++ b/ch07/04_preference-tuning-with-dpo/README.md @@ -1,3 +1,8 @@ # Chapter 7: Finetuning to Follow Instructions -In progress ... \ No newline at end of file +In progress ... + +In the meantime, see + +- LLM Training: RLHF and Its Alternatives, [https://magazine.sebastianraschka.com/p/llm-training-rlhf-and-its-alternatives](https://magazine.sebastianraschka.com/p/llm-training-rlhf-and-its-alternatives) +- Tips for LLM Pretraining and Evaluating Reward Models, [https://sebastianraschka.com/blog/2024/research-papers-in-march-2024.html](https://sebastianraschka.com/blog/2024/research-papers-in-march-2024.html)