Sebastian Raschka
|
7114ccd10d
|
Add PyPI package (#576)
* Add PyPI package
* fixes
* fixes
|
2025-03-23 19:28:49 -05:00 |
|
rasbt
|
bed5f89378
|
fix reward margins plot label in dpo nb
|
2025-01-12 14:04:05 -06:00 |
|
Sebastian Raschka
|
992f3068d1
|
Auto download DPO dataset if not already available in path (#479)
* Auto download DPO dataset if not already available in path
* update tests to account for latest HF transformers release in unit tests
* pep 8
|
2025-01-12 12:27:28 -06:00 |
|
Sebastian Raschka
|
05f2a398b8
|
adds no-grad context for reference model to DPO (#473)
|
2025-01-07 20:49:01 -06:00 |
|
QS
|
976c92010c
|
typo fixed (#468)
* typo fixed
* only update plot
---------
Co-authored-by: rasbt <mail@sebastianraschka.com>
|
2025-01-05 09:17:13 -06:00 |
|
Jinge Wang
|
0dbc203f66
|
Fix 2 typos in 04_preferene-tuning-with-dpo (#356)
|
2024-09-15 07:36:22 -05:00 |
|
rasbt
|
7a5771932b
|
note about logistic sigmoid
|
2024-08-06 19:48:30 -05:00 |
|
rasbt
|
2245f8d9c1
|
extend equation description
|
2024-08-06 19:46:50 -05:00 |
|
rasbt
|
a65e06ff99
|
add more explanations
|
2024-08-06 19:45:11 -05:00 |
|
rasbt
|
089901db26
|
small figure update
|
2024-08-05 17:57:16 -05:00 |
|
Daniel Kleine
|
dcdf04e3bd
|
minor DPO fixes (#298)
* fixed issues, updated .gitignore
* added closing paren
* fixed CEL spelling
* fixed more minor issues
* Update ch07/01_main-chapter-code/ch07.ipynb
* Update ch07/04_preference-tuning-with-dpo/dpo-from-scratch.ipynb
* Update ch07/04_preference-tuning-with-dpo/dpo-from-scratch.ipynb
* Update ch07/04_preference-tuning-with-dpo/dpo-from-scratch.ipynb
---------
Co-authored-by: Sebastian Raschka <mail@sebastianraschka.com>
|
2024-08-05 08:40:46 -05:00 |
|
rasbt
|
6030071e3f
|
update model path
|
2024-08-05 07:36:08 -05:00 |
|
rasbt
|
f302f5e8d5
|
improve latex rendering in dpo notebook
|
2024-08-04 09:19:59 -05:00 |
|
Sebastian Raschka
|
09dc080cf3
|
Direct Preference Optimization from scratch (#294)
|
2024-08-04 08:57:36 -05:00 |
|