6 Commits

Author SHA1 Message Date
Jake Poznanski
8ef7f8085a isort and black 2025-09-30 17:37:10 +00:00
Jake Poznanski
d70208d98a Moving test code around, adding format reward since some runs stop outputting the front matter thing in grpo training 2025-08-27 18:22:05 +00:00
Jake Poznanski
8383865392 Fixing up subscripts and superscripts in synth data 2025-08-27 18:15:36 +00:00
Jake Poznanski
d36357f3db Some fixes to validating math which was not working otherwise 2025-08-22 20:40:14 +00:00
Jake Poznanski
dcc932dc2c Markdown cleanup 2025-08-22 17:21:13 +00:00
Jake Poznanski
d2bec31595 Markdown front matter corrector 2025-08-22 16:43:36 +00:00