1713 Commits

Author SHA1 Message Date
Jake Poznanski
d9b6978499 Some scripts 2025-08-25 18:44:18 +00:00
Jake Poznanski
2415cdcfda Adding support for a blank page output check 2025-08-25 17:20:47 +00:00
Jake Poznanski
33ecc2316c Startng miner for blank pages, we want to make sure we don't hallucinate 2025-08-25 16:54:23 +00:00
Jake Poznanski
55b7101d7e Add some new rotation tests to a branch of the bench 2025-08-25 16:25:00 +00:00
Jake Poznanski
59321af018
Merge pull request #319 from haydn-jones/main
External vLLM instance
2025-08-25 08:43:25 -07:00
Haydn Jones
2c63836648 Black and mock 2025-08-23 20:07:05 -04:00
Jake Poznanski
5c6225b227 Fix for some math equations stuff 2025-08-22 20:52:41 +00:00
Jake Poznanski
d36357f3db Some fixes to validating math which was not working otherwise 2025-08-22 20:40:14 +00:00
Jake Poznanski
f3ea1527ef Local path supported 2025-08-22 20:11:35 +00:00
Jake Poznanski
5cbe331259 Async mode 2025-08-22 19:51:51 +00:00
Jake Poznanski
0df56e958e Even more test cleanup 2025-08-22 18:56:56 +00:00
Jake Poznanski
9831e65161 Fixing some logs, verifying 100% self test accuracy 2025-08-22 18:39:39 +00:00
Jake Poznanski
c1c83fd86c Better quality synth data from both sides 2025-08-22 18:37:59 +00:00
Jake Poznanski
d9789947d5 Refactoring of loading tests 2025-08-22 18:31:37 +00:00
Jake Poznanski
afac12b839 Cleaning up test cases 2025-08-22 18:24:39 +00:00
Jake Poznanski
7ea692198b Spaces for newlines in markdown is better 2025-08-22 17:27:47 +00:00
Jake Poznanski
dcc932dc2c Markdown cleanup 2025-08-22 17:21:13 +00:00
Jake Poznanski
aed755de38 Image alt tags 2025-08-22 16:52:52 +00:00
Jake Poznanski
d2bec31595 Markdown front matter corrector 2025-08-22 16:43:36 +00:00
Jake Poznanski
b1101c1a2f Adding front matter to markdown, still debugging a bit 2025-08-22 16:36:03 +00:00
Jake Poznanski
dc3aba9891 Sorting keys in dataset to repro nicely 2025-08-21 22:18:48 +00:00
Jake Poznanski
c0aee06c8f grpo startup script works 2025-08-21 22:15:21 +00:00
Haydn Jones
261c722f56 Update README + arg name 2025-08-21 17:49:07 -04:00
Jake Poznanski
6e3e2d8abc Type checks better 2025-08-21 18:51:05 +00:00
Jake Poznanski
33d889c748 Fixing for conv format 2025-08-21 18:47:53 +00:00
Jake Poznanski
0f8d515d8c Logging 2025-08-21 18:42:04 +00:00
Jake Poznanski
0fd7d07e73 GRPO reward fixups 2025-08-21 18:33:11 +00:00
Jake Poznanski
55b705ce50 Log completions 2025-08-21 18:20:26 +00:00
Jake Poznanski
1dd6ff9b03 Olmocr bench grpo stuff 2025-08-21 18:17:07 +00:00
Jake Poznanski
6184c94c3c Vllm enable 2025-08-21 17:33:56 +00:00
Jake Poznanski
6fb136deee Random func 2025-08-21 17:23:26 +00:00
Jake Poznanski
5556e204cf Testing 2025-08-21 17:17:43 +00:00
Jake Poznanski
892429965a Hmm 2025-08-21 17:14:14 +00:00
Jake Poznanski
de719edf49 Logs 2025-08-21 17:11:33 +00:00
Jake Poznanski
6fe630516a Score longer 2025-08-21 17:09:09 +00:00
Jake Poznanski
7f2dd85e5b More configs 2025-08-21 17:06:18 +00:00
Jake Poznanski
dc4c0864ac Dont default to tensorboard 2025-08-21 17:03:01 +00:00
Jake Poznanski
a443b89854 Cleanup 2025-08-21 17:01:40 +00:00
Jake Poznanski
d08068218c Setting some defaults 2025-08-21 16:59:16 +00:00
Jake Poznanski
f865e49624 Options 2025-08-21 16:55:24 +00:00
Jake Poznanski
3c8410f22c fix 2025-08-21 16:51:51 +00:00
Jake Poznanski
1dbb4332c0 FIxing up 2025-08-21 16:50:56 +00:00
Haydn Jones
b34c3611e1 oopsy woopsy 2025-08-20 19:22:48 -04:00
Haydn Jones
b8a2b92174 External vLLM 2025-08-20 19:21:38 -04:00
Jake Poznanski
7c446e1679 Trying to fix script 2025-08-20 22:44:10 +00:00
Jake Poznanski
a2ee4d46c0 gpro trainer test 1 2025-08-20 22:35:19 +00:00
Jake Poznanski
77164e909f Decent grpo script 2025-08-20 22:25:26 +00:00
Jake Poznanski
cc918ca03e Setting up GRPO trainer 2025-08-20 22:18:38 +00:00
Jake Poznanski
d046ba554a Mining math tests too 2025-08-20 21:05:56 +00:00
Jake Poznanski
c32dced59c More fixes to data gen script 2025-08-20 20:24:38 +00:00