257 Commits

Author SHA1 Message Date
Jake Poznanski
54cd5a3438 Going to train on the new transcripts data 2025-09-08 22:30:40 +00:00
Jake Poznanski
ef09c73bf2 Fixing up some rewards stuff 2025-09-04 17:34:53 +00:00
Jake Poznanski
ede0dc51b1 Adding drop last to prevent any weirdnesses 2025-09-04 16:50:08 +00:00
Jake Poznanski
14a882db9a Fixing to new version, adjusting scale rewards stuff 2025-09-03 22:43:35 +00:00
Jake Poznanski
755c221024 Trying some more things 2025-09-03 22:11:16 +00:00
Jake Poznanski
a41d04660a Cleaning script 2025-09-03 21:31:21 +00:00
Jake Poznanski
e6cff25b6b Cleanup stuff 2025-09-03 20:34:12 +00:00
Jake Poznanski
bade86fe91 Cleaned up things 2025-09-03 20:23:01 +00:00
Jake Poznanski
b689a8e5f8 Giving more memory buffer 2025-09-03 19:56:53 +00:00
Jake Poznanski
7346d12322 Better cleaning, augusta version 2025-09-03 18:47:02 +00:00
Jake Poznanski
f20f1a0b54 Doing some cleaning 2025-09-03 18:41:36 +00:00
Jake Poznanski
94d19c51c6 Cleaning up scripts, multi gpu trainer more flexible 2025-09-03 18:25:10 +00:00
Jake Poznanski
c612293a59 Remove device map auto 2025-09-03 18:04:42 +00:00
Jake Poznanski
1fb49cefc1 Working on multi gpu trainer 2025-09-03 17:25:14 +00:00
Jake Poznanski
3be381b375 Adding some params 2025-08-26 20:46:06 +00:00
Jake Poznanski
82fd50263f Launcher for grpo training 2025-08-26 16:28:38 +00:00
Jake Poznanski
ed6f483074 Fixing run_benchmark 2025-08-25 20:28:40 +00:00
Jake Poznanski
d84eb95ba2 Saving some extra data mixes 2025-08-25 20:26:29 +00:00
Jake Poznanski
b16e4051f6 Saving bench results to s3 2025-08-25 19:53:55 +00:00
Jake Poznanski
d9b6978499 Some scripts 2025-08-25 18:44:18 +00:00
Jake Poznanski
55b7101d7e Add some new rotation tests to a branch of the bench 2025-08-25 16:25:00 +00:00
Jake Poznanski
c0aee06c8f grpo startup script works 2025-08-21 22:15:21 +00:00
Jake Poznanski
1dd6ff9b03 Olmocr bench grpo stuff 2025-08-21 18:17:07 +00:00
Jake Poznanski
6184c94c3c Vllm enable 2025-08-21 17:33:56 +00:00
Jake Poznanski
1dbb4332c0 FIxing up 2025-08-21 16:50:56 +00:00
Jake Poznanski
7c446e1679 Trying to fix script 2025-08-20 22:44:10 +00:00
Jake Poznanski
a2ee4d46c0 gpro trainer test 1 2025-08-20 22:35:19 +00:00
Jake Poznanski
c075f3071f New configs for new data 2025-08-16 17:31:42 +00:00
Jake Poznanski
2fca448105 Using new budget code 2025-08-06 16:31:08 +00:00
Jake Poznanski
8b8c6bb837 Cleaning up some training requirements installation steps 2025-08-05 19:42:46 +00:00
Jake Poznanski
c9b8088bc6 Adding some preempt flags 2025-08-05 18:00:46 +00:00
Jake Poznanski
55f8ba0ac0 Fixing configs 2025-08-04 22:54:39 +00:00
Jake Poznanski
12f8a90f1b Copying preprocessed files to local ssd in trainer script 2025-08-04 22:18:38 +00:00
Jake Poznanski
7c098955a9 Trying fix for transformers benchmark 2025-08-04 19:50:05 +00:00
Jake Poznanski
df52cb0e0e Small fixes for transformers test runner 2025-07-25 03:18:24 +00:00
Jake Poznanski
cf1912dec4 Some transformer bench ideas 2025-07-24 21:20:15 +00:00
Jake Poznanski
16145a4b32 Need accelerate 2025-07-16 18:51:37 +00:00
Jake Poznanski
31c834dcdd Constants 2025-07-16 02:15:17 +00:00
Jake Poznanski
5ea4e8a6e2 Compare vllm script 2025-07-15 22:55:49 +00:00
Jake Poznanski
24608956a0 Working on comparing to vllm 2025-07-15 22:21:54 +00:00
Jake Poznanski
e6c98236b6 Adding more pipeline retry stats, compress code fixed 2025-07-15 21:41:10 +00:00
Jake Poznanski
4dbbf91e1c Compression script 2025-07-15 21:26:15 +00:00
Jake Poznanski
1092213c5f Merge branch 'jakep/new_traininer_nojson_newprompt' into jakep/new_trainer 2025-07-15 17:44:55 +00:00
Jake Poznanski
43ae28dde4 Prepare checkpoint works for older models too 2025-07-14 21:30:32 +00:00
Jake Poznanski
f014c2aaf9 Need to reserve all 8 gpus for reliable performance benchmark, even if you only use 1 2025-07-14 21:02:14 +00:00
Jake Poznanski
65d0edcaae Adding guided decoding option 2025-07-10 15:13:26 +00:00
Jake Poznanski
a7e2f719bf Start a preemptible one at least once 2025-07-02 19:26:30 +00:00
Jake Poznanski
79a7818517 New trainer launch script for beaker 2025-07-01 01:43:38 +00:00
Jake Poznanski
dcf026a63c Better script 2025-07-01 01:40:55 +00:00
Jake Poznanski
9f0f912101 Ugh 2025-06-30 23:37:35 +00:00