1589 Commits

Author SHA1 Message Date
Jake Poznanski
0656e1fdc1 Probably this has a deeper root cause that I dont see 2025-07-28 03:44:58 +00:00
Jake Poznanski
3a6a158a25 Muon config 2025-07-25 19:36:13 +00:00
Jake Poznanski
0acad6faf6 Adding muon support 2025-07-25 17:29:12 +00:00
Jake Poznanski
df52cb0e0e Small fixes for transformers test runner 2025-07-25 03:18:24 +00:00
Jake Poznanski
cf1912dec4 Some transformer bench ideas 2025-07-24 21:20:15 +00:00
Jake Poznanski
26c6281075 Formatting 2025-07-24 18:50:30 +00:00
Jake Poznanski
4a70b2ebad Docs and parameter groups 2025-07-24 18:47:03 +00:00
Jake Poznanski
fc983ca039 README 2025-07-24 18:11:38 +00:00
Jake Poznanski
476e20c936 Bump version to v0.2.1 for release v0.2.1 2025-07-23 21:49:22 +00:00
Jake Poznanski
2545408d97 Minor release cleaning up a few pipeline things 2025-07-23 21:49:11 +00:00
Jake Poznanski
54719b679f Fixed 2025-07-23 21:38:29 +00:00
Jake Poznanski
c63e97f6cc Default max model len cleanup 2025-07-23 20:37:48 +00:00
Jake Poznanski
4acc85ea45 bolds in tables 2025-07-23 17:21:58 +00:00
Jake Poznanski
c13f5aa720 Readme 2025-07-23 17:20:38 +00:00
Jake Poznanski
44cb957a2f Readmes 2025-07-23 16:54:25 +00:00
Jake Poznanski
783cacdeb6 Merge branch 'main' of https://github.com/allenai/olmocr 2025-07-23 16:48:58 +00:00
Jake Poznanski
ce32ceb83f Hopefully a cleaner pipeline 2025-07-23 16:48:56 +00:00
Jake Poznanski
538cb7645d
Update README.md 2025-07-23 09:43:41 -07:00
Jake Poznanski
b4c5913772 Bump version to v0.2.0 for release v0.2.0 2025-07-23 15:37:18 +00:00
Jake Poznanski
35a5329516 New version 2025-07-23 15:37:08 +00:00
Jake Poznanski
0a6b2fe42d Lints - bringing back files 2025-07-23 04:49:37 +00:00
Jake Poznanski
56296d6927 Brining back a few files 2025-07-23 04:49:13 +00:00
Jake Poznanski
6e8272413c Lint fixes 2025-07-23 03:40:05 +00:00
Jake Poznanski
5ec49672ea New default model 2025-07-23 03:33:25 +00:00
Jake Poznanski
a4752b5ef9 Merge remote-tracking branch 'origin/main' into jakep/new_trainer 2025-07-23 03:32:49 +00:00
Jake Poznanski
6d5711fa3e Noop fix 2025-07-22 23:17:07 +00:00
Jake Poznanski
eb271b1ea9 New step tracking changes 2025-07-22 22:22:06 +00:00
Jake Poznanski
d0efc70de6 Fixes 2025-07-22 20:48:46 +00:00
Jake Poznanski
61a13f96cf
Merge pull request #270 from lgibelli/fix-kv-cache-under-allocation
Expose --gpu_memory_utilization / --max_model_len flags and startup hint
2025-07-22 12:40:54 -07:00
Jake Poznanski
9ef3fd704a Adjusting temp by attempt 2025-07-22 19:35:40 +00:00
Jake Poznanski
94c78b4f37 Fixing up seeding 2025-07-22 19:28:11 +00:00
Jake Poznanski
38517176ea Fix 2025-07-22 18:13:38 +00:00
Jake Poznanski
731ca95fda Fixes 2025-07-22 17:57:55 +00:00
Jake Poznanski
19b548c5ae Small fixes 2025-07-22 17:17:13 +00:00
Jake Poznanski
60c39440b7 More configs 2025-07-22 04:30:42 +00:00
Jake Poznanski
f44d03f608 Don't break on errors 2025-07-22 03:26:06 +00:00
Jake Poznanski
8eb3786fbe Fixing compressor again 2025-07-21 23:30:26 +00:00
Jake Poznanski
1009ec4188 Cleanup 2025-07-21 23:28:44 +00:00
Jake Poznanski
b2a950f9f4 1 epoich custom trainer 2025-07-21 22:34:35 +00:00
Jake Poznanski
28a207b912 Grok was asked to drop the hf trainer and implement it custom 2025-07-21 22:33:50 +00:00
Jake Poznanski
da6bc458cd Fix for compresor 2025-07-21 22:21:04 +00:00
Jake Poznanski
eb200e71c6 Fixing some default configs on quantizer 2025-07-21 21:43:33 +00:00
Jake Poznanski
0aa74795d3 More calibration samples by default 2025-07-21 21:33:43 +00:00
Jake Poznanski
6e480129d1 Trying out an idea for dataset augmentation 2025-07-19 15:27:53 +00:00
Jake Poznanski
df960cbc15 2 epoch config just to try 2025-07-18 23:25:27 +00:00
Jake Poznanski
a326c965dd Default to 1288 2025-07-17 19:46:35 +00:00
Jake Poznanski
b88c71e00c Rounding to better image size, full soups 2025-07-16 20:41:35 +00:00
Jake Poznanski
75bfa6a603 Adding full soup configs 2025-07-16 20:32:58 +00:00
Jake Poznanski
0f733ffc30 FIxes to compare vllm script 2025-07-16 19:58:35 +00:00
Jake Poznanski
16145a4b32 Need accelerate 2025-07-16 18:51:37 +00:00