Jake Poznanski
|
9f8df232b6
|
Readme updates
|
2025-08-13 22:03:03 +00:00 |
|
Jake Poznanski
|
36ca700669
|
Bump version to v0.3.0 for release
v0.3.0
|
2025-08-13 21:41:30 +00:00 |
|
Jake Poznanski
|
3e5351c028
|
version bump
|
2025-08-13 21:41:22 +00:00 |
|
Jake Poznanski
|
894c617ea4
|
Merge pull request #303 from allenai/jakep/olmocr_v03
olmOCR v.0.3.0
|
2025-08-13 14:39:54 -07:00 |
|
Jake Poznanski
|
463cef7ea2
|
New default model
|
2025-08-13 20:57:15 +00:00 |
|
Jake Poznanski
|
e86267a01c
|
Making local results directory properly
|
2025-08-13 20:40:04 +00:00 |
|
Jake Poznanski
|
11302feb8c
|
Move open cv2 import only into experimental data loader class
|
2025-08-13 20:28:31 +00:00 |
|
Jake Poznanski
|
93411a80a0
|
Lint fixes
|
2025-08-13 20:21:04 +00:00 |
|
Jake Poznanski
|
05330150ad
|
New work queue code is cleaner
|
2025-08-13 20:20:27 +00:00 |
|
Jake Poznanski
|
9a8fa335ae
|
One more scheme to try
|
2025-08-13 18:21:58 +00:00 |
|
Jake Poznanski
|
ffb0c6abc5
|
Adding some more quant schemes
|
2025-08-13 18:00:38 +00:00 |
|
Jake Poznanski
|
b921922f25
|
Cleaning up some pipeline logs
|
2025-08-13 17:39:02 +00:00 |
|
Jake Poznanski
|
332a818614
|
useless config
|
2025-08-12 17:31:19 +00:00 |
|
Jake Poznanski
|
b873d66dae
|
resumable
|
2025-08-12 16:35:21 +00:00 |
|
Jake Poznanski
|
98d457c502
|
2epoch config fix
|
2025-08-11 22:21:55 +00:00 |
|
Jake Poznanski
|
387e7947c4
|
Another 2 epoch run
|
2025-08-06 22:39:09 +00:00 |
|
Jake Poznanski
|
2a3c534a84
|
2 epoch resumable config
|
2025-08-06 22:38:38 +00:00 |
|
Jake Poznanski
|
c7a533c945
|
Sorting data loader samples to maintain consistency between runs
|
2025-08-06 21:46:13 +00:00 |
|
Jake Poznanski
|
2fca448105
|
Using new budget code
|
2025-08-06 16:31:08 +00:00 |
|
Jake Poznanski
|
e664dc5f36
|
typo
|
2025-08-05 19:43:11 +00:00 |
|
Jake Poznanski
|
8b8c6bb837
|
Cleaning up some training requirements installation steps
|
2025-08-05 19:42:46 +00:00 |
|
Jake Poznanski
|
c9b8088bc6
|
Adding some preempt flags
|
2025-08-05 18:00:46 +00:00 |
|
Jake Poznanski
|
8b7006d75d
|
One more thing to try
|
2025-08-05 17:38:59 +00:00 |
|
Jake Poznanski
|
51ec1d34b2
|
Adding a bigger config with augemnts
|
2025-08-05 17:38:00 +00:00 |
|
Jake Poznanski
|
8b595b63ec
|
Adding a decent augmentations pipeline
|
2025-08-05 17:37:02 +00:00 |
|
Jake Poznanski
|
7dca33db60
|
Getting things ready for a bit more augmentation
|
2025-08-05 16:34:46 +00:00 |
|
Jake Poznanski
|
55f8ba0ac0
|
Fixing configs
|
2025-08-04 22:54:39 +00:00 |
|
Jake Poznanski
|
c4de7dce80
|
Dataloader fix for loading blank yamls
|
2025-08-04 22:42:57 +00:00 |
|
Jake Poznanski
|
3ae173bd72
|
Merge branch 'main' into jakep/olmocr_v03
|
2025-08-04 22:28:29 +00:00 |
|
Jake Poznanski
|
12f8a90f1b
|
Copying preprocessed files to local ssd in trainer script
|
2025-08-04 22:18:38 +00:00 |
|
Jake Poznanski
|
be1f845da4
|
Fixing issue with blank documents
|
2025-08-04 21:50:54 +00:00 |
|
Jake Poznanski
|
8715ccd245
|
Rotation augmentation config
|
2025-08-04 21:17:40 +00:00 |
|
Jake Poznanski
|
0792c03a9a
|
Ok, rotation augmentation is in
|
2025-08-04 21:15:36 +00:00 |
|
Jake Poznanski
|
3bc2c0b8e3
|
Adding batch skipping in data loader
|
2025-08-04 21:07:04 +00:00 |
|
Jake Poznanski
|
66c7d823b5
|
Cleaning up some new config files
|
2025-08-04 20:49:33 +00:00 |
|
Jake Poznanski
|
d7cb315878
|
Merge branch 'main' into jakep/olmocr_v03
|
2025-08-04 20:46:56 +00:00 |
|
Jake Poznanski
|
6216896102
|
Accidentally comitted too many files
|
2025-08-04 20:41:21 +00:00 |
|
Jake Poznanski
|
4d2ddd3245
|
Merge branch 'jakep/flip_prompt' into jakep/olmocr_v03
|
2025-08-04 20:35:40 +00:00 |
|
Jake Poznanski
|
6417b2e8ba
|
Merge branch 'main' of https://github.com/allenai/olmocr
v0.2.3
|
2025-08-04 20:34:02 +00:00 |
|
Jake Poznanski
|
75a8b05255
|
Bump version to v0.2.3 for release
|
2025-08-04 20:33:54 +00:00 |
|
Jake Poznanski
|
ea465bebf3
|
Update README.md
|
2025-08-04 13:26:52 -07:00 |
|
Jake Poznanski
|
f3aedf2c12
|
Bumping version
|
2025-08-04 20:02:55 +00:00 |
|
Jake Poznanski
|
becd15d3cf
|
Reformating fix
|
2025-08-04 20:01:54 +00:00 |
|
Jake Poznanski
|
d6591c04a1
|
Saving extra metadata that will be useful for finetuning
|
2025-08-04 20:01:30 +00:00 |
|
Jake Poznanski
|
7c098955a9
|
Trying fix for transformers benchmark
|
2025-08-04 19:50:05 +00:00 |
|
Jake Poznanski
|
8712534e81
|
Fix for docker ignore
|
2025-08-04 19:39:55 +00:00 |
|
Jake Poznanski
|
0bdcd4471e
|
Fix docker ignore
|
2025-08-04 18:58:45 +00:00 |
|
Jake Poznanski
|
0536c0e9b8
|
Lint fixes
|
2025-08-04 18:21:47 +00:00 |
|
Jake Poznanski
|
08b263ba46
|
Cumulative rotation support
|
2025-08-04 18:21:31 +00:00 |
|
Jake Poznanski
|
5e991b67e5
|
Merge pull request #291 from haydn-jones/main
Forward unknown args to vLLM
|
2025-08-04 11:04:46 -07:00 |
|