aman-17
|
0130a970c2
|
fixed style
|
2025-02-25 08:57:02 -08:00 |
|
Jake Poznanski
|
25ec87b66d
|
CI
|
2025-02-14 20:46:55 +00:00 |
|
Jake Poznanski
|
c05e01532c
|
Hopefully CI runs now
|
2025-02-14 20:42:19 +00:00 |
|
Jake Poznanski
|
dcaca8aa90
|
Black formatting
|
2025-01-29 15:30:39 -08:00 |
|
Jake Poznanski
|
4a1762d455
|
isort
|
2025-01-29 15:25:10 -08:00 |
|
Jake Poznanski
|
b2894d0280
|
Massive refactor from pdelfin to olmocr
|
2025-01-27 18:30:41 +00:00 |
|
Jake Poznanski
|
6a4a55f9e0
|
Hopefully working molmo HF trainer config
|
2024-10-30 14:00:27 -07:00 |
|
Jake Poznanski
|
bede854cd5
|
Startng to write molmo formatters
|
2024-10-30 13:24:11 -07:00 |
|
Jake Poznanski
|
ffe470bf0e
|
Fix
|
2024-10-23 22:55:50 +00:00 |
|
Jake Poznanski
|
180dde03c5
|
dataprep sampling tests
|
2024-10-23 22:53:05 +00:00 |
|
Jake Poznanski
|
2826bcad18
|
Yay all unit tests pass cleanly now too
|
2024-10-17 17:05:55 +00:00 |
|
Jake Poznanski
|
124aaf5fe0
|
Hmm, cant repro failing anchor case
|
2024-10-17 17:00:02 +00:00 |
|
Jake Poznanski
|
2864f907e1
|
Dataloader fix with nicer tests
|
2024-10-10 16:58:45 +00:00 |
|
Jake Poznanski
|
b7c80cd17f
|
Fix up some tests but I don't see why this isn't working
|
2024-10-10 16:58:40 +00:00 |
|
Jake Poznanski
|
e42cecf96c
|
Adding anchor code based off of pypdf that visits each text block, hopefully so we can make it output good bboxes
|
2024-10-01 22:10:58 +00:00 |
|
Jake Poznanski
|
decfd7fbc1
|
Fixing the refiner input prompt to something simpler that doesn't depend on the training data. Fixing beaker job workspace and bumping priority to high.
|
2024-09-27 22:54:07 +00:00 |
|
Jake Poznanski
|
4eddb1b45f
|
Okay, reasonably happy with the dataprep pipeline
|
2024-09-20 13:04:47 -07:00 |
|
Jake Poznanski
|
a47afe5c8d
|
Adding test to make sure the traning and inference time tokenization stays identical, currenlty failing
|
2024-09-20 12:01:05 -07:00 |
|