Jake Poznanski
|
bede854cd5
|
Startng to write molmo formatters
|
2024-10-30 13:24:11 -07:00 |
|
Jake Poznanski
|
ffe470bf0e
|
Fix
|
2024-10-23 22:55:50 +00:00 |
|
Jake Poznanski
|
180dde03c5
|
dataprep sampling tests
|
2024-10-23 22:53:05 +00:00 |
|
Jake Poznanski
|
2826bcad18
|
Yay all unit tests pass cleanly now too
|
2024-10-17 17:05:55 +00:00 |
|
Jake Poznanski
|
124aaf5fe0
|
Hmm, cant repro failing anchor case
|
2024-10-17 17:00:02 +00:00 |
|
Jake Poznanski
|
2864f907e1
|
Dataloader fix with nicer tests
|
2024-10-10 16:58:45 +00:00 |
|
Jake Poznanski
|
b7c80cd17f
|
Fix up some tests but I don't see why this isn't working
|
2024-10-10 16:58:40 +00:00 |
|
Jake Poznanski
|
e42cecf96c
|
Adding anchor code based off of pypdf that visits each text block, hopefully so we can make it output good bboxes
|
2024-10-01 22:10:58 +00:00 |
|
Jake Poznanski
|
decfd7fbc1
|
Fixing the refiner input prompt to something simpler that doesn't depend on the training data. Fixing beaker job workspace and bumping priority to high.
|
2024-09-27 22:54:07 +00:00 |
|
Jake Poznanski
|
4eddb1b45f
|
Okay, reasonably happy with the dataprep pipeline
|
2024-09-20 13:04:47 -07:00 |
|
Jake Poznanski
|
a47afe5c8d
|
Adding test to make sure the traning and inference time tokenization stays identical, currenlty failing
|
2024-09-20 12:01:05 -07:00 |
|