Jake Poznanski
|
3fef3f914f
|
Gemini support, some debugging stuff
|
2025-03-10 16:26:48 +00:00 |
|
Jake Poznanski
|
fc857f9c6d
|
Starting on math dataset
|
2025-03-07 21:30:37 +00:00 |
|
Jake Poznanski
|
d006e8f331
|
Working on equation matching
|
2025-03-06 16:09:26 -08:00 |
|
Jake Poznanski
|
7003e9cfe1
|
Working on a better compare function
|
2025-03-06 15:16:20 -08:00 |
|
Jake Poznanski
|
e144200276
|
Fix markdown parsing for mistral
|
2025-03-06 13:41:51 -08:00 |
|
Jake Poznanski
|
bdc0d75799
|
Adding mistral ocr to eval
|
2025-03-06 13:29:56 -08:00 |
|
Jake Poznanski
|
4053ea58a4
|
Work on image matching
|
2025-03-06 13:11:08 -08:00 |
|
Jake Poznanski
|
b03d840238
|
Better error handling on eqn rendering
|
2025-03-06 11:59:20 -08:00 |
|
Jake Poznanski
|
438e68ec68
|
Some more math stuff
|
2025-03-06 11:00:50 -08:00 |
|
Jake Poznanski
|
7f36ac86f3
|
First math tests
|
2025-03-06 10:34:05 -08:00 |
|
Jake Poznanski
|
b62ccc25dd
|
Equation rendering code, first pass
|
2025-03-06 09:59:36 -08:00 |
|
Jake Poznanski
|
9be696fa30
|
Adding a trailing repetition test
|
2025-03-06 08:56:16 -08:00 |
|
Jake Poznanski
|
07466e1ae4
|
Stats tests
|
2025-03-06 08:18:05 -08:00 |
|
Jake Poznanski
|
eeb2733c9e
|
Marker rerun, stats changes
|
2025-03-06 07:55:44 -08:00 |
|
Jake Poznanski
|
50e55f45ab
|
Conversion fixes
|
2025-03-05 15:31:45 -08:00 |
|
Jake Poznanski
|
fb0a729fe6
|
Better convert script
|
2025-03-05 14:31:39 -08:00 |
|
Jake Poznanski
|
fa68c6b6ce
|
Better conversion script, run on more things
|
2025-03-05 14:16:29 -08:00 |
|
Jake Poznanski
|
c9ecd8e040
|
Need those chat templates
|
2025-03-05 14:01:14 -08:00 |
|
Jake Poznanski
|
5611d79bb2
|
Model runners
|
2025-03-05 13:55:40 -08:00 |
|
Jake Poznanski
|
5cb32c3289
|
Convert script work with server backends
|
2025-03-05 13:33:39 -08:00 |
|
Jake Poznanski
|
87875b3e2f
|
Merge branch 'main' of https://github.com/allenai/olmocr into main
|
2025-03-05 12:33:02 -08:00 |
|
Jake Poznanski
|
2982526a10
|
Convert scripts for benchmark
|
2025-03-05 12:03:34 -08:00 |
|
Jake Poznanski
|
dbbe6cea11
|
Merge branch 'main' of https://github.com/allenai/olmocr
|
2025-03-05 19:37:10 +00:00 |
|
Jake Poznanski
|
abeaf028fd
|
Docker file builds faster now
|
2025-03-05 19:37:09 +00:00 |
|
Jake Poznanski
|
1545a6d515
|
Adding more work on diffs
|
2025-03-04 15:08:59 -08:00 |
|
Jake Poznanski
|
004486f014
|
Nice tables support
|
2025-03-04 14:22:03 -08:00 |
|
Jake Poznanski
|
3a0bcb6afd
|
Better table tests
|
2025-03-04 14:04:50 -08:00 |
|
Jake Poznanski
|
748fd62e8a
|
Adding basic table relative tests
|
2025-03-04 13:34:33 -08:00 |
|
Jake Poznanski
|
76476f9992
|
Synth rendering ideas
|
2025-03-04 09:59:51 -08:00 |
|
Jake Poznanski
|
c4f6b11834
|
Fixing the mine diffs script, but it still doesn't work great
|
2025-03-04 09:11:53 -08:00 |
|
Jake Poznanski
|
fcb1eab98f
|
Consistent ordering on convert, with data dir script
|
2025-03-04 08:39:35 -08:00 |
|
Jake Poznanski
|
ecac3847e4
|
Making a nicer warning message when waiting for sglang server
|
2025-03-04 08:28:15 -08:00 |
|
Jake Poznanski
|
03ef3532b4
|
One last lint fix
|
2025-03-04 04:14:33 +00:00 |
|
Jake Poznanski
|
7d7e81ef91
|
Internal version bump
|
2025-03-04 04:09:02 +00:00 |
|
Luca Soldaini
|
7a7c87805c
|
double parentheses for proper escaping
|
2025-03-03 20:08:27 -08:00 |
|
Jake Poznanski
|
dc7cb5c8b5
|
Ruff fixes to CI
|
2025-03-03 15:56:39 -08:00 |
|
Jake Poznanski
|
1348a29ce8
|
Merge branch 'main' of https://github.com/allenai/olmocr into main
|
2025-03-03 15:54:54 -08:00 |
|
Jake Poznanski
|
ca0f911997
|
Probably need at least 20GB GPU ram to have a good time with olmocr
|
2025-03-03 15:54:47 -08:00 |
|
Jake Poznanski
|
9390831f6e
|
Update action.yml to use cache v3
|
2025-03-03 15:18:40 -08:00 |
|
Jake Poznanski
|
22418534e1
|
Merge branch 'main' of https://github.com/allenai/olmocr into main
|
2025-03-03 14:45:14 -08:00 |
|
Jake Poznanski
|
a701a37629
|
Fix for calling --pdfs with an invalid pdf
|
2025-03-03 14:45:06 -08:00 |
|
Jake Poznanski
|
90f7b590fd
|
Update README.md
|
2025-03-03 13:48:21 -08:00 |
|
Jake Poznanski
|
622540eaae
|
Fix so that the pipeline.py attempts to download the model weights first, before starting the loading timeout
|
2025-03-03 13:42:13 -08:00 |
|
Jake Poznanski
|
010fdf87ea
|
Small fix
|
2025-03-03 13:04:42 -08:00 |
|
Jake Poznanski
|
7dd44ed717
|
convert script
|
2025-03-03 10:01:12 -08:00 |
|
Jake Poznanski
|
701abdb955
|
Some new entries
|
2025-02-28 15:14:06 -08:00 |
|
Jake Poznanski
|
1148b475e9
|
Minor fixes
|
2025-02-28 15:10:51 -08:00 |
|
Jake Poznanski
|
361ed2a038
|
Merge branch 'main' of https://github.com/allenai/olmocr into main
|
2025-02-28 14:58:36 -08:00 |
|
Jake Poznanski
|
9f12917e10
|
Organizing things for data entry
|
2025-02-28 14:58:29 -08:00 |
|
Jake Poznanski
|
af02c63531
|
Working viewer
|
2025-02-28 14:00:22 -08:00 |
|