708 Commits

Author SHA1 Message Date
Jake Poznanski
87875b3e2f Merge branch 'main' of https://github.com/allenai/olmocr into main 2025-03-05 12:33:02 -08:00
Jake Poznanski
2982526a10 Convert scripts for benchmark 2025-03-05 12:03:34 -08:00
Jake Poznanski
dbbe6cea11 Merge branch 'main' of https://github.com/allenai/olmocr 2025-03-05 19:37:10 +00:00
Jake Poznanski
abeaf028fd Docker file builds faster now 2025-03-05 19:37:09 +00:00
Jake Poznanski
1545a6d515 Adding more work on diffs 2025-03-04 15:08:59 -08:00
Jake Poznanski
004486f014 Nice tables support 2025-03-04 14:22:03 -08:00
Jake Poznanski
3a0bcb6afd Better table tests 2025-03-04 14:04:50 -08:00
Jake Poznanski
748fd62e8a Adding basic table relative tests 2025-03-04 13:34:33 -08:00
Jake Poznanski
76476f9992 Synth rendering ideas 2025-03-04 09:59:51 -08:00
Jake Poznanski
c4f6b11834 Fixing the mine diffs script, but it still doesn't work great 2025-03-04 09:11:53 -08:00
Jake Poznanski
fcb1eab98f Consistent ordering on convert, with data dir script 2025-03-04 08:39:35 -08:00
Jake Poznanski
ecac3847e4 Making a nicer warning message when waiting for sglang server 2025-03-04 08:28:15 -08:00
Jake Poznanski
03ef3532b4 One last lint fix 2025-03-04 04:14:33 +00:00
Jake Poznanski
7d7e81ef91 Internal version bump 2025-03-04 04:09:02 +00:00
Luca Soldaini
7a7c87805c
double parentheses for proper escaping 2025-03-03 20:08:27 -08:00
Jake Poznanski
dc7cb5c8b5 Ruff fixes to CI 2025-03-03 15:56:39 -08:00
Jake Poznanski
1348a29ce8 Merge branch 'main' of https://github.com/allenai/olmocr into main 2025-03-03 15:54:54 -08:00
Jake Poznanski
ca0f911997 Probably need at least 20GB GPU ram to have a good time with olmocr 2025-03-03 15:54:47 -08:00
Jake Poznanski
9390831f6e
Update action.yml to use cache v3 2025-03-03 15:18:40 -08:00
Jake Poznanski
22418534e1 Merge branch 'main' of https://github.com/allenai/olmocr into main 2025-03-03 14:45:14 -08:00
Jake Poznanski
a701a37629 Fix for calling --pdfs with an invalid pdf 2025-03-03 14:45:06 -08:00
Jake Poznanski
90f7b590fd
Update README.md 2025-03-03 13:48:21 -08:00
Jake Poznanski
622540eaae Fix so that the pipeline.py attempts to download the model weights first, before starting the loading timeout 2025-03-03 13:42:13 -08:00
Jake Poznanski
010fdf87ea Small fix 2025-03-03 13:04:42 -08:00
Jake Poznanski
7dd44ed717 convert script 2025-03-03 10:01:12 -08:00
Jake Poznanski
701abdb955 Some new entries 2025-02-28 15:14:06 -08:00
Jake Poznanski
1148b475e9 Minor fixes 2025-02-28 15:10:51 -08:00
Jake Poznanski
361ed2a038 Merge branch 'main' of https://github.com/allenai/olmocr into main 2025-02-28 14:58:36 -08:00
Jake Poznanski
9f12917e10 Organizing things for data entry 2025-02-28 14:58:29 -08:00
Jake Poznanski
af02c63531 Working viewer 2025-02-28 14:00:22 -08:00
Aman Rangapur
37b32d7b06
Merge pull request #62 from allenai/amanr/bench
Added Gemini and Claude runners with a viewer.
2025-02-28 13:20:38 -08:00
“aman-17”
17d8b0ec1f resolved viewer merge conflict 2025-02-28 13:19:20 -08:00
Jake Poznanski
8061aacd58 Working on viewer/editor for rules 2025-02-28 13:05:56 -08:00
Jake Poznanski
ab13ac6054 Mining diff script outputs candidate rules 2025-02-28 12:55:22 -08:00
“aman-17”
484b9acbde resolved Jake's comments 2025-02-28 11:55:02 -08:00
Jake Poznanski
99ab0464e5 Autominer work 2025-02-28 11:50:18 -08:00
“aman-17”
b49fa89cae updated changelog 2025-02-28 11:30:32 -08:00
“aman-17”
452ac01fda cleaning 2025-02-28 11:25:33 -08:00
“aman-17”
7add91a4fe more cleaning 2025-02-28 11:16:35 -08:00
“aman-17”
15731e49d6 restored changes wrt main 2025-02-28 11:13:59 -08:00
Jake Poznanski
143769bcbc
Merge pull request #61 from allenai/kylel/elo
Adds data and scripts for ELO ratings
2025-02-28 10:18:00 -08:00
Jake Poznanski
1b78ec9572 More work on automining 2025-02-28 10:14:47 -08:00
kyleclo
25df26fefd readme 2025-02-28 10:12:07 -08:00
kyleclo
7e434d8466 Merge branch 'main' into kylel/elo 2025-02-28 10:06:40 -08:00
Jake Poznanski
3670219a8f commits 2025-02-28 08:54:47 -08:00
Jake Poznanski
2d4c1a1290 Merge branch 'main' of https://github.com/allenai/olmocr into main 2025-02-27 16:02:55 -08:00
Jake Poznanski
a03673e126 Working on some progress for the autominer, fixing more options in convert script 2025-02-27 16:02:48 -08:00
Jake Poznanski
e68329800a
Update README.md 2025-02-27 14:56:05 -08:00
Jake Poznanski
11e89dcd22 Script fixups 2025-02-27 14:32:10 -08:00
Jake Poznanski
505e08cbb1 automine draft 2025-02-27 13:59:40 -08:00