693 Commits

Author SHA1 Message Date
Jake Poznanski
dc7cb5c8b5 Ruff fixes to CI 2025-03-03 15:56:39 -08:00
Jake Poznanski
1348a29ce8 Merge branch 'main' of https://github.com/allenai/olmocr into main 2025-03-03 15:54:54 -08:00
Jake Poznanski
ca0f911997 Probably need at least 20GB GPU ram to have a good time with olmocr 2025-03-03 15:54:47 -08:00
Jake Poznanski
9390831f6e
Update action.yml to use cache v3 2025-03-03 15:18:40 -08:00
Jake Poznanski
22418534e1 Merge branch 'main' of https://github.com/allenai/olmocr into main 2025-03-03 14:45:14 -08:00
Jake Poznanski
a701a37629 Fix for calling --pdfs with an invalid pdf 2025-03-03 14:45:06 -08:00
Jake Poznanski
90f7b590fd
Update README.md 2025-03-03 13:48:21 -08:00
Jake Poznanski
622540eaae Fix so that the pipeline.py attempts to download the model weights first, before starting the loading timeout 2025-03-03 13:42:13 -08:00
Jake Poznanski
010fdf87ea Small fix 2025-03-03 13:04:42 -08:00
Jake Poznanski
7dd44ed717 convert script 2025-03-03 10:01:12 -08:00
Jake Poznanski
701abdb955 Some new entries 2025-02-28 15:14:06 -08:00
Jake Poznanski
1148b475e9 Minor fixes 2025-02-28 15:10:51 -08:00
Jake Poznanski
361ed2a038 Merge branch 'main' of https://github.com/allenai/olmocr into main 2025-02-28 14:58:36 -08:00
Jake Poznanski
9f12917e10 Organizing things for data entry 2025-02-28 14:58:29 -08:00
Jake Poznanski
af02c63531 Working viewer 2025-02-28 14:00:22 -08:00
Aman Rangapur
37b32d7b06
Merge pull request #62 from allenai/amanr/bench
Added Gemini and Claude runners with a viewer.
2025-02-28 13:20:38 -08:00
“aman-17”
17d8b0ec1f resolved viewer merge conflict 2025-02-28 13:19:20 -08:00
Jake Poznanski
8061aacd58 Working on viewer/editor for rules 2025-02-28 13:05:56 -08:00
Jake Poznanski
ab13ac6054 Mining diff script outputs candidate rules 2025-02-28 12:55:22 -08:00
“aman-17”
484b9acbde resolved Jake's comments 2025-02-28 11:55:02 -08:00
Jake Poznanski
99ab0464e5 Autominer work 2025-02-28 11:50:18 -08:00
“aman-17”
b49fa89cae updated changelog 2025-02-28 11:30:32 -08:00
“aman-17”
452ac01fda cleaning 2025-02-28 11:25:33 -08:00
“aman-17”
7add91a4fe more cleaning 2025-02-28 11:16:35 -08:00
“aman-17”
15731e49d6 restored changes wrt main 2025-02-28 11:13:59 -08:00
Jake Poznanski
143769bcbc
Merge pull request #61 from allenai/kylel/elo
Adds data and scripts for ELO ratings
2025-02-28 10:18:00 -08:00
Jake Poznanski
1b78ec9572 More work on automining 2025-02-28 10:14:47 -08:00
kyleclo
25df26fefd readme 2025-02-28 10:12:07 -08:00
kyleclo
7e434d8466 Merge branch 'main' into kylel/elo 2025-02-28 10:06:40 -08:00
Jake Poznanski
3670219a8f commits 2025-02-28 08:54:47 -08:00
Jake Poznanski
2d4c1a1290 Merge branch 'main' of https://github.com/allenai/olmocr into main 2025-02-27 16:02:55 -08:00
Jake Poznanski
a03673e126 Working on some progress for the autominer, fixing more options in convert script 2025-02-27 16:02:48 -08:00
Jake Poznanski
e68329800a
Update README.md 2025-02-27 14:56:05 -08:00
Jake Poznanski
11e89dcd22 Script fixups 2025-02-27 14:32:10 -08:00
Jake Poznanski
505e08cbb1 automine draft 2025-02-27 13:59:40 -08:00
Jake Poznanski
ae7efd3580 Refactoring 2025-02-27 13:15:33 -08:00
Jake Poznanski
9e019f17b5 More factoring 2025-02-27 13:11:47 -08:00
“aman-17”
158be488c9 added viewer for gemini vs chatgpt 2025-02-27 11:52:08 -08:00
“aman-17”
7fbca7c766 update 2025-02-26 14:03:51 -08:00
“aman-17”
98f376630a restored the fine-tuning prompt 2025-02-26 13:36:20 -08:00
“aman-17”
9481b29da3 update 2025-02-26 13:28:01 -08:00
Jake Poznanski
bd08fdb476 fixes missing OSS code for Issue #36 2025-02-26 17:49:04 +00:00
“aman-17”
88910e20fa updated gemini 2025-02-26 09:42:35 -08:00
“aman-17”
9a9f9cbddb added gemini and claude 2025-02-25 16:57:39 -08:00
“aman-17”
3a6df83168 update 2025-02-25 14:41:48 -08:00
Jake Poznanski
d4b902cea2 Olmocr runner implemented 2025-02-25 14:25:02 -08:00
Jake Poznanski
aac0c1503d chatgpt converter 2025-02-25 13:46:36 -08:00
Jake Poznanski
8a6e8b965f Basic rule viewer 2025-02-25 13:11:54 -08:00
Jake Poznanski
9081f7f7e6
Update README.md 2025-02-25 09:17:15 -08:00
aman-17
0130a970c2 fixed style 2025-02-25 08:57:02 -08:00