387 Commits

Author SHA1 Message Date
Jake Poznanski
4053ea58a4 Work on image matching 2025-03-06 13:11:08 -08:00
Jake Poznanski
b03d840238 Better error handling on eqn rendering 2025-03-06 11:59:20 -08:00
Jake Poznanski
438e68ec68 Some more math stuff 2025-03-06 11:00:50 -08:00
Jake Poznanski
7f36ac86f3 First math tests 2025-03-06 10:34:05 -08:00
Jake Poznanski
b62ccc25dd Equation rendering code, first pass 2025-03-06 09:59:36 -08:00
Jake Poznanski
9be696fa30 Adding a trailing repetition test 2025-03-06 08:56:16 -08:00
Jake Poznanski
07466e1ae4 Stats tests 2025-03-06 08:18:05 -08:00
Jake Poznanski
eeb2733c9e Marker rerun, stats changes 2025-03-06 07:55:44 -08:00
Jake Poznanski
50e55f45ab Conversion fixes 2025-03-05 15:31:45 -08:00
Jake Poznanski
fb0a729fe6 Better convert script 2025-03-05 14:31:39 -08:00
Jake Poznanski
fa68c6b6ce Better conversion script, run on more things 2025-03-05 14:16:29 -08:00
Jake Poznanski
c9ecd8e040 Need those chat templates 2025-03-05 14:01:14 -08:00
Jake Poznanski
5611d79bb2 Model runners 2025-03-05 13:55:40 -08:00
Jake Poznanski
5cb32c3289 Convert script work with server backends 2025-03-05 13:33:39 -08:00
Jake Poznanski
2982526a10 Convert scripts for benchmark 2025-03-05 12:03:34 -08:00
Jake Poznanski
1545a6d515 Adding more work on diffs 2025-03-04 15:08:59 -08:00
Jake Poznanski
004486f014 Nice tables support 2025-03-04 14:22:03 -08:00
Jake Poznanski
3a0bcb6afd Better table tests 2025-03-04 14:04:50 -08:00
Jake Poznanski
748fd62e8a Adding basic table relative tests 2025-03-04 13:34:33 -08:00
Jake Poznanski
76476f9992 Synth rendering ideas 2025-03-04 09:59:51 -08:00
Jake Poznanski
c4f6b11834 Fixing the mine diffs script, but it still doesn't work great 2025-03-04 09:11:53 -08:00
Jake Poznanski
fcb1eab98f Consistent ordering on convert, with data dir script 2025-03-04 08:39:35 -08:00
Jake Poznanski
ecac3847e4 Making a nicer warning message when waiting for sglang server 2025-03-04 08:28:15 -08:00
Jake Poznanski
03ef3532b4 One last lint fix 2025-03-04 04:14:33 +00:00
Jake Poznanski
7d7e81ef91 Internal version bump 2025-03-04 04:09:02 +00:00
Jake Poznanski
dc7cb5c8b5 Ruff fixes to CI 2025-03-03 15:56:39 -08:00
Jake Poznanski
ca0f911997 Probably need at least 20GB GPU ram to have a good time with olmocr 2025-03-03 15:54:47 -08:00
Jake Poznanski
a701a37629 Fix for calling --pdfs with an invalid pdf 2025-03-03 14:45:06 -08:00
Jake Poznanski
622540eaae Fix so that the pipeline.py attempts to download the model weights first, before starting the loading timeout 2025-03-03 13:42:13 -08:00
Jake Poznanski
010fdf87ea Small fix 2025-03-03 13:04:42 -08:00
Jake Poznanski
7dd44ed717 convert script 2025-03-03 10:01:12 -08:00
Jake Poznanski
701abdb955 Some new entries 2025-02-28 15:14:06 -08:00
Jake Poznanski
1148b475e9 Minor fixes 2025-02-28 15:10:51 -08:00
Jake Poznanski
361ed2a038 Merge branch 'main' of https://github.com/allenai/olmocr into main 2025-02-28 14:58:36 -08:00
Jake Poznanski
9f12917e10 Organizing things for data entry 2025-02-28 14:58:29 -08:00
Jake Poznanski
af02c63531 Working viewer 2025-02-28 14:00:22 -08:00
Aman Rangapur
37b32d7b06
Merge pull request #62 from allenai/amanr/bench
Added Gemini and Claude runners with a viewer.
2025-02-28 13:20:38 -08:00
“aman-17”
17d8b0ec1f resolved viewer merge conflict 2025-02-28 13:19:20 -08:00
Jake Poznanski
8061aacd58 Working on viewer/editor for rules 2025-02-28 13:05:56 -08:00
Jake Poznanski
ab13ac6054 Mining diff script outputs candidate rules 2025-02-28 12:55:22 -08:00
“aman-17”
484b9acbde resolved Jake's comments 2025-02-28 11:55:02 -08:00
Jake Poznanski
99ab0464e5 Autominer work 2025-02-28 11:50:18 -08:00
“aman-17”
452ac01fda cleaning 2025-02-28 11:25:33 -08:00
“aman-17”
7add91a4fe more cleaning 2025-02-28 11:16:35 -08:00
“aman-17”
15731e49d6 restored changes wrt main 2025-02-28 11:13:59 -08:00
Jake Poznanski
1b78ec9572 More work on automining 2025-02-28 10:14:47 -08:00
Jake Poznanski
3670219a8f commits 2025-02-28 08:54:47 -08:00
Jake Poznanski
a03673e126 Working on some progress for the autominer, fixing more options in convert script 2025-02-27 16:02:48 -08:00
Jake Poznanski
11e89dcd22 Script fixups 2025-02-27 14:32:10 -08:00
Jake Poznanski
505e08cbb1 automine draft 2025-02-27 13:59:40 -08:00