693 Commits

Author SHA1 Message Date
aman-17
c2b54d8525 updated readme 2025-02-25 08:44:36 -08:00
Jake Poznanski
d841216ffc Merge branch 'main' of https://github.com/allenai/olmocr into main 2025-02-24 11:39:46 -08:00
Jake Poznanski
813a355f44 Fixing mineru runner, added a few sample docs 2025-02-24 11:39:38 -08:00
Jake Poznanski
e8387ece2f
Update README.md 2025-02-24 11:00:31 -08:00
Jake Poznanski
6b06fe83ec
Update README.md 2025-02-24 09:37:12 -08:00
Jake Poznanski
cc1f476b3e Bugfixes 2025-02-21 14:56:32 -08:00
Jake Poznanski
9da1f92628 Cleaner implementations of benchmark stuff 2025-02-21 14:08:24 -08:00
Jake Poznanski
53494d9c7e Refactoring 2025-02-21 12:47:24 -08:00
Jake Poznanski
ff465f7f36 Starting refactor 2025-02-21 11:58:39 -08:00
Jake Poznanski
a348cd6e8f olmocr bench runner 2025-02-21 09:57:07 -08:00
Jake Poznanski
c20e3c0702 Pdf for dataset 2025-02-21 09:31:25 -08:00
Jake Poznanski
16a32445a2 olmocr running 2025-02-19 16:10:46 -08:00
Jake Poznanski
422d08f4b8 Adding more rules and seeing how they should work 2025-02-19 15:13:19 -08:00
Jake Poznanski
f2f761973c Adding mineru script 2025-02-19 15:07:47 -08:00
Jake Poznanski
e5a80c572c Fixing up benchmark a bit 2025-02-19 14:43:47 -08:00
Jake Poznanski
c3d0ce99f2 Some readmes and instructions 2025-02-19 13:25:31 -08:00
Jake Poznanski
4e0339f965 Runner for olmocr bench 2025-02-19 21:04:49 +00:00
Jake Poznanski
a8f6921dd3 Benchmark runners for other systems 2025-02-19 19:50:26 +00:00
Jake Poznanski
318abf22ad Adding runbench 2025-02-19 19:27:08 +00:00
Jake Poznanski
1230aefe98 Making progress 2025-02-19 18:59:51 +00:00
Jake Poznanski
072bc1d142 Making some progress 2025-02-19 18:48:02 +00:00
Jake Poznanski
823629d046 Sample code for olmocrbench 2025-02-19 18:35:55 +00:00
Jake Poznanski
9e62003727 Adding readme for olmocr bench 2025-02-18 23:40:38 +00:00
Jake Poznanski
e4f9b1962f Infinigram counting script for paper 2025-02-18 19:01:17 +00:00
Jake Poznanski
602012267e Match script 2025-02-18 17:53:46 +00:00
Jake Poznanski
b871e4b425 Small helper to measure overlap 2025-02-18 17:14:56 +00:00
Jake Poznanski
a2c0887b3f Bump version to v0.1.58 for release v0.1.58 2025-02-15 00:16:07 +00:00
Jake Poznanski
0e7b3972c2
Update README.md 2025-02-14 16:00:57 -08:00
Jake Poznanski
c95343d4a1 Bump version to v0.1.57 for release v0.1.57 2025-02-14 22:57:51 +00:00
Jake Poznanski
58db354532 Fixing release script 2025-02-14 22:57:43 +00:00
Jake Poznanski
c4303074e6 Bump version to v0.1.56 for release v0.1.56 2025-02-14 22:27:44 +00:00
Jake Poznanski
f50f37efb8 pyproject.toml changes 2025-02-14 22:27:36 +00:00
Jake Poznanski
bcf967b105 Bump version to v0.1.55 for release v0.1.55 2025-02-14 22:09:43 +00:00
Jake Poznanski
3ee8b7b45e toml fix 2025-02-14 22:09:29 +00:00
Jake Poznanski
95853fb25b Merge branch 'main' of https://github.com/allenai/olmocr v0.1.54 2025-02-14 21:43:01 +00:00
Jake Poznanski
7e02e199ba Adjusting tools to include html templates 2025-02-14 21:42:59 +00:00
Jake Poznanski
81f543019e
Update README.md 2025-02-14 13:30:39 -08:00
Jake Poznanski
08f76121c3 Bump version to v0.1.53 for release v0.1.53 2025-02-14 20:55:21 +00:00
Jake Poznanski
58bdfa512b CI 2025-02-14 20:51:04 +00:00
Jake Poznanski
25ec87b66d CI 2025-02-14 20:46:55 +00:00
Jake Poznanski
c05e01532c Hopefully CI runs now 2025-02-14 20:42:19 +00:00
Jake Poznanski
15f9b8b9dc Install poppler in CI 2025-02-14 20:02:05 +00:00
Jake Poznanski
229da8cb17 unused imports 2025-02-14 19:54:48 +00:00
Jake Poznanski
32aa359458 Formatting fix 2025-02-14 19:50:19 +00:00
Jake Poznanski
0dcdbcc61a
Update README.md 2025-02-14 11:07:42 -08:00
Jake Poznanski
6583fb641a hfupload scripts 2025-02-14 17:36:00 +00:00
kyleclo
86b17d0ea3 add boxplot drawing 2025-02-13 19:38:09 -08:00
kyleclo
a790ba73ee update args; include output 2025-02-13 17:06:36 -08:00
kyleclo
88c18b3afa human eval data; elo ratings script; dependencies 2025-02-13 16:59:09 -08:00
Jake Poznanski
8297955290 Making my parquets 2025-02-14 00:02:07 +00:00