764 Commits

Author SHA1 Message Date
Jake Poznanski
3670219a8f commits 2025-02-28 08:54:47 -08:00
Jake Poznanski
2d4c1a1290 Merge branch 'main' of https://github.com/allenai/olmocr into main 2025-02-27 16:02:55 -08:00
Jake Poznanski
a03673e126 Working on some progress for the autominer, fixing more options in convert script 2025-02-27 16:02:48 -08:00
Jake Poznanski
e68329800a
Update README.md 2025-02-27 14:56:05 -08:00
Jake Poznanski
11e89dcd22 Script fixups 2025-02-27 14:32:10 -08:00
Jake Poznanski
505e08cbb1 automine draft 2025-02-27 13:59:40 -08:00
Jake Poznanski
ae7efd3580 Refactoring 2025-02-27 13:15:33 -08:00
Jake Poznanski
9e019f17b5 More factoring 2025-02-27 13:11:47 -08:00
“aman-17”
158be488c9 added viewer for gemini vs chatgpt 2025-02-27 11:52:08 -08:00
“aman-17”
7fbca7c766 update 2025-02-26 14:03:51 -08:00
“aman-17”
98f376630a restored the fine-tuning prompt 2025-02-26 13:36:20 -08:00
“aman-17”
9481b29da3 update 2025-02-26 13:28:01 -08:00
Jake Poznanski
bd08fdb476 fixes missing OSS code for Issue #36 2025-02-26 17:49:04 +00:00
“aman-17”
88910e20fa updated gemini 2025-02-26 09:42:35 -08:00
“aman-17”
9a9f9cbddb added gemini and claude 2025-02-25 16:57:39 -08:00
“aman-17”
3a6df83168 update 2025-02-25 14:41:48 -08:00
Jake Poznanski
d4b902cea2 Olmocr runner implemented 2025-02-25 14:25:02 -08:00
Jake Poznanski
aac0c1503d chatgpt converter 2025-02-25 13:46:36 -08:00
Jake Poznanski
8a6e8b965f Basic rule viewer 2025-02-25 13:11:54 -08:00
Jake Poznanski
9081f7f7e6
Update README.md 2025-02-25 09:17:15 -08:00
aman-17
0130a970c2 fixed style 2025-02-25 08:57:02 -08:00
aman-17
c2b54d8525 updated readme 2025-02-25 08:44:36 -08:00
Jake Poznanski
d841216ffc Merge branch 'main' of https://github.com/allenai/olmocr into main 2025-02-24 11:39:46 -08:00
Jake Poznanski
813a355f44 Fixing mineru runner, added a few sample docs 2025-02-24 11:39:38 -08:00
Jake Poznanski
e8387ece2f
Update README.md 2025-02-24 11:00:31 -08:00
Jake Poznanski
6b06fe83ec
Update README.md 2025-02-24 09:37:12 -08:00
Jake Poznanski
cc1f476b3e Bugfixes 2025-02-21 14:56:32 -08:00
Jake Poznanski
9da1f92628 Cleaner implementations of benchmark stuff 2025-02-21 14:08:24 -08:00
Jake Poznanski
53494d9c7e Refactoring 2025-02-21 12:47:24 -08:00
Jake Poznanski
ff465f7f36 Starting refactor 2025-02-21 11:58:39 -08:00
Jake Poznanski
a348cd6e8f olmocr bench runner 2025-02-21 09:57:07 -08:00
Jake Poznanski
c20e3c0702 Pdf for dataset 2025-02-21 09:31:25 -08:00
Jake Poznanski
16a32445a2 olmocr running 2025-02-19 16:10:46 -08:00
Jake Poznanski
422d08f4b8 Adding more rules and seeing how they should work 2025-02-19 15:13:19 -08:00
Jake Poznanski
f2f761973c Adding mineru script 2025-02-19 15:07:47 -08:00
Jake Poznanski
e5a80c572c Fixing up benchmark a bit 2025-02-19 14:43:47 -08:00
Jake Poznanski
c3d0ce99f2 Some readmes and instructions 2025-02-19 13:25:31 -08:00
Jake Poznanski
4e0339f965 Runner for olmocr bench 2025-02-19 21:04:49 +00:00
Jake Poznanski
a8f6921dd3 Benchmark runners for other systems 2025-02-19 19:50:26 +00:00
Jake Poznanski
318abf22ad Adding runbench 2025-02-19 19:27:08 +00:00
Jake Poznanski
1230aefe98 Making progress 2025-02-19 18:59:51 +00:00
Jake Poznanski
072bc1d142 Making some progress 2025-02-19 18:48:02 +00:00
Jake Poznanski
823629d046 Sample code for olmocrbench 2025-02-19 18:35:55 +00:00
Jake Poznanski
9e62003727 Adding readme for olmocr bench 2025-02-18 23:40:38 +00:00
Jake Poznanski
e4f9b1962f Infinigram counting script for paper 2025-02-18 19:01:17 +00:00
Jake Poznanski
602012267e Match script 2025-02-18 17:53:46 +00:00
Jake Poznanski
b871e4b425 Small helper to measure overlap 2025-02-18 17:14:56 +00:00
Jake Poznanski
a2c0887b3f Bump version to v0.1.58 for release v0.1.58 2025-02-15 00:16:07 +00:00
Jake Poznanski
0e7b3972c2
Update README.md 2025-02-14 16:00:57 -08:00
Jake Poznanski
c95343d4a1 Bump version to v0.1.57 for release v0.1.57 2025-02-14 22:57:51 +00:00