1145 Commits

Author SHA1 Message Date
aman-17
bc89f90216 removed convert file 2025-04-15 15:12:35 -07:00
aman-17
8abc475a0b added old_scans and old_scans math miners and review app 2025-04-15 15:11:20 -07:00
Jake Poznanski
1d0c560455 Upping version to fix issue with work queue and delimited paths 2025-04-15 18:50:13 +00:00
aman-17
7703f0c9fa update 2025-04-14 19:40:17 -07:00
aman-17
7c1c43649a added old latex 2025-04-14 18:53:20 -07:00
Jake Poznanski
786b14aef5 Final adjustments 2025-04-14 23:27:27 +00:00
Jake Poznanski
4d8a8affdb Adjusting prolific script 2025-04-14 23:21:28 +00:00
Jake Poznanski
dc2512c2f0 Adjusted annotation script 2025-04-14 20:27:06 +00:00
aman-17
3a1f98ca65 added more testcases for old_docs 2025-04-14 13:25:20 -07:00
Jake Poznanski
ee41449ff6 Instructions updated in annotation tool 2025-04-14 19:07:13 +00:00
Jake Poznanski
5ebec4664a Merge branch 'main' of https://github.com/allenai/olmocr 2025-04-14 17:14:53 +00:00
Jake Poznanski
0b5cd40664 Staggering model downloads in big sharded jobs 2025-04-14 17:14:51 +00:00
Jake Poznanski
f7529f4e60
Update README.md 2025-04-11 11:23:17 -07:00
Jake Poznanski
b3c3a13e03
Update README.md 2025-04-10 16:06:17 -07:00
Jake Poznanski
52e11d3f38
Update README.md 2025-04-10 16:03:02 -07:00
Jake Poznanski
7b53714e27
Update README.md 2025-04-10 16:02:02 -07:00
Jake Poznanski
d781121e44
Update README.md 2025-04-10 16:00:05 -07:00
Jake Poznanski
3f34969a85 Rendering math in review app 2025-04-10 21:58:32 +00:00
Jake Poznanski
590a92ec2f Ruff fix 2025-04-10 21:50:14 +00:00
aman-17
c7d0510fc1 removed rejected instances and cleaned up 2025-04-08 16:28:40 -07:00
aman-17
4e2a534f84 updated old_docs 2025-04-08 16:02:51 -07:00
Jake Poznanski
4e990e2584 Merge branch 'main' of https://github.com/allenai/olmocr 2025-04-08 22:31:01 +00:00
Jake Poznanski
a13a50143a Formatting, fixes to annotation tool 2025-04-08 22:30:59 +00:00
Jake Poznanski
c7ddad0cc0 Decent prompt 2025-04-08 14:55:12 -07:00
Jake Poznanski
cf0d07d8d7 Merge branch 'main' of https://github.com/allenai/olmocr into main 2025-04-08 14:09:30 -07:00
Jake Poznanski
df6a96d90d Prompting improving 2025-04-08 14:09:28 -07:00
Jake Poznanski
a74800f528 New flowchart based annotation tool 2025-04-08 21:04:56 +00:00
Jake Poznanski
cdc7fae4f9 Adjusting annotation script 2025-04-08 20:50:00 +00:00
Jake Poznanski
2f74a2a996 Prompt6 for qwen2.7 vl 2025-04-08 13:25:15 -07:00
aman-17
3e7d4b17ec update 2025-04-08 13:21:34 -07:00
aman-17
92e168a91e added old docs 2025-04-08 11:38:19 -07:00
Jake Poznanski
8c287a0255 Basic prompt edits 2025-04-08 10:28:41 -07:00
Jake Poznanski
ecbd3a246f Merge branch 'main' of https://github.com/allenai/olmocr into main 2025-04-07 21:19:37 -07:00
Jake Poznanski
474e0ef6ed Lint fixes, adjusting qwen2.5 vl prompt 2025-04-07 21:19:36 -07:00
Jake Poznanski
1fc548d2b5
Update README.md 2025-04-07 20:01:22 -07:00
Jake Poznanski
aa5cb95169 Typos fixed up 2025-04-07 16:31:57 -07:00
Jake Poznanski
141fc69cd4 Vllm based qwen2.5 evals 2025-04-07 15:17:14 -07:00
Jake Poznanski
9d8a4cf478 Merge branch 'main' of https://github.com/allenai/olmocr into main 2025-04-07 14:53:32 -07:00
Jake Poznanski
f5641c68d9 Convert script updated a bit 2025-04-07 14:53:28 -07:00
Jake Poznanski
500dedc11c Merge branch 'main' of https://github.com/allenai/olmocr 2025-04-07 21:39:57 +00:00
Jake Poznanski
f0d18e8b80 Final version for prolific 2025-04-07 21:39:55 +00:00
Jake Poznanski
ae4fda7429 Bugfixes 2025-04-07 14:15:32 -07:00
Jake Poznanski
aa5837074e Merge branch 'main' of https://github.com/allenai/olmocr into main 2025-04-07 14:13:53 -07:00
Jake Poznanski
613a4f3758 Adding additional runners and updating convert script 2025-04-07 14:13:52 -07:00
Jake Poznanski
b97e90ce3a Merge branch 'main' of https://github.com/allenai/olmocr 2025-04-07 20:27:35 +00:00
Jake Poznanski
b626b4a1e1 Adjusting labeling task 2025-04-07 20:27:32 +00:00
Jake Poznanski
b607aecbbc Lints 2025-04-07 10:21:35 -07:00
Jake Poznanski
95b03a1df0 Fixing gemini conver script to use new API 2025-04-07 10:20:58 -07:00
Jake Poznanski
3d1925067b Removing progress bar in annotation UI 2025-04-04 21:41:36 +00:00
Jake Poznanski
caf21b9664 Lints 2025-04-04 19:45:38 +00:00