918 Commits

Author SHA1 Message Date
aman-17
c7d0510fc1 removed rejected instances and cleaned up 2025-04-08 16:28:40 -07:00
aman-17
4e2a534f84 updated old_docs 2025-04-08 16:02:51 -07:00
aman-17
3e7d4b17ec update 2025-04-08 13:21:34 -07:00
aman-17
92e168a91e added old docs 2025-04-08 11:38:19 -07:00
Jake Poznanski
8c287a0255 Basic prompt edits 2025-04-08 10:28:41 -07:00
Jake Poznanski
ecbd3a246f Merge branch 'main' of https://github.com/allenai/olmocr into main 2025-04-07 21:19:37 -07:00
Jake Poznanski
474e0ef6ed Lint fixes, adjusting qwen2.5 vl prompt 2025-04-07 21:19:36 -07:00
Jake Poznanski
1fc548d2b5
Update README.md 2025-04-07 20:01:22 -07:00
Jake Poznanski
aa5cb95169 Typos fixed up 2025-04-07 16:31:57 -07:00
Jake Poznanski
141fc69cd4 Vllm based qwen2.5 evals 2025-04-07 15:17:14 -07:00
Jake Poznanski
9d8a4cf478 Merge branch 'main' of https://github.com/allenai/olmocr into main 2025-04-07 14:53:32 -07:00
Jake Poznanski
f5641c68d9 Convert script updated a bit 2025-04-07 14:53:28 -07:00
Jake Poznanski
500dedc11c Merge branch 'main' of https://github.com/allenai/olmocr 2025-04-07 21:39:57 +00:00
Jake Poznanski
f0d18e8b80 Final version for prolific 2025-04-07 21:39:55 +00:00
Jake Poznanski
ae4fda7429 Bugfixes 2025-04-07 14:15:32 -07:00
Jake Poznanski
aa5837074e Merge branch 'main' of https://github.com/allenai/olmocr into main 2025-04-07 14:13:53 -07:00
Jake Poznanski
613a4f3758 Adding additional runners and updating convert script 2025-04-07 14:13:52 -07:00
Jake Poznanski
b97e90ce3a Merge branch 'main' of https://github.com/allenai/olmocr 2025-04-07 20:27:35 +00:00
Jake Poznanski
b626b4a1e1 Adjusting labeling task 2025-04-07 20:27:32 +00:00
Jake Poznanski
b607aecbbc Lints 2025-04-07 10:21:35 -07:00
Jake Poznanski
95b03a1df0 Fixing gemini conver script to use new API 2025-04-07 10:20:58 -07:00
Jake Poznanski
3d1925067b Removing progress bar in annotation UI 2025-04-04 21:41:36 +00:00
Jake Poznanski
caf21b9664 Lints 2025-04-04 19:45:38 +00:00
Jake Poznanski
f1188dc85d Merge branch 'main' of https://github.com/allenai/olmocr 2025-04-04 19:44:55 +00:00
Jake Poznanski
a0f8b028f8 Reporting results 2025-04-04 19:44:54 +00:00
Jake Poznanski
cc7b1131c6 Editing 2025-04-04 19:38:59 +00:00
Jake Poznanski
9338f5359f Saving pdf paths 2025-04-04 19:36:10 +00:00
Jake Poznanski
ee70b68a19
Merge pull request #164 from allenai/amanr/multi_columns
Added multi_column miner script
2025-04-04 12:33:25 -07:00
Jake Poznanski
c8cc61b95f
Merge pull request #163 from franzbischoff/main
Add script to convert JSONL files to Markdown format
2025-04-04 12:30:54 -07:00
aman-17
71e44a1b4e fixed style 2025-04-04 11:10:00 -07:00
aman-17
9fd7bc8a96 added multi_column script 2025-04-04 11:01:59 -07:00
Jake Poznanski
61624a37ff Fixed 2025-04-04 17:53:26 +00:00
Jake Poznanski
d299119c65 Links updated 2025-04-04 17:18:41 +00:00
Jake Poznanski
a113fd3015 Review app 2025-04-04 17:18:19 +00:00
Jake Poznanski
e8c14fc496 Saving prolific codes 2025-04-04 17:12:46 +00:00
Jake Poznanski
cd9e370c92 Tinyhosting automatically 2025-04-04 16:29:58 +00:00
Jake Poznanski
02cd002488 Step by step annotation 2025-04-04 16:19:04 +00:00
Jake Poznanski
6a0dbfc925 Adjusting buttons 2025-04-04 16:05:04 +00:00
Francisco Bischoff
c2193ddc93
Remove first line 2025-04-04 16:44:21 +01:00
Francisco Bischoff
c96143c3b1
Add script to convert JSONL files to Markdown format 2025-04-04 12:52:58 +01:00
Jake Poznanski
d4d87f7c65 Force flag for review app, tests fixed for difference comparison in tables 2025-04-03 20:27:01 +00:00
Jake Poznanski
e856e9de1d Test mining not including line numbers 2025-04-02 23:07:32 +00:00
Jake Poznanski
2614fc9050 Merge branch 'main' of https://github.com/allenai/olmocr 2025-04-02 21:46:35 +00:00
Jake Poznanski
a96f1541c4 Hopefuly avoiding comparison issues now 2025-04-02 21:46:34 +00:00
Jake Poznanski
46ca990663 Merge branch 'main' of https://github.com/allenai/olmocr into main 2025-04-02 14:46:13 -07:00
Jake Poznanski
0d94d15341 Test validation 2025-04-02 14:46:07 -07:00
Jake Poznanski
b8b780faca More mining of synthetic tests code 2025-04-02 21:39:50 +00:00
Jake Poznanski
360b1be07c Better filtering of tests 2025-04-02 21:24:00 +00:00
Jake Poznanski
6d3a7d634e Adding autorender if katex into synthetic pipeline 2025-04-02 21:14:14 +00:00
Jake Poznanski
4604b59661 SYnth mining 2025-04-02 20:25:16 +00:00