1764 Commits

Author SHA1 Message Date
Jake Poznanski
0c3d2d2e16 One more args fix 2025-10-20 22:38:49 +00:00
Jake Poznanski
8118680b4b Fixes 2025-10-20 22:12:33 +00:00
Jake Poznanski
7472ef905e More args 2025-10-20 22:09:05 +00:00
Jake Poznanski
3d3fd78499 Test 2025-10-20 22:08:19 +00:00
Jake Poznanski
d211276a73 Adjust again 2025-10-20 22:07:29 +00:00
Jake Poznanski
096cb3e521 Ugh 2025-10-20 22:02:22 +00:00
Jake Poznanski
255ee48594 Fixing other way to run benchmark 2025-10-20 22:00:32 +00:00
Jake Poznanski
eaf83026d3 Lints 2025-10-20 18:43:13 +00:00
Jake Poznanski
4fc9cd112b Improving docs 2025-10-20 18:42:49 +00:00
Jake Poznanski
47ed6bbe66 VLLM based nanonets ocr2 2025-10-20 17:32:30 +00:00
Jake Poznanski
76e05f8165 Fixes, adding more runners 2025-10-20 17:11:12 +00:00
Jake Poznanski
e796448482 Adding paddlevl script 2025-10-20 16:26:08 +00:00
Jake Poznanski
7a744cc0b4 FInal docs on model setup 2025-10-19 18:21:45 +00:00
Jake Poznanski
6c32ff2c7d Update dates 2025-10-16 18:21:18 +00:00
Jake Poznanski
b1b29b2206 Cleanup of paths 2025-10-15 21:31:32 +00:00
Jake Poznanski
80f18cc2bc Fixes 2025-10-15 21:14:53 +00:00
Jake Poznanski
5695e46a21 Adding docs, refactoring how urls are pased in 2025-10-15 21:12:15 +00:00
Jake Poznanski
ab7b02a431 Readme updates 2025-10-15 20:07:02 +00:00
Jake Poznanski
b44c30b482 Starting to add support for parasail 2025-10-15 20:02:41 +00:00
Jake Poznanski
e2a5d9f8f3 Cleaning up dependencies 2025-10-15 19:40:58 +00:00
Jake Poznanski
569311c461 Workspace stuff 2025-10-14 22:54:03 +00:00
Jake Poznanski
36d6228ffa Prepare workspace fix 2025-10-14 22:48:59 +00:00
Jake Poznanski
05d85264ca Cleaning up some table test creation stuff, but it's still not great 2025-10-14 20:20:24 +00:00
Jake Poznanski
08a7c32b62 A few more fixes 2025-10-14 18:58:36 +00:00
Jake Poznanski
654fdc3271 Adjusting step 0 filtering 2025-10-14 18:14:34 +00:00
Jake Poznanski
da1607c0c0 Refinement 2025-10-14 18:12:45 +00:00
Jake Poznanski
93e8a0663d Adding meta tags to head with git version, also filtering out badly rotated docs 2025-10-14 16:30:03 +00:00
Jake Poznanski
a17aa6f94d Fixing up some things with mine_html_templates 2025-10-14 16:07:50 +00:00
Jake Poznanski
52c6dcd523 More reliable mine html templates 2025-10-13 22:27:34 +00:00
Jake Poznanski
aa239eb34c Lints 2025-10-13 21:15:19 +00:00
Jake Poznanski
369fd4d23a Adjusting some things 2025-10-13 21:14:53 +00:00
Jake Poznanski
9480508642 Mineru 2025-10-13 20:47:52 +00:00
Jake Poznanski
417fbed4ad Fix 2025-10-13 19:46:27 +00:00
Jake Poznanski
7d6db61446 Mineru runner 2025-10-13 19:43:39 +00:00
Jake Poznanski
7487e3673a More graceful tar extraction 2025-10-13 17:27:45 +00:00
Jake Poznanski
5b81bc61c6 Filtering downloads 2025-10-13 17:22:57 +00:00
Jake Poznanski
b86e3071da More bench results 2025-10-13 16:37:08 +00:00
Jake Poznanski
62faa003d3 Fix for some corrupted data 2025-10-10 22:34:32 +00:00
Jake Poznanski
fc4934c9b4 URL packaging 2025-10-10 16:52:42 +00:00
Jake Poznanski
87a2b8a9a3 More lint fixes 2025-10-09 22:16:46 +00:00
Jake Poznanski
875337f962 Lints 2025-10-09 22:12:19 +00:00
Jake Poznanski
702c42f8e7 Packaging working better now 2025-10-09 22:12:02 +00:00
Jake Poznanski
557bb9a5e9 Repackager is still not working right 2025-10-09 22:01:01 +00:00
Jake Poznanski
4c21e15d0e Packaging and repackaging test works 2025-10-09 21:52:05 +00:00
Jake Poznanski
9f4a2d4177 Tests 2025-10-09 21:42:32 +00:00
Jake Poznanski
35fc9ca025 Testing the packager 2025-10-09 21:30:38 +00:00
Jake Poznanski
74eb910b95 Now you can just run pytest . cleanly 2025-10-09 20:31:28 +00:00
Jake Poznanski
f01f7183e4 Test fixes 2025-10-09 20:28:29 +00:00
Jake Poznanski
bc8c044dd4 Preparing olmocr mix packaging scripts 2025-10-09 20:14:43 +00:00
Jake Poznanski
743e48361c New claude sonnet, going to add multilinguage tests to olmocr bench 1025 internal version 2025-10-09 19:43:22 +00:00