1803 Commits

Author SHA1 Message Date
Jake Poznanski
cce7a6c4de Adding more row span col span tests 2025-10-24 21:50:32 +00:00
Jake Poznanski
c9f0b2c709 Table checking code refactored 2025-10-24 21:37:25 +00:00
Jake Poznanski
81d11ced06 Simplifying table test code 2025-10-24 20:28:41 +00:00
Jake Poznanski
e25b4e4bac Table parsing improved 2025-10-24 20:12:01 +00:00
Jake Poznanski
8633256ddb Ok, making the heading crawls a bit different, it's nice now 2025-10-24 20:06:45 +00:00
Jake Poznanski
8079609004 Launching qwen3vl version again 2025-10-24 19:50:28 +00:00
Jake Poznanski
f31f68a4d2 New transformers for qwen3 vl stuff 2025-10-24 16:56:39 +00:00
Jake Poznanski
ba9e14154e Setting new transformers version for training 2025-10-23 22:27:04 +00:00
Jake Poznanski
1ef66fd313 Working on table parsing 2025-10-23 21:19:27 +00:00
Jake Poznanski
88937c6e40 Prepping to train qwen3 vl 2025-10-23 18:54:38 +00:00
Jake Poznanski
2d2c2c9202 Adjusted table 2025-10-23 18:43:16 +00:00
Jake Poznanski
7d551ecb56 Cleanup bench results table 2025-10-23 18:42:51 +00:00
Jake Poznanski
7c5a3854d2 Copy over results table to main readme 2025-10-23 18:42:07 +00:00
Jake Poznanski
2c90e6aaf2 More readme updated with paper benchmark scores 2025-10-23 18:36:11 +00:00
Jake Poznanski
53be91a0ed Add citations and arxiv paper links 2025-10-23 18:26:39 +00:00
Jake Poznanski
ab108a5c5a Lint fixes 2025-10-23 04:03:39 +00:00
Jake Poznanski
210389f0f1 Merge branch 'main' of https://github.com/allenai/olmocr 2025-10-22 22:22:50 +00:00
Jake Poznanski
5083967589 Cleaning up repo to move all unit tests to a consistent place 2025-10-22 22:22:49 +00:00
Jake Poznanski
e7d6036ab3
Add files via upload 2025-10-22 09:30:40 -07:00
Jake Poznanski
d41584772e
Update README.md 2025-10-22 09:09:09 -07:00
Jake Poznanski
f5569fc443
Add files via upload 2025-10-22 09:07:03 -07:00
Jake Poznanski
197be00aa4
Update README.md 2025-10-22 08:44:11 -07:00
Jake Poznanski
da3b1a8b60
Update README.md 2025-10-22 08:23:44 -07:00
Luca Soldaini
ffa4ecc9c2
Add files via upload 2025-10-22 10:49:31 -04:00
Kyle Lo
4a4e5a5406
paper 2025-10-22 02:47:50 -05:00
Jake Poznanski
f5fad405c0 Bump version to v0.4.2 for release v0.4.2 2025-10-22 04:49:12 +00:00
Jake Poznanski
ee37a6a0ac Updating tests and a few CI fixes 2025-10-22 04:49:01 +00:00
Jake Poznanski
fe0bde009c Bump version to v0.4.1 for release v0.4.1 2025-10-22 04:27:51 +00:00
Jake Poznanski
b21c933af2 Version bump to rebuild 2025-10-22 04:27:47 +00:00
Jake Poznanski
970c5d08d0 Adding dependecies so unit tests run in CI 2025-10-22 04:26:51 +00:00
Jake Poznanski
a426f0c462 Bump version to v0.4.0 for release v0.4.0 2025-10-22 04:13:54 +00:00
Jake Poznanski
ea414c9e71 Version bump 2025-10-22 04:13:32 +00:00
Jake Poznanski
87137db70c Merge branch 'jakep/new_data' 2025-10-22 04:12:09 +00:00
Jake Poznanski
4b8146c532 Unit test fixes 2025-10-22 03:47:58 +00:00
Jake Poznanski
3786c4c5ba Renaming 2025-10-21 20:15:26 +00:00
Jake Poznanski
5c16f52d3b Paddle vl benchmark runner saves off data 2025-10-21 20:09:39 +00:00
Jake Poznanski
0c3d2d2e16 One more args fix 2025-10-20 22:38:49 +00:00
Jake Poznanski
8118680b4b Fixes 2025-10-20 22:12:33 +00:00
Jake Poznanski
7472ef905e More args 2025-10-20 22:09:05 +00:00
Jake Poznanski
3d3fd78499 Test 2025-10-20 22:08:19 +00:00
Jake Poznanski
d211276a73 Adjust again 2025-10-20 22:07:29 +00:00
Jake Poznanski
096cb3e521 Ugh 2025-10-20 22:02:22 +00:00
Jake Poznanski
255ee48594 Fixing other way to run benchmark 2025-10-20 22:00:32 +00:00
Jake Poznanski
eaf83026d3 Lints 2025-10-20 18:43:13 +00:00
Jake Poznanski
4fc9cd112b Improving docs 2025-10-20 18:42:49 +00:00
Jake Poznanski
47ed6bbe66 VLLM based nanonets ocr2 2025-10-20 17:32:30 +00:00
Jake Poznanski
76e05f8165 Fixes, adding more runners 2025-10-20 17:11:12 +00:00
Jake Poznanski
e796448482 Adding paddlevl script 2025-10-20 16:26:08 +00:00
Jake Poznanski
7a744cc0b4 FInal docs on model setup 2025-10-19 18:21:45 +00:00
Jake Poznanski
6c32ff2c7d Update dates 2025-10-16 18:21:18 +00:00