708 Commits

Author SHA1 Message Date
Jake Poznanski
7e02e199ba Adjusting tools to include html templates 2025-02-14 21:42:59 +00:00
Jake Poznanski
81f543019e
Update README.md 2025-02-14 13:30:39 -08:00
Jake Poznanski
08f76121c3 Bump version to v0.1.53 for release v0.1.53 2025-02-14 20:55:21 +00:00
Jake Poznanski
58bdfa512b CI 2025-02-14 20:51:04 +00:00
Jake Poznanski
25ec87b66d CI 2025-02-14 20:46:55 +00:00
Jake Poznanski
c05e01532c Hopefully CI runs now 2025-02-14 20:42:19 +00:00
Jake Poznanski
15f9b8b9dc Install poppler in CI 2025-02-14 20:02:05 +00:00
Jake Poznanski
229da8cb17 unused imports 2025-02-14 19:54:48 +00:00
Jake Poznanski
32aa359458 Formatting fix 2025-02-14 19:50:19 +00:00
Jake Poznanski
0dcdbcc61a
Update README.md 2025-02-14 11:07:42 -08:00
Jake Poznanski
6583fb641a hfupload scripts 2025-02-14 17:36:00 +00:00
kyleclo
86b17d0ea3 add boxplot drawing 2025-02-13 19:38:09 -08:00
kyleclo
a790ba73ee update args; include output 2025-02-13 17:06:36 -08:00
kyleclo
88c18b3afa human eval data; elo ratings script; dependencies 2025-02-13 16:59:09 -08:00
Jake Poznanski
8297955290 Making my parquets 2025-02-14 00:02:07 +00:00
Jake Poznanski
51cfdbd64f Better converter 2025-02-13 22:30:20 +00:00
Jake Poznanski
e369569f99
Update README.md 2025-02-13 13:46:02 -08:00
Jake Poznanski
91eef279b3 Adding some gnarly 1 pager pdfs from kyle 2025-02-11 18:45:42 +00:00
Jake Poznanski
87cb9573d8 First pass at dataset builder script 2025-02-11 18:38:41 +00:00
Jake Poznanski
6ed6f85c42 Generating parquets for hugging face 2025-02-10 23:12:38 +00:00
Jake Poznanski
84c0c71393 Merge branch 'main' of https://github.com/allenai/olmocr 2025-02-10 22:00:42 +00:00
Jake Poznanski
7d67a59c31 Remove unused 2025-02-10 22:00:40 +00:00
Jake Poznanski
6471f28ec8 Random git ignores, remove unused code 2025-02-10 22:00:35 +00:00
Jake Poznanski
f04d1207a5 Merge branch 'main' of https://github.com/allenai/olmocr into main 2025-02-10 12:40:29 -08:00
Jake Poznanski
e73ff9d7a1 Updating to new model name on HF 2025-02-10 12:39:49 -08:00
Jake Poznanski
e627842b77
Merge pull request #28 from allenai/amanr/code_documentation
Resolved Git checks and updated readme
2025-02-10 11:53:54 -08:00
aman-17
f57c6f3f7b restored modeling_molmo.py file 2025-02-10 11:07:35 -08:00
aman-17
4bff92053b updated changelog 2025-02-07 16:34:53 -08:00
aman-17
b6e5dab306 fixed lint check 2025-02-07 16:29:27 -08:00
aman-17
a036133fdd resolved all the mypy, black and isort issues and updated readme 2025-02-07 16:05:00 -08:00
Jake Poznanski
9bf3d35cdb Comment fix 2025-01-30 16:02:08 -08:00
Jake Poznanski
2ab7cb280c Removing pymupdf 2025-01-30 15:51:54 -08:00
Jake Poznanski
ddeea92591 More dev dependecies 2025-01-30 15:38:29 -08:00
Jake Poznanski
72f4b9a590 Project setup 2025-01-30 15:33:04 -08:00
Jake Poznanski
cdd830235f Shortened some sample docs 2025-01-30 15:28:31 -08:00
Jake Poznanski
10094ffc19 Even newer mypy crashes still 2025-01-30 14:32:08 -08:00
Jake Poznanski
c74d47a553 Pipeline fixes 2025-01-30 22:30:39 +00:00
Jake Poznanski
04844b3f87 More beaker and docker fixes 2025-01-30 22:14:57 +00:00
Jake Poznanski
9df86da271 Beaker fixes 2025-01-30 21:44:22 +00:00
Jake Poznanski
cf6673cecf Pipeline fixes 2025-01-30 13:42:42 -08:00
Jake Poznanski
7fbbb572ae Remove mypy for now 2025-01-30 13:37:01 -08:00
Jake Poznanski
d36e556f19 Hopefully fixes build 2025-01-30 13:11:37 -08:00
Jake Poznanski
c69e0d6762 More cleanup, removing dead adv anchor code 2025-01-30 12:58:11 -08:00
Jake Poznanski
d4d711d12a Nicer glob handing for pipeline.py 2025-01-30 12:48:10 -08:00
Jake Poznanski
84477b50f4 More formatting 2025-01-30 10:54:21 -08:00
Jake Poznanski
e3d04ee79f Merge branch 'main' of https://github.com/allenai/olmocr into main 2025-01-30 10:53:40 -08:00
Jake Poznanski
c37e545d25 running isort again 2025-01-30 10:53:35 -08:00
Jake Poznanski
358a24f6cb
Update README.md 2025-01-30 10:33:54 -08:00
Jake Poznanski
c58e13392b
Update README.md 2025-01-30 10:28:57 -08:00
Jake Poznanski
2c2953329e Fixing most ruff errors 2025-01-29 15:57:26 -08:00