Merge branch 'main' of https://github.com/allenai/olmocr into main

This commit is contained in:
Jake Poznanski 2025-04-23 15:55:04 -07:00
commit 811d267bd5

View File

@ -105,7 +105,7 @@ Several categories of tests have been made so far:
- [ ] Review math equations in old_scans_math.jsonl using chat gpt script
- [X] Add test category of long_texts which are still ~1 standard printed page, but with dense/small text
- [ ] Review multicolumn_tests, make sure they are correct, clean, and don't have order tests between regions
- [ ] Remove [] and other special symbols from old_scans
- [X] Remove [] and other special symbols from old_scans
- [ ] Full review of old_scans, somehow, chatgpt or prolific
- [ ] Adjust scoring to weight each test category equally in final score distribution
- [ ] Double check marker inline math outputs