diff --git a/olmocr/bench/README.md b/olmocr/bench/README.md index 4e32c98..8018c09 100644 --- a/olmocr/bench/README.md +++ b/olmocr/bench/README.md @@ -105,7 +105,7 @@ Several categories of tests have been made so far: - [ ] Review math equations in old_scans_math.jsonl using chat gpt script - [X] Add test category of long_texts which are still ~1 standard printed page, but with dense/small text - [ ] Review multicolumn_tests, make sure they are correct, clean, and don't have order tests between regions - - [ ] Remove [] and other special symbols from old_scans + - [X] Remove [] and other special symbols from old_scans - [ ] Full review of old_scans, somehow, chatgpt or prolific - [ ] Adjust scoring to weight each test category equally in final score distribution - [ ] Double check marker inline math outputs