Jake Poznanski
|
587b73f23e
|
Try with more aggressive anchor changing
|
2025-05-29 22:33:16 +00:00 |
|
Jake Poznanski
|
8f5d5bdf28
|
Revert "Trying to add repetition penalty"
This reverts commit 90f754e7b182f5978f60f5e4734f6ebb0aa3e735.
|
2025-05-29 21:59:23 +00:00 |
|
Jake Poznanski
|
90f754e7b1
|
Trying to add repetition penalty
|
2025-05-29 21:27:13 +00:00 |
|
Jake Poznanski
|
9dcdef6ca3
|
Going to try with up to 5k tokens
|
2025-05-29 20:34:05 +00:00 |
|
Jake Poznanski
|
8d92620d3c
|
Merge remote-tracking branch 'origin/main' into retry_improvements
|
2025-05-29 20:33:45 +00:00 |
|
Jake Poznanski
|
cd5b524d20
|
Some benchmark cleanup
|
2025-05-29 20:32:25 +00:00 |
|
Jake Poznanski
|
2cb14cceae
|
ALlowing more tokens
|
2025-05-29 19:59:58 +00:00 |
|
Jake Poznanski
|
22ee068d88
|
Merge remote-tracking branch 'origin/main' into retry_improvements
|
2025-05-29 18:25:10 +00:00 |
|
Jake Poznanski
|
fbcd82ad30
|
Cleanup attempt lookup code a bit
|
2025-05-29 16:01:26 +00:00 |
|
aman-17
|
ce616c6514
|
addressed Jake's comments
|
2025-05-28 19:01:01 -07:00 |
|
aman-17
|
8a63093663
|
fixed lint
|
2025-05-28 14:45:07 -07:00 |
|
aman-17
|
cd5db7f281
|
fixed style and lint
|
2025-05-28 14:42:07 -07:00 |
|
aman-17
|
acc2687f21
|
Updated dockerfile and added a file
|
2025-05-28 14:35:23 -07:00 |
|
Jake Poznanski
|
f8fd234093
|
Idea to improve retry performance
|
2025-05-28 18:27:40 +00:00 |
|
Jake Poznanski
|
76270f5538
|
Upping to v70 to test new docker builds
|
2025-05-23 20:09:45 +00:00 |
|
Jake Poznanski
|
bea1873300
|
Update README.md
|
2025-05-23 11:32:02 -07:00 |
|
Jake Poznanski
|
7996a7dac4
|
Update README.md
|
2025-05-22 16:00:29 -07:00 |
|
Jake Poznanski
|
71275cce76
|
Bumping version, adding more docs, more to come
|
2025-05-20 16:42:21 +00:00 |
|
Jake Poznanski
|
8d8e32331a
|
Adding markdown flag to directly generate markdown outputs
|
2025-05-19 19:42:48 +00:00 |
|
Jake Poznanski
|
2c1c8a693b
|
Updating readme more
|
2025-05-19 17:30:21 +00:00 |
|
Jake Poznanski
|
db9972c39a
|
Readme updates
|
2025-05-19 16:56:22 +00:00 |
|
Jake Poznanski
|
c97ce8bcd4
|
Lints
|
2025-05-16 22:40:54 +00:00 |
|
Jake Poznanski
|
08806fdec6
|
Fixups
|
2025-05-16 21:32:24 +00:00 |
|
Jake Poznanski
|
10b5e9e31e
|
Includes
|
2025-05-16 21:30:09 +00:00 |
|
Jake Poznanski
|
63aee2c1e5
|
Code cleanup, version bump, remove unused permutation test
|
2025-05-16 21:25:32 +00:00 |
|
Jake Poznanski
|
5de52e7d13
|
Update README.md
|
2025-05-16 14:20:21 -07:00 |
|
Jake Poznanski
|
0da6fa0c59
|
Update README.md
|
2025-05-15 20:41:37 -07:00 |
|
Jake Poznanski
|
f0768bba3e
|
Merge branch 'main' of https://github.com/allenai/olmocr
|
2025-05-15 22:50:30 +00:00 |
|
Jake Poznanski
|
c4a0fb9af5
|
Adding back in proper CI estimation
|
2025-05-15 22:50:29 +00:00 |
|
Aman Rangapur
|
d047bc6712
|
Updated README.md
|
2025-05-15 11:34:07 -07:00 |
|
Jake Poznanski
|
ffee4c9740
|
Big bug fix, moving the prompt to match how training was done, 2.3 point boost on olmocr-bench
|
2025-05-14 19:51:00 +00:00 |
|
Jake Poznanski
|
2e8753af26
|
Docling runner based on CLI, but its too slow to use. Pii rule fixes
|
2025-05-14 16:31:56 +00:00 |
|
Jake Poznanski
|
74ef2b6f65
|
Fixes for some pii taggers
|
2025-05-13 16:19:50 +00:00 |
|
Jake Poznanski
|
b3b405d077
|
dedupe script
|
2025-05-12 17:02:35 +00:00 |
|
Jake Poznanski
|
1538163f6f
|
Merge branch 'main' of https://github.com/allenai/olmocr
|
2025-05-10 17:41:44 +00:00 |
|
Jake Poznanski
|
623c66c85c
|
Fixing up tagging pipeline
|
2025-05-10 17:41:43 +00:00 |
|
Jake Poznanski
|
1c59130b55
|
Update README.md
|
2025-05-09 14:51:18 -07:00 |
|
Jake Poznanski
|
225b705eef
|
Update README.md
|
2025-05-09 14:48:49 -07:00 |
|
Jake Poznanski
|
03db04cb7e
|
Fixing handling of new lines in some test cases
|
2025-05-08 17:21:06 +00:00 |
|
Aman Rangapur
|
6f62e05b1f
|
Merge pull request #188 from allenai/amanr/miners
added checker for `hea_foo` and miner to get `old_scans` img's
|
2025-05-07 11:41:29 -07:00 |
|
Jake Poznanski
|
ef083bf845
|
Stats fix
|
2025-05-06 21:21:06 +00:00 |
|
Jake Poznanski
|
a2ec95e0f5
|
Testing out to see where we stand on qwen2.5
|
2025-05-05 17:15:09 +00:00 |
|
aman-17
|
57720564ee
|
fixed lint and style
|
2025-05-02 16:24:03 -07:00 |
|
aman-17
|
281ca51916
|
added checker for hea_foo and miner to get old scans img's
|
2025-05-02 16:22:45 -07:00 |
|
Jake Poznanski
|
97e4992a3f
|
Merge branch 'main' of https://github.com/allenai/olmocr
|
2025-05-02 21:51:24 +00:00 |
|
Jake Poznanski
|
dcbe6543b8
|
Report for benchmarking
|
2025-05-02 21:51:23 +00:00 |
|
Jake Poznanski
|
18de822269
|
Update README.md
|
2025-05-01 13:31:19 -07:00 |
|
Jake Poznanski
|
472ee108d7
|
Lints
|
2025-04-30 21:18:59 +00:00 |
|
Jake Poznanski
|
0a320e9870
|
Some helper scripts for Aman
|
2025-04-30 18:47:10 +00:00 |
|
Jake Poznanski
|
1067f80160
|
Update README.md
|
2025-04-29 15:43:43 -07:00 |
|