Jake Poznanski
|
a957ab2aaf
|
Adding an adjustment to how blank pages test is run, skipping image tags
|
2025-09-08 17:18:51 +00:00 |
|
Jake Poznanski
|
0710debf75
|
Cleaner front matter reward
|
2025-08-27 19:49:42 +00:00 |
|
Jake Poznanski
|
d70208d98a
|
Moving test code around, adding format reward since some runs stop outputting the front matter thing in grpo training
|
2025-08-27 18:22:05 +00:00 |
|
Jake Poznanski
|
8383865392
|
Fixing up subscripts and superscripts in synth data
|
2025-08-27 18:15:36 +00:00 |
|
Jake Poznanski
|
d36357f3db
|
Some fixes to validating math which was not working otherwise
|
2025-08-22 20:40:14 +00:00 |
|
Jake Poznanski
|
dcc932dc2c
|
Markdown cleanup
|
2025-08-22 17:21:13 +00:00 |
|
Jake Poznanski
|
d2bec31595
|
Markdown front matter corrector
|
2025-08-22 16:43:36 +00:00 |
|
Jake Poznanski
|
0fd7d07e73
|
GRPO reward fixups
|
2025-08-21 18:33:11 +00:00 |
|
Jake Poznanski
|
1dd6ff9b03
|
Olmocr bench grpo stuff
|
2025-08-21 18:17:07 +00:00 |
|
Jake Poznanski
|
cc918ca03e
|
Setting up GRPO trainer
|
2025-08-20 22:18:38 +00:00 |
|
Jake Poznanski
|
41201b6317
|
Lints
|
2025-08-19 21:30:41 +00:00 |
|
Jake Poznanski
|
768cb33937
|
Better filtering coming in
|
2025-08-19 21:22:54 +00:00 |
|
Jake Poznanski
|
84a0c432e7
|
Adding some filtering rules and tests for them
|
2025-08-19 18:14:15 +00:00 |
|
Jake Poznanski
|
93411a80a0
|
Lint fixes
|
2025-08-13 20:21:04 +00:00 |
|
Jake Poznanski
|
05330150ad
|
New work queue code is cleaner
|
2025-08-13 20:20:27 +00:00 |
|
Jake Poznanski
|
6216896102
|
Accidentally comitted too many files
|
2025-08-04 20:41:21 +00:00 |
|
Jake Poznanski
|
0536c0e9b8
|
Lint fixes
|
2025-08-04 18:21:47 +00:00 |
|
Jake Poznanski
|
08b263ba46
|
Cumulative rotation support
|
2025-08-04 18:21:31 +00:00 |
|
Jake Poznanski
|
ed8a5d10cf
|
Ok fixed rotation stuff finally
|
2025-08-04 17:53:48 +00:00 |
|
Jake Poznanski
|
e0158df210
|
Adding test file
|
2025-08-04 17:21:40 +00:00 |
|
Jake Poznanski
|
6cdcb06ae7
|
Removing some dead code and adding tests
|
2025-08-04 16:54:42 +00:00 |
|
Jake Poznanski
|
a8d5299433
|
Trying to add a test for rotation correction
|
2025-08-04 16:24:13 +00:00 |
|
Jake Poznanski
|
56296d6927
|
Brining back a few files
|
2025-07-23 04:49:13 +00:00 |
|
Jake Poznanski
|
b588ae27d2
|
Remvoing sglang tests, switch to vllm
|
2025-06-17 16:07:16 +00:00 |
|
Jake Poznanski
|
5faf570e30
|
Format fixes
|
2025-05-29 23:23:02 +00:00 |
|
Jake Poznanski
|
f8fd234093
|
Idea to improve retry performance
|
2025-05-28 18:27:40 +00:00 |
|
Jake Poznanski
|
63aee2c1e5
|
Code cleanup, version bump, remove unused permutation test
|
2025-05-16 21:25:32 +00:00 |
|
Jake Poznanski
|
1854ae1269
|
A bit more work on tagging
|
2025-05-09 19:31:07 +00:00 |
|
Jake Poznanski
|
03db04cb7e
|
Fixing handling of new lines in some test cases
|
2025-05-08 17:21:06 +00:00 |
|
Jake Poznanski
|
8f46b6e966
|
Running more tests in CI
|
2025-04-17 14:26:06 -07:00 |
|
Jake Poznanski
|
1d0c560455
|
Upping version to fix issue with work queue and delimited paths
|
2025-04-15 18:50:13 +00:00 |
|
Jake Poznanski
|
79e2677319
|
Hmm, these should be passing!
|
2025-03-14 02:52:13 +00:00 |
|
Jake Poznanski
|
f5d92bdb14
|
Trying to get new CI to work
|
2025-03-14 02:43:55 +00:00 |
|
Chris Wilhelm
|
c585415797
|
for now, only process one pdf in the ci script
|
2025-03-13 15:48:47 -07:00 |
|
Chris Wilhelm
|
9b958e65f1
|
moves what happens where around a bit and updates readme
|
2025-03-13 15:31:55 -07:00 |
|
Chris Wilhelm
|
29b9054749
|
basic docker image and test
|
2025-03-13 15:31:55 -07:00 |
|
aman-17
|
0130a970c2
|
fixed style
|
2025-02-25 08:57:02 -08:00 |
|
Jake Poznanski
|
58bdfa512b
|
CI
|
2025-02-14 20:51:04 +00:00 |
|
Jake Poznanski
|
25ec87b66d
|
CI
|
2025-02-14 20:46:55 +00:00 |
|
Jake Poznanski
|
c05e01532c
|
Hopefully CI runs now
|
2025-02-14 20:42:19 +00:00 |
|
Jake Poznanski
|
91eef279b3
|
Adding some gnarly 1 pager pdfs from kyle
|
2025-02-11 18:45:42 +00:00 |
|
aman-17
|
a036133fdd
|
resolved all the mypy, black and isort issues and updated readme
|
2025-02-07 16:05:00 -08:00 |
|
Jake Poznanski
|
9bf3d35cdb
|
Comment fix
|
2025-01-30 16:02:08 -08:00 |
|
Jake Poznanski
|
2ab7cb280c
|
Removing pymupdf
|
2025-01-30 15:51:54 -08:00 |
|
Jake Poznanski
|
72f4b9a590
|
Project setup
|
2025-01-30 15:33:04 -08:00 |
|
Jake Poznanski
|
cdd830235f
|
Shortened some sample docs
|
2025-01-30 15:28:31 -08:00 |
|
Jake Poznanski
|
10094ffc19
|
Even newer mypy crashes still
|
2025-01-30 14:32:08 -08:00 |
|
Jake Poznanski
|
fb402297ce
|
Isort and black update
|
2025-01-29 15:42:34 -08:00 |
|
Jake Poznanski
|
dcaca8aa90
|
Black formatting
|
2025-01-29 15:30:39 -08:00 |
|
Jake Poznanski
|
4a1762d455
|
isort
|
2025-01-29 15:25:10 -08:00 |
|