Jake Poznanski
|
386374bd72
|
More prints
|
2024-11-25 16:08:24 -08:00 |
|
Jake Poznanski
|
04d6123037
|
Doing some experiments
|
2024-11-25 15:36:04 -08:00 |
|
Jake Poznanski
|
51614efc83
|
More log probs investigation
|
2024-11-25 11:24:21 -08:00 |
|
Jake Poznanski
|
28d52602e9
|
More test code
|
2024-11-25 11:00:03 -08:00 |
|
Jake Poznanski
|
606e81bfea
|
Not happy here with this test
|
2024-11-25 10:32:18 -08:00 |
|
Jake Poznanski
|
d7838372e8
|
Full test
|
2024-11-25 10:25:55 -08:00 |
|
Jake Poznanski
|
2e4f7d7827
|
Working on HF test for comparison
|
2024-11-25 10:12:29 -08:00 |
|
Jake Poznanski
|
5e3080db28
|
Sglang based unit test
|
2024-11-25 09:48:05 -08:00 |
|
Jake Poznanski
|
60f24ad2d6
|
tests
|
2024-11-25 09:39:55 -08:00 |
|
Jake Poznanski
|
5289092076
|
Startingon sglang test
|
2024-11-25 09:34:59 -08:00 |
|
Jake Poznanski
|
ba8eba245b
|
Unit tests fixes
|
2024-11-25 09:13:13 -08:00 |
|
Jake Poznanski
|
c9e1a4c540
|
More tests
|
2024-11-20 19:37:00 +00:00 |
|
Jake Poznanski
|
8793fc7d99
|
Adding more retries, and it was able to process more complicated books
|
2024-11-18 14:25:32 -08:00 |
|
Jake Poznanski
|
e499413089
|
Better work queue
|
2024-11-18 11:04:51 -08:00 |
|
Jake Poznanski
|
04429b2862
|
Basic work queue from claude
|
2024-11-18 10:07:03 -08:00 |
|
Jake Poznanski
|
fcabb8e55a
|
Handling more error cases
|
2024-11-18 09:12:04 -08:00 |
|
Jake Poznanski
|
96984fcd77
|
Fix a reliability issue
|
2024-11-18 09:03:24 -08:00 |
|
Jake Poznanski
|
6a4a55f9e0
|
Hopefully working molmo HF trainer config
|
2024-10-30 14:00:27 -07:00 |
|
Jake Poznanski
|
bede854cd5
|
Startng to write molmo formatters
|
2024-10-30 13:24:11 -07:00 |
|
Jake Poznanski
|
85e0e2a61b
|
Fixing issues with pdf parsing
|
2024-10-30 16:26:02 +00:00 |
|
Jake Poznanski
|
08d51b7183
|
Adding some rotation retry contrl
|
2024-10-28 20:16:06 +00:00 |
|
Jake Poznanski
|
ffe470bf0e
|
Fix
|
2024-10-23 22:55:50 +00:00 |
|
Jake Poznanski
|
180dde03c5
|
dataprep sampling tests
|
2024-10-23 22:53:05 +00:00 |
|
Jake Poznanski
|
999f64dd46
|
Adding empty anchor support
|
2024-10-23 22:17:20 +00:00 |
|
Jake Poznanski
|
a1a4798ce7
|
Some crazy idea I had to simplify futures and memory limits
|
2024-10-23 21:51:37 +00:00 |
|
Jake Poznanski
|
302eee3da5
|
Yay matches between birr and hf
|
2024-10-21 16:58:30 +00:00 |
|
Jake Poznanski
|
9d35d3ca8f
|
Birr tokenization test
|
2024-10-18 23:02:37 +00:00 |
|
Jake Poznanski
|
7dbcbc154b
|
Birr tests that don't do anything but help me understand the universe
|
2024-10-18 22:39:17 +00:00 |
|
Jake Poznanski
|
dd4f9670b5
|
Filter refactor
|
2024-10-17 22:36:38 +00:00 |
|
Jake Poznanski
|
7d4cff53b5
|
Nice test for picking proper page in birrpipelie
|
2024-10-17 20:26:02 +00:00 |
|
Jake Poznanski
|
2826bcad18
|
Yay all unit tests pass cleanly now too
|
2024-10-17 17:05:55 +00:00 |
|
Jake Poznanski
|
124aaf5fe0
|
Hmm, cant repro failing anchor case
|
2024-10-17 17:00:02 +00:00 |
|
Jake Poznanski
|
202d81cece
|
Merge branch 'main' of https://github.com/allenai/pdelfin into main
|
2024-10-16 11:38:33 -07:00 |
|
Jake Poznanski
|
e2552b2f28
|
Adding test case
|
2024-10-16 11:38:31 -07:00 |
|
Jake Poznanski
|
3c1b7de293
|
Refactoring of train dataloaders
|
2024-10-16 18:26:25 +00:00 |
|
Jake Poznanski
|
23d129fd2c
|
Organizing around a new style of dataloader
|
2024-10-16 18:06:27 +00:00 |
|
Jake Poznanski
|
a2546e0b04
|
more stuff
|
2024-10-16 17:06:03 +00:00 |
|
Jake Poznanski
|
96682b2ecb
|
Refactoring
|
2024-10-16 16:18:27 +00:00 |
|
Jake Poznanski
|
2cd863ddce
|
Dolma viewer improvements
|
2024-10-16 16:05:44 +00:00 |
|
Jake Poznanski
|
6d53683001
|
More stats hopefully running faster
|
2024-10-14 21:37:14 +00:00 |
|
Jake Poznanski
|
7b161533e2
|
Code to do local inference on fine tuned models for testing
|
2024-10-14 08:38:18 -07:00 |
|
Jake Poznanski
|
2864f907e1
|
Dataloader fix with nicer tests
|
2024-10-10 16:58:45 +00:00 |
|
Jake Poznanski
|
b7c80cd17f
|
Fix up some tests but I don't see why this isn't working
|
2024-10-10 16:58:40 +00:00 |
|
Jake Poznanski
|
a90feda42f
|
bugfixes
|
2024-10-09 20:20:06 +00:00 |
|
Jake Poznanski
|
4bf6e7a430
|
Refactoring
|
2024-10-09 18:11:18 +00:00 |
|
Jake Poznanski
|
dc6440d068
|
Cleaning up anchor text to deal with abnormally long lines
|
2024-10-09 16:29:20 +00:00 |
|
Jake Poznanski
|
230c8a9f9a
|
Trying new run that will rewrite the prompts as it goes
|
2024-10-08 22:10:18 +00:00 |
|
Jake Poznanski
|
97291b3f6a
|
Anchor is fixed to sample text elements better
|
2024-10-08 21:51:43 +00:00 |
|
Jake Poznanski
|
c8a4d14c57
|
Adding image merging to pdf report/hint/anchor
|
2024-10-08 21:23:21 +00:00 |
|
Jake Poznanski
|
ebd40f9084
|
Hopefully fixing dataloader for now
|
2024-10-07 12:59:27 -07:00 |
|