Jake Poznanski
|
c93fc36f72
|
Missing import
|
2024-11-26 22:22:36 +00:00 |
|
Jake Poznanski
|
dd17185cfd
|
More things to try
|
2024-11-23 21:49:33 +00:00 |
|
Jake Poznanski
|
46fe4acc0b
|
Trying fixes for live lock
|
2024-11-23 21:41:49 +00:00 |
|
Jake Poznanski
|
41accfe867
|
Error out if you see a broken process pool, might need a better check for this
|
2024-11-22 22:07:43 +00:00 |
|
Jake Poznanski
|
a95487e44c
|
Adding check for possible sglang livelock
|
2024-11-22 21:50:45 +00:00 |
|
Jake Poznanski
|
cff97990bf
|
Moving to official sglang release
|
2024-11-22 19:37:31 +00:00 |
|
Jake Poznanski
|
f8dcdf625a
|
Better catching of httpx errors and retrying them
|
2024-11-21 23:35:42 +00:00 |
|
Jake Poznanski
|
d6a00135a7
|
Faster init by caching pdf filter
|
2024-11-21 23:23:11 +00:00 |
|
Jake Poznanski
|
a91befc4ad
|
Fix for fallback stuff
|
2024-11-21 11:08:42 -08:00 |
|
Jake Poznanski
|
8c858a9d15
|
New version
|
2024-11-21 10:49:31 -08:00 |
|
Jake Poznanski
|
66fff4f44b
|
Merge branch 'main' of https://github.com/allenai/pdelfin
|
2024-11-21 18:39:22 +00:00 |
|
Jake Poznanski
|
212d391933
|
More convservative filtering
|
2024-11-21 18:39:21 +00:00 |
|
Jake Poznanski
|
b8b786e003
|
Applying pdf filter
|
2024-11-21 10:20:58 -08:00 |
|
Jake Poznanski
|
cb800d6e2c
|
Merge branch 'main' of https://github.com/allenai/pdelfin into main
|
2024-11-21 08:58:30 -08:00 |
|
Jake Poznanski
|
7dd20460a3
|
New version
|
2024-11-21 08:58:28 -08:00 |
|
Jake Poznanski
|
219cc7eca8
|
Merge branch 'main' of https://github.com/allenai/pdelfin
|
2024-11-21 16:56:20 +00:00 |
|
Jake Poznanski
|
98e40143dd
|
Adding mass filtering script
|
2024-11-21 16:56:19 +00:00 |
|
Jake Poznanski
|
af8ce518ac
|
Merge branch 'main' of https://github.com/allenai/pdelfin into main
|
2024-11-21 08:45:19 -08:00 |
|
Jake Poznanski
|
9112d81bd1
|
No keep alive connection to try to resolve sglang livelock
|
2024-11-21 08:45:17 -08:00 |
|
Jake Poznanski
|
2443c22fde
|
Projected output tokens
|
2024-11-20 23:57:10 +00:00 |
|
Jake Poznanski
|
09319a64ea
|
new version
|
2024-11-20 22:58:06 +00:00 |
|
Jake Poznanski
|
53a510479b
|
Merge branch 'main' of https://github.com/allenai/pdelfin into main
|
2024-11-20 14:45:24 -08:00 |
|
Jake Poznanski
|
67d11ec0e6
|
TODOs and client fix
|
2024-11-20 14:45:12 -08:00 |
|
Jake Poznanski
|
092480573b
|
Baseline repeat detect
|
2024-11-20 19:58:20 +00:00 |
|
Jake Poznanski
|
c9e1a4c540
|
More tests
|
2024-11-20 19:37:00 +00:00 |
|
Jake Poznanski
|
3153aea260
|
Merge branch 'main' of https://github.com/allenai/pdelfin into main
|
2024-11-20 10:42:39 -08:00 |
|
Jake Poznanski
|
9b8d58b59e
|
Better stats and metadata
|
2024-11-20 10:42:26 -08:00 |
|
Jake Poznanski
|
878a21b48d
|
Update README.md
|
2024-11-20 08:55:57 -08:00 |
|
Jake Poznanski
|
5704bb89ad
|
Update README.md
|
2024-11-20 08:54:30 -08:00 |
|
Jake Poznanski
|
273a8b0d0a
|
Logging fallback pages
|
2024-11-19 15:11:02 -08:00 |
|
Jake Poznanski
|
b0acfa870e
|
Adding support for fallback pages
|
2024-11-19 14:59:20 -08:00 |
|
Jake Poznanski
|
204a4a8e5b
|
Better stats
|
2024-11-19 13:41:32 -08:00 |
|
Jake Poznanski
|
3ef4609bdd
|
Fixing args
|
2024-11-19 11:48:45 -08:00 |
|
Jake Poznanski
|
27d23525b7
|
Claude recommends httpx instead of aiohttp, seeing if that will help with straggler timeouts
|
2024-11-19 10:41:58 -08:00 |
|
Jake Poznanski
|
4469f4b2ce
|
Version patch
|
2024-11-18 19:55:26 -08:00 |
|
Jake Poznanski
|
9e2e09bd06
|
More fixes
|
2024-11-18 15:04:50 -08:00 |
|
Jake Poznanski
|
8793fc7d99
|
Adding more retries, and it was able to process more complicated books
|
2024-11-18 14:25:32 -08:00 |
|
Jake Poznanski
|
2f55a3ddb7
|
fix
|
2024-11-18 13:58:25 -08:00 |
|
Jake Poznanski
|
d4d47369cb
|
more gcs
|
2024-11-18 13:20:28 -08:00 |
|
Jake Poznanski
|
e48d4bef00
|
Fix
|
2024-11-18 13:16:19 -08:00 |
|
Jake Poznanski
|
8c3b5753c9
|
Gcs support better
|
2024-11-18 13:07:27 -08:00 |
|
Jake Poznanski
|
9381bf862a
|
docs
|
2024-11-18 12:44:34 -08:00 |
|
Jake Poznanski
|
f287f2451c
|
Fixing a few stats things
|
2024-11-18 11:50:22 -08:00 |
|
Jake Poznanski
|
e499413089
|
Better work queue
|
2024-11-18 11:04:51 -08:00 |
|
Jake Poznanski
|
04429b2862
|
Basic work queue from claude
|
2024-11-18 10:07:03 -08:00 |
|
Jake Poznanski
|
995b1d15fc
|
Fixes, mocking out queue into separate file
|
2024-11-18 09:55:45 -08:00 |
|
Jake Poznanski
|
fcabb8e55a
|
Handling more error cases
|
2024-11-18 09:12:04 -08:00 |
|
Jake Poznanski
|
96984fcd77
|
Fix a reliability issue
|
2024-11-18 09:03:24 -08:00 |
|
Jake Poznanski
|
0af29f1f44
|
Adding page rotation
|
2024-11-18 08:29:32 -08:00 |
|
Jake Poznanski
|
e2303f28af
|
Running on l40s, fixing queue
|
2024-11-18 08:25:36 -08:00 |
|