Jake Poznanski
|
386374bd72
|
More prints
|
2024-11-25 16:08:24 -08:00 |
|
Jake Poznanski
|
04d6123037
|
Doing some experiments
|
2024-11-25 15:36:04 -08:00 |
|
Jake Poznanski
|
51614efc83
|
More log probs investigation
|
2024-11-25 11:24:21 -08:00 |
|
Jake Poznanski
|
28d52602e9
|
More test code
|
2024-11-25 11:00:03 -08:00 |
|
Jake Poznanski
|
606e81bfea
|
Not happy here with this test
|
2024-11-25 10:32:18 -08:00 |
|
Jake Poznanski
|
d7838372e8
|
Full test
|
2024-11-25 10:25:55 -08:00 |
|
Jake Poznanski
|
2e4f7d7827
|
Working on HF test for comparison
|
2024-11-25 10:12:29 -08:00 |
|
Jake Poznanski
|
5e3080db28
|
Sglang based unit test
|
2024-11-25 09:48:05 -08:00 |
|
Jake Poznanski
|
60f24ad2d6
|
tests
|
2024-11-25 09:39:55 -08:00 |
|
Jake Poznanski
|
5289092076
|
Startingon sglang test
|
2024-11-25 09:34:59 -08:00 |
|
Jake Poznanski
|
ba8eba245b
|
Unit tests fixes
|
2024-11-25 09:13:13 -08:00 |
|
Jake Poznanski
|
dd17185cfd
|
More things to try
|
2024-11-23 21:49:33 +00:00 |
|
Jake Poznanski
|
46fe4acc0b
|
Trying fixes for live lock
|
2024-11-23 21:41:49 +00:00 |
|
Jake Poznanski
|
41accfe867
|
Error out if you see a broken process pool, might need a better check for this
|
2024-11-22 22:07:43 +00:00 |
|
Jake Poznanski
|
a95487e44c
|
Adding check for possible sglang livelock
|
2024-11-22 21:50:45 +00:00 |
|
Jake Poznanski
|
cff97990bf
|
Moving to official sglang release
|
2024-11-22 19:37:31 +00:00 |
|
Jake Poznanski
|
f8dcdf625a
|
Better catching of httpx errors and retrying them
|
2024-11-21 23:35:42 +00:00 |
|
Jake Poznanski
|
d6a00135a7
|
Faster init by caching pdf filter
|
2024-11-21 23:23:11 +00:00 |
|
Jake Poznanski
|
a91befc4ad
|
Fix for fallback stuff
|
2024-11-21 11:08:42 -08:00 |
|
Jake Poznanski
|
8c858a9d15
|
New version
|
2024-11-21 10:49:31 -08:00 |
|
Jake Poznanski
|
66fff4f44b
|
Merge branch 'main' of https://github.com/allenai/pdelfin
|
2024-11-21 18:39:22 +00:00 |
|
Jake Poznanski
|
212d391933
|
More convservative filtering
|
2024-11-21 18:39:21 +00:00 |
|
Jake Poznanski
|
b8b786e003
|
Applying pdf filter
|
2024-11-21 10:20:58 -08:00 |
|
Jake Poznanski
|
cb800d6e2c
|
Merge branch 'main' of https://github.com/allenai/pdelfin into main
|
2024-11-21 08:58:30 -08:00 |
|
Jake Poznanski
|
7dd20460a3
|
New version
|
2024-11-21 08:58:28 -08:00 |
|
Jake Poznanski
|
219cc7eca8
|
Merge branch 'main' of https://github.com/allenai/pdelfin
|
2024-11-21 16:56:20 +00:00 |
|
Jake Poznanski
|
98e40143dd
|
Adding mass filtering script
|
2024-11-21 16:56:19 +00:00 |
|
Jake Poznanski
|
af8ce518ac
|
Merge branch 'main' of https://github.com/allenai/pdelfin into main
|
2024-11-21 08:45:19 -08:00 |
|
Jake Poznanski
|
9112d81bd1
|
No keep alive connection to try to resolve sglang livelock
|
2024-11-21 08:45:17 -08:00 |
|
Jake Poznanski
|
2443c22fde
|
Projected output tokens
|
2024-11-20 23:57:10 +00:00 |
|
Jake Poznanski
|
09319a64ea
|
new version
|
2024-11-20 22:58:06 +00:00 |
|
Jake Poznanski
|
53a510479b
|
Merge branch 'main' of https://github.com/allenai/pdelfin into main
|
2024-11-20 14:45:24 -08:00 |
|
Jake Poznanski
|
67d11ec0e6
|
TODOs and client fix
|
2024-11-20 14:45:12 -08:00 |
|
Jake Poznanski
|
092480573b
|
Baseline repeat detect
|
2024-11-20 19:58:20 +00:00 |
|
Jake Poznanski
|
c9e1a4c540
|
More tests
|
2024-11-20 19:37:00 +00:00 |
|
Jake Poznanski
|
3153aea260
|
Merge branch 'main' of https://github.com/allenai/pdelfin into main
|
2024-11-20 10:42:39 -08:00 |
|
Jake Poznanski
|
9b8d58b59e
|
Better stats and metadata
|
2024-11-20 10:42:26 -08:00 |
|
Jake Poznanski
|
878a21b48d
|
Update README.md
|
2024-11-20 08:55:57 -08:00 |
|
Jake Poznanski
|
5704bb89ad
|
Update README.md
|
2024-11-20 08:54:30 -08:00 |
|
Jake Poznanski
|
273a8b0d0a
|
Logging fallback pages
|
2024-11-19 15:11:02 -08:00 |
|
Jake Poznanski
|
b0acfa870e
|
Adding support for fallback pages
|
2024-11-19 14:59:20 -08:00 |
|
Jake Poznanski
|
204a4a8e5b
|
Better stats
|
2024-11-19 13:41:32 -08:00 |
|
Jake Poznanski
|
3ef4609bdd
|
Fixing args
|
2024-11-19 11:48:45 -08:00 |
|
Jake Poznanski
|
27d23525b7
|
Claude recommends httpx instead of aiohttp, seeing if that will help with straggler timeouts
|
2024-11-19 10:41:58 -08:00 |
|
Jake Poznanski
|
4469f4b2ce
|
Version patch
|
2024-11-18 19:55:26 -08:00 |
|
Jake Poznanski
|
9e2e09bd06
|
More fixes
|
2024-11-18 15:04:50 -08:00 |
|
Jake Poznanski
|
8793fc7d99
|
Adding more retries, and it was able to process more complicated books
|
2024-11-18 14:25:32 -08:00 |
|
Jake Poznanski
|
2f55a3ddb7
|
fix
|
2024-11-18 13:58:25 -08:00 |
|
Jake Poznanski
|
d4d47369cb
|
more gcs
|
2024-11-18 13:20:28 -08:00 |
|
Jake Poznanski
|
e48d4bef00
|
Fix
|
2024-11-18 13:16:19 -08:00 |
|