Jake Poznanski
|
a7cd7467c3
|
mathjax
|
2024-10-16 16:45:07 +00:00 |
|
Jake Poznanski
|
baa82a4a9a
|
Fixing links, rendering tables
|
2024-10-16 16:37:08 +00:00 |
|
Jake Poznanski
|
19e56ec7ce
|
dolma viewer runs much faster now
|
2024-10-16 16:21:25 +00:00 |
|
Jake Poznanski
|
96682b2ecb
|
Refactoring
|
2024-10-16 16:18:27 +00:00 |
|
Jake Poznanski
|
2cd863ddce
|
Dolma viewer improvements
|
2024-10-16 16:05:44 +00:00 |
|
Jake Poznanski
|
35558dbddc
|
Make the prompt hint randomly select lines
|
2024-10-16 16:05:07 +00:00 |
|
Jake Poznanski
|
9eb252f8f6
|
Better tracking of completion_errors
|
2024-10-15 22:43:31 +00:00 |
|
Jake Poznanski
|
4ef14ec813
|
More stats
|
2024-10-15 22:26:31 +00:00 |
|
Jake Poznanski
|
4a280e55df
|
Nicer dolma viewer
|
2024-10-15 21:03:28 +00:00 |
|
Jake Poznanski
|
42cf6a639f
|
Dolma viewer
|
2024-10-15 18:37:31 +00:00 |
|
Jake Poznanski
|
b8cd414022
|
tiny fix
|
2024-10-15 16:54:19 +00:00 |
|
Jake Poznanski
|
a7fae0e659
|
fix
|
2024-10-15 16:36:54 +00:00 |
|
Jake Poznanski
|
4669eb7134
|
Adjusting workflow so I can do s2 pdfs
|
2024-10-15 16:22:55 +00:00 |
|
Jake Poznanski
|
6d61ae4aa8
|
Some pipeline cleanup stuff
|
2024-10-15 16:02:08 +00:00 |
|
Jake Poznanski
|
fc8fcfaeba
|
Fixing dataloader hopefully
|
2024-10-15 15:13:25 +00:00 |
|
Jake Poznanski
|
6d53683001
|
More stats hopefully running faster
|
2024-10-14 21:37:14 +00:00 |
|
Jake Poznanski
|
350061906e
|
Adding nicer output stats
|
2024-10-14 20:48:33 +00:00 |
|
Jake Poznanski
|
194af5ff52
|
Robustness
|
2024-10-14 20:31:37 +00:00 |
|
Jake Poznanski
|
1ed9e4c947
|
Runs to the end now
|
2024-10-14 20:28:54 +00:00 |
|
Jake Poznanski
|
879b974af2
|
More and more fixes
|
2024-10-14 20:06:07 +00:00 |
|
Jake Poznanski
|
77a850d7ef
|
Tracking rounds of inference better
|
2024-10-14 18:42:50 +00:00 |
|
Jake Poznanski
|
af992bd603
|
More refactoring
|
2024-10-14 18:23:22 +00:00 |
|
Jake Poznanski
|
cd8e28e459
|
Pipeline working hopefully soon
|
2024-10-14 18:19:17 +00:00 |
|
Jake Poznanski
|
f2f578cca9
|
More pipeline code
|
2024-10-14 17:23:09 +00:00 |
|
Jake Poznanski
|
39333f2c96
|
New pipeline stuff
|
2024-10-14 17:09:11 +00:00 |
|
Jake Poznanski
|
4d6eaf654d
|
Merge branch 'main' of https://github.com/allenai/pdelfin
|
2024-10-14 16:30:51 +00:00 |
|
Jake Poznanski
|
89d4ee2145
|
Pipeline work
|
2024-10-14 16:30:49 +00:00 |
|
Jake Poznanski
|
7b161533e2
|
Code to do local inference on fine tuned models for testing
|
2024-10-14 08:38:18 -07:00 |
|
Jake Poznanski
|
5a7377af30
|
Refactoring
|
2024-10-11 22:57:49 +00:00 |
|
Jake Poznanski
|
4fd6066600
|
gpt cleanup
|
2024-10-11 22:41:09 +00:00 |
|
Jake Poznanski
|
a45f86e4a4
|
More cleanup
|
2024-10-11 22:37:32 +00:00 |
|
Jake Poznanski
|
53fdb6108c
|
More pipeline code
|
2024-10-11 21:50:09 +00:00 |
|
Jake Poznanski
|
10b7a58d28
|
fix
|
2024-10-11 20:22:58 +00:00 |
|
Jake Poznanski
|
f477a68621
|
dbmanager
|
2024-10-11 16:24:29 +00:00 |
|
Jake Poznanski
|
2dccc4be3b
|
Oops removing print
|
2024-10-11 16:23:14 +00:00 |
|
Jake Poznanski
|
aea3f7f1fe
|
Fix for anchor generation on pdfs with no text elements
|
2024-10-11 15:01:01 +00:00 |
|
Jake Poznanski
|
af03358c47
|
assemble
|
2024-10-10 22:36:09 +00:00 |
|
Jake Poznanski
|
312847acac
|
Ok, finally working nicely to build the page index
|
2024-10-10 22:30:09 +00:00 |
|
Jake Poznanski
|
312ee8d953
|
pipeline script
|
2024-10-10 22:13:43 +00:00 |
|
Jake Poznanski
|
49b5b233c3
|
Working on new pipeline script
|
2024-10-10 22:10:26 +00:00 |
|
Jake Poznanski
|
a8b50ae8fa
|
Preloading the datasets directly
|
2024-10-10 19:57:51 +00:00 |
|
Jake Poznanski
|
85f2dc6d26
|
Fixes
|
2024-10-10 18:52:42 +00:00 |
|
Jake Poznanski
|
2864f907e1
|
Dataloader fix with nicer tests
|
2024-10-10 16:58:45 +00:00 |
|
Jake Poznanski
|
b7c80cd17f
|
Fix up some tests but I don't see why this isn't working
|
2024-10-10 16:58:40 +00:00 |
|
Jake Poznanski
|
3245990216
|
Faster eval script
|
2024-10-10 15:22:33 +00:00 |
|
Jake Poznanski
|
931f48c3d1
|
Allow eval script to support one more type of jsonls, runpipeline multiglobs, other fixes
|
2024-10-09 23:39:13 +00:00 |
|
Jake Poznanski
|
c6bdf69d8f
|
First stab at document assembly
|
2024-10-09 22:19:16 +00:00 |
|
Jake Poznanski
|
847064f46f
|
Taking notes, starting on document assembly
|
2024-10-09 22:14:28 +00:00 |
|
Jake Poznanski
|
8e5809da71
|
runpipeline
|
2024-10-09 20:29:59 +00:00 |
|
Jake Poznanski
|
a90feda42f
|
bugfixes
|
2024-10-09 20:20:06 +00:00 |
|