Jake Poznanski
|
1cf3cd8caa
|
Had to swtich to conda env override for gantry due to cu118 compat
|
2024-09-23 22:35:42 +00:00 |
|
Jake Poznanski
|
cb0b97a16a
|
Gantry requirements
|
2024-09-23 15:08:39 -07:00 |
|
Jake Poznanski
|
15793975dd
|
Merge branch 'main' of https://github.com/allenai/pdelfin
|
2024-09-23 21:42:27 +00:00 |
|
Jake Poznanski
|
0691e1a77f
|
chmodding
|
2024-09-23 21:42:26 +00:00 |
|
Jake Poznanski
|
a30ca16e1f
|
Script adjustment
|
2024-09-23 14:41:35 -07:00 |
|
Jake Poznanski
|
79feb986a6
|
Merge branch 'main' of https://github.com/allenai/pdelfin into main
|
2024-09-23 14:32:12 -07:00 |
|
Jake Poznanski
|
a3feca01fc
|
Setting up for a real train run
|
2024-09-23 14:32:10 -07:00 |
|
Jake Poznanski
|
d589b5651d
|
Merge branch 'main' of https://github.com/allenai/pdelfin
|
2024-09-23 21:19:26 +00:00 |
|
Jake Poznanski
|
9ae26472d3
|
Silver dataset adjustments
|
2024-09-23 21:19:24 +00:00 |
|
Jake Poznanski
|
0812b0dd77
|
Prepping for gantry
|
2024-09-23 14:04:22 -07:00 |
|
Jake Poznanski
|
f78d021f50
|
Should be merging the LORA adapters back into the model for the final checkpoint
|
2024-09-23 12:55:01 -07:00 |
|
Jake Poznanski
|
5967a525fd
|
Flash attention and mixed precision training, works quite a bit faster
|
2024-09-23 11:26:18 -07:00 |
|
Jake Poznanski
|
a7782255d5
|
Merge branch 'main' of https://github.com/allenai/pdelfin into main
|
2024-09-23 10:44:39 -07:00 |
|
Jake Poznanski
|
45e5823823
|
Much happier gpu utilization
|
2024-09-23 10:44:25 -07:00 |
|
Jake Poznanski
|
5535e3ab2e
|
Moving the openai data generation stuff to this repo now
|
2024-09-23 17:20:18 +00:00 |
|
Jake Poznanski
|
dc71b28ddd
|
No need to save tokenizer
|
2024-09-23 10:06:04 -07:00 |
|
Jake Poznanski
|
5916239cd8
|
typos
|
2024-09-23 09:43:36 -07:00 |
|
Jake Poznanski
|
ea3af0143c
|
Loading dataset from config now
|
2024-09-23 09:40:24 -07:00 |
|
Jake Poznanski
|
ab9458b913
|
Basic LORA trainer, doesn't seem to make any speed difference
|
2024-09-23 09:08:00 -07:00 |
|
Jake Poznanski
|
3ed14a9ea5
|
Prepping new training stuff
|
2024-09-23 08:53:56 -07:00 |
|
Jake Poznanski
|
b915e7de00
|
Smaller config for now, fixing a few requirements
|
2024-09-23 08:20:08 -07:00 |
|
Jake Poznanski
|
256d77c232
|
Hoping to get a basic hf Trainer to run
|
2024-09-20 15:53:11 -07:00 |
|
Jake Poznanski
|
55035b02c9
|
Tries to run a forward pass but oOMS
|
2024-09-20 15:05:23 -07:00 |
|
Jake Poznanski
|
4eddb1b45f
|
Okay, reasonably happy with the dataprep pipeline
|
2024-09-20 13:04:47 -07:00 |
|
Jake Poznanski
|
a47afe5c8d
|
Adding test to make sure the traning and inference time tokenization stays identical, currenlty failing
|
2024-09-20 12:01:05 -07:00 |
|
Jake Poznanski
|
fcb67ebd61
|
Prepping data to be in a trainable format
|
2024-09-20 09:25:54 -07:00 |
|
Jake Poznanski
|
dc86a99a97
|
Pyproject dependency cleanup
|
2024-09-20 08:22:10 -07:00 |
|
Jake Poznanski
|
962fb7eb6d
|
merge
|
2024-09-20 15:10:47 +00:00 |
|
Jake Poznanski
|
0cc2b5d7cf
|
Pyproject stuff
|
2024-09-20 15:09:45 +00:00 |
|
Jake Poznanski
|
0f2c42a6d3
|
Fixing formating in pyproject
|
2024-09-20 08:01:48 -07:00 |
|
Jake Poznanski
|
84e68f313e
|
Basic forward generation pass with openai dataset and qwen2vl
|
2024-09-19 22:16:59 +00:00 |
|
Jake Poznanski
|
7d2c447dd3
|
Importing core training config stuff from dolma refine
|
2024-09-19 21:55:07 +00:00 |
|
Jake Poznanski
|
bab32aa9b3
|
Formatting
|
2024-09-18 22:52:42 +00:00 |
|
Jake Poznanski
|
f4d18cb287
|
Dataloader capabable of loading 38k rows reasonably fast
|
2024-09-18 22:48:38 +00:00 |
|
Jake Poznanski
|
d22b311340
|
Starting to write dataloader for visual lm data
|
2024-09-18 21:42:09 +00:00 |
|
Jake Poznanski
|
fb4fc4229e
|
Fixing close file warning
|
2024-09-17 20:31:32 +00:00 |
|
Jake Poznanski
|
af2126df99
|
450tok/sec/core with smollm that appears to work well
|
2024-09-17 19:59:02 +00:00 |
|
Jake Poznanski
|
2f71cb9232
|
Using SmolLM, seems a lot better and is able to pass some tests
|
2024-09-17 18:47:27 +00:00 |
|
Jake Poznanski
|
57e80aacd2
|
Testing coherence with distilgpt2, but it doesn't work great
|
2024-09-17 16:58:45 +00:00 |
|
Jake Poznanski
|
cb9b6efb3c
|
Trying distilgpt2 instead of kenlm
|
2024-09-17 16:50:01 +00:00 |
|
Jake Poznanski
|
01bc0b2f10
|
Moving a whole bunch of code over, still broken
|
2024-09-17 16:26:55 +00:00 |
|
Jake Poznanski
|
a534a0180d
|
Moving pdf filter code over with tests
|
2024-09-17 15:16:58 +00:00 |
|
Jake Poznanski
|
9662718bfd
|
Running personalize script on template
|
2024-09-17 15:06:59 +00:00 |
|
Jake Poznanski
|
7d71e2d643
|
Update README.md
|
2024-09-17 07:58:39 -07:00 |
|
Jake Poznanski
|
68b2c0e8d6
|
Initial commit
|
2024-09-17 07:53:43 -07:00 |
|