Jake Poznanski
|
07c0323c91
|
Adding lora config to try to address OOMs
|
2024-09-25 07:57:01 -07:00 |
|
Jake Poznanski
|
ea0226c499
|
More flexibility in dataloader dims
|
2024-09-24 19:47:13 -07:00 |
|
Jake Poznanski
|
ff3d6aa61a
|
Merge remote-tracking branch 'origin/main'
|
2024-09-24 15:52:45 -07:00 |
|
Jake Poznanski
|
f6905c39ea
|
Hopefully the last changes
|
2024-09-24 15:52:34 -07:00 |
|
Jake Poznanski
|
4fb78c29ef
|
Fixing runeval to work with qwen2vl batch inferences
|
2024-09-24 22:11:53 +00:00 |
|
Jake Poznanski
|
2579931ae2
|
Merge branch 'main' of https://github.com/allenai/pdelfin
|
2024-09-24 21:57:52 +00:00 |
|
Jake Poznanski
|
a50ffe27c9
|
Adding in eval scripts from oe-data-internal now all in one place
|
2024-09-24 21:57:51 +00:00 |
|
Jake Poznanski
|
ea731055d7
|
More realistic configuration
|
2024-09-24 14:50:23 -07:00 |
|
Jake Poznanski
|
0442a33209
|
New images work much better now, and device map fix
|
2024-09-24 12:58:18 -07:00 |
|
Jake Poznanski
|
6173f55845
|
More typos
|
2024-09-24 11:58:10 -07:00 |
|
Jake Poznanski
|
0d9917367b
|
Flash attention as part of the image
|
2024-09-24 11:57:56 -07:00 |
|
Jake Poznanski
|
3c8e05362f
|
New image, dont need to install
|
2024-09-24 11:30:19 -07:00 |
|
Jake Poznanski
|
66c29dd44f
|
Moving to making a new dockerfile
|
2024-09-24 11:24:14 -07:00 |
|
Jake Poznanski
|
e64d4f7103
|
More pip stuff
|
2024-09-24 17:56:19 +00:00 |
|
Jake Poznanski
|
bf1239deea
|
Use mini dataset now for testing
|
2024-09-24 10:55:03 -07:00 |
|
Jake Poznanski
|
596fc55628
|
Enabling model eval
|
2024-09-24 10:48:53 -07:00 |
|
Jake Poznanski
|
5a0bcb7b1d
|
batch inference slowness
|
2024-09-24 09:13:47 -07:00 |
|
Jake Poznanski
|
28bcf72e11
|
Hoping to get a quick batch inference pipeline rolling
|
2024-09-24 08:56:36 -07:00 |
|
Jake Poznanski
|
45f691c718
|
Starting batch inference script to measure performance, train script using proper model from config now
|
2024-09-24 08:40:46 -07:00 |
|
Jake Poznanski
|
b0777dcb87
|
missing libaio
|
2024-09-24 15:32:31 +00:00 |
|
Jake Poznanski
|
1bb222bdd8
|
Datasets version
|
2024-09-24 14:57:53 +00:00 |
|
Jake Poznanski
|
7b76b66262
|
extra index
|
2024-09-24 14:50:10 +00:00 |
|
Jake Poznanski
|
5287ba50b9
|
Back to pip... sigh
|
2024-09-24 14:45:44 +00:00 |
|
Jake Poznanski
|
357f2c6960
|
More env stuff
|
2024-09-23 22:48:46 +00:00 |
|
Jake Poznanski
|
491b7383a1
|
env fix
|
2024-09-23 22:40:35 +00:00 |
|
Jake Poznanski
|
1cf3cd8caa
|
Had to swtich to conda env override for gantry due to cu118 compat
|
2024-09-23 22:35:42 +00:00 |
|
Jake Poznanski
|
cb0b97a16a
|
Gantry requirements
|
2024-09-23 15:08:39 -07:00 |
|
Jake Poznanski
|
15793975dd
|
Merge branch 'main' of https://github.com/allenai/pdelfin
|
2024-09-23 21:42:27 +00:00 |
|
Jake Poznanski
|
0691e1a77f
|
chmodding
|
2024-09-23 21:42:26 +00:00 |
|
Jake Poznanski
|
a30ca16e1f
|
Script adjustment
|
2024-09-23 14:41:35 -07:00 |
|
Jake Poznanski
|
79feb986a6
|
Merge branch 'main' of https://github.com/allenai/pdelfin into main
|
2024-09-23 14:32:12 -07:00 |
|
Jake Poznanski
|
a3feca01fc
|
Setting up for a real train run
|
2024-09-23 14:32:10 -07:00 |
|
Jake Poznanski
|
d589b5651d
|
Merge branch 'main' of https://github.com/allenai/pdelfin
|
2024-09-23 21:19:26 +00:00 |
|
Jake Poznanski
|
9ae26472d3
|
Silver dataset adjustments
|
2024-09-23 21:19:24 +00:00 |
|
Jake Poznanski
|
0812b0dd77
|
Prepping for gantry
|
2024-09-23 14:04:22 -07:00 |
|
Jake Poznanski
|
f78d021f50
|
Should be merging the LORA adapters back into the model for the final checkpoint
|
2024-09-23 12:55:01 -07:00 |
|
Jake Poznanski
|
5967a525fd
|
Flash attention and mixed precision training, works quite a bit faster
|
2024-09-23 11:26:18 -07:00 |
|
Jake Poznanski
|
a7782255d5
|
Merge branch 'main' of https://github.com/allenai/pdelfin into main
|
2024-09-23 10:44:39 -07:00 |
|
Jake Poznanski
|
45e5823823
|
Much happier gpu utilization
|
2024-09-23 10:44:25 -07:00 |
|
Jake Poznanski
|
5535e3ab2e
|
Moving the openai data generation stuff to this repo now
|
2024-09-23 17:20:18 +00:00 |
|
Jake Poznanski
|
dc71b28ddd
|
No need to save tokenizer
|
2024-09-23 10:06:04 -07:00 |
|
Jake Poznanski
|
5916239cd8
|
typos
|
2024-09-23 09:43:36 -07:00 |
|
Jake Poznanski
|
ea3af0143c
|
Loading dataset from config now
|
2024-09-23 09:40:24 -07:00 |
|
Jake Poznanski
|
ab9458b913
|
Basic LORA trainer, doesn't seem to make any speed difference
|
2024-09-23 09:08:00 -07:00 |
|
Jake Poznanski
|
3ed14a9ea5
|
Prepping new training stuff
|
2024-09-23 08:53:56 -07:00 |
|
Jake Poznanski
|
b915e7de00
|
Smaller config for now, fixing a few requirements
|
2024-09-23 08:20:08 -07:00 |
|
Jake Poznanski
|
256d77c232
|
Hoping to get a basic hf Trainer to run
|
2024-09-20 15:53:11 -07:00 |
|
Jake Poznanski
|
55035b02c9
|
Tries to run a forward pass but oOMS
|
2024-09-20 15:05:23 -07:00 |
|
Jake Poznanski
|
4eddb1b45f
|
Okay, reasonably happy with the dataprep pipeline
|
2024-09-20 13:04:47 -07:00 |
|
Jake Poznanski
|
a47afe5c8d
|
Adding test to make sure the traning and inference time tokenization stays identical, currenlty failing
|
2024-09-20 12:01:05 -07:00 |
|