Jake Poznanski
|
a0bec4ee41
|
7b scripto
|
2024-09-25 22:08:36 +00:00 |
|
Jake Poznanski
|
385c1bf9a7
|
Lora config
|
2024-09-25 22:07:04 +00:00 |
|
Jake Poznanski
|
24b30b2333
|
Prepping for 7b training
|
2024-09-25 20:51:25 +00:00 |
|
Jake Poznanski
|
5f9b2341c7
|
Some prompt tweaks I thought of for next time
|
2024-09-25 20:47:33 +00:00 |
|
Jake Poznanski
|
8ebe751196
|
Merge branch 'main' of https://github.com/allenai/pdelfin
|
2024-09-25 20:28:05 +00:00 |
|
Jake Poznanski
|
ed502549d3
|
Adding script to convert silver data that we send to openai into something we can run through mise/birr
|
2024-09-25 20:27:49 +00:00 |
|
Jake Poznanski
|
3a5b438a6f
|
Lora misconfiguration
|
2024-09-25 10:48:39 -07:00 |
|
Jake Poznanski
|
86813fe210
|
Filtering off the weird tail ends of the distribution to make training smoother
|
2024-09-25 09:49:03 -07:00 |
|
Jake Poznanski
|
5f313266a4
|
Adding linear layers from visual network to target modules LORA
|
2024-09-25 09:09:24 -07:00 |
|
Jake Poznanski
|
b2341ed4f4
|
Merge branch 'main' of https://github.com/allenai/pdelfin into main
|
2024-09-25 09:05:46 -07:00 |
|
Jake Poznanski
|
9cbc128553
|
Sampling some sequence lengths
|
2024-09-25 09:05:11 -07:00 |
|
Jake Poznanski
|
d0deac5ea7
|
Lora config
|
2024-09-25 08:34:58 -07:00 |
|
Jake Poznanski
|
07c0323c91
|
Adding lora config to try to address OOMs
|
2024-09-25 07:57:01 -07:00 |
|
Jake Poznanski
|
ea0226c499
|
More flexibility in dataloader dims
|
2024-09-24 19:47:13 -07:00 |
|
Jake Poznanski
|
ff3d6aa61a
|
Merge remote-tracking branch 'origin/main'
|
2024-09-24 15:52:45 -07:00 |
|
Jake Poznanski
|
f6905c39ea
|
Hopefully the last changes
|
2024-09-24 15:52:34 -07:00 |
|
Jake Poznanski
|
4fb78c29ef
|
Fixing runeval to work with qwen2vl batch inferences
|
2024-09-24 22:11:53 +00:00 |
|
Jake Poznanski
|
2579931ae2
|
Merge branch 'main' of https://github.com/allenai/pdelfin
|
2024-09-24 21:57:52 +00:00 |
|
Jake Poznanski
|
a50ffe27c9
|
Adding in eval scripts from oe-data-internal now all in one place
|
2024-09-24 21:57:51 +00:00 |
|
Jake Poznanski
|
ea731055d7
|
More realistic configuration
|
2024-09-24 14:50:23 -07:00 |
|
Jake Poznanski
|
0442a33209
|
New images work much better now, and device map fix
|
2024-09-24 12:58:18 -07:00 |
|
Jake Poznanski
|
6173f55845
|
More typos
|
2024-09-24 11:58:10 -07:00 |
|
Jake Poznanski
|
0d9917367b
|
Flash attention as part of the image
|
2024-09-24 11:57:56 -07:00 |
|
Jake Poznanski
|
3c8e05362f
|
New image, dont need to install
|
2024-09-24 11:30:19 -07:00 |
|
Jake Poznanski
|
66c29dd44f
|
Moving to making a new dockerfile
|
2024-09-24 11:24:14 -07:00 |
|
Jake Poznanski
|
e64d4f7103
|
More pip stuff
|
2024-09-24 17:56:19 +00:00 |
|
Jake Poznanski
|
bf1239deea
|
Use mini dataset now for testing
|
2024-09-24 10:55:03 -07:00 |
|
Jake Poznanski
|
596fc55628
|
Enabling model eval
|
2024-09-24 10:48:53 -07:00 |
|
Jake Poznanski
|
5a0bcb7b1d
|
batch inference slowness
|
2024-09-24 09:13:47 -07:00 |
|
Jake Poznanski
|
28bcf72e11
|
Hoping to get a quick batch inference pipeline rolling
|
2024-09-24 08:56:36 -07:00 |
|
Jake Poznanski
|
45f691c718
|
Starting batch inference script to measure performance, train script using proper model from config now
|
2024-09-24 08:40:46 -07:00 |
|
Jake Poznanski
|
b0777dcb87
|
missing libaio
|
2024-09-24 15:32:31 +00:00 |
|
Jake Poznanski
|
1bb222bdd8
|
Datasets version
|
2024-09-24 14:57:53 +00:00 |
|
Jake Poznanski
|
7b76b66262
|
extra index
|
2024-09-24 14:50:10 +00:00 |
|
Jake Poznanski
|
5287ba50b9
|
Back to pip... sigh
|
2024-09-24 14:45:44 +00:00 |
|
Jake Poznanski
|
357f2c6960
|
More env stuff
|
2024-09-23 22:48:46 +00:00 |
|
Jake Poznanski
|
491b7383a1
|
env fix
|
2024-09-23 22:40:35 +00:00 |
|
Jake Poznanski
|
1cf3cd8caa
|
Had to swtich to conda env override for gantry due to cu118 compat
|
2024-09-23 22:35:42 +00:00 |
|
Jake Poznanski
|
cb0b97a16a
|
Gantry requirements
|
2024-09-23 15:08:39 -07:00 |
|
Jake Poznanski
|
15793975dd
|
Merge branch 'main' of https://github.com/allenai/pdelfin
|
2024-09-23 21:42:27 +00:00 |
|
Jake Poznanski
|
0691e1a77f
|
chmodding
|
2024-09-23 21:42:26 +00:00 |
|
Jake Poznanski
|
a30ca16e1f
|
Script adjustment
|
2024-09-23 14:41:35 -07:00 |
|
Jake Poznanski
|
79feb986a6
|
Merge branch 'main' of https://github.com/allenai/pdelfin into main
|
2024-09-23 14:32:12 -07:00 |
|
Jake Poznanski
|
a3feca01fc
|
Setting up for a real train run
|
2024-09-23 14:32:10 -07:00 |
|
Jake Poznanski
|
d589b5651d
|
Merge branch 'main' of https://github.com/allenai/pdelfin
|
2024-09-23 21:19:26 +00:00 |
|
Jake Poznanski
|
9ae26472d3
|
Silver dataset adjustments
|
2024-09-23 21:19:24 +00:00 |
|
Jake Poznanski
|
0812b0dd77
|
Prepping for gantry
|
2024-09-23 14:04:22 -07:00 |
|
Jake Poznanski
|
f78d021f50
|
Should be merging the LORA adapters back into the model for the final checkpoint
|
2024-09-23 12:55:01 -07:00 |
|
Jake Poznanski
|
5967a525fd
|
Flash attention and mixed precision training, works quite a bit faster
|
2024-09-23 11:26:18 -07:00 |
|
Jake Poznanski
|
a7782255d5
|
Merge branch 'main' of https://github.com/allenai/pdelfin into main
|
2024-09-23 10:44:39 -07:00 |
|