582 Commits

Author SHA1 Message Date
Jake Poznanski
a0bec4ee41 7b scripto 2024-09-25 22:08:36 +00:00
Jake Poznanski
385c1bf9a7 Lora config 2024-09-25 22:07:04 +00:00
Jake Poznanski
24b30b2333 Prepping for 7b training 2024-09-25 20:51:25 +00:00
Jake Poznanski
5f9b2341c7 Some prompt tweaks I thought of for next time 2024-09-25 20:47:33 +00:00
Jake Poznanski
8ebe751196 Merge branch 'main' of https://github.com/allenai/pdelfin 2024-09-25 20:28:05 +00:00
Jake Poznanski
ed502549d3 Adding script to convert silver data that we send to openai into something we can run through mise/birr 2024-09-25 20:27:49 +00:00
Jake Poznanski
3a5b438a6f Lora misconfiguration 2024-09-25 10:48:39 -07:00
Jake Poznanski
86813fe210 Filtering off the weird tail ends of the distribution to make training smoother 2024-09-25 09:49:03 -07:00
Jake Poznanski
5f313266a4 Adding linear layers from visual network to target modules LORA 2024-09-25 09:09:24 -07:00
Jake Poznanski
b2341ed4f4 Merge branch 'main' of https://github.com/allenai/pdelfin into main 2024-09-25 09:05:46 -07:00
Jake Poznanski
9cbc128553 Sampling some sequence lengths 2024-09-25 09:05:11 -07:00
Jake Poznanski
d0deac5ea7 Lora config 2024-09-25 08:34:58 -07:00
Jake Poznanski
07c0323c91 Adding lora config to try to address OOMs 2024-09-25 07:57:01 -07:00
Jake Poznanski
ea0226c499 More flexibility in dataloader dims 2024-09-24 19:47:13 -07:00
Jake Poznanski
ff3d6aa61a Merge remote-tracking branch 'origin/main' 2024-09-24 15:52:45 -07:00
Jake Poznanski
f6905c39ea Hopefully the last changes 2024-09-24 15:52:34 -07:00
Jake Poznanski
4fb78c29ef Fixing runeval to work with qwen2vl batch inferences 2024-09-24 22:11:53 +00:00
Jake Poznanski
2579931ae2 Merge branch 'main' of https://github.com/allenai/pdelfin 2024-09-24 21:57:52 +00:00
Jake Poznanski
a50ffe27c9 Adding in eval scripts from oe-data-internal now all in one place 2024-09-24 21:57:51 +00:00
Jake Poznanski
ea731055d7 More realistic configuration 2024-09-24 14:50:23 -07:00
Jake Poznanski
0442a33209 New images work much better now, and device map fix 2024-09-24 12:58:18 -07:00
Jake Poznanski
6173f55845 More typos 2024-09-24 11:58:10 -07:00
Jake Poznanski
0d9917367b Flash attention as part of the image 2024-09-24 11:57:56 -07:00
Jake Poznanski
3c8e05362f New image, dont need to install 2024-09-24 11:30:19 -07:00
Jake Poznanski
66c29dd44f Moving to making a new dockerfile 2024-09-24 11:24:14 -07:00
Jake Poznanski
e64d4f7103 More pip stuff 2024-09-24 17:56:19 +00:00
Jake Poznanski
bf1239deea Use mini dataset now for testing 2024-09-24 10:55:03 -07:00
Jake Poznanski
596fc55628 Enabling model eval 2024-09-24 10:48:53 -07:00
Jake Poznanski
5a0bcb7b1d batch inference slowness 2024-09-24 09:13:47 -07:00
Jake Poznanski
28bcf72e11 Hoping to get a quick batch inference pipeline rolling 2024-09-24 08:56:36 -07:00
Jake Poznanski
45f691c718 Starting batch inference script to measure performance, train script using proper model from config now 2024-09-24 08:40:46 -07:00
Jake Poznanski
b0777dcb87 missing libaio 2024-09-24 15:32:31 +00:00
Jake Poznanski
1bb222bdd8 Datasets version 2024-09-24 14:57:53 +00:00
Jake Poznanski
7b76b66262 extra index 2024-09-24 14:50:10 +00:00
Jake Poznanski
5287ba50b9 Back to pip... sigh 2024-09-24 14:45:44 +00:00
Jake Poznanski
357f2c6960 More env stuff 2024-09-23 22:48:46 +00:00
Jake Poznanski
491b7383a1 env fix 2024-09-23 22:40:35 +00:00
Jake Poznanski
1cf3cd8caa Had to swtich to conda env override for gantry due to cu118 compat 2024-09-23 22:35:42 +00:00
Jake Poznanski
cb0b97a16a Gantry requirements 2024-09-23 15:08:39 -07:00
Jake Poznanski
15793975dd Merge branch 'main' of https://github.com/allenai/pdelfin 2024-09-23 21:42:27 +00:00
Jake Poznanski
0691e1a77f chmodding 2024-09-23 21:42:26 +00:00
Jake Poznanski
a30ca16e1f Script adjustment 2024-09-23 14:41:35 -07:00
Jake Poznanski
79feb986a6 Merge branch 'main' of https://github.com/allenai/pdelfin into main 2024-09-23 14:32:12 -07:00
Jake Poznanski
a3feca01fc Setting up for a real train run 2024-09-23 14:32:10 -07:00
Jake Poznanski
d589b5651d Merge branch 'main' of https://github.com/allenai/pdelfin 2024-09-23 21:19:26 +00:00
Jake Poznanski
9ae26472d3 Silver dataset adjustments 2024-09-23 21:19:24 +00:00
Jake Poznanski
0812b0dd77 Prepping for gantry 2024-09-23 14:04:22 -07:00
Jake Poznanski
f78d021f50 Should be merging the LORA adapters back into the model for the final checkpoint 2024-09-23 12:55:01 -07:00
Jake Poznanski
5967a525fd Flash attention and mixed precision training, works quite a bit faster 2024-09-23 11:26:18 -07:00
Jake Poznanski
a7782255d5 Merge branch 'main' of https://github.com/allenai/pdelfin into main 2024-09-23 10:44:39 -07:00