307 Commits

Author SHA1 Message Date
Jake Poznanski
2c7323d1c4 Convert silver adjustments 2024-09-30 22:41:51 +00:00
Jake Poznanski
80bb0cbc23 Open ai to openai comparison now supported, new prompts 2024-09-30 22:08:30 +00:00
Jake Poznanski
e179453cc5 Fixing qwen checkpoint script 2024-09-30 20:34:06 +00:00
Jake Poznanski
963e946233 Convertsilver birr script can go in and out of S3 now 2024-09-30 20:06:45 +00:00
Jake Poznanski
b856b4551f Fixes to convertsilver to birr script 2024-09-30 19:54:30 +00:00
Jake Poznanski
da1982acb8 Refactoring prompts into their own new folder 2024-09-30 18:48:17 +00:00
Jake Poznanski
d74f9a352b Send silver script tries to open file first, before sending an API requests 2024-09-30 18:41:50 +00:00
Jake Poznanski
1216d9c7c9 retrieve silver script reports errors better 2024-09-30 18:41:33 +00:00
Jake Poznanski
b4e9d6a2b8 Buildsilver script suppors reservoir sampling so it can sample 100M+ paths now efficiently 2024-09-30 18:41:18 +00:00
Jake Poznanski
8ec9e35f22 dataprep issue 2024-09-28 04:31:11 +00:00
Jake Poznanski
e53f782b0f Datasetdict fix 2024-09-28 03:38:29 +00:00
Jake Poznanski
decfd7fbc1 Fixing the refiner input prompt to something simpler that doesn't depend on the training data. Fixing beaker job workspace and bumping priority to high. 2024-09-27 22:54:07 +00:00
Jake Poznanski
22b765e6be Going back to non iterable dataset, so shuffling works better, applying a light filter 2024-09-27 15:48:56 +00:00
Jake Poznanski
65a9c9981e Hopefuly will train now 2024-09-27 15:16:12 +00:00
Jake Poznanski
e864b9d88f weird dataloader stuff now 2024-09-27 02:53:59 +00:00
Jake Poznanski
37f10051f6 typo 2024-09-27 01:19:21 +00:00
Jake Poznanski
c00e40d1c4 More fixes 2024-09-26 23:10:07 +00:00
Jake Poznanski
d098a87ed2 Column name fix 2024-09-26 22:29:19 +00:00
Jake Poznanski
84e9da637c Removing lambda due to pickling errors 2024-09-26 21:39:08 +00:00
Jake Poznanski
61dd7bb61f Fix for map in iterable mode 2024-09-26 20:44:47 +00:00
Jake Poznanski
49efa5cb40 Typo 2024-09-26 19:57:53 +00:00
Jake Poznanski
cf1aa0176e Proper use of iterable_dataset 2024-09-26 19:55:54 +00:00
Jake Poznanski
05fdb81da2 map and filter on iterable dataset 2024-09-26 19:01:34 +00:00
Jake Poznanski
f14e910175 bnb 2024-09-26 03:30:35 +00:00
Jake Poznanski
7707bc08da trying cheaper optimizer to solve ooms 2024-09-25 22:56:05 +00:00
Jake Poznanski
a0bec4ee41 7b scripto 2024-09-25 22:08:36 +00:00
Jake Poznanski
385c1bf9a7 Lora config 2024-09-25 22:07:04 +00:00
Jake Poznanski
24b30b2333 Prepping for 7b training 2024-09-25 20:51:25 +00:00
Jake Poznanski
5f9b2341c7 Some prompt tweaks I thought of for next time 2024-09-25 20:47:33 +00:00
Jake Poznanski
8ebe751196 Merge branch 'main' of https://github.com/allenai/pdelfin 2024-09-25 20:28:05 +00:00
Jake Poznanski
ed502549d3 Adding script to convert silver data that we send to openai into something we can run through mise/birr 2024-09-25 20:27:49 +00:00
Jake Poznanski
3a5b438a6f Lora misconfiguration 2024-09-25 10:48:39 -07:00
Jake Poznanski
86813fe210 Filtering off the weird tail ends of the distribution to make training smoother 2024-09-25 09:49:03 -07:00
Jake Poznanski
5f313266a4 Adding linear layers from visual network to target modules LORA 2024-09-25 09:09:24 -07:00
Jake Poznanski
b2341ed4f4 Merge branch 'main' of https://github.com/allenai/pdelfin into main 2024-09-25 09:05:46 -07:00
Jake Poznanski
9cbc128553 Sampling some sequence lengths 2024-09-25 09:05:11 -07:00
Jake Poznanski
d0deac5ea7 Lora config 2024-09-25 08:34:58 -07:00
Jake Poznanski
07c0323c91 Adding lora config to try to address OOMs 2024-09-25 07:57:01 -07:00
Jake Poznanski
ea0226c499 More flexibility in dataloader dims 2024-09-24 19:47:13 -07:00
Jake Poznanski
ff3d6aa61a Merge remote-tracking branch 'origin/main' 2024-09-24 15:52:45 -07:00
Jake Poznanski
f6905c39ea Hopefully the last changes 2024-09-24 15:52:34 -07:00
Jake Poznanski
4fb78c29ef Fixing runeval to work with qwen2vl batch inferences 2024-09-24 22:11:53 +00:00
Jake Poznanski
2579931ae2 Merge branch 'main' of https://github.com/allenai/pdelfin 2024-09-24 21:57:52 +00:00
Jake Poznanski
a50ffe27c9 Adding in eval scripts from oe-data-internal now all in one place 2024-09-24 21:57:51 +00:00
Jake Poznanski
ea731055d7 More realistic configuration 2024-09-24 14:50:23 -07:00
Jake Poznanski
0442a33209 New images work much better now, and device map fix 2024-09-24 12:58:18 -07:00
Jake Poznanski
6173f55845 More typos 2024-09-24 11:58:10 -07:00
Jake Poznanski
0d9917367b Flash attention as part of the image 2024-09-24 11:57:56 -07:00
Jake Poznanski
3c8e05362f New image, dont need to install 2024-09-24 11:30:19 -07:00
Jake Poznanski
66c29dd44f Moving to making a new dockerfile 2024-09-24 11:24:14 -07:00