Jake Poznanski
|
2c7323d1c4
|
Convert silver adjustments
|
2024-09-30 22:41:51 +00:00 |
|
Jake Poznanski
|
80bb0cbc23
|
Open ai to openai comparison now supported, new prompts
|
2024-09-30 22:08:30 +00:00 |
|
Jake Poznanski
|
e179453cc5
|
Fixing qwen checkpoint script
|
2024-09-30 20:34:06 +00:00 |
|
Jake Poznanski
|
963e946233
|
Convertsilver birr script can go in and out of S3 now
|
2024-09-30 20:06:45 +00:00 |
|
Jake Poznanski
|
b856b4551f
|
Fixes to convertsilver to birr script
|
2024-09-30 19:54:30 +00:00 |
|
Jake Poznanski
|
da1982acb8
|
Refactoring prompts into their own new folder
|
2024-09-30 18:48:17 +00:00 |
|
Jake Poznanski
|
d74f9a352b
|
Send silver script tries to open file first, before sending an API requests
|
2024-09-30 18:41:50 +00:00 |
|
Jake Poznanski
|
1216d9c7c9
|
retrieve silver script reports errors better
|
2024-09-30 18:41:33 +00:00 |
|
Jake Poznanski
|
b4e9d6a2b8
|
Buildsilver script suppors reservoir sampling so it can sample 100M+ paths now efficiently
|
2024-09-30 18:41:18 +00:00 |
|
Jake Poznanski
|
8ec9e35f22
|
dataprep issue
|
2024-09-28 04:31:11 +00:00 |
|
Jake Poznanski
|
e53f782b0f
|
Datasetdict fix
|
2024-09-28 03:38:29 +00:00 |
|
Jake Poznanski
|
decfd7fbc1
|
Fixing the refiner input prompt to something simpler that doesn't depend on the training data. Fixing beaker job workspace and bumping priority to high.
|
2024-09-27 22:54:07 +00:00 |
|
Jake Poznanski
|
22b765e6be
|
Going back to non iterable dataset, so shuffling works better, applying a light filter
|
2024-09-27 15:48:56 +00:00 |
|
Jake Poznanski
|
65a9c9981e
|
Hopefuly will train now
|
2024-09-27 15:16:12 +00:00 |
|
Jake Poznanski
|
e864b9d88f
|
weird dataloader stuff now
|
2024-09-27 02:53:59 +00:00 |
|
Jake Poznanski
|
37f10051f6
|
typo
|
2024-09-27 01:19:21 +00:00 |
|
Jake Poznanski
|
c00e40d1c4
|
More fixes
|
2024-09-26 23:10:07 +00:00 |
|
Jake Poznanski
|
d098a87ed2
|
Column name fix
|
2024-09-26 22:29:19 +00:00 |
|
Jake Poznanski
|
84e9da637c
|
Removing lambda due to pickling errors
|
2024-09-26 21:39:08 +00:00 |
|
Jake Poznanski
|
61dd7bb61f
|
Fix for map in iterable mode
|
2024-09-26 20:44:47 +00:00 |
|
Jake Poznanski
|
49efa5cb40
|
Typo
|
2024-09-26 19:57:53 +00:00 |
|
Jake Poznanski
|
cf1aa0176e
|
Proper use of iterable_dataset
|
2024-09-26 19:55:54 +00:00 |
|
Jake Poznanski
|
05fdb81da2
|
map and filter on iterable dataset
|
2024-09-26 19:01:34 +00:00 |
|
Jake Poznanski
|
f14e910175
|
bnb
|
2024-09-26 03:30:35 +00:00 |
|
Jake Poznanski
|
7707bc08da
|
trying cheaper optimizer to solve ooms
|
2024-09-25 22:56:05 +00:00 |
|
Jake Poznanski
|
a0bec4ee41
|
7b scripto
|
2024-09-25 22:08:36 +00:00 |
|
Jake Poznanski
|
385c1bf9a7
|
Lora config
|
2024-09-25 22:07:04 +00:00 |
|
Jake Poznanski
|
24b30b2333
|
Prepping for 7b training
|
2024-09-25 20:51:25 +00:00 |
|
Jake Poznanski
|
5f9b2341c7
|
Some prompt tweaks I thought of for next time
|
2024-09-25 20:47:33 +00:00 |
|
Jake Poznanski
|
8ebe751196
|
Merge branch 'main' of https://github.com/allenai/pdelfin
|
2024-09-25 20:28:05 +00:00 |
|
Jake Poznanski
|
ed502549d3
|
Adding script to convert silver data that we send to openai into something we can run through mise/birr
|
2024-09-25 20:27:49 +00:00 |
|
Jake Poznanski
|
3a5b438a6f
|
Lora misconfiguration
|
2024-09-25 10:48:39 -07:00 |
|
Jake Poznanski
|
86813fe210
|
Filtering off the weird tail ends of the distribution to make training smoother
|
2024-09-25 09:49:03 -07:00 |
|
Jake Poznanski
|
5f313266a4
|
Adding linear layers from visual network to target modules LORA
|
2024-09-25 09:09:24 -07:00 |
|
Jake Poznanski
|
b2341ed4f4
|
Merge branch 'main' of https://github.com/allenai/pdelfin into main
|
2024-09-25 09:05:46 -07:00 |
|
Jake Poznanski
|
9cbc128553
|
Sampling some sequence lengths
|
2024-09-25 09:05:11 -07:00 |
|
Jake Poznanski
|
d0deac5ea7
|
Lora config
|
2024-09-25 08:34:58 -07:00 |
|
Jake Poznanski
|
07c0323c91
|
Adding lora config to try to address OOMs
|
2024-09-25 07:57:01 -07:00 |
|
Jake Poznanski
|
ea0226c499
|
More flexibility in dataloader dims
|
2024-09-24 19:47:13 -07:00 |
|
Jake Poznanski
|
ff3d6aa61a
|
Merge remote-tracking branch 'origin/main'
|
2024-09-24 15:52:45 -07:00 |
|
Jake Poznanski
|
f6905c39ea
|
Hopefully the last changes
|
2024-09-24 15:52:34 -07:00 |
|
Jake Poznanski
|
4fb78c29ef
|
Fixing runeval to work with qwen2vl batch inferences
|
2024-09-24 22:11:53 +00:00 |
|
Jake Poznanski
|
2579931ae2
|
Merge branch 'main' of https://github.com/allenai/pdelfin
|
2024-09-24 21:57:52 +00:00 |
|
Jake Poznanski
|
a50ffe27c9
|
Adding in eval scripts from oe-data-internal now all in one place
|
2024-09-24 21:57:51 +00:00 |
|
Jake Poznanski
|
ea731055d7
|
More realistic configuration
|
2024-09-24 14:50:23 -07:00 |
|
Jake Poznanski
|
0442a33209
|
New images work much better now, and device map fix
|
2024-09-24 12:58:18 -07:00 |
|
Jake Poznanski
|
6173f55845
|
More typos
|
2024-09-24 11:58:10 -07:00 |
|
Jake Poznanski
|
0d9917367b
|
Flash attention as part of the image
|
2024-09-24 11:57:56 -07:00 |
|
Jake Poznanski
|
3c8e05362f
|
New image, dont need to install
|
2024-09-24 11:30:19 -07:00 |
|
Jake Poznanski
|
66c29dd44f
|
Moving to making a new dockerfile
|
2024-09-24 11:24:14 -07:00 |
|