Jake Poznanski
|
07c0323c91
|
Adding lora config to try to address OOMs
|
2024-09-25 07:57:01 -07:00 |
|
Jake Poznanski
|
ea0226c499
|
More flexibility in dataloader dims
|
2024-09-24 19:47:13 -07:00 |
|
Jake Poznanski
|
f6905c39ea
|
Hopefully the last changes
|
2024-09-24 15:52:34 -07:00 |
|
Jake Poznanski
|
ea731055d7
|
More realistic configuration
|
2024-09-24 14:50:23 -07:00 |
|
Jake Poznanski
|
0442a33209
|
New images work much better now, and device map fix
|
2024-09-24 12:58:18 -07:00 |
|
Jake Poznanski
|
bf1239deea
|
Use mini dataset now for testing
|
2024-09-24 10:55:03 -07:00 |
|
Jake Poznanski
|
596fc55628
|
Enabling model eval
|
2024-09-24 10:48:53 -07:00 |
|
Jake Poznanski
|
5a0bcb7b1d
|
batch inference slowness
|
2024-09-24 09:13:47 -07:00 |
|
Jake Poznanski
|
28bcf72e11
|
Hoping to get a quick batch inference pipeline rolling
|
2024-09-24 08:56:36 -07:00 |
|
Jake Poznanski
|
45f691c718
|
Starting batch inference script to measure performance, train script using proper model from config now
|
2024-09-24 08:40:46 -07:00 |
|
Jake Poznanski
|
a30ca16e1f
|
Script adjustment
|
2024-09-23 14:41:35 -07:00 |
|
Jake Poznanski
|
a3feca01fc
|
Setting up for a real train run
|
2024-09-23 14:32:10 -07:00 |
|
Jake Poznanski
|
f78d021f50
|
Should be merging the LORA adapters back into the model for the final checkpoint
|
2024-09-23 12:55:01 -07:00 |
|
Jake Poznanski
|
5967a525fd
|
Flash attention and mixed precision training, works quite a bit faster
|
2024-09-23 11:26:18 -07:00 |
|
Jake Poznanski
|
45e5823823
|
Much happier gpu utilization
|
2024-09-23 10:44:25 -07:00 |
|
Jake Poznanski
|
dc71b28ddd
|
No need to save tokenizer
|
2024-09-23 10:06:04 -07:00 |
|
Jake Poznanski
|
5916239cd8
|
typos
|
2024-09-23 09:43:36 -07:00 |
|
Jake Poznanski
|
ea3af0143c
|
Loading dataset from config now
|
2024-09-23 09:40:24 -07:00 |
|
Jake Poznanski
|
ab9458b913
|
Basic LORA trainer, doesn't seem to make any speed difference
|
2024-09-23 09:08:00 -07:00 |
|
Jake Poznanski
|
3ed14a9ea5
|
Prepping new training stuff
|
2024-09-23 08:53:56 -07:00 |
|
Jake Poznanski
|
b915e7de00
|
Smaller config for now, fixing a few requirements
|
2024-09-23 08:20:08 -07:00 |
|
Jake Poznanski
|
256d77c232
|
Hoping to get a basic hf Trainer to run
|
2024-09-20 15:53:11 -07:00 |
|
Jake Poznanski
|
55035b02c9
|
Tries to run a forward pass but oOMS
|
2024-09-20 15:05:23 -07:00 |
|
Jake Poznanski
|
4eddb1b45f
|
Okay, reasonably happy with the dataprep pipeline
|
2024-09-20 13:04:47 -07:00 |
|
Jake Poznanski
|
a47afe5c8d
|
Adding test to make sure the traning and inference time tokenization stays identical, currenlty failing
|
2024-09-20 12:01:05 -07:00 |
|
Jake Poznanski
|
fcb67ebd61
|
Prepping data to be in a trainable format
|
2024-09-20 09:25:54 -07:00 |
|
Jake Poznanski
|
84e68f313e
|
Basic forward generation pass with openai dataset and qwen2vl
|
2024-09-19 22:16:59 +00:00 |
|
Jake Poznanski
|
7d2c447dd3
|
Importing core training config stuff from dolma refine
|
2024-09-19 21:55:07 +00:00 |
|
Jake Poznanski
|
bab32aa9b3
|
Formatting
|
2024-09-18 22:52:42 +00:00 |
|
Jake Poznanski
|
f4d18cb287
|
Dataloader capabable of loading 38k rows reasonably fast
|
2024-09-18 22:48:38 +00:00 |
|
Jake Poznanski
|
d22b311340
|
Starting to write dataloader for visual lm data
|
2024-09-18 21:42:09 +00:00 |
|