18 Commits

Author SHA1 Message Date
Jake Poznanski
24b30b2333 Prepping for 7b training 2024-09-25 20:51:25 +00:00
Jake Poznanski
0442a33209 New images work much better now, and device map fix 2024-09-24 12:58:18 -07:00
Jake Poznanski
45f691c718 Starting batch inference script to measure performance, train script using proper model from config now 2024-09-24 08:40:46 -07:00
Jake Poznanski
f78d021f50 Should be merging the LORA adapters back into the model for the final checkpoint 2024-09-23 12:55:01 -07:00
Jake Poznanski
5967a525fd Flash attention and mixed precision training, works quite a bit faster 2024-09-23 11:26:18 -07:00
Jake Poznanski
dc71b28ddd No need to save tokenizer 2024-09-23 10:06:04 -07:00
Jake Poznanski
5916239cd8 typos 2024-09-23 09:43:36 -07:00
Jake Poznanski
ea3af0143c Loading dataset from config now 2024-09-23 09:40:24 -07:00
Jake Poznanski
ab9458b913 Basic LORA trainer, doesn't seem to make any speed difference 2024-09-23 09:08:00 -07:00
Jake Poznanski
3ed14a9ea5 Prepping new training stuff 2024-09-23 08:53:56 -07:00
Jake Poznanski
256d77c232 Hoping to get a basic hf Trainer to run 2024-09-20 15:53:11 -07:00
Jake Poznanski
55035b02c9 Tries to run a forward pass but oOMS 2024-09-20 15:05:23 -07:00
Jake Poznanski
a47afe5c8d Adding test to make sure the traning and inference time tokenization stays identical, currenlty failing 2024-09-20 12:01:05 -07:00
Jake Poznanski
fcb67ebd61 Prepping data to be in a trainable format 2024-09-20 09:25:54 -07:00
Jake Poznanski
84e68f313e Basic forward generation pass with openai dataset and qwen2vl 2024-09-19 22:16:59 +00:00
Jake Poznanski
7d2c447dd3 Importing core training config stuff from dolma refine 2024-09-19 21:55:07 +00:00
Jake Poznanski
bab32aa9b3 Formatting 2024-09-18 22:52:42 +00:00
Jake Poznanski
d22b311340 Starting to write dataloader for visual lm data 2024-09-18 21:42:09 +00:00