Jake Poznanski
|
500bd2de5b
|
flash attn
|
2024-10-30 22:33:10 +00:00 |
|
Jake Poznanski
|
6a4a55f9e0
|
Hopefully working molmo HF trainer config
|
2024-10-30 14:00:27 -07:00 |
|
Jake Poznanski
|
e65747e591
|
Some better logging
|
2024-10-30 11:22:52 -07:00 |
|
Jake Poznanski
|
bcb47946e5
|
Starting on molmo changes
|
2024-10-30 08:39:48 -07:00 |
|
Jake Poznanski
|
89fcff233a
|
Fixing saving bug again
|
2024-10-17 20:37:28 +00:00 |
|
Jake Poznanski
|
51f1669451
|
fix
|
2024-10-16 13:30:06 -07:00 |
|
Jake Poznanski
|
d94713e73e
|
Truncation handled in a custom collator
|
2024-10-16 13:28:12 -07:00 |
|
Jake Poznanski
|
cbc667ce78
|
Prepping to train
|
2024-10-16 13:18:24 -07:00 |
|
Jake Poznanski
|
446773dbc8
|
First part of new dataloader
|
2024-10-16 11:54:06 -07:00 |
|
Jake Poznanski
|
fdcd77eadd
|
typo
|
2024-10-07 14:32:47 -07:00 |
|
Jake Poznanski
|
7416b42023
|
Adding support for parquet datasets which are precached
|
2024-10-07 21:14:33 +00:00 |
|
Jake Poznanski
|
4557a5b296
|
Typo
|
2024-10-07 13:03:31 -07:00 |
|
Jake Poznanski
|
ebd40f9084
|
Hopefully fixing dataloader for now
|
2024-10-07 12:59:27 -07:00 |
|
Jake Poznanski
|
974ddd3773
|
I'm pretty sure we only need to save on rank0
|
2024-10-03 11:30:44 -07:00 |
|
Jake Poznanski
|
8f1fa4f796
|
Running a mini config again with metric
|
2024-10-03 11:12:30 -07:00 |
|
Jake Poznanski
|
046d4a4534
|
Adding eval on start and seed params
|
2024-10-03 10:54:25 -07:00 |
|
Jake Poznanski
|
0ddaf9023d
|
Getting ready to launch a new training run
|
2024-10-02 23:04:56 +00:00 |
|
Jake Poznanski
|
e53f782b0f
|
Datasetdict fix
|
2024-09-28 03:38:29 +00:00 |
|
Jake Poznanski
|
22b765e6be
|
Going back to non iterable dataset, so shuffling works better, applying a light filter
|
2024-09-27 15:48:56 +00:00 |
|
Jake Poznanski
|
65a9c9981e
|
Hopefuly will train now
|
2024-09-27 15:16:12 +00:00 |
|
Jake Poznanski
|
e864b9d88f
|
weird dataloader stuff now
|
2024-09-27 02:53:59 +00:00 |
|
Jake Poznanski
|
37f10051f6
|
typo
|
2024-09-27 01:19:21 +00:00 |
|
Jake Poznanski
|
c00e40d1c4
|
More fixes
|
2024-09-26 23:10:07 +00:00 |
|
Jake Poznanski
|
d098a87ed2
|
Column name fix
|
2024-09-26 22:29:19 +00:00 |
|
Jake Poznanski
|
84e9da637c
|
Removing lambda due to pickling errors
|
2024-09-26 21:39:08 +00:00 |
|
Jake Poznanski
|
61dd7bb61f
|
Fix for map in iterable mode
|
2024-09-26 20:44:47 +00:00 |
|
Jake Poznanski
|
49efa5cb40
|
Typo
|
2024-09-26 19:57:53 +00:00 |
|
Jake Poznanski
|
cf1aa0176e
|
Proper use of iterable_dataset
|
2024-09-26 19:55:54 +00:00 |
|
Jake Poznanski
|
05fdb81da2
|
map and filter on iterable dataset
|
2024-09-26 19:01:34 +00:00 |
|
Jake Poznanski
|
24b30b2333
|
Prepping for 7b training
|
2024-09-25 20:51:25 +00:00 |
|
Jake Poznanski
|
0442a33209
|
New images work much better now, and device map fix
|
2024-09-24 12:58:18 -07:00 |
|
Jake Poznanski
|
45f691c718
|
Starting batch inference script to measure performance, train script using proper model from config now
|
2024-09-24 08:40:46 -07:00 |
|
Jake Poznanski
|
f78d021f50
|
Should be merging the LORA adapters back into the model for the final checkpoint
|
2024-09-23 12:55:01 -07:00 |
|
Jake Poznanski
|
5967a525fd
|
Flash attention and mixed precision training, works quite a bit faster
|
2024-09-23 11:26:18 -07:00 |
|
Jake Poznanski
|
dc71b28ddd
|
No need to save tokenizer
|
2024-09-23 10:06:04 -07:00 |
|
Jake Poznanski
|
5916239cd8
|
typos
|
2024-09-23 09:43:36 -07:00 |
|
Jake Poznanski
|
ea3af0143c
|
Loading dataset from config now
|
2024-09-23 09:40:24 -07:00 |
|
Jake Poznanski
|
ab9458b913
|
Basic LORA trainer, doesn't seem to make any speed difference
|
2024-09-23 09:08:00 -07:00 |
|
Jake Poznanski
|
3ed14a9ea5
|
Prepping new training stuff
|
2024-09-23 08:53:56 -07:00 |
|
Jake Poznanski
|
256d77c232
|
Hoping to get a basic hf Trainer to run
|
2024-09-20 15:53:11 -07:00 |
|
Jake Poznanski
|
55035b02c9
|
Tries to run a forward pass but oOMS
|
2024-09-20 15:05:23 -07:00 |
|
Jake Poznanski
|
a47afe5c8d
|
Adding test to make sure the traning and inference time tokenization stays identical, currenlty failing
|
2024-09-20 12:01:05 -07:00 |
|
Jake Poznanski
|
fcb67ebd61
|
Prepping data to be in a trainable format
|
2024-09-20 09:25:54 -07:00 |
|
Jake Poznanski
|
84e68f313e
|
Basic forward generation pass with openai dataset and qwen2vl
|
2024-09-19 22:16:59 +00:00 |
|
Jake Poznanski
|
7d2c447dd3
|
Importing core training config stuff from dolma refine
|
2024-09-19 21:55:07 +00:00 |
|
Jake Poznanski
|
bab32aa9b3
|
Formatting
|
2024-09-18 22:52:42 +00:00 |
|
Jake Poznanski
|
d22b311340
|
Starting to write dataloader for visual lm data
|
2024-09-18 21:42:09 +00:00 |
|