Jake Poznanski
|
500bd2de5b
|
flash attn
|
2024-10-30 22:33:10 +00:00 |
|
Jake Poznanski
|
d45b34fdd5
|
Trust remote code
|
2024-10-30 21:22:39 +00:00 |
|
Jake Poznanski
|
8f001bf74c
|
Config updates
|
2024-10-30 14:02:57 -07:00 |
|
Jake Poznanski
|
6a4a55f9e0
|
Hopefully working molmo HF trainer config
|
2024-10-30 14:00:27 -07:00 |
|
Jake Poznanski
|
bede854cd5
|
Startng to write molmo formatters
|
2024-10-30 13:24:11 -07:00 |
|
Jake Poznanski
|
e65747e591
|
Some better logging
|
2024-10-30 11:22:52 -07:00 |
|
Jake Poznanski
|
43aa4f2508
|
Proper selection of LORA weights
|
2024-10-30 10:42:53 -07:00 |
|
Jake Poznanski
|
bcb47946e5
|
Starting on molmo changes
|
2024-10-30 08:39:48 -07:00 |
|
Jake Poznanski
|
f13d0a5741
|
List configs to list
|
2024-10-24 03:07:32 +00:00 |
|
Jake Poznanski
|
180dde03c5
|
dataprep sampling tests
|
2024-10-23 22:53:05 +00:00 |
|
Jake Poznanski
|
64041bd6d7
|
Allow sampling different anchor text lens
|
2024-10-23 15:37:23 -07:00 |
|
Jake Poznanski
|
6a22900b8a
|
Allow for sampling anchor and other params
|
2024-10-23 22:26:12 +00:00 |
|
Jake Poznanski
|
f44dbd15ef
|
Small fixes
|
2024-10-21 16:45:06 +00:00 |
|
Jake Poznanski
|
a4822718ea
|
train more steps
|
2024-10-19 14:12:44 +00:00 |
|
Jake Poznanski
|
c9ac48bd9d
|
Try to save at the last second only
|
2024-10-19 02:07:57 +00:00 |
|
Jake Poznanski
|
3ecbeae6dc
|
Trying save to s3 but with threaded saver
|
2024-10-17 21:39:01 +00:00 |
|
Jake Poznanski
|
89fcff233a
|
Fixing saving bug again
|
2024-10-17 20:37:28 +00:00 |
|
Jake Poznanski
|
529d51d57d
|
Put LR back, need to save larger checkpoints to weka to prevent timeouts
|
2024-10-17 19:46:25 +00:00 |
|
Jake Poznanski
|
e141c91e5e
|
Try lora run higher LR
|
2024-10-17 17:12:35 +00:00 |
|
Jake Poznanski
|
124aaf5fe0
|
Hmm, cant repro failing anchor case
|
2024-10-17 17:00:02 +00:00 |
|
Jake Poznanski
|
1c42a08d06
|
Fixes to prevent errors later in dataloading
|
2024-10-17 02:28:43 +00:00 |
|
Jake Poznanski
|
f13bcad943
|
Adding check that pdfs are valid in the new anchor text generation format
|
2024-10-16 23:31:40 +00:00 |
|
Jake Poznanski
|
5018d591f6
|
will try lower lr
|
2024-10-16 23:27:00 +00:00 |
|
Jake Poznanski
|
5c36c22bf7
|
Prepping for more training
|
2024-10-16 23:01:40 +00:00 |
|
Jake Poznanski
|
277723fa2c
|
Adding cache
|
2024-10-16 21:18:52 +00:00 |
|
Jake Poznanski
|
87182ab573
|
Ensuring unique names
|
2024-10-16 20:44:23 +00:00 |
|
Jake Poznanski
|
4884b8288b
|
Full dataset
|
2024-10-16 13:30:25 -07:00 |
|
Jake Poznanski
|
51f1669451
|
fix
|
2024-10-16 13:30:06 -07:00 |
|
Jake Poznanski
|
d94713e73e
|
Truncation handled in a custom collator
|
2024-10-16 13:28:12 -07:00 |
|
Jake Poznanski
|
cbc667ce78
|
Prepping to train
|
2024-10-16 13:18:24 -07:00 |
|
Jake Poznanski
|
9d647b13b8
|
fix
|
2024-10-16 11:58:35 -07:00 |
|
Jake Poznanski
|
446773dbc8
|
First part of new dataloader
|
2024-10-16 11:54:06 -07:00 |
|
Jake Poznanski
|
d4f64ed82a
|
Config work
|
2024-10-16 18:37:52 +00:00 |
|
Jake Poznanski
|
3c1b7de293
|
Refactoring of train dataloaders
|
2024-10-16 18:26:25 +00:00 |
|
Jake Poznanski
|
23d129fd2c
|
Organizing around a new style of dataloader
|
2024-10-16 18:06:27 +00:00 |
|
Jake Poznanski
|
fc8fcfaeba
|
Fixing dataloader hopefully
|
2024-10-15 15:13:25 +00:00 |
|
Jake Poznanski
|
7b161533e2
|
Code to do local inference on fine tuned models for testing
|
2024-10-14 08:38:18 -07:00 |
|
Jake Poznanski
|
2dccc4be3b
|
Oops removing print
|
2024-10-11 16:23:14 +00:00 |
|
Jake Poznanski
|
a8b50ae8fa
|
Preloading the datasets directly
|
2024-10-10 19:57:51 +00:00 |
|
Jake Poznanski
|
2864f907e1
|
Dataloader fix with nicer tests
|
2024-10-10 16:58:45 +00:00 |
|
Jake Poznanski
|
7c19a9a856
|
fix
|
2024-10-08 23:54:17 +00:00 |
|
Jake Poznanski
|
ad10add6c1
|
try lower lr
|
2024-10-08 23:52:56 +00:00 |
|
Jake Poznanski
|
230c8a9f9a
|
Trying new run that will rewrite the prompts as it goes
|
2024-10-08 22:10:18 +00:00 |
|
Jake Poznanski
|
085937859f
|
Lower lr
|
2024-10-08 17:52:00 +00:00 |
|
Jake Poznanski
|
f5fd9ff53a
|
Trying grad checkpoint
|
2024-10-08 16:11:31 +00:00 |
|
Jake Poznanski
|
fb4e585e9f
|
Trying out non-lora training
|
2024-10-08 15:20:37 +00:00 |
|
Jake Poznanski
|
ec09408ca9
|
Filtering based on cpu count
|
2024-10-07 15:40:29 -07:00 |
|
Jake Poznanski
|
a90eb94951
|
Fix dataloader bug
|
2024-10-07 15:25:48 -07:00 |
|
Jake Poznanski
|
3d36545fa5
|
loading fix for parquets again...
|
2024-10-07 14:48:53 -07:00 |
|
Jake Poznanski
|
fdcd77eadd
|
typo
|
2024-10-07 14:32:47 -07:00 |
|