12 Commits

Author SHA1 Message Date
Jake Poznanski
64041bd6d7 Allow sampling different anchor text lens 2024-10-23 15:37:23 -07:00
Jake Poznanski
6a22900b8a Allow for sampling anchor and other params 2024-10-23 22:26:12 +00:00
Jake Poznanski
124aaf5fe0 Hmm, cant repro failing anchor case 2024-10-17 17:00:02 +00:00
Jake Poznanski
446773dbc8 First part of new dataloader 2024-10-16 11:54:06 -07:00
Jake Poznanski
d4f64ed82a Config work 2024-10-16 18:37:52 +00:00
Jake Poznanski
dc26541da2 Starting code to build parquets... 2024-10-07 20:59:43 +00:00
Jake Poznanski
8f1fa4f796 Running a mini config again with metric 2024-10-03 11:12:30 -07:00
Jake Poznanski
0ddaf9023d Getting ready to launch a new training run 2024-10-02 23:04:56 +00:00
Jake Poznanski
ea3af0143c Loading dataset from config now 2024-09-23 09:40:24 -07:00
Jake Poznanski
256d77c232 Hoping to get a basic hf Trainer to run 2024-09-20 15:53:11 -07:00
Jake Poznanski
84e68f313e Basic forward generation pass with openai dataset and qwen2vl 2024-09-19 22:16:59 +00:00
Jake Poznanski
7d2c447dd3 Importing core training config stuff from dolma refine 2024-09-19 21:55:07 +00:00