10 Commits

Author SHA1 Message Date
Jake Poznanski
decfd7fbc1 Fixing the refiner input prompt to something simpler that doesn't depend on the training data. Fixing beaker job workspace and bumping priority to high. 2024-09-27 22:54:07 +00:00
Jake Poznanski
22b765e6be Going back to non iterable dataset, so shuffling works better, applying a light filter 2024-09-27 15:48:56 +00:00
Jake Poznanski
c00e40d1c4 More fixes 2024-09-26 23:10:07 +00:00
Jake Poznanski
d098a87ed2 Column name fix 2024-09-26 22:29:19 +00:00
Jake Poznanski
61dd7bb61f Fix for map in iterable mode 2024-09-26 20:44:47 +00:00
Jake Poznanski
cf1aa0176e Proper use of iterable_dataset 2024-09-26 19:55:54 +00:00
Jake Poznanski
9cbc128553 Sampling some sequence lengths 2024-09-25 09:05:11 -07:00
Jake Poznanski
bab32aa9b3 Formatting 2024-09-18 22:52:42 +00:00
Jake Poznanski
f4d18cb287 Dataloader capabable of loading 38k rows reasonably fast 2024-09-18 22:48:38 +00:00
Jake Poznanski
d22b311340 Starting to write dataloader for visual lm data 2024-09-18 21:42:09 +00:00