1279 Commits

Author SHA1 Message Date
Jake Poznanski
6d6476b31a One idea for resume fix 2025-07-02 01:33:37 +00:00
Jake Poznanski
2a20607d37 Get rid of fused 2025-07-02 01:13:32 +00:00
Jake Poznanski
59f11c7e2e Better names 2025-07-02 01:05:40 +00:00
Jake Poznanski
210d170b15 Adding a standard JSON output option 2025-07-01 22:13:06 +00:00
Jake Poznanski
6f2a426986 Fresh prompt configs 2025-07-01 21:24:58 +00:00
Jake Poznanski
5e8017b5cd Oops 2025-07-01 21:15:45 +00:00
Jake Poznanski
4a6ef91b5e Matching old trainer config 2025-07-01 21:15:17 +00:00
Jake Poznanski
5e2f703ee6 Trying some config changes 2025-07-01 21:01:34 +00:00
Jake Poznanski
94d7900887 Default configs are better 2025-07-01 20:36:06 +00:00
Jake Poznanski
56e51ea23a Improving regex even more 2025-07-01 20:35:57 +00:00
Jake Poznanski
98df1d5fb7 Adding max length option 2025-07-01 20:22:59 +00:00
Jake Poznanski
abdc907a3c Pipeline fix 2025-07-01 20:03:02 +00:00
Jake Poznanski
e691ea176c Better regex for structured decoding, adding some new prompts to train with 2025-07-01 18:12:32 +00:00
Jake Poznanski
a651cf0ca6 Adding guided regex decoder 2025-07-01 17:44:02 +00:00
Jake Poznanski
748e2ae9eb With yaml formatted responses, make sure response finishes with code stop 2025-07-01 17:31:21 +00:00
Jake Poznanski
9bf8e9e0fa Preparing pipeline for new format 2025-07-01 17:01:33 +00:00
Jake Poznanski
c6c1fbd0eb Better prepare checkpoint script 2025-07-01 16:44:19 +00:00
Jake Poznanski
8dcfdd0418 Checkpoint prep tool 2025-07-01 16:34:29 +00:00
Jake Poznanski
c029ccdbfb Added a few more configs to try 2025-07-01 01:46:53 +00:00
Jake Poznanski
79a7818517 New trainer launch script for beaker 2025-07-01 01:43:38 +00:00
Jake Poznanski
dcf026a63c Better script 2025-07-01 01:40:55 +00:00
Jake Poznanski
9f0f912101 Ugh 2025-06-30 23:37:35 +00:00
Jake Poznanski
1d007d1bf2 Perhaps fixing default config 2025-06-30 22:58:21 +00:00
Jake Poznanski
e7020c7f50 More configs 2025-06-30 22:49:42 +00:00
Jake Poznanski
7cf98794fd Image 1600 configuration 2025-06-30 22:49:13 +00:00
Jake Poznanski
d2ef9d78f1 Four basic training configs for new version 2025-06-30 22:31:02 +00:00
Jake Poznanski
a3ad61bd4d Small config updates 2025-06-30 22:22:49 +00:00
Jake Poznanski
ee8bd9b220 Better resume logic I hope 2025-06-30 22:18:15 +00:00
Jake Poznanski
208fabcb69 Validating on procespool 2025-06-30 22:10:59 +00:00
Jake Poznanski
4f46f10e0c At least get resuming from checkpoints to work perhaps 2025-06-30 21:56:12 +00:00
Jake Poznanski
2375079758 Torch compile off, gives warnings and no speed boost, padding to do multi batch is not working either 2025-06-30 21:47:17 +00:00
Jake Poznanski
c11120a3fa Trying to do batch size > 1 2025-06-30 21:37:50 +00:00
Jake Poznanski
5c2d69a3d7 Some cleanup stuff 2025-06-30 21:24:35 +00:00
Jake Poznanski
e86511e11b Weka fix 2025-06-30 17:46:13 +00:00
Jake Poznanski
656dbef833 Frontier configs 2025-06-30 17:43:30 +00:00
Jake Poznanski
e2f2d36e4f More typos 2025-06-30 17:41:19 +00:00
Jake Poznanski
ea72ea2645 Ugh stupid fix 2025-06-30 17:40:19 +00:00
Jake Poznanski
55a737ca6b script 2025-06-30 17:32:01 +00:00
Jake Poznanski
ba49fd53d9 frontier train script let's see what happens 2025-06-30 17:30:17 +00:00
Jake Poznanski
bde6f2955e Bf16 only 2025-06-30 17:25:53 +00:00
Jake Poznanski
44dd966850 Wandb fixes 2025-06-30 17:23:47 +00:00
Jake Poznanski
f8071c7457 Loss config 2025-06-30 17:17:48 +00:00
Jake Poznanski
a3997419b3 Naming config entries better 2025-06-30 17:15:58 +00:00
Jake Poznanski
8e5e18f54c Checking that anchor text works for each pdf page when initializing dataloader 2025-06-30 16:29:33 +00:00
Jake Poznanski
dc7fff5bf7 Collator fix 2025-06-29 19:52:53 +00:00
Jake Poznanski
12b5cc3101 Lowwering size of default data load for testing 2025-06-28 23:09:44 +00:00
Jake Poznanski
c36b5df2af Cleanup collator 2025-06-28 22:46:12 +00:00
Jake Poznanski
887190e961 Cleanup 2025-06-27 21:54:31 +00:00
Jake Poznanski
330f465d5d Small fixes 2025-06-27 21:53:06 +00:00
Jake Poznanski
214c44df36 Reporting to wandb, better eval dataset loading 2025-06-27 21:16:22 +00:00