Jake Poznanski
|
da5f8f2f78
|
wsd config
|
2025-07-10 01:13:54 +00:00 |
|
Jake Poznanski
|
336b000416
|
Adding wsd as an option
|
2025-07-09 22:35:57 +00:00 |
|
Jake Poznanski
|
69581cca23
|
More config fixes
|
2025-07-09 17:59:59 +00:00 |
|
Jake Poznanski
|
ca8e503870
|
Ugh, lost some training runs because files got saved to the wrong place
|
2025-07-09 17:57:34 +00:00 |
|
Jake Poznanski
|
02f0706edc
|
Reverting back to json pipeline as it seems better by default
|
2025-07-09 17:46:54 +00:00 |
|
Luca G
|
073cdd066b
|
Expose --gpu_memory_utilization / --max_model_len flags and startup hint
|
2025-07-05 10:42:52 +02:00 |
|
Jake Poznanski
|
8ae9104bb3
|
Calling it with a new name
|
2025-07-03 23:04:58 +00:00 |
|
Jake Poznanski
|
3976cee141
|
Adding 8192 cap on day2 config
|
2025-07-03 23:04:29 +00:00 |
|
Jake Poznanski
|
ca2609cb52
|
No doc anchoring version
|
2025-07-03 18:24:16 +00:00 |
|
Jake Poznanski
|
560a585523
|
Configs with proper names
|
2025-07-03 18:12:09 +00:00 |
|
Jake Poznanski
|
53cc1a0ba9
|
Fixed json configuration
|
2025-07-03 18:01:28 +00:00 |
|
Jake Poznanski
|
2c54c6d06c
|
ALlow unicode in json
|
2025-07-03 16:43:51 +00:00 |
|
Jake Poznanski
|
b1ab9964ee
|
Day 2 json config
|
2025-07-03 16:42:17 +00:00 |
|
Jake Poznanski
|
a1c2ee82a6
|
More workers by default
|
2025-07-03 16:36:07 +00:00 |
|
Jake Poznanski
|
d26ae4bb4d
|
Easier way to test configs
|
2025-07-03 16:30:25 +00:00 |
|
Jake Poznanski
|
a7e2f719bf
|
Start a preemptible one at least once
|
2025-07-02 19:26:30 +00:00 |
|
Jake Poznanski
|
6d6476b31a
|
One idea for resume fix
|
2025-07-02 01:33:37 +00:00 |
|
Jake Poznanski
|
2a20607d37
|
Get rid of fused
|
2025-07-02 01:13:32 +00:00 |
|
Jake Poznanski
|
59f11c7e2e
|
Better names
|
2025-07-02 01:05:40 +00:00 |
|
Jake Poznanski
|
210d170b15
|
Adding a standard JSON output option
|
2025-07-01 22:13:06 +00:00 |
|
Jake Poznanski
|
6f2a426986
|
Fresh prompt configs
|
2025-07-01 21:24:58 +00:00 |
|
Jake Poznanski
|
5e8017b5cd
|
Oops
|
2025-07-01 21:15:45 +00:00 |
|
Jake Poznanski
|
4a6ef91b5e
|
Matching old trainer config
|
2025-07-01 21:15:17 +00:00 |
|
Jake Poznanski
|
5e2f703ee6
|
Trying some config changes
|
2025-07-01 21:01:34 +00:00 |
|
Jake Poznanski
|
94d7900887
|
Default configs are better
|
2025-07-01 20:36:06 +00:00 |
|
Jake Poznanski
|
56e51ea23a
|
Improving regex even more
|
2025-07-01 20:35:57 +00:00 |
|
Jake Poznanski
|
98df1d5fb7
|
Adding max length option
|
2025-07-01 20:22:59 +00:00 |
|
Jake Poznanski
|
abdc907a3c
|
Pipeline fix
|
2025-07-01 20:03:02 +00:00 |
|
Jake Poznanski
|
e691ea176c
|
Better regex for structured decoding, adding some new prompts to train with
|
2025-07-01 18:12:32 +00:00 |
|
Jake Poznanski
|
a651cf0ca6
|
Adding guided regex decoder
|
2025-07-01 17:44:02 +00:00 |
|
Jake Poznanski
|
748e2ae9eb
|
With yaml formatted responses, make sure response finishes with code stop
|
2025-07-01 17:31:21 +00:00 |
|
Jake Poznanski
|
9bf8e9e0fa
|
Preparing pipeline for new format
|
2025-07-01 17:01:33 +00:00 |
|
Jake Poznanski
|
c6c1fbd0eb
|
Better prepare checkpoint script
|
2025-07-01 16:44:19 +00:00 |
|
Jake Poznanski
|
8dcfdd0418
|
Checkpoint prep tool
|
2025-07-01 16:34:29 +00:00 |
|
Jake Poznanski
|
c029ccdbfb
|
Added a few more configs to try
|
2025-07-01 01:46:53 +00:00 |
|
Jake Poznanski
|
79a7818517
|
New trainer launch script for beaker
|
2025-07-01 01:43:38 +00:00 |
|
Jake Poznanski
|
dcf026a63c
|
Better script
|
2025-07-01 01:40:55 +00:00 |
|
Jake Poznanski
|
9f0f912101
|
Ugh
|
2025-06-30 23:37:35 +00:00 |
|
Jake Poznanski
|
1d007d1bf2
|
Perhaps fixing default config
|
2025-06-30 22:58:21 +00:00 |
|
Jake Poznanski
|
e7020c7f50
|
More configs
|
2025-06-30 22:49:42 +00:00 |
|
Jake Poznanski
|
7cf98794fd
|
Image 1600 configuration
|
2025-06-30 22:49:13 +00:00 |
|
Jake Poznanski
|
d2ef9d78f1
|
Four basic training configs for new version
|
2025-06-30 22:31:02 +00:00 |
|
Jake Poznanski
|
a3ad61bd4d
|
Small config updates
|
2025-06-30 22:22:49 +00:00 |
|
Jake Poznanski
|
ee8bd9b220
|
Better resume logic I hope
|
2025-06-30 22:18:15 +00:00 |
|
Jake Poznanski
|
208fabcb69
|
Validating on procespool
|
2025-06-30 22:10:59 +00:00 |
|
Jake Poznanski
|
4f46f10e0c
|
At least get resuming from checkpoints to work perhaps
|
2025-06-30 21:56:12 +00:00 |
|
Jake Poznanski
|
2375079758
|
Torch compile off, gives warnings and no speed boost, padding to do multi batch is not working either
|
2025-06-30 21:47:17 +00:00 |
|
Jake Poznanski
|
c11120a3fa
|
Trying to do batch size > 1
|
2025-06-30 21:37:50 +00:00 |
|
Jake Poznanski
|
5c2d69a3d7
|
Some cleanup stuff
|
2025-06-30 21:24:35 +00:00 |
|
Jake Poznanski
|
e86511e11b
|
Weka fix
|
2025-06-30 17:46:13 +00:00 |
|