1589 Commits

Author SHA1 Message Date
Jake Poznanski
4785759ba1 Adding some souping suppor to prepare checkpoint 2025-07-16 18:16:47 +00:00
Jake Poznanski
2b638559a1 Compare has better downloader 2025-07-16 17:56:39 +00:00
Jake Poznanski
0b40bd3528 Better docker ignore 2025-07-16 17:36:32 +00:00
Jake Poznanski
d21a164bac Fixing async stuff 2025-07-16 17:20:07 +00:00
Jake Poznanski
3ca305d0b8 Adding some souping configs 2025-07-16 03:42:55 +00:00
Jake Poznanski
c0bf3105df Fixing import 2025-07-16 03:36:12 +00:00
Jake Poznanski
31c834dcdd Constants 2025-07-16 02:15:17 +00:00
Jake Poznanski
5ea4e8a6e2 Compare vllm script 2025-07-15 22:55:49 +00:00
Jake Poznanski
939a76a4d1 Adding a compare vllm checkpoint script 2025-07-15 22:38:55 +00:00
Jake Poznanski
24608956a0 Working on comparing to vllm 2025-07-15 22:21:54 +00:00
Jake Poznanski
e6c98236b6 Adding more pipeline retry stats, compress code fixed 2025-07-15 21:41:10 +00:00
Jake Poznanski
4dbbf91e1c Compression script 2025-07-15 21:26:15 +00:00
Jake Poznanski
feb2daba9a Adjus config 2025-07-15 21:06:53 +00:00
Jake Poznanski
022f437b41 w8a8-int8 version 2025-07-15 21:02:35 +00:00
Jake Poznanski
5a4a836be7 Calibration 2025-07-15 20:59:33 +00:00
Jake Poznanski
9115a02c67 Fixes 2025-07-15 20:46:40 +00:00
Jake Poznanski
4b0960bb69 Test 2025-07-15 20:43:43 +00:00
Jake Poznanski
ee69faa87d Dataset 2025-07-15 20:39:13 +00:00
Jake Poznanski
bd92f08cd7 Errors propagated 2025-07-15 20:32:16 +00:00
Jake Poznanski
fcd373d831 Calibration stuff 2025-07-15 20:27:05 +00:00
Jake Poznanski
2218bf8460 Merge branch 'jakep/new_trainer_vllm092' into jakep/new_trainer 2025-07-15 19:51:29 +00:00
Jake Poznanski
b5f480d19d Working on calibration set for compressor, seems like qwen2.5 is not working 2025-07-15 18:59:48 +00:00
Jake Poznanski
e77bcd20ab Upping vllm versions 2025-07-15 18:43:38 +00:00
Jake Poznanski
3f9fc8bd1b Better compressor hopefully 2025-07-15 18:08:17 +00:00
Jake Poznanski
287c8278f5 Starting to cleanup and merge yaml front matter stuff in 2025-07-15 18:00:01 +00:00
Jake Poznanski
1092213c5f Merge branch 'jakep/new_traininer_nojson_newprompt' into jakep/new_trainer 2025-07-15 17:44:55 +00:00
Jake Poznanski
679063aba5 Adding some more logging to compressor 2025-07-15 17:42:33 +00:00
Jake Poznanski
43ae28dde4 Prepare checkpoint works for older models too 2025-07-14 21:30:32 +00:00
Jake Poznanski
f306a52fe1 Compress fix 2025-07-14 21:16:48 +00:00
Jake Poznanski
f014c2aaf9 Need to reserve all 8 gpus for reliable performance benchmark, even if you only use 1 2025-07-14 21:02:14 +00:00
Jake Poznanski
01360ba21d Compressor script 2025-07-14 20:56:51 +00:00
Jake Poznanski
1ede76d0b2 Cleaning up compress and prepare checkpoint scripts 2025-07-14 20:36:20 +00:00
Jake Poznanski
2674162d02 New prompt test 2025-07-14 17:35:29 +00:00
Jake Poznanski
a5a0cd7478 Trying a few more configs 2025-07-11 20:19:48 +00:00
Jake Poznanski
384a1b19c7 Qwen 2 config too 2025-07-11 17:20:59 +00:00
Jake Poznanski
24a3fb87e8 128batch config, wsd config 2025-07-11 17:19:02 +00:00
Jake Poznanski
0c773c40af Let's do a 1280 no anchor yaml 2025-07-10 21:44:39 +00:00
Jake Poznanski
65d0edcaae Adding guided decoding option 2025-07-10 15:13:26 +00:00
Jake Poznanski
da5f8f2f78 wsd config 2025-07-10 01:13:54 +00:00
Jake Poznanski
336b000416 Adding wsd as an option 2025-07-09 22:35:57 +00:00
Jake Poznanski
69581cca23 More config fixes 2025-07-09 17:59:59 +00:00
Jake Poznanski
ca8e503870 Ugh, lost some training runs because files got saved to the wrong place 2025-07-09 17:57:34 +00:00
Jake Poznanski
02f0706edc Reverting back to json pipeline as it seems better by default 2025-07-09 17:46:54 +00:00
Luca G
073cdd066b Expose --gpu_memory_utilization / --max_model_len flags and startup hint 2025-07-05 10:42:52 +02:00
Jake Poznanski
8ae9104bb3 Calling it with a new name 2025-07-03 23:04:58 +00:00
Jake Poznanski
3976cee141 Adding 8192 cap on day2 config 2025-07-03 23:04:29 +00:00
Jake Poznanski
ca2609cb52 No doc anchoring version 2025-07-03 18:24:16 +00:00
Jake Poznanski
560a585523 Configs with proper names 2025-07-03 18:12:09 +00:00
Jake Poznanski
53cc1a0ba9 Fixed json configuration 2025-07-03 18:01:28 +00:00
Jake Poznanski
2c54c6d06c ALlow unicode in json 2025-07-03 16:43:51 +00:00