1362 Commits

Author SHA1 Message Date
Jake Poznanski
538cb7645d
Update README.md 2025-07-23 09:43:41 -07:00
Jake Poznanski
b4c5913772 Bump version to v0.2.0 for release v0.2.0 2025-07-23 15:37:18 +00:00
Jake Poznanski
35a5329516 New version 2025-07-23 15:37:08 +00:00
Jake Poznanski
0a6b2fe42d Lints - bringing back files 2025-07-23 04:49:37 +00:00
Jake Poznanski
56296d6927 Brining back a few files 2025-07-23 04:49:13 +00:00
Jake Poznanski
6e8272413c Lint fixes 2025-07-23 03:40:05 +00:00
Jake Poznanski
5ec49672ea New default model 2025-07-23 03:33:25 +00:00
Jake Poznanski
a4752b5ef9 Merge remote-tracking branch 'origin/main' into jakep/new_trainer 2025-07-23 03:32:49 +00:00
Jake Poznanski
61a13f96cf
Merge pull request #270 from lgibelli/fix-kv-cache-under-allocation
Expose --gpu_memory_utilization / --max_model_len flags and startup hint
2025-07-22 12:40:54 -07:00
Jake Poznanski
9ef3fd704a Adjusting temp by attempt 2025-07-22 19:35:40 +00:00
Jake Poznanski
60c39440b7 More configs 2025-07-22 04:30:42 +00:00
Jake Poznanski
f44d03f608 Don't break on errors 2025-07-22 03:26:06 +00:00
Jake Poznanski
8eb3786fbe Fixing compressor again 2025-07-21 23:30:26 +00:00
Jake Poznanski
da6bc458cd Fix for compresor 2025-07-21 22:21:04 +00:00
Jake Poznanski
eb200e71c6 Fixing some default configs on quantizer 2025-07-21 21:43:33 +00:00
Jake Poznanski
0aa74795d3 More calibration samples by default 2025-07-21 21:33:43 +00:00
Jake Poznanski
6e480129d1 Trying out an idea for dataset augmentation 2025-07-19 15:27:53 +00:00
Jake Poznanski
df960cbc15 2 epoch config just to try 2025-07-18 23:25:27 +00:00
Jake Poznanski
a326c965dd Default to 1288 2025-07-17 19:46:35 +00:00
Jake Poznanski
b88c71e00c Rounding to better image size, full soups 2025-07-16 20:41:35 +00:00
Jake Poznanski
75bfa6a603 Adding full soup configs 2025-07-16 20:32:58 +00:00
Jake Poznanski
0f733ffc30 FIxes to compare vllm script 2025-07-16 19:58:35 +00:00
Jake Poznanski
16145a4b32 Need accelerate 2025-07-16 18:51:37 +00:00
Jake Poznanski
4785759ba1 Adding some souping suppor to prepare checkpoint 2025-07-16 18:16:47 +00:00
Jake Poznanski
2b638559a1 Compare has better downloader 2025-07-16 17:56:39 +00:00
Jake Poznanski
0b40bd3528 Better docker ignore 2025-07-16 17:36:32 +00:00
Jake Poznanski
d21a164bac Fixing async stuff 2025-07-16 17:20:07 +00:00
Jake Poznanski
3ca305d0b8 Adding some souping configs 2025-07-16 03:42:55 +00:00
Jake Poznanski
c0bf3105df Fixing import 2025-07-16 03:36:12 +00:00
Jake Poznanski
31c834dcdd Constants 2025-07-16 02:15:17 +00:00
Jake Poznanski
5ea4e8a6e2 Compare vllm script 2025-07-15 22:55:49 +00:00
Jake Poznanski
939a76a4d1 Adding a compare vllm checkpoint script 2025-07-15 22:38:55 +00:00
Jake Poznanski
24608956a0 Working on comparing to vllm 2025-07-15 22:21:54 +00:00
Jake Poznanski
e6c98236b6 Adding more pipeline retry stats, compress code fixed 2025-07-15 21:41:10 +00:00
Jake Poznanski
4dbbf91e1c Compression script 2025-07-15 21:26:15 +00:00
Jake Poznanski
feb2daba9a Adjus config 2025-07-15 21:06:53 +00:00
Jake Poznanski
022f437b41 w8a8-int8 version 2025-07-15 21:02:35 +00:00
Jake Poznanski
5a4a836be7 Calibration 2025-07-15 20:59:33 +00:00
Jake Poznanski
9115a02c67 Fixes 2025-07-15 20:46:40 +00:00
Jake Poznanski
4b0960bb69 Test 2025-07-15 20:43:43 +00:00
Jake Poznanski
ee69faa87d Dataset 2025-07-15 20:39:13 +00:00
Jake Poznanski
bd92f08cd7 Errors propagated 2025-07-15 20:32:16 +00:00
Jake Poznanski
fcd373d831 Calibration stuff 2025-07-15 20:27:05 +00:00
Jake Poznanski
2218bf8460 Merge branch 'jakep/new_trainer_vllm092' into jakep/new_trainer 2025-07-15 19:51:29 +00:00
Jake Poznanski
b5f480d19d Working on calibration set for compressor, seems like qwen2.5 is not working 2025-07-15 18:59:48 +00:00
Jake Poznanski
e77bcd20ab Upping vllm versions 2025-07-15 18:43:38 +00:00
Jake Poznanski
3f9fc8bd1b Better compressor hopefully 2025-07-15 18:08:17 +00:00
Jake Poznanski
287c8278f5 Starting to cleanup and merge yaml front matter stuff in 2025-07-15 18:00:01 +00:00
Jake Poznanski
1092213c5f Merge branch 'jakep/new_traininer_nojson_newprompt' into jakep/new_trainer 2025-07-15 17:44:55 +00:00
Jake Poznanski
679063aba5 Adding some more logging to compressor 2025-07-15 17:42:33 +00:00