Sebastian Raschka
|
3bdf18a599
|
Update Llama 3 table for consistency with Qwen3
|
2025-06-23 18:33:04 -05:00 |
|
Sebastian Raschka
|
81eda38d3b
|
Improve KV cache code for torch.compile (#705)
* Improve KV cache code for torch.compile
* cleanup
* cleanup
|
2025-06-23 18:08:49 -05:00 |
|
Sebastian Raschka
|
0a2e8c39c4
|
Qwen3 KV cache (#688)
|
2025-06-21 17:34:39 -05:00 |
|
Sebastian Raschka
|
3be0f3202a
|
Llama 3 KV Cache (#685)
* Llama 3 KV Cache
* skip expensive tests on Gh actions
* Update __init__.py
|
2025-06-21 10:55:20 -05:00 |
|
Sebastian Raschka
|
c4cde1c21b
|
Reduce Llama 3 RoPE memory requirements (#658)
* Llama3 from scratch improvements
* Fix Llama 3 expensive RoPE memory issue
* updates
* update package
* benchmark
* remove unused rescale_theta
|
2025-06-12 11:08:02 -05:00 |
|
Sebastian Raschka
|
47c036058d
|
Llama3 from scratch improvements (#621)
* Llama3 from scratch improvements
* restore
|
2025-04-16 18:08:26 -05:00 |
|
Sebastian Raschka
|
aedad7efc3
|
Add Llama 3.2 to pkg (#591)
* Add Llama 3.2 to pkg
* remove redundant attributes
* update tests
* updates
* updates
* updates
* fix link
* fix link
|
2025-03-31 18:59:47 -05:00 |
|
Sebastian Raschka
|
b44096acef
|
Implement Llama 3.2 (#383)
|
2024-10-05 07:30:47 -05:00 |
|
Sebastian Raschka
|
0467c8289b
|
GPT to Llama (#368)
* GPT to Llama
* fix urls
|
2024-09-23 07:34:06 -05:00 |
|