18 Commits

Author SHA1 Message Date
Sebastian Raschka
0405b0c8e7
Handle other Qwen3 tokenizer settings (#716) 2025-06-30 17:49:51 -05:00
Sebastian Raschka
c4ec55edac
Support different Qwen3 sizes in pkg (#714) 2025-06-28 08:00:23 -05:00
Sebastian Raschka
81eda38d3b
Improve KV cache code for torch.compile (#705)
* Improve KV cache code for torch.compile

* cleanup

* cleanup
2025-06-23 18:08:49 -05:00
Sebastian Raschka
37b26c2e04
CPU compile performance for Qwen3 models (#704)
* Ch06 classifier function asserts

* Qwen3 cpu compilation perf
2025-06-23 11:06:10 -05:00
Sajjad Baloch
661a6e84ee
Fix: Typo in appendix_d.py comments. (#682)
* Fix: pkg/llms_from_scratch/appendix_d.py

* minor language typo fix

* fix 691

---------

Co-authored-by: PrinceSajjadHussain <PrinceSajjadHussain@users.noreply.github.com>
Co-authored-by: rasbt <mail@sebastianraschka.com>
2025-06-22 12:15:12 -05:00
Sebastian Raschka
0a2e8c39c4
Qwen3 KV cache (#688) 2025-06-21 17:34:39 -05:00
Daniel Kleine
14c054d36c
added pkg fixes (#676)
Co-authored-by: Sebastian Raschka <mail@sebastianraschka.com>
2025-06-21 16:07:50 -05:00
Sebastian Raschka
fdc3e1b701
Add GPT-2 KV cache to pkg (#687) 2025-06-21 12:29:04 -05:00
Sebastian Raschka
3be0f3202a
Llama 3 KV Cache (#685)
* Llama 3 KV Cache

* skip expensive tests on Gh actions

* Update __init__.py
2025-06-21 10:55:20 -05:00
Sebastian Raschka
e719bd86ad
Qwen3 From Scratch (#678)
* Qwen3 From Scratch

* rev other file

* upd

* upd

* upd

* url fixes
2025-06-19 18:44:38 -05:00
Daniel Kleine
c2cfb47b1a
fixed gqa qkv code comments (#660) 2025-06-13 08:21:28 -05:00
Sebastian Raschka
c4cde1c21b
Reduce Llama 3 RoPE memory requirements (#658)
* Llama3 from scratch improvements

* Fix Llama 3 expensive RoPE memory issue

* updates

* update package

* benchmark

* remove unused rescale_theta
2025-06-12 11:08:02 -05:00
Sebastian Raschka
43e25a5165
Llama3Fast (#593)
* Llama3Fast

* Update pkg/llms_from_scratch/tests/test_llama3.py
2025-04-01 12:56:11 -05:00
Sebastian Raschka
aedad7efc3
Add Llama 3.2 to pkg (#591)
* Add Llama 3.2 to pkg

* remove redundant attributes

* update tests

* updates

* updates

* updates

* fix link

* fix link
2025-03-31 18:59:47 -05:00
Sebastian Raschka
3f93d73d6d
Alt weight loading code via PyTorch (#585)
* Alt weight loading code via PyTorch

* commit additional files
2025-03-27 20:10:23 -05:00
Sebastian Raschka
ffd4035144
Add GPTModelFast (#584)
* Add GPTModelFast

* update
2025-03-27 14:00:25 -05:00
Sebastian Raschka
feb1e9a83d
Add readme (#577) 2025-03-23 19:35:12 -05:00
Sebastian Raschka
c21bfe4a23
Add PyPI package (#576)
* Add PyPI package

* fixes

* fixes
2025-03-23 19:28:49 -05:00