LLMs-from-scratch

mirror of https://github.com/rasbt/LLMs-from-scratch.git synced 2025-11-10 23:07:28 +00:00

Author	SHA1	Message	Date
Sebastian Raschka	7ca7c47e4a	Make quote style consistent (#891 )	2025-10-21 19:42:33 -05:00
Sebastian Raschka	e742d8af2c	Improve MoE implementation (#841 )	2025-09-22 15:21:06 -05:00
casinca	42c130623b	`Qwen3Tokenizer` fix for Qwen3 Base models and generation mismatch with HF (#828 ) * prevent `self.apply_chat_template` being applied for base Qwen models * - added no chat template comparison in `test_chat_wrap_and_equivalence` - removed duplicate comparison * Revert "- added no chat template comparison in `test_chat_wrap_and_equivalence`" This reverts commit 3a5ee8cfa19aa7e4874cd5f35171098be760b05f. * Revert "prevent `self.apply_chat_template` being applied for base Qwen models" This reverts commit df504397a8957886c6d6d808615545e37ceffcad. * copied `download_file` in `utils` from https://github.com/rasbt/reasoning-from-scratch/blob/main/reasoning_from_scratch/utils.py * added copy of test `def test_tokenizer_equivalence()` from `reasoning-from-scratch` in `test_qwen3.py` * removed duplicate code fragment in`test_chat_wrap_and_equivalence` * use apply_chat_template * add toggle for instruct model * Update tokenizer usage --------- Co-authored-by: rasbt <mail@sebastianraschka.com>	2025-09-17 08:14:11 -05:00
Sebastian Raschka	9eee9296d9	Interactive qwen3 chat interface (#801 ) * Interactive qwen3 chat interface * update * update * update url	2025-09-01 20:50:25 -05:00
Sebastian Raschka	70edd53809	Improve RoPE (#799 )	2025-08-31 11:46:36 -05:00
Sebastian Raschka	b14325e56d	Qwen3 and Llama3 equivalency teests with HF transformers (#768 ) * Qwen3 and Llama3 equivalency teests with HF transformers * update	2025-08-14 18:36:07 -05:00
Sebastian Raschka	f92b40e4ab	Qwen3 Coder Flash & MoE from Scratch (#760 ) * Qwen3 Coder Flash & MoE from Scratch * update * refinements * updates * update * update * update	2025-08-01 19:13:17 -05:00
Sebastian Raschka	a354555049	Batched KV Cache Inference for Qwen3 (#735 )	2025-07-10 08:09:35 -05:00
Sebastian Raschka	b8c8237251	Qwen3 tokenizer sanity checks (#730 )	2025-07-09 13:52:35 -05:00
Sebastian Raschka	21c41721cc	Add more sophisticated Qwen3 tokenizer (#729 )	2025-07-09 13:16:26 -05:00
Sebastian Raschka	9cf64170ed	Update Qwen3 tokenizer test (#727 ) * Update Qwen3 tokenizer test * add tokenizers to dev dependencies * add tokenizers to dev dependencies	2025-07-08 06:59:46 -05:00
Sebastian Raschka	0405b0c8e7	Handle other Qwen3 tokenizer settings (#716 )	2025-06-30 17:49:51 -05:00
Sebastian Raschka	0a2e8c39c4	Qwen3 KV cache (#688 )	2025-06-21 17:34:39 -05:00
Sebastian Raschka	e719bd86ad	Qwen3 From Scratch (#678 ) * Qwen3 From Scratch * rev other file * upd * upd * upd * url fixes	2025-06-19 18:44:38 -05:00

14 Commits