Clarify Qwen3 notebook purpose (#812)

* Clarify Qwen3 notebook purpose * Update README.md * Update README.md
2025-12-11 06:51:21 +00:00 · 2025-09-06 15:31:35 -05:00 · 2025-09-06 15:31:35 -05:00 · 4b0021416a
commit 4b0021416a
parent 6d175a22df
1 changed files with 16 additions and 0 deletions
--- a/ch05/11_qwen3/README.md
+++ b/ch05/11_qwen3/README.md
@ -9,6 +9,22 @@ This [standalone-qwen3-moe.ipynb](standalone-qwen3-moe.ipynb) and [standalone-qw

 <img src="https://sebastianraschka.com/images/LLMs-from-scratch-images/bonus/qwen/qwen3-coder-flash-overview.webp?123" width="430px">

+&nbsp;
+# Qwen3 from-scratch code
+
+The standalone notebooks in this folder contain from-scratch codes in linear fashion:
+
+1. [standalone-qwen3.ipynb](standalone-qwen3.ipynb): The dense Qwen3 model without bells and whistles
+2. [standalone-qwen3-plus-kvcache.ipynb](standalone-qwen3-plus-kvcache.ipynb): Same as above but with KV cache for better inference efficiency
+3. [standalone-qwen3-moe.ipynb](standalone-qwen3-moe.ipynb): Like the first notebook but the Mixture-of-Experts (MoE) variant
+4. [standalone-qwen3-moe-plus-kvcache.ipynb](standalone-qwen3-moe-plus-kvcache.ipynb): Same as above but with KV cache for better inference efficiency
+
+Alternatively, I also organized the code into a Python package [here](../../pkg/llms_from_scratch/) (including unit tests and CI), which you can run as described below.
+
+&nbsp;
+# Training
+
+The `Qwen3Model` class is implemented in a similar style as the `GPTModel` class, so it can be used as a drop-in replacement for training in chapter 5 and finetuning in chapters 6 and 7.


 &nbsp;