51 Commits

Author SHA1 Message Date
rasbt
7e35370518 note about random numbers 2024-09-22 12:02:03 -05:00
Sebastian Raschka
2aa964765e update gpt-2 paper url 2024-09-20 07:00:06 -07:00
rasbt
f57088285b update gpt-2 paper link 2024-09-09 06:31:28 -05:00
rasbt
e105df0ced update gpt-2 paper link 2024-09-08 15:49:44 -05:00
Sebastian Raschka
6dd8666d9c Test code in pytorch 2.4 (#285)
* test code in pytorch 2.4

* update
2024-07-24 21:53:41 -05:00
Thanh Tran
a2bb045984 fix typos & inconsistent texts (#269)
Co-authored-by: TRAN <you@example.com>
2024-07-17 07:34:51 -05:00
Jeroen Van Goey
70cfced899 fix typos, add codespell pre-commit hook (#264)
* fix typos, add codespell pre-commit hook

* Update .pre-commit-config.yaml

---------

Co-authored-by: Sebastian Raschka <mail@sebastianraschka.com>
2024-07-16 07:07:04 -05:00
rasbt
3b79631672 add missing "be" to figure 2024-07-15 08:06:05 -05:00
rasbt
52e10c7360 use correct chapter reference 2024-07-02 17:29:57 -05:00
rasbt
5e24a042c1 add links to summary sections 2024-06-29 07:33:26 -05:00
rasbt
1ffb7500e4 add clarifying note about GELU 2024-06-29 07:14:36 -05:00
rasbt
1e61943bf2 force refresh figure 2024-06-29 07:01:37 -05:00
rasbt
8b50915fc9 remove redundant plus sign 2024-06-29 06:59:36 -05:00
Daniel Kleine
7a54d383e7 minor fixes (#246)
* removed duplicated white spaces

* Update ch07/01_main-chapter-code/ch07.ipynb

* Update ch07/05_dataset-generation/llama3-ollama.ipynb

* removed duplicated white spaces

* fixed title again

---------

Co-authored-by: Sebastian Raschka <mail@sebastianraschka.com>
2024-06-25 17:30:30 -05:00
rasbt
fe8bb9291e update formatting 2024-05-24 07:20:37 -05:00
rasbt
d93fbbd4b9 flops analysis 2024-05-23 20:35:41 -05:00
rasbt
9e149417b2 fix swiglu acronym 2024-05-01 20:26:17 -05:00
Sebastian Raschka
a5b353667d Rename drop_resid to drop_shortcut (#136) 2024-04-28 14:31:27 -05:00
rasbt
90fb214822 update figures 2024-04-20 11:42:03 -05:00
rasbt
c8cffefb6f cleanup 2024-04-04 07:58:41 -05:00
Sebastian Raschka
ccd7cebbb3 Rename variable to context_length to make it easier on readers (#106)
* rename to context length

* fix spacing
2024-04-04 07:27:41 -05:00
Sebastian Raschka
5beff4e25a Remove reundant dropout in MLP module (#105) 2024-04-03 20:19:08 -05:00
Sebastian Raschka
a2cd8436cb Ch05 supplementary code (#81) 2024-03-19 09:26:26 -05:00
rasbt
4fc6de7afa add notes 2024-03-17 09:29:06 -05:00
rasbt
d60da19fd0 add more notes and embed figures externally to save space 2024-03-17 09:08:38 -05:00
rasbt
861c296312 add imports and version on top 2024-03-16 09:50:00 -05:00
joel-foo
dbb5e65a29 Remove duplicate cells 2024-03-10 21:40:57 +08:00
rasbt
e0df4df433 add dropout for embedding layers 2024-03-04 07:05:06 -06:00
rasbt
267e33cfaf remove redundant import 2024-02-29 19:59:05 -06:00
Rayed Bin Wahed
2fb035435e Update ch04.ipynb
Add missing import
2024-02-27 23:05:36 +08:00
rasbt
f6266c3756 improve code comments 2024-02-27 06:40:35 -06:00
rasbt
3f186ab072 use .shape instead of .size() for consistency 2024-02-25 08:47:25 -06:00
rasbt
f057156181 use smaller number of tokens to emphasize next token prediction goal 2024-02-15 20:09:20 -06:00
rasbt
557ddfc684 make a new example for shortcut connections 2024-02-15 19:34:12 -06:00
rasbt
250e6306e2 use attn_scores from sec 3.4 instead of 3.3 2024-02-14 20:23:59 -06:00
rasbt
231a854ae7 use less ambiguous var name 2024-02-13 07:05:37 -06:00
rasbt
fe332006de ch4 exercise solutions 2024-02-11 11:51:39 -06:00
rasbt
352b83d225 make softmax explicit 2024-02-11 08:42:21 -06:00
rasbt
7d86023fc4 make softmax explicit 2024-02-11 08:41:45 -06:00
rasbt
5840b4b5f8 update name of last section 2024-02-11 07:35:07 -06:00
rasbt
baa8617921 variable name fix 2024-02-10 17:53:54 -06:00
rasbt
496b52f842 format the other GPT architecture sizes 2024-02-10 17:47:56 -06:00
rasbt
10aa2d099d add print statements for illustration purposes 2024-02-10 10:10:14 -06:00
rasbt
5d1d8ce511 add shape information for clarity 2024-02-08 20:16:54 -06:00
rasbt
3a5fc79b38 add and update readme files 2024-02-05 06:51:58 -06:00
rasbt
2b38b63a7a move overview up 2024-02-04 15:57:03 -06:00
rasbt
bb50de7210 adjust figure width 2024-02-04 10:12:11 -06:00
rasbt
1653f6953a adjust figure width 2024-02-04 10:09:36 -06:00
rasbt
ec312e581b add chapter 4 code 2024-02-04 10:02:05 -06:00
rasbt
d261abce4c add forward pass 2024-01-31 08:00:19 -06:00