Sebastian Raschka
c21bfe4a23
Add PyPI package ( #576 )
...
* Add PyPI package
* fixes
* fixes
2025-03-23 19:28:49 -05:00
Sebastian Raschka
a08d7aaa84
Uv workflow improvements ( #531 )
...
* Uv workflow improvements
* Uv workflow improvements
* linter improvements
* pytproject.toml fixes
* pytproject.toml fixes
* pytproject.toml fixes
* pytproject.toml fixes
* pytproject.toml fixes
* pytproject.toml fixes
* windows fixes
* windows fixes
* windows fixes
* windows fixes
* windows fixes
* windows fixes
* win32 fix
* win32 fix
* win32 fix
* win32 fix
* win32 fix
* win32 fix
* win32 fix
* win32 fix
* win32 fix
* win32 fix
* win32 fix
* win32 fix
* win32 fix
* win32 fix
* win32 fix
* win32 fix
* win32 fix
* win32 fix
* win32 fix
2025-02-16 13:16:51 -06:00
Sebastian Raschka
68e2efe1c9
Mention small discrepancy due to Dropout non-reproducibility in PyTorch ( #519 )
...
* Mention small discrepancy due to Dropout non-reproducibility in PyTorch
* bump pytorch version
2025-02-06 14:59:52 -06:00
rasbt
dc1b1a05b0
note about random numbers
2024-09-22 12:02:03 -05:00
Sebastian Raschka
222f7b16f8
update gpt-2 paper url
2024-09-20 07:00:06 -07:00
rasbt
8ad50a3315
update gpt-2 paper link
2024-09-09 06:31:28 -05:00
rasbt
1e48c13e89
update gpt-2 paper link
2024-09-08 15:49:44 -05:00
Sebastian Raschka
08040f024c
Test code in pytorch 2.4 ( #285 )
...
* test code in pytorch 2.4
* update
2024-07-24 21:53:41 -05:00
Thanh Tran
070a69fc8b
fix typos & inconsistent texts ( #269 )
...
Co-authored-by: TRAN <you@example.com>
2024-07-17 07:34:51 -05:00
Jeroen Van Goey
48bd72c890
fix typos, add codespell pre-commit hook ( #264 )
...
* fix typos, add codespell pre-commit hook
* Update .pre-commit-config.yaml
---------
Co-authored-by: Sebastian Raschka <mail@sebastianraschka.com>
2024-07-16 07:07:04 -05:00
rasbt
6ffd628bb6
add missing "be" to figure
2024-07-15 08:06:05 -05:00
rasbt
921e91a05f
use correct chapter reference
2024-07-02 17:29:57 -05:00
rasbt
31806828d0
add links to summary sections
2024-06-29 07:33:26 -05:00
rasbt
796f0e2a30
add clarifying note about GELU
2024-06-29 07:14:36 -05:00
rasbt
ab23ca5b1b
force refresh figure
2024-06-29 07:01:37 -05:00
rasbt
6a8acf5135
remove redundant plus sign
2024-06-29 06:59:36 -05:00
Daniel Kleine
81c843bdc0
minor fixes ( #246 )
...
* removed duplicated white spaces
* Update ch07/01_main-chapter-code/ch07.ipynb
* Update ch07/05_dataset-generation/llama3-ollama.ipynb
* removed duplicated white spaces
* fixed title again
---------
Co-authored-by: Sebastian Raschka <mail@sebastianraschka.com>
2024-06-25 17:30:30 -05:00
rasbt
98d453b666
update formatting
2024-05-24 07:20:37 -05:00
rasbt
e5e6aaf9f1
flops analysis
2024-05-23 20:35:41 -05:00
rasbt
c735c21e87
fix swiglu acronym
2024-05-01 20:26:17 -05:00
Sebastian Raschka
97ed38116a
Rename drop_resid to drop_shortcut ( #136 )
2024-04-28 14:31:27 -05:00
rasbt
d202cabdee
update figures
2024-04-20 11:42:03 -05:00
rasbt
6de0417321
cleanup
2024-04-04 07:58:41 -05:00
Sebastian Raschka
2de60d1bfb
Rename variable to context_length to make it easier on readers ( #106 )
...
* rename to context length
* fix spacing
2024-04-04 07:27:41 -05:00
Sebastian Raschka
3829ccdb34
Remove reundant dropout in MLP module ( #105 )
2024-04-03 20:19:08 -05:00
Sebastian Raschka
a2cd8436cb
Ch05 supplementary code ( #81 )
2024-03-19 09:26:26 -05:00
rasbt
4fc6de7afa
add notes
2024-03-17 09:29:06 -05:00
rasbt
d60da19fd0
add more notes and embed figures externally to save space
2024-03-17 09:08:38 -05:00
rasbt
861c296312
add imports and version on top
2024-03-16 09:50:00 -05:00
joel-foo
dbb5e65a29
Remove duplicate cells
2024-03-10 21:40:57 +08:00
rasbt
e0df4df433
add dropout for embedding layers
2024-03-04 07:05:06 -06:00
rasbt
267e33cfaf
remove redundant import
2024-02-29 19:59:05 -06:00
Rayed Bin Wahed
2fb035435e
Update ch04.ipynb
...
Add missing import
2024-02-27 23:05:36 +08:00
rasbt
f6266c3756
improve code comments
2024-02-27 06:40:35 -06:00
rasbt
3f186ab072
use .shape instead of .size() for consistency
2024-02-25 08:47:25 -06:00
rasbt
f057156181
use smaller number of tokens to emphasize next token prediction goal
2024-02-15 20:09:20 -06:00
rasbt
557ddfc684
make a new example for shortcut connections
2024-02-15 19:34:12 -06:00
rasbt
250e6306e2
use attn_scores from sec 3.4 instead of 3.3
2024-02-14 20:23:59 -06:00
rasbt
231a854ae7
use less ambiguous var name
2024-02-13 07:05:37 -06:00
rasbt
fe332006de
ch4 exercise solutions
2024-02-11 11:51:39 -06:00
rasbt
352b83d225
make softmax explicit
2024-02-11 08:42:21 -06:00
rasbt
7d86023fc4
make softmax explicit
2024-02-11 08:41:45 -06:00
rasbt
5840b4b5f8
update name of last section
2024-02-11 07:35:07 -06:00
rasbt
baa8617921
variable name fix
2024-02-10 17:53:54 -06:00
rasbt
496b52f842
format the other GPT architecture sizes
2024-02-10 17:47:56 -06:00
rasbt
10aa2d099d
add print statements for illustration purposes
2024-02-10 10:10:14 -06:00
rasbt
5d1d8ce511
add shape information for clarity
2024-02-08 20:16:54 -06:00
rasbt
3a5fc79b38
add and update readme files
2024-02-05 06:51:58 -06:00
rasbt
2b38b63a7a
move overview up
2024-02-04 15:57:03 -06:00
rasbt
bb50de7210
adjust figure width
2024-02-04 10:12:11 -06:00