73 Commits

Author SHA1 Message Date
Sebastian Raschka
c21bfe4a23
Add PyPI package (#576)
* Add PyPI package

* fixes

* fixes
2025-03-23 19:28:49 -05:00
Sebastian Raschka
a08d7aaa84
Uv workflow improvements (#531)
* Uv workflow improvements

* Uv workflow improvements

* linter improvements

* pytproject.toml fixes

* pytproject.toml fixes

* pytproject.toml fixes

* pytproject.toml fixes

* pytproject.toml fixes

* pytproject.toml fixes

* windows fixes

* windows fixes

* windows fixes

* windows fixes

* windows fixes

* windows fixes

* win32 fix

* win32 fix

* win32 fix

* win32 fix

* win32 fix

* win32 fix

* win32 fix

* win32 fix

* win32 fix

* win32 fix

* win32 fix

* win32 fix

* win32 fix

* win32 fix

* win32 fix

* win32 fix

* win32 fix

* win32 fix

* win32 fix
2025-02-16 13:16:51 -06:00
Sebastian Raschka
68e2efe1c9
Mention small discrepancy due to Dropout non-reproducibility in PyTorch (#519)
* Mention small discrepancy due to Dropout non-reproducibility in PyTorch

* bump pytorch version
2025-02-06 14:59:52 -06:00
Sebastian Raschka
126adb7663
Include mathematical breakdown for exercise solution 4.1 (#483) 2025-01-14 19:23:00 -06:00
rasbt
dc1b1a05b0
note about random numbers 2024-09-22 12:02:03 -05:00
Sebastian Raschka
222f7b16f8 update gpt-2 paper url 2024-09-20 07:00:06 -07:00
rasbt
8ad50a3315
update gpt-2 paper link 2024-09-09 06:31:28 -05:00
rasbt
1e48c13e89
update gpt-2 paper link 2024-09-08 15:49:44 -05:00
Sebastian Raschka
08040f024c
Test code in pytorch 2.4 (#285)
* test code in pytorch 2.4

* update
2024-07-24 21:53:41 -05:00
Thanh Tran
070a69fc8b
fix typos & inconsistent texts (#269)
Co-authored-by: TRAN <you@example.com>
2024-07-17 07:34:51 -05:00
Jeroen Van Goey
48bd72c890
fix typos, add codespell pre-commit hook (#264)
* fix typos, add codespell pre-commit hook

* Update .pre-commit-config.yaml

---------

Co-authored-by: Sebastian Raschka <mail@sebastianraschka.com>
2024-07-16 07:07:04 -05:00
rasbt
6ffd628bb6
add missing "be" to figure 2024-07-15 08:06:05 -05:00
rasbt
921e91a05f
use correct chapter reference 2024-07-02 17:29:57 -05:00
rasbt
31806828d0
add links to summary sections 2024-06-29 07:33:26 -05:00
rasbt
796f0e2a30
add clarifying note about GELU 2024-06-29 07:14:36 -05:00
rasbt
ab23ca5b1b
force refresh figure 2024-06-29 07:01:37 -05:00
rasbt
6a8acf5135
remove redundant plus sign 2024-06-29 06:59:36 -05:00
Daniel Kleine
81c843bdc0
minor fixes (#246)
* removed duplicated white spaces

* Update ch07/01_main-chapter-code/ch07.ipynb

* Update ch07/05_dataset-generation/llama3-ollama.ipynb

* removed duplicated white spaces

* fixed title again

---------

Co-authored-by: Sebastian Raschka <mail@sebastianraschka.com>
2024-06-25 17:30:30 -05:00
rasbt
283397aaf2
add main and optional sections 2024-06-19 17:48:25 -05:00
Daniel Kleine
bbb2a0c3d5
fixed num_workers (#229)
* fixed num_workers

* ch06 & ch07: added num_workers to create_dataloader_v1
2024-06-19 17:36:46 -05:00
Daniel Kleine
dcbdc1d2e5
fixes for code (#206)
* updated .gitignore

* removed unused GELU import

* fixed model_configs, fixed all tensors on same device

* removed unused tiktoken

* update

* update hparam search

* remove redundant tokenizer argument

---------

Co-authored-by: rasbt <mail@sebastianraschka.com>
2024-06-11 20:59:48 -05:00
rasbt
39c4a887eb
add allowed_special={"<|endoftext|>"} 2024-06-09 06:04:02 -05:00
Sebastian Raschka
72a073bbbf
Remove leftover instances of self.tokenizer (#201)
* Remove leftover instances of self.tokenizer

* add endoftext token
2024-06-08 14:57:34 -05:00
rasbt
98d453b666
update formatting 2024-05-24 07:20:37 -05:00
rasbt
e5e6aaf9f1
flops analysis 2024-05-23 20:35:41 -05:00
rasbt
c735c21e87
fix swiglu acronym 2024-05-01 20:26:17 -05:00
Sebastian Raschka
97ed38116a
Rename drop_resid to drop_shortcut (#136) 2024-04-28 14:31:27 -05:00
rasbt
d202cabdee
update figures 2024-04-20 11:42:03 -05:00
Sebastian Raschka
dd51d4ad83
Make datesets and loaders compatible with multiprocessing (#118) 2024-04-13 13:57:56 -05:00
James Holcombe
05718c6b94
Use instance tokenizer (#116)
* Use instance tokenizer

* consistency updates

---------

Co-authored-by: Sebastian Raschka <mail@sebastianraschka.com>
2024-04-10 21:16:19 -04:00
rasbt
6de0417321
cleanup 2024-04-04 07:58:41 -05:00
Sebastian Raschka
2de60d1bfb
Rename variable to context_length to make it easier on readers (#106)
* rename to context length

* fix spacing
2024-04-04 07:27:41 -05:00
Sebastian Raschka
3829ccdb34
Remove reundant dropout in MLP module (#105) 2024-04-03 20:19:08 -05:00
Sebastian Raschka
a2cd8436cb Ch05 supplementary code (#81) 2024-03-19 09:26:26 -05:00
Sebastian Raschka
ca96abac8a Set up basic test gh worklows (#79)
* Set up basic test gh worklows

* update file paths

* env check

* add env check

* Update requirements.txt

* simplify

* upd
2024-03-18 11:58:37 -05:00
Sebastian Raschka
9d6da22ebb Update pep8 (#78)
* simplify requirements file

* style

* apply linter
2024-03-18 08:16:17 -05:00
rasbt
4fc6de7afa add notes 2024-03-17 09:29:06 -05:00
rasbt
d60da19fd0 add more notes and embed figures externally to save space 2024-03-17 09:08:38 -05:00
rasbt
861c296312 add imports and version on top 2024-03-16 09:50:00 -05:00
joel-foo
dbb5e65a29 Remove duplicate cells 2024-03-10 21:40:57 +08:00
rasbt
da33ce8054 remove redundant unsqueeze in mask 2024-03-09 17:42:31 -06:00
rasbt
87fcfd9245 mha variants 2024-03-06 08:30:32 -06:00
rasbt
e0df4df433 add dropout for embedding layers 2024-03-04 07:05:06 -06:00
Sebastian Raschka
c9dccb0c40 Merge pull request #33 from rayedbw/patch-1
Update ch04.ipynb
2024-02-29 20:00:09 -06:00
rasbt
267e33cfaf remove redundant import 2024-02-29 19:59:05 -06:00
rasbt
b827bf4eea remove redundant double-unsequeeze 2024-02-29 08:31:07 -06:00
Rayed Bin Wahed
2fb035435e Update ch04.ipynb
Add missing import
2024-02-27 23:05:36 +08:00
rasbt
f6266c3756 improve code comments 2024-02-27 06:40:35 -06:00
rasbt
3f186ab072 use .shape instead of .size() for consistency 2024-02-25 08:47:25 -06:00
rasbt
cdcd73ba7f drop_last=True 2024-02-25 07:23:38 -06:00