35 Commits

Author SHA1 Message Date
Sajjad Baloch
661a6e84ee
Fix: Typo in appendix_d.py comments. (#682)
* Fix: pkg/llms_from_scratch/appendix_d.py

* minor language typo fix

* fix 691

---------

Co-authored-by: PrinceSajjadHussain <PrinceSajjadHussain@users.noreply.github.com>
Co-authored-by: rasbt <mail@sebastianraschka.com>
2025-06-22 12:15:12 -05:00
Sebastian Raschka
c21bfe4a23
Add PyPI package (#576)
* Add PyPI package

* fixes

* fixes
2025-03-23 19:28:49 -05:00
Sebastian Raschka
fd8d77a79d
A few cosmetic updates (#504) 2025-01-23 09:38:55 -06:00
casinca
9ce0be333b
potential little fixes appendix-D4 .ipynb (#427)
* Update appendix-D.ipynb

- lr missing argument for passing peak_lr to the optimizer
- filling 1 step gap for gradient clipping

* adjustments

---------

Co-authored-by: rasbt <mail@sebastianraschka.com>
2024-11-03 12:12:58 -06:00
rasbt
f03f545a17
Note about warm-up steps 2024-11-01 16:47:12 -05:00
Sebastian Raschka
01cb137bfd
Note about MPS devices (#329) 2024-08-19 20:58:45 -05:00
Daniel Kleine
bbb2a0c3d5
fixed num_workers (#229)
* fixed num_workers

* ch06 & ch07: added num_workers to create_dataloader_v1
2024-06-19 17:36:46 -05:00
Sebastian Raschka
72a073bbbf
Remove leftover instances of self.tokenizer (#201)
* Remove leftover instances of self.tokenizer

* add endoftext token
2024-06-08 14:57:34 -05:00
rasbt
054cdfa4b1
restore file 2024-06-03 07:17:56 -05:00
rasbt
7fdbd16551
add number of workers to data loader 2024-06-03 07:12:47 -05:00
rasbt
6f0a5c320b
fix learning rate scheduler 2024-06-03 07:06:42 -05:00
rasbt
98d453b666
update formatting 2024-05-24 07:20:37 -05:00
rasbt
b40c260859
update how to retrieve learning rate 2024-05-23 17:19:01 -05:00
DrCesar
ecd2855334 fix move model to device before calculating loss 2024-05-14 22:28:00 -07:00
rasbt
a740a62239
tests and exercises 2024-05-13 07:45:59 -05:00
Sebastian Raschka
97ed38116a
Rename drop_resid to drop_shortcut (#136) 2024-04-28 14:31:27 -05:00
Sebastian Raschka
c70ddff558
Return nan if val loader is empty (#124) 2024-04-20 08:02:30 -05:00
Sebastian Raschka
e0ce5ca459
Calculate warmup steps as a fraction (#121) 2024-04-17 20:30:42 -05:00
Sebastian Raschka
dd51d4ad83
Make datesets and loaders compatible with multiprocessing (#118) 2024-04-13 13:57:56 -05:00
James Holcombe
05718c6b94
Use instance tokenizer (#116)
* Use instance tokenizer

* consistency updates

---------

Co-authored-by: Sebastian Raschka <mail@sebastianraschka.com>
2024-04-10 21:16:19 -04:00
rasbt
6de0417321
cleanup 2024-04-04 07:58:41 -05:00
Sebastian Raschka
2de60d1bfb
Rename variable to context_length to make it easier on readers (#106)
* rename to context length

* fix spacing
2024-04-04 07:27:41 -05:00
Sebastian Raschka
3829ccdb34
Remove reundant dropout in MLP module (#105) 2024-04-03 20:19:08 -05:00
rasbt
776a517d18 figure scaling 2024-04-01 08:05:01 -05:00
rasbt
005835bfce make figures for appendix d 2024-03-31 21:24:41 -05:00
rasbt
ac2bdb02bd make figures for appendix d 2024-03-31 21:22:49 -05:00
rasbt
88b2dd780a make batch loss calculatution more efficient 2024-03-27 07:11:56 -05:00
rasbt
3cb5a52a1b simplify calc_loss_loader 2024-03-26 20:34:50 -05:00
rasbt
de576296de simplify .view code 2024-03-25 08:09:31 -05:00
Sebastian Raschka
cf39abac04 Add and link bonus material (#84) 2024-03-23 07:27:43 -05:00
Sebastian Raschka
a2cd8436cb Ch05 supplementary code (#81) 2024-03-19 09:26:26 -05:00
Sebastian Raschka
9d6da22ebb Update pep8 (#78)
* simplify requirements file

* style

* apply linter
2024-03-18 08:16:17 -05:00
rasbt
ff8657ac92 fix ipywidgets formatting issue 2024-03-16 08:35:43 -05:00
rasbt
a155879d71 update formatting 2024-03-16 08:10:58 -05:00
rasbt
6a585e08bc Add appendix D 2024-03-11 07:07:36 -05:00