Sajjad Baloch
661a6e84ee
Fix: Typo in appendix_d.py
comments. ( #682 )
...
* Fix: pkg/llms_from_scratch/appendix_d.py
* minor language typo fix
* fix 691
---------
Co-authored-by: PrinceSajjadHussain <PrinceSajjadHussain@users.noreply.github.com>
Co-authored-by: rasbt <mail@sebastianraschka.com>
2025-06-22 12:15:12 -05:00
Sebastian Raschka
c21bfe4a23
Add PyPI package ( #576 )
...
* Add PyPI package
* fixes
* fixes
2025-03-23 19:28:49 -05:00
Sebastian Raschka
fd8d77a79d
A few cosmetic updates ( #504 )
2025-01-23 09:38:55 -06:00
casinca
9ce0be333b
potential little fixes appendix-D4 .ipynb
( #427 )
...
* Update appendix-D.ipynb
- lr missing argument for passing peak_lr to the optimizer
- filling 1 step gap for gradient clipping
* adjustments
---------
Co-authored-by: rasbt <mail@sebastianraschka.com>
2024-11-03 12:12:58 -06:00
rasbt
f03f545a17
Note about warm-up steps
2024-11-01 16:47:12 -05:00
Sebastian Raschka
01cb137bfd
Note about MPS devices ( #329 )
2024-08-19 20:58:45 -05:00
Daniel Kleine
bbb2a0c3d5
fixed num_workers ( #229 )
...
* fixed num_workers
* ch06 & ch07: added num_workers to create_dataloader_v1
2024-06-19 17:36:46 -05:00
Sebastian Raschka
72a073bbbf
Remove leftover instances of self.tokenizer ( #201 )
...
* Remove leftover instances of self.tokenizer
* add endoftext token
2024-06-08 14:57:34 -05:00
rasbt
054cdfa4b1
restore file
2024-06-03 07:17:56 -05:00
rasbt
7fdbd16551
add number of workers to data loader
2024-06-03 07:12:47 -05:00
rasbt
6f0a5c320b
fix learning rate scheduler
2024-06-03 07:06:42 -05:00
rasbt
98d453b666
update formatting
2024-05-24 07:20:37 -05:00
rasbt
b40c260859
update how to retrieve learning rate
2024-05-23 17:19:01 -05:00
DrCesar
ecd2855334
fix move model to device before calculating loss
2024-05-14 22:28:00 -07:00
rasbt
a740a62239
tests and exercises
2024-05-13 07:45:59 -05:00
Sebastian Raschka
97ed38116a
Rename drop_resid to drop_shortcut ( #136 )
2024-04-28 14:31:27 -05:00
Sebastian Raschka
c70ddff558
Return nan if val loader is empty ( #124 )
2024-04-20 08:02:30 -05:00
Sebastian Raschka
e0ce5ca459
Calculate warmup steps as a fraction ( #121 )
2024-04-17 20:30:42 -05:00
Sebastian Raschka
dd51d4ad83
Make datesets and loaders compatible with multiprocessing ( #118 )
2024-04-13 13:57:56 -05:00
James Holcombe
05718c6b94
Use instance tokenizer ( #116 )
...
* Use instance tokenizer
* consistency updates
---------
Co-authored-by: Sebastian Raschka <mail@sebastianraschka.com>
2024-04-10 21:16:19 -04:00
rasbt
6de0417321
cleanup
2024-04-04 07:58:41 -05:00
Sebastian Raschka
2de60d1bfb
Rename variable to context_length to make it easier on readers ( #106 )
...
* rename to context length
* fix spacing
2024-04-04 07:27:41 -05:00
Sebastian Raschka
3829ccdb34
Remove reundant dropout in MLP module ( #105 )
2024-04-03 20:19:08 -05:00
rasbt
776a517d18
figure scaling
2024-04-01 08:05:01 -05:00
rasbt
005835bfce
make figures for appendix d
2024-03-31 21:24:41 -05:00
rasbt
ac2bdb02bd
make figures for appendix d
2024-03-31 21:22:49 -05:00
rasbt
88b2dd780a
make batch loss calculatution more efficient
2024-03-27 07:11:56 -05:00
rasbt
3cb5a52a1b
simplify calc_loss_loader
2024-03-26 20:34:50 -05:00
rasbt
de576296de
simplify .view code
2024-03-25 08:09:31 -05:00
Sebastian Raschka
cf39abac04
Add and link bonus material ( #84 )
2024-03-23 07:27:43 -05:00
Sebastian Raschka
a2cd8436cb
Ch05 supplementary code ( #81 )
2024-03-19 09:26:26 -05:00
Sebastian Raschka
9d6da22ebb
Update pep8 ( #78 )
...
* simplify requirements file
* style
* apply linter
2024-03-18 08:16:17 -05:00
rasbt
ff8657ac92
fix ipywidgets formatting issue
2024-03-16 08:35:43 -05:00
rasbt
a155879d71
update formatting
2024-03-16 08:10:58 -05:00
rasbt
6a585e08bc
Add appendix D
2024-03-11 07:07:36 -05:00