9 Commits

Author SHA1 Message Date
Daniel Kleine
79210eb393 fixes for code (#206)
* updated .gitignore

* removed unused GELU import

* fixed model_configs, fixed all tensors on same device

* removed unused tiktoken

* update

* update hparam search

* remove redundant tokenizer argument

---------

Co-authored-by: rasbt <mail@sebastianraschka.com>
2024-06-11 20:59:48 -05:00
rasbt
5a1e0eecce fix learning rate scheduler 2024-06-03 07:06:42 -05:00
rasbt
aa084656e0 update how to retrieve learning rate 2024-05-23 17:19:01 -05:00
Sebastian Raschka
4557d5830e Return nan if val loader is empty (#124) 2024-04-20 08:02:30 -05:00
Sebastian Raschka
bae4b0fb08 Make datesets and loaders compatible with multiprocessing (#118) 2024-04-13 13:57:56 -05:00
Sebastian Raschka
ccd7cebbb3 Rename variable to context_length to make it easier on readers (#106)
* rename to context length

* fix spacing
2024-04-04 07:27:41 -05:00
rasbt
88b2dd780a make batch loss calculatution more efficient 2024-03-27 07:11:56 -05:00
rasbt
3cb5a52a1b simplify calc_loss_loader 2024-03-26 20:34:50 -05:00
Sebastian Raschka
cf39abac04 Add and link bonus material (#84) 2024-03-23 07:27:43 -05:00