Daniel Kleine
|
dcbdc1d2e5
|
fixes for code (#206)
* updated .gitignore
* removed unused GELU import
* fixed model_configs, fixed all tensors on same device
* removed unused tiktoken
* update
* update hparam search
* remove redundant tokenizer argument
---------
Co-authored-by: rasbt <mail@sebastianraschka.com>
|
2024-06-11 20:59:48 -05:00 |
|
rasbt
|
6f0a5c320b
|
fix learning rate scheduler
|
2024-06-03 07:06:42 -05:00 |
|
rasbt
|
b40c260859
|
update how to retrieve learning rate
|
2024-05-23 17:19:01 -05:00 |
|
Sebastian Raschka
|
c70ddff558
|
Return nan if val loader is empty (#124)
|
2024-04-20 08:02:30 -05:00 |
|
Sebastian Raschka
|
dd51d4ad83
|
Make datesets and loaders compatible with multiprocessing (#118)
|
2024-04-13 13:57:56 -05:00 |
|
Sebastian Raschka
|
2de60d1bfb
|
Rename variable to context_length to make it easier on readers (#106)
* rename to context length
* fix spacing
|
2024-04-04 07:27:41 -05:00 |
|
rasbt
|
88b2dd780a
|
make batch loss calculatution more efficient
|
2024-03-27 07:11:56 -05:00 |
|
rasbt
|
3cb5a52a1b
|
simplify calc_loss_loader
|
2024-03-26 20:34:50 -05:00 |
|
Sebastian Raschka
|
cf39abac04
|
Add and link bonus material (#84)
|
2024-03-23 07:27:43 -05:00 |
|