8 Commits

Author SHA1 Message Date
Sebastian Raschka
5016499d1d Uv workflow improvements (#531)
* Uv workflow improvements

* Uv workflow improvements

* linter improvements

* pytproject.toml fixes

* pytproject.toml fixes

* pytproject.toml fixes

* pytproject.toml fixes

* pytproject.toml fixes

* pytproject.toml fixes

* windows fixes

* windows fixes

* windows fixes

* windows fixes

* windows fixes

* windows fixes

* win32 fix

* win32 fix

* win32 fix

* win32 fix

* win32 fix

* win32 fix

* win32 fix

* win32 fix

* win32 fix

* win32 fix

* win32 fix

* win32 fix

* win32 fix

* win32 fix

* win32 fix

* win32 fix

* win32 fix

* win32 fix

* win32 fix
2025-02-16 13:16:51 -06:00
Sebastian Raschka
fd24a3679a Alternative weight loading via .safetensors (#507) 2025-01-29 08:15:29 -06:00
Daniel Kleine
e5c3c5ce99 minor bug fixes (#207)
* fixed path arg for create_dataset_csvs()

* updated assign_check() to remove user warning
2024-06-12 06:27:56 -05:00
Daniel Kleine
79210eb393 fixes for code (#206)
* updated .gitignore

* removed unused GELU import

* fixed model_configs, fixed all tensors on same device

* removed unused tiktoken

* update

* update hparam search

* remove redundant tokenizer argument

---------

Co-authored-by: rasbt <mail@sebastianraschka.com>
2024-06-11 20:59:48 -05:00
rasbt
fe8bb9291e update formatting 2024-05-24 07:20:37 -05:00
Sebastian Raschka
ccd7cebbb3 Rename variable to context_length to make it easier on readers (#106)
* rename to context length

* fix spacing
2024-04-04 07:27:41 -05:00
rasbt
83adc4a2ac add weight sizes 2024-03-31 08:48:19 -05:00
Sebastian Raschka
4582995ced Add alternative weight loading strategy as backup (#82) 2024-03-20 08:43:18 -05:00