Sebastian Raschka
|
e55e3e88e1
|
Alt weight loading code via PyTorch (#585)
* Alt weight loading code via PyTorch
* commit additional files
|
2025-03-27 20:10:23 -05:00 |
|
Sebastian Raschka
|
7114ccd10d
|
Add PyPI package (#576)
* Add PyPI package
* fixes
* fixes
|
2025-03-23 19:28:49 -05:00 |
|
Sebastian Raschka
|
5016499d1d
|
Uv workflow improvements (#531)
* Uv workflow improvements
* Uv workflow improvements
* linter improvements
* pytproject.toml fixes
* pytproject.toml fixes
* pytproject.toml fixes
* pytproject.toml fixes
* pytproject.toml fixes
* pytproject.toml fixes
* windows fixes
* windows fixes
* windows fixes
* windows fixes
* windows fixes
* windows fixes
* win32 fix
* win32 fix
* win32 fix
* win32 fix
* win32 fix
* win32 fix
* win32 fix
* win32 fix
* win32 fix
* win32 fix
* win32 fix
* win32 fix
* win32 fix
* win32 fix
* win32 fix
* win32 fix
* win32 fix
* win32 fix
* win32 fix
|
2025-02-16 13:16:51 -06:00 |
|
Sebastian Raschka
|
fd24a3679a
|
Alternative weight loading via .safetensors (#507)
|
2025-01-29 08:15:29 -06:00 |
|
Daniel Kleine
|
73be1c592f
|
fixed num_workers (#229)
* fixed num_workers
* ch06 & ch07: added num_workers to create_dataloader_v1
|
2024-06-19 17:36:46 -05:00 |
|
Daniel Kleine
|
e5c3c5ce99
|
minor bug fixes (#207)
* fixed path arg for create_dataset_csvs()
* updated assign_check() to remove user warning
|
2024-06-12 06:27:56 -05:00 |
|
Daniel Kleine
|
79210eb393
|
fixes for code (#206)
* updated .gitignore
* removed unused GELU import
* fixed model_configs, fixed all tensors on same device
* removed unused tiktoken
* update
* update hparam search
* remove redundant tokenizer argument
---------
Co-authored-by: rasbt <mail@sebastianraschka.com>
|
2024-06-11 20:59:48 -05:00 |
|
Sebastian Raschka
|
40ba3a4068
|
Remove leftover instances of self.tokenizer (#201)
* Remove leftover instances of self.tokenizer
* add endoftext token
|
2024-06-08 14:57:34 -05:00 |
|
Sebastian Raschka
|
a0b5603423
|
Make header more clear
|
2024-05-25 10:44:12 -05:00 |
|
rasbt
|
fe8bb9291e
|
update formatting
|
2024-05-24 07:20:37 -05:00 |
|
rasbt
|
bc5cbbf1bd
|
change defaults to 0 temp
|
2024-05-19 09:04:49 -05:00 |
|
rasbt
|
59f5ed8d68
|
use default value for temperature
|
2024-05-19 08:48:10 -05:00 |
|
rasbt
|
9d84935b69
|
add eos_id option for ch07
|
2024-05-18 12:35:40 -05:00 |
|
Daniel Kleine
|
e6012b944e
|
fixed empty space
|
2024-05-17 10:44:18 +02:00 |
|
Sebastian Raschka
|
a5b353667d
|
Rename drop_resid to drop_shortcut (#136)
|
2024-04-28 14:31:27 -05:00 |
|
rasbt
|
df4fc602d8
|
update numbering
|
2024-04-22 07:00:20 -05:00 |
|
rasbt
|
2dd7bf9cda
|
file header
|
2024-04-22 06:53:38 -05:00 |
|
Sebastian Raschka
|
bae4b0fb08
|
Make datesets and loaders compatible with multiprocessing (#118)
|
2024-04-13 13:57:56 -05:00 |
|
James Holcombe
|
0b866c133f
|
Use instance tokenizer (#116)
* Use instance tokenizer
* consistency updates
---------
Co-authored-by: Sebastian Raschka <mail@sebastianraschka.com>
|
2024-04-10 21:16:19 -04:00 |
|
Sebastian Raschka
|
ccd7cebbb3
|
Rename variable to context_length to make it easier on readers (#106)
* rename to context length
* fix spacing
|
2024-04-04 07:27:41 -05:00 |
|
Sebastian Raschka
|
5beff4e25a
|
Remove reundant dropout in MLP module (#105)
|
2024-04-03 20:19:08 -05:00 |
|
rasbt
|
83adc4a2ac
|
add weight sizes
|
2024-03-31 08:48:19 -05:00 |
|
Sebastian Raschka
|
4582995ced
|
Add alternative weight loading strategy as backup (#82)
|
2024-03-20 08:43:18 -05:00 |
|