Kasen
|
af4b73ca7b
|
Improve BPE vocabulary saving and pair frequency handling (#539)
|
2025-02-19 09:51:04 -06:00 |
|
Kasen
|
0a5214b804
|
Fix incorrect indentation (#536)
|
2025-02-18 14:47:31 -06:00 |
|
Austin Welch
|
654734053a
|
fix: preserve newline tokens in BPE encoder (#495)
* fix: preserve newline tokens in BPE encoder
* further fixes
* more fixes
---------
Co-authored-by: rasbt <mail@sebastianraschka.com>
|
2025-01-21 12:47:15 -06:00 |
|
Daniel Kleine
|
3f9facbc55
|
BPE: fixed typo (#492)
* fixed typo
* use rel path if exists
* mod gitignore and use existing vocab files
---------
Co-authored-by: rasbt <mail@sebastianraschka.com>
|
2025-01-20 20:49:53 -06:00 |
|
Sebastian Raschka
|
b17d097742
|
Implementingthe BPE Tokenizer from Scratch (#487)
|
2025-01-17 12:22:00 -06:00 |
|