44 Commits

Author SHA1 Message Date
Sebastian Raschka
c278745aff
DeBERTa-v3 baseline (#630)
* Llama3 from scratch improvements

* deberta-baseline

* restore
2025-04-19 21:16:17 -05:00
Sebastian Raschka
9df572fdf4
Improve ModernBERT comments (#606)
* Improve modernbert comments

* bash code formatting
2025-04-06 18:29:22 -05:00
Sebastian Raschka
371ab9e8ff
Correct BERT experiments (#600) 2025-04-05 10:05:15 -05:00
Sebastian Raschka
4a9654173c
Add ModernBERT (#598) 2025-04-05 09:13:30 -05:00
Sebastian Raschka
86b714a5e0
Specify UTF-8 encoding in the json load command explicitely (#557) 2025-03-05 11:46:21 -06:00
Sebastian Raschka
d1e99f6092
Fix timeout issue related to spam data backup url (#544)
* Add backup url for Spam Dataset

* import urllib

* fix url

* fix timeout issue
2025-02-20 09:26:23 -06:00
Sebastian Raschka
c39aa32ef5
Add backup url for Spam Dataset (#543)
* Add backup url for Spam Dataset

* import urllib

* fix url
2025-02-20 08:08:28 -06:00
Sebastian Raschka
a08d7aaa84
Uv workflow improvements (#531)
* Uv workflow improvements

* Uv workflow improvements

* linter improvements

* pytproject.toml fixes

* pytproject.toml fixes

* pytproject.toml fixes

* pytproject.toml fixes

* pytproject.toml fixes

* pytproject.toml fixes

* windows fixes

* windows fixes

* windows fixes

* windows fixes

* windows fixes

* windows fixes

* win32 fix

* win32 fix

* win32 fix

* win32 fix

* win32 fix

* win32 fix

* win32 fix

* win32 fix

* win32 fix

* win32 fix

* win32 fix

* win32 fix

* win32 fix

* win32 fix

* win32 fix

* win32 fix

* win32 fix

* win32 fix

* win32 fix
2025-02-16 13:16:51 -06:00
Sebastian Raschka
8cfa52bf1d
More pythonic way to find the longest sequence (#512)
* More pythonic way to find the longest sequence

* pep8 fix
2025-02-01 10:22:47 -06:00
Sebastian Raschka
701090815e
Add backup URL for gpt2 weights (#469)
* Add backup URL for gpt2 weights

* newline
2025-01-05 11:28:09 -06:00
Sebastian Raschka
38969864e6
Add mean pooling experiment to classifier bonus experiments (#406)
* Add mean pooling experiment to classifier bonus  experiments

* formatting

* add average embeddings option

* pep8
2024-10-20 11:04:18 -05:00
Daniel Kleine
c7267c3b09
ch06/03 fixes (#336)
* fixed bash commands

* fixed help docstrings

* added missing logreg bash cmd

* Update train_bert_hf.py

* Update train_bert_hf_spam.py

* Update README.md

---------

Co-authored-by: Sebastian Raschka <mail@sebastianraschka.com>
2024-08-27 08:23:25 +02:00
rasbt
91cdfe3309
sklearn baseline and roberta-large update 2024-08-26 10:31:54 +02:00
TITC
4f791e6cc2
add RoBERTa and params frozen (#335)
* add roberta experiment result

* add roberta & params frozen

* Update README.md

* modify lr

* modify lr

---------

Co-authored-by: Sebastian Raschka <mail@sebastianraschka.com>
2024-08-26 10:27:09 +02:00
Sebastian Raschka
564362044a
add BERT experiment results (#333)
* add BERT experiment results

* cleanup

* formatting
2024-08-23 08:40:40 -05:00
Sebastian Raschka
8d02cb1cee
Add download help message (#274) 2024-07-19 08:29:29 -05:00
Daniel Kleine
90b25ece3d
fixed spelling typos (#258) 2024-07-03 07:47:33 -05:00
Daniel Kleine
bbb2a0c3d5
fixed num_workers (#229)
* fixed num_workers

* ch06 & ch07: added num_workers to create_dataloader_v1
2024-06-19 17:36:46 -05:00
Daniel Kleine
dcbdc1d2e5
fixes for code (#206)
* updated .gitignore

* removed unused GELU import

* fixed model_configs, fixed all tensors on same device

* removed unused tiktoken

* update

* update hparam search

* remove redundant tokenizer argument

---------

Co-authored-by: rasbt <mail@sebastianraschka.com>
2024-06-11 20:59:48 -05:00
rasbt
1b1fd21d64
fix typo in comment 2024-06-09 06:14:02 -05:00
Sebastian Raschka
72a073bbbf
Remove leftover instances of self.tokenizer (#201)
* Remove leftover instances of self.tokenizer

* add endoftext token
2024-06-08 14:57:34 -05:00
rasbt
98d453b666
update formatting 2024-05-24 07:20:37 -05:00
Daniel Kleine
982cbe5e40 removed empty line 2024-05-22 16:15:13 +00:00
rasbt
cbe9664ef4
fix link 2024-05-17 08:20:35 -05:00
Sebastian Raschka
e631823762 improve bonus code in chapter 06 2024-05-14 20:35:50 -04:00
Sebastian Raschka
717b294680
Merge branch 'main' into main 2024-05-14 08:28:02 -05:00
rasbt
52f15dff30
fix file path name 2024-05-14 08:27:46 -05:00
Sebastian Raschka
fa52c3bc78
Merge branch 'main' into main 2024-05-14 08:12:19 -05:00
rasbt
6cfec73490
add previous chapters file 2024-05-14 08:11:58 -05:00
Sebastian Raschka
abd29ce7c2
Merge branch 'main' into main 2024-05-14 08:07:58 -05:00
rasbt
25fb63e14a
add missing gpt-download.py 2024-05-14 08:05:56 -05:00
Daniel Kleine
4bf268f398 added missing python run statement 2024-05-14 12:17:09 +00:00
rasbt
ad41c6e3cc
use validation path 2024-05-12 09:41:46 -05:00
rasbt
33dda489a1
use path 2024-05-12 09:36:35 -05:00
rasbt
188d3cd262
basepath 2024-05-12 09:27:38 -05:00
rasbt
a733a7eb42
basepath 2024-05-12 09:25:56 -05:00
rasbt
2e47a6e61c
update dataset naming 2024-05-12 09:22:42 -05:00
rasbt
55c3a91838
rename download_and_unzip to make it more specific 2024-05-12 08:36:24 -05:00
Sebastian Raschka
58c591c0e0
add header 2024-05-11 14:37:21 -05:00
rasbt
756ff780de experiments with largest model 2024-05-09 07:40:09 -05:00
rasbt
6f486460bc
ouput -> output 2024-05-05 12:21:10 -05:00
rasbt
0ac19a1e50 use training set len 2024-04-29 21:50:07 -05:00
Sebastian Raschka
70cd174091
add roberta option (#135) 2024-04-28 13:57:36 -05:00
Sebastian Raschka
59b4fd3e25
IMDB experiments (#128)
* IMDB experiments

* style fixes

* Update README.md
2024-04-25 07:20:53 -05:00