44 Commits

Author SHA1 Message Date
rasbt
c8c0fd4fb5 fix spelling 2024-06-18 05:50:40 -05:00
rasbt
88ad21490c replace figure 2024-06-18 05:46:36 -05:00
Daniel Kleine
79210eb393 fixes for code (#206)
* updated .gitignore

* removed unused GELU import

* fixed model_configs, fixed all tensors on same device

* removed unused tiktoken

* update

* update hparam search

* remove redundant tokenizer argument

---------

Co-authored-by: rasbt <mail@sebastianraschka.com>
2024-06-11 20:59:48 -05:00
rasbt
f0e4c99bc3 fix typo in comment 2024-06-09 06:14:02 -05:00
Sebastian Raschka
40ba3a4068 Remove leftover instances of self.tokenizer (#201)
* Remove leftover instances of self.tokenizer

* add endoftext token
2024-06-08 14:57:34 -05:00
rasbt
fe8bb9291e update formatting 2024-05-24 07:20:37 -05:00
rasbt
c35cf65dbf add assertion about data set length 2024-05-23 06:50:43 -05:00
rasbt
c4cd48475c Fix device setting 2024-05-22 17:51:51 -05:00
rasbt
3b72e55c26 remove duplicated text 2024-05-19 11:34:47 -05:00
rasbt
5541f7c8fe add test mode for dataset download 2024-05-18 17:38:19 -05:00
rasbt
87bf79e888 tokens seen -> examples seen 2024-05-13 20:08:48 -05:00
rasbt
d9e364c04a spelling 2024-05-13 20:06:38 -05:00
rasbt
b350daaa93 add readme 2024-05-13 08:50:55 -05:00
rasbt
c95abad6d1 pep8 fixes 2024-05-13 07:50:51 -05:00
rasbt
13e4282567 tests and exercises 2024-05-13 07:45:59 -05:00
rasbt
c8bcdf5206 fix tests 2024-05-12 19:03:14 -05:00
rasbt
37c33d6fee add chapter 6 unit test 2024-05-12 18:51:28 -05:00
rasbt
6b5bc7a1cd add missing figure 2024-05-12 18:37:02 -05:00
rasbt
ccb862cc36 chapter 06 summary file 2024-05-12 18:27:50 -05:00
rasbt
98c0723b3d update dataset naming 2024-05-12 09:22:42 -05:00
rasbt
beeaf323f1 rename download_and_unzip to make it more specific 2024-05-12 08:36:24 -05:00
rasbt
84edcfaf43 use spam / not spam labels 2024-05-11 13:42:18 -05:00
rasbt
c94f24e759 reorder section 6.6 2024-05-11 08:27:07 -05:00
rasbt
db29f5c685 explain how class labels are obtained 2024-05-11 07:42:13 -05:00
rasbt
774974de97 6 -> 4 2024-05-10 07:02:14 -05:00
rasbt
dadd0f7ea3 clarify overfitting 2024-05-09 09:09:26 -05:00
rasbt
1638dc8b7f spelling improvements 2024-05-09 07:25:52 -05:00
rasbt
1e34f5a429 add note about worker number 2024-05-08 21:20:43 -05:00
rasbt
1e7d1f3bcb update figure 6.6 2024-05-08 20:46:54 -05:00
rasbt
a31d571625 text -> dataset 2024-05-08 08:14:03 -05:00
rasbt
6cc9cf9f4e make spam spelling consistent 2024-05-08 06:48:28 -05:00
rasbt
7082ecac80 formatting improvements 2024-05-06 20:35:51 -05:00
rasbt
0448162fdc show downloads 2024-05-06 07:40:09 -05:00
rasbt
78829f28e9 tokenizing example 2024-05-06 07:16:40 -05:00
rasbt
c6528ede9e ch06 dataset 2024-05-06 06:55:56 -05:00
rasbt
e574d04eba classfication -> classification 2024-05-06 06:50:38 -05:00
Ikko Eltociear Ashimine
d361cef65f Update ch06.ipynb (#143)
ouput -> output
2024-05-05 12:18:20 -05:00
rasbt
a63b0f626c make code more general for larger models 2024-05-05 10:18:46 -05:00
Sebastian Raschka
3328b29521 cosmetics 2024-05-05 08:15:46 -05:00
rasbt
244593ce01 add text-to-token-id fn 2024-05-05 08:05:20 -05:00
Sebastian Raschka
c6fcadb087 Add figures for ch06 (#141) 2024-05-05 07:10:04 -05:00
rasbt
97106950c1 add description 2024-05-04 07:34:29 -05:00
Sebastian Raschka
004b0614fc Ch06 draft (#138)
* Ch06 first draft

* add utility files
2024-05-03 08:37:58 -05:00
Sebastian Raschka
f656ef996d Chapter 6 ablation studies (#127)
* Chapter 6 ablation studies

* add table

* formatting

* formatting

* formatting
2024-04-23 09:51:52 -05:00