rasbt
|
b40c260859
|
update how to retrieve learning rate
|
2024-05-23 17:19:01 -05:00 |
|
rasbt
|
b43a7a8820
|
trainable token -> trainable token position
|
2024-05-23 11:43:20 -05:00 |
|
Sebastian Raschka
|
95cea64824
|
Merge pull request #176 from rasbt/dataset-length-warning
Add assertion about data set length
|
2024-05-23 07:58:47 -04:00 |
|
rasbt
|
18e729643d
|
add assertion about data set length
|
2024-05-23 06:50:43 -05:00 |
|
Sebastian Raschka
|
ec70194d19
|
Merge pull request #174 from rasbt/update-regex
update regex
|
2024-05-22 21:37:50 -04:00 |
|
rasbt
|
5b1dcf0b33
|
reset cell count for better nbdiff
|
2024-05-22 20:27:09 -05:00 |
|
rasbt
|
7686d4569f
|
update regex
|
2024-05-22 20:15:31 -05:00 |
|
Sebastian Raschka
|
9587b58cf7
|
Merge pull request #173 from rasbt/device-setting
Fix device setting
|
2024-05-22 18:59:58 -04:00 |
|
rasbt
|
86f6c2df43
|
Fix device setting
|
2024-05-22 17:51:51 -05:00 |
|
Sebastian Raschka
|
7a2a157844
|
Merge pull request #171 from d-kleine/main
fixed last_two_blocks
|
2024-05-22 18:42:29 -04:00 |
|
Daniel Kleine
|
982cbe5e40
|
removed empty line
|
2024-05-22 16:15:13 +00:00 |
|
Daniel Kleine
|
195849cf8d
|
fixed last_two_blocks
|
2024-05-22 02:02:43 +00:00 |
|
Sebastian Raschka
|
9f8377a154
|
Merge pull request #170 from rasbt/last-two-blocks
Experiment with last two blocks
|
2024-05-21 20:57:55 -04:00 |
|
rasbt
|
caf3725001
|
fix table alignment
|
2024-05-21 19:51:22 -05:00 |
|
rasbt
|
725bed56f7
|
experiment with last two blocks
|
2024-05-21 19:49:34 -05:00 |
|
Sebastian Raschka
|
af73fa6055
|
Merge pull request #169 from d-kleine/main
minor: 2nd gitignore / add. exp. table
|
2024-05-21 20:27:33 -04:00 |
|
Daniel Kleine
|
f39087e573
|
improved readability of Additional Experiments table
|
2024-05-21 19:26:25 +00:00 |
|
Daniel Kleine
|
aa67e6e1ac
|
removed unnecessary .gitignore
|
2024-05-21 19:25:16 +00:00 |
|
Sebastian Raschka
|
451a62994d
|
Merge pull request #166 from rasbt/update-lora-init
Update lora init
|
2024-05-19 21:33:28 -04:00 |
|
rasbt
|
f3a2e93160
|
100x -> 50x
|
2024-05-19 20:26:53 -05:00 |
|
rasbt
|
0d48725b5c
|
use macbook version
|
2024-05-19 20:19:02 -05:00 |
|
rasbt
|
c2028871e4
|
update lora init
|
2024-05-19 20:11:56 -05:00 |
|
rasbt
|
a8a28017c0
|
remove duplicated text
|
2024-05-19 11:34:47 -05:00 |
|
Sebastian Raschka
|
1dd9c0a71e
|
Merge pull request #165 from d-kleine/main
updated .gitignore
|
2024-05-19 12:34:31 -04:00 |
|
Daniel Kleine
|
e7914182c6
|
updated .gitignore
|
2024-05-19 16:07:20 +00:00 |
|
rasbt
|
a5593f9860
|
change defaults to 0 temp
|
2024-05-19 09:04:49 -05:00 |
|
rasbt
|
1463b2ae47
|
use default value for temperature
|
2024-05-19 08:48:10 -05:00 |
|
rasbt
|
1b340c9eb6
|
add ignore index experiment
|
2024-05-19 07:24:49 -05:00 |
|
rasbt
|
02e6f06a11
|
add test mode for dataset download
|
2024-05-18 17:38:19 -05:00 |
|
rasbt
|
5ef4edf2b5
|
new experiment w/o causal mask
|
2024-05-18 17:03:36 -05:00 |
|
Sebastian Raschka
|
57634f2045
|
fix row number typo
|
2024-05-18 15:54:13 -05:00 |
|
Sebastian Raschka
|
e8212c3f7c
|
Merge pull request #164 from rasbt/eos_id-token
Add eos_id option for ch07
|
2024-05-18 16:10:25 -04:00 |
|
rasbt
|
4851d5a0fa
|
add eos_id option for ch07
|
2024-05-18 12:35:40 -05:00 |
|
rasbt
|
3b57b6d8c4
|
make consistent with the latest production version
|
2024-05-18 12:08:39 -05:00 |
|
rasbt
|
ea9da3a89c
|
formatting for consistency with production chapter
|
2024-05-18 11:03:42 -05:00 |
|
Sebastian Raschka
|
217ab77a6c
|
Merge pull request #163 from rasbt/add-gradient-accumulation
Add experiment with gradient accumulation
|
2024-05-17 22:45:57 -04:00 |
|
rasbt
|
42cb0cbd59
|
Add experiment with gradient accumulation
|
2024-05-17 21:31:22 -05:00 |
|
rasbt
|
fc88fefd9c
|
fix no padding option
|
2024-05-17 21:06:51 -05:00 |
|
Sebastian Raschka
|
08f0cfd438
|
Merge pull request #162 from d-kleine/main
minor: fixed variable name in text
|
2024-05-17 16:45:43 -04:00 |
|
Daniel Kleine
|
d7fb3f34ec
|
Merge branch 'rasbt:main' into main
|
2024-05-17 15:59:44 +02:00 |
|
Daniel Kleine
|
7e3638649e
|
fixed var name
|
2024-05-17 13:58:07 +00:00 |
|
Sebastian Raschka
|
ceb53648ae
|
Merge pull request #161 from rasbt/no-padding
Add new experiment without padding
|
2024-05-17 09:35:02 -04:00 |
|
rasbt
|
cbe9664ef4
|
fix link
|
2024-05-17 08:20:35 -05:00 |
|
rasbt
|
5cfc64d038
|
fix indent
|
2024-05-17 07:58:01 -05:00 |
|
rasbt
|
04b9540938
|
Add new experiment without padding
|
2024-05-17 07:55:51 -05:00 |
|
Sebastian Raschka
|
b8451e5077
|
Merge pull request #160 from d-kleine/main
small changes Docker / OpenAI
|
2024-05-17 07:51:26 -04:00 |
|
Daniel Kleine
|
cf8b6c1094
|
fixed empty space
|
2024-05-17 10:44:18 +02:00 |
|
Daniel Kleine
|
cb0e1b2f37
|
added missing step 2 and prettyfied readme
|
2024-05-17 10:43:35 +02:00 |
|
rasbt
|
37a17e2228
|
simplify code
|
2024-05-16 20:16:25 -05:00 |
|
Sebastian Raschka
|
738ec44bf9
|
Merge pull request #157 from DrCesar/main
fix move model to device before calculating loss
|
2024-05-15 20:58:43 -04:00 |
|