12 Commits

Author SHA1 Message Date
rasbt
1870b4bacd update stride param 2024-03-13 08:39:59 -05:00
rasbt
da33ce8054 remove redundant unsqueeze in mask 2024-03-09 17:42:31 -06:00
rasbt
87fcfd9245 mha variants 2024-03-06 08:30:32 -06:00
rasbt
d4754f1bdd change dim=1 to dim=-1 2024-03-04 18:54:43 -06:00
rasbt
b827bf4eea remove redundant double-unsequeeze 2024-02-29 08:31:07 -06:00
rasbt
8860e16e05 <|endoftext|> token in dataset v1 2024-01-21 12:03:04 -06:00
rasbt
92896d817c add toggle for qkv_bias 2024-01-17 07:50:57 -06:00
rasbt
dfe2c3b46f use blocksize in positional embedding 2024-01-15 08:15:33 -06:00
rasbt
9e85f13ba9 readability improvements 2024-01-15 07:36:19 -06:00
rasbt
a7b4880179 small readability updates 2024-01-14 11:58:42 -06:00
rasbt
4f161bd549 use block size variable in positional embedding layer 2023-12-28 19:05:06 +01:00
rasbt
31980a6ef1 add ch03 and TOC 2023-12-09 17:13:56 -06:00