Sebastian Raschka
|
44b0febe68
|
Merge pull request #71 from Intelligence-Manifesto/patch-6
the above -> the following
|
2024-03-15 16:07:22 -05:00 |
|
Intelligence-Manifesto
|
d4b4e3d0f0
|
the above -> the following
|
2024-03-15 05:00:28 +08:00 |
|
rasbt
|
ee8efcbcf6
|
fix plotting
|
2024-03-14 07:41:45 -05:00 |
|
Sebastian Raschka
|
f25760c394
|
Merge pull request #70 from d-kleine/main
Updated Docker readme
|
2024-03-14 06:50:26 -05:00 |
|
Daniel Kleine
|
809ea9d196
|
Update README.md
updated readme for Docker with CUDA support instructions
|
2024-03-13 18:51:20 +01:00 |
|
rasbt
|
1870b4bacd
|
update stride param
|
2024-03-13 08:39:59 -05:00 |
|
Sebastian Raschka
|
0b66c55950
|
Merge pull request #69 from rasbt/pretraining-on-proj-gutenberg
Pretraining on Project Gutenberg
|
2024-03-13 08:38:33 -05:00 |
|
rasbt
|
0d517e98b9
|
update
|
2024-03-13 08:37:54 -05:00 |
|
rasbt
|
f2c8eeb6b8
|
pretraining on project gutenberg
|
2024-03-13 08:34:39 -05:00 |
|
rasbt
|
569f6bc7f0
|
benchmark numbers
|
2024-03-13 07:12:10 -05:00 |
|
Sebastian Raschka
|
319e919062
|
Merge pull request #68 from taihaozesong/fix_ch03_impl_wrapper
Fix mha wrapper implementations in ch03 bonus
|
2024-03-13 07:02:13 -05:00 |
|
taihaozesong
|
f1fa9df15c
|
Fix mha wrapper implementations in ch03 bonus
|
2024-03-13 18:02:26 +08:00 |
|
Sebastian Raschka
|
00b121a5af
|
Merge pull request #66 from rasbt/appendix-d
Add appendix D
|
2024-03-11 07:08:57 -05:00 |
|
rasbt
|
6a585e08bc
|
Add appendix D
|
2024-03-11 07:07:36 -05:00 |
|
Sebastian Raschka
|
8c1871f16e
|
Merge pull request #65 from d-kleine/main
Updated Dockerfile
|
2024-03-11 06:39:33 -05:00 |
|
Sebastian Raschka
|
e524ddb6f4
|
Merge pull request #64 from shenxiangzhuang/fix/chap2_notebook_links
fix: chap2 inner links
|
2024-03-11 06:38:07 -05:00 |
|
Daniel Kleine
|
3787227c41
|
Updated Dockerfile with following changes:
* changed CUDA files to pytorch 2.0.1 (for reproducibility)
* fixed RUN command (for updating Ubuntu and installing Git)
|
2024-03-11 08:06:48 +00:00 |
|
Xiangzhuang Shen
|
fa2864ddbf
|
fix: inner links
|
2024-03-11 10:52:56 +08:00 |
|
rasbt
|
321f3d33f9
|
add cuda warmup
|
2024-03-10 10:31:55 -05:00 |
|
Sebastian Raschka
|
4d67a8be61
|
Merge pull request #63 from joel-foo/main
Remove duplicate cells
|
2024-03-10 09:48:52 -05:00 |
|
joel-foo
|
dbb5e65a29
|
Remove duplicate cells
|
2024-03-10 21:40:57 +08:00 |
|
rasbt
|
244137e8a1
|
amend
|
2024-03-10 08:05:22 -05:00 |
|
rasbt
|
76205521d7
|
different dropout behavior on macos and linux
|
2024-03-10 07:58:10 -05:00 |
|
rasbt
|
73822b8bfa
|
move ex 3.3 solution outside main chapter
|
2024-03-10 07:18:24 -05:00 |
|
rasbt
|
da33ce8054
|
remove redundant unsqueeze in mask
|
2024-03-09 17:42:31 -06:00 |
|
rasbt
|
6ba97adaee
|
add PyTorch version
|
2024-03-09 17:42:30 -06:00 |
|
Sebastian Raschka
|
1d819c3d9c
|
Merge pull request #56 from eltociear/patch-2
Update compare-bpe-tiktoken.ipynb
|
2024-03-09 10:27:11 -06:00 |
|
rasbt
|
5ca60321c4
|
add a100 numbers
|
2024-03-09 10:20:08 -06:00 |
|
Ikko Eltociear Ashimine
|
47519f4d14
|
Update compare-bpe-tiktoken.ipynb
HuggingFace -> Hugging Face
|
2024-03-10 01:11:35 +09:00 |
|
rasbt
|
29ca41799a
|
use need_weights=False
|
2024-03-09 10:09:17 -06:00 |
|
rasbt
|
5643c88db9
|
add pytorch mha
|
2024-03-08 09:30:55 -06:00 |
|
rasbt
|
3beaea46ce
|
add lowres figs for better navigation
|
2024-03-08 07:18:06 -06:00 |
|
rasbt
|
404f48aa74
|
automatically run on gpu or cpu
|
2024-03-07 20:14:03 -06:00 |
|
rasbt
|
c5b17c3d67
|
simplify
|
2024-03-07 07:52:24 -06:00 |
|
rasbt
|
f454944d5d
|
add setup recommendations
|
2024-03-07 07:32:30 -06:00 |
|
Sebastian Raschka
|
083d11fbd0
|
Merge pull request #55 from rayedbw/patch-4
Update mha-implementations.ipynb
|
2024-03-07 06:31:01 -06:00 |
|
rasbt
|
99a5e28def
|
rename q,k,v for consistency with chapter 3
|
2024-03-07 06:30:40 -06:00 |
|
Rayed Bin Wahed
|
496079c61e
|
Update mha-implementations.ipynb
Fix variable spelling in comments to keep consistent with code
|
2024-03-06 23:03:57 +08:00 |
|
rasbt
|
b6fe1a37b3
|
also add simple wrapper
|
2024-03-06 08:38:53 -06:00 |
|
rasbt
|
571377a2d6
|
update title
|
2024-03-06 08:34:04 -06:00 |
|
Sebastian Raschka
|
d2835931b7
|
Merge pull request #54 from rasbt/mha-variants
Add More Multihead Attention Variants as Bonus Material
|
2024-03-06 08:32:53 -06:00 |
|
rasbt
|
87fcfd9245
|
mha variants
|
2024-03-06 08:30:32 -06:00 |
|
rasbt
|
d4754f1bdd
|
change dim=1 to dim=-1
|
2024-03-04 18:54:43 -06:00 |
|
Sebastian Raschka
|
b50c42ffbb
|
Merge pull request #52 from rasbt/use-embedding-dropout
Add dropout for embedding layers
|
2024-03-04 07:07:46 -06:00 |
|
rasbt
|
e0df4df433
|
add dropout for embedding layers
|
2024-03-04 07:05:06 -06:00 |
|
rasbt
|
3198363c4f
|
add wording from three to four
|
2024-03-04 06:42:58 -06:00 |
|
rasbt
|
29672da3b0
|
stride consistency
|
2024-03-03 19:37:06 -06:00 |
|
rasbt
|
742f0a6d29
|
add missing output in bonus
|
2024-03-03 17:29:46 -06:00 |
|
rasbt
|
f526a8d7fb
|
add requirements file for bonus notebook
|
2024-03-02 16:54:24 -06:00 |
|
rasbt
|
cc2383c4de
|
remove duplicated exercise code
|
2024-03-02 16:44:36 -06:00 |
|