704 Commits

Author SHA1 Message Date
Ikko Eltociear Ashimine
47519f4d14 Update compare-bpe-tiktoken.ipynb
HuggingFace -> Hugging Face
2024-03-10 01:11:35 +09:00
rasbt
29ca41799a use need_weights=False 2024-03-09 10:09:17 -06:00
rasbt
5643c88db9 add pytorch mha 2024-03-08 09:30:55 -06:00
rasbt
3beaea46ce add lowres figs for better navigation 2024-03-08 07:18:06 -06:00
rasbt
404f48aa74 automatically run on gpu or cpu 2024-03-07 20:14:03 -06:00
rasbt
c5b17c3d67 simplify 2024-03-07 07:52:24 -06:00
rasbt
f454944d5d add setup recommendations 2024-03-07 07:32:30 -06:00
Sebastian Raschka
083d11fbd0 Merge pull request #55 from rayedbw/patch-4
Update mha-implementations.ipynb
2024-03-07 06:31:01 -06:00
rasbt
99a5e28def rename q,k,v for consistency with chapter 3 2024-03-07 06:30:40 -06:00
Rayed Bin Wahed
496079c61e Update mha-implementations.ipynb
Fix variable spelling in comments to keep consistent with code
2024-03-06 23:03:57 +08:00
rasbt
b6fe1a37b3 also add simple wrapper 2024-03-06 08:38:53 -06:00
rasbt
571377a2d6 update title 2024-03-06 08:34:04 -06:00
Sebastian Raschka
d2835931b7 Merge pull request #54 from rasbt/mha-variants
Add More Multihead Attention Variants as Bonus Material
2024-03-06 08:32:53 -06:00
rasbt
87fcfd9245 mha variants 2024-03-06 08:30:32 -06:00
rasbt
d4754f1bdd change dim=1 to dim=-1 2024-03-04 18:54:43 -06:00
Sebastian Raschka
b50c42ffbb Merge pull request #52 from rasbt/use-embedding-dropout
Add dropout for embedding layers
2024-03-04 07:07:46 -06:00
rasbt
e0df4df433 add dropout for embedding layers 2024-03-04 07:05:06 -06:00
rasbt
3198363c4f add wording from three to four 2024-03-04 06:42:58 -06:00
rasbt
29672da3b0 stride consistency 2024-03-03 19:37:06 -06:00
rasbt
742f0a6d29 add missing output in bonus 2024-03-03 17:29:46 -06:00
rasbt
f526a8d7fb add requirements file for bonus notebook 2024-03-02 16:54:24 -06:00
rasbt
cc2383c4de remove duplicated exercise code 2024-03-02 16:44:36 -06:00
Sebastian Raschka
c071ea73f9 Update DDP-script.py
Fix for-loop
2024-03-01 18:31:05 -06:00
Sebastian Raschka
c9dccb0c40 Merge pull request #33 from rayedbw/patch-1
Update ch04.ipynb
2024-02-29 20:00:09 -06:00
rasbt
267e33cfaf remove redundant import 2024-02-29 19:59:05 -06:00
Sebastian Raschka
d419c02792 Merge pull request #39 from rayedbw/patch-3
Update Dockerfile
2024-02-29 12:30:50 -06:00
Rayed Bin Wahed
32087331ae Update Dockerfile
Use significantly smaller docker image
2024-03-01 02:10:01 +08:00
Sebastian Raschka
a94d53a752 Merge pull request #38 from rayedbw/patch-2
Update README.md
2024-02-29 12:06:05 -06:00
Rayed Bin Wahed
c47e434162 Update README.md
Correct spelling mistake
2024-03-01 01:56:58 +08:00
rasbt
7d732a5db0 add readme for devcontainer 2024-02-29 09:00:06 -06:00
rasbt
ee24acd481 Merge branch 'main' of https://github.com/rasbt/LLMs-from-scratch 2024-02-29 08:31:20 -06:00
rasbt
b827bf4eea remove redundant double-unsequeeze 2024-02-29 08:31:07 -06:00
Sebastian Raschka
3278243dd5 Merge pull request #31 from rayedbw/main
Add devcontainer
2024-02-29 08:24:29 -06:00
rasbt
fb770ef97c update docker files and docs 2024-02-29 08:22:53 -06:00
Rayed Bin Wahed
2fb035435e Update ch04.ipynb
Add missing import
2024-02-27 23:05:36 +08:00
rasbt
d89aaf319d update folder name 2024-02-27 08:53:04 -06:00
Sebastian Raschka
a060f923d3 Merge pull request #32 from rasbt/hparam
Add hparam tuning script
2024-02-27 08:52:01 -06:00
rasbt
87a743076d hparam tuning script 2024-02-27 08:51:03 -06:00
rasbt
f6266c3756 improve code comments 2024-02-27 06:40:35 -06:00
Rayed Bin Wahed
45a10dd823 Add devcontainer starter doc 2024-02-27 13:04:06 +08:00
Rayed Bin Wahed
fa7e659eb3 Add devcontainer 2024-02-26 22:29:27 +08:00
Sebastian Raschka
78ed2e35bc Add requirements.txt to main repo 2024-02-25 13:32:30 -06:00
Sebastian Raschka
3debb2f0df Update README.md 2024-02-25 13:31:32 -06:00
rasbt
3f186ab072 use .shape instead of .size() for consistency 2024-02-25 08:47:25 -06:00
rasbt
cdcd73ba7f drop_last=True 2024-02-25 07:23:38 -06:00
rasbt
6243726ab3 rename to dataloader v1 2024-02-24 07:48:18 -06:00
rasbt
4e68649f16 comment update 2024-02-24 06:52:17 -06:00
rasbt
f057156181 use smaller number of tokens to emphasize next token prediction goal 2024-02-15 20:09:20 -06:00
rasbt
557ddfc684 make a new example for shortcut connections 2024-02-15 19:34:12 -06:00
rasbt
250e6306e2 use attn_scores from sec 3.4 instead of 3.3 2024-02-14 20:23:59 -06:00