Ikko Eltociear Ashimine
|
47519f4d14
|
Update compare-bpe-tiktoken.ipynb
HuggingFace -> Hugging Face
|
2024-03-10 01:11:35 +09:00 |
|
rasbt
|
29ca41799a
|
use need_weights=False
|
2024-03-09 10:09:17 -06:00 |
|
rasbt
|
5643c88db9
|
add pytorch mha
|
2024-03-08 09:30:55 -06:00 |
|
rasbt
|
3beaea46ce
|
add lowres figs for better navigation
|
2024-03-08 07:18:06 -06:00 |
|
rasbt
|
404f48aa74
|
automatically run on gpu or cpu
|
2024-03-07 20:14:03 -06:00 |
|
rasbt
|
c5b17c3d67
|
simplify
|
2024-03-07 07:52:24 -06:00 |
|
rasbt
|
f454944d5d
|
add setup recommendations
|
2024-03-07 07:32:30 -06:00 |
|
Sebastian Raschka
|
083d11fbd0
|
Merge pull request #55 from rayedbw/patch-4
Update mha-implementations.ipynb
|
2024-03-07 06:31:01 -06:00 |
|
rasbt
|
99a5e28def
|
rename q,k,v for consistency with chapter 3
|
2024-03-07 06:30:40 -06:00 |
|
Rayed Bin Wahed
|
496079c61e
|
Update mha-implementations.ipynb
Fix variable spelling in comments to keep consistent with code
|
2024-03-06 23:03:57 +08:00 |
|
rasbt
|
b6fe1a37b3
|
also add simple wrapper
|
2024-03-06 08:38:53 -06:00 |
|
rasbt
|
571377a2d6
|
update title
|
2024-03-06 08:34:04 -06:00 |
|
Sebastian Raschka
|
d2835931b7
|
Merge pull request #54 from rasbt/mha-variants
Add More Multihead Attention Variants as Bonus Material
|
2024-03-06 08:32:53 -06:00 |
|
rasbt
|
87fcfd9245
|
mha variants
|
2024-03-06 08:30:32 -06:00 |
|
rasbt
|
d4754f1bdd
|
change dim=1 to dim=-1
|
2024-03-04 18:54:43 -06:00 |
|
Sebastian Raschka
|
b50c42ffbb
|
Merge pull request #52 from rasbt/use-embedding-dropout
Add dropout for embedding layers
|
2024-03-04 07:07:46 -06:00 |
|
rasbt
|
e0df4df433
|
add dropout for embedding layers
|
2024-03-04 07:05:06 -06:00 |
|
rasbt
|
3198363c4f
|
add wording from three to four
|
2024-03-04 06:42:58 -06:00 |
|
rasbt
|
29672da3b0
|
stride consistency
|
2024-03-03 19:37:06 -06:00 |
|
rasbt
|
742f0a6d29
|
add missing output in bonus
|
2024-03-03 17:29:46 -06:00 |
|
rasbt
|
f526a8d7fb
|
add requirements file for bonus notebook
|
2024-03-02 16:54:24 -06:00 |
|
rasbt
|
cc2383c4de
|
remove duplicated exercise code
|
2024-03-02 16:44:36 -06:00 |
|
Sebastian Raschka
|
c071ea73f9
|
Update DDP-script.py
Fix for-loop
|
2024-03-01 18:31:05 -06:00 |
|
Sebastian Raschka
|
c9dccb0c40
|
Merge pull request #33 from rayedbw/patch-1
Update ch04.ipynb
|
2024-02-29 20:00:09 -06:00 |
|
rasbt
|
267e33cfaf
|
remove redundant import
|
2024-02-29 19:59:05 -06:00 |
|
Sebastian Raschka
|
d419c02792
|
Merge pull request #39 from rayedbw/patch-3
Update Dockerfile
|
2024-02-29 12:30:50 -06:00 |
|
Rayed Bin Wahed
|
32087331ae
|
Update Dockerfile
Use significantly smaller docker image
|
2024-03-01 02:10:01 +08:00 |
|
Sebastian Raschka
|
a94d53a752
|
Merge pull request #38 from rayedbw/patch-2
Update README.md
|
2024-02-29 12:06:05 -06:00 |
|
Rayed Bin Wahed
|
c47e434162
|
Update README.md
Correct spelling mistake
|
2024-03-01 01:56:58 +08:00 |
|
rasbt
|
7d732a5db0
|
add readme for devcontainer
|
2024-02-29 09:00:06 -06:00 |
|
rasbt
|
ee24acd481
|
Merge branch 'main' of https://github.com/rasbt/LLMs-from-scratch
|
2024-02-29 08:31:20 -06:00 |
|
rasbt
|
b827bf4eea
|
remove redundant double-unsequeeze
|
2024-02-29 08:31:07 -06:00 |
|
Sebastian Raschka
|
3278243dd5
|
Merge pull request #31 from rayedbw/main
Add devcontainer
|
2024-02-29 08:24:29 -06:00 |
|
rasbt
|
fb770ef97c
|
update docker files and docs
|
2024-02-29 08:22:53 -06:00 |
|
Rayed Bin Wahed
|
2fb035435e
|
Update ch04.ipynb
Add missing import
|
2024-02-27 23:05:36 +08:00 |
|
rasbt
|
d89aaf319d
|
update folder name
|
2024-02-27 08:53:04 -06:00 |
|
Sebastian Raschka
|
a060f923d3
|
Merge pull request #32 from rasbt/hparam
Add hparam tuning script
|
2024-02-27 08:52:01 -06:00 |
|
rasbt
|
87a743076d
|
hparam tuning script
|
2024-02-27 08:51:03 -06:00 |
|
rasbt
|
f6266c3756
|
improve code comments
|
2024-02-27 06:40:35 -06:00 |
|
Rayed Bin Wahed
|
45a10dd823
|
Add devcontainer starter doc
|
2024-02-27 13:04:06 +08:00 |
|
Rayed Bin Wahed
|
fa7e659eb3
|
Add devcontainer
|
2024-02-26 22:29:27 +08:00 |
|
Sebastian Raschka
|
78ed2e35bc
|
Add requirements.txt to main repo
|
2024-02-25 13:32:30 -06:00 |
|
Sebastian Raschka
|
3debb2f0df
|
Update README.md
|
2024-02-25 13:31:32 -06:00 |
|
rasbt
|
3f186ab072
|
use .shape instead of .size() for consistency
|
2024-02-25 08:47:25 -06:00 |
|
rasbt
|
cdcd73ba7f
|
drop_last=True
|
2024-02-25 07:23:38 -06:00 |
|
rasbt
|
6243726ab3
|
rename to dataloader v1
|
2024-02-24 07:48:18 -06:00 |
|
rasbt
|
4e68649f16
|
comment update
|
2024-02-24 06:52:17 -06:00 |
|
rasbt
|
f057156181
|
use smaller number of tokens to emphasize next token prediction goal
|
2024-02-15 20:09:20 -06:00 |
|
rasbt
|
557ddfc684
|
make a new example for shortcut connections
|
2024-02-15 19:34:12 -06:00 |
|
rasbt
|
250e6306e2
|
use attn_scores from sec 3.4 instead of 3.3
|
2024-02-14 20:23:59 -06:00 |
|