rasbt
3beaea46ce
add lowres figs for better navigation
2024-03-08 07:18:06 -06:00
rasbt
404f48aa74
automatically run on gpu or cpu
2024-03-07 20:14:03 -06:00
rasbt
c5b17c3d67
simplify
2024-03-07 07:52:24 -06:00
rasbt
f454944d5d
add setup recommendations
2024-03-07 07:32:30 -06:00
Sebastian Raschka
083d11fbd0
Merge pull request #55 from rayedbw/patch-4
...
Update mha-implementations.ipynb
2024-03-07 06:31:01 -06:00
rasbt
99a5e28def
rename q,k,v for consistency with chapter 3
2024-03-07 06:30:40 -06:00
Rayed Bin Wahed
496079c61e
Update mha-implementations.ipynb
...
Fix variable spelling in comments to keep consistent with code
2024-03-06 23:03:57 +08:00
rasbt
b6fe1a37b3
also add simple wrapper
2024-03-06 08:38:53 -06:00
rasbt
571377a2d6
update title
2024-03-06 08:34:04 -06:00
Sebastian Raschka
d2835931b7
Merge pull request #54 from rasbt/mha-variants
...
Add More Multihead Attention Variants as Bonus Material
2024-03-06 08:32:53 -06:00
rasbt
87fcfd9245
mha variants
2024-03-06 08:30:32 -06:00
rasbt
d4754f1bdd
change dim=1 to dim=-1
2024-03-04 18:54:43 -06:00
Sebastian Raschka
b50c42ffbb
Merge pull request #52 from rasbt/use-embedding-dropout
...
Add dropout for embedding layers
2024-03-04 07:07:46 -06:00
rasbt
e0df4df433
add dropout for embedding layers
2024-03-04 07:05:06 -06:00
rasbt
3198363c4f
add wording from three to four
2024-03-04 06:42:58 -06:00
rasbt
29672da3b0
stride consistency
2024-03-03 19:37:06 -06:00
rasbt
742f0a6d29
add missing output in bonus
2024-03-03 17:29:46 -06:00
rasbt
f526a8d7fb
add requirements file for bonus notebook
2024-03-02 16:54:24 -06:00
rasbt
cc2383c4de
remove duplicated exercise code
2024-03-02 16:44:36 -06:00
Sebastian Raschka
c071ea73f9
Update DDP-script.py
...
Fix for-loop
2024-03-01 18:31:05 -06:00
Sebastian Raschka
c9dccb0c40
Merge pull request #33 from rayedbw/patch-1
...
Update ch04.ipynb
2024-02-29 20:00:09 -06:00
rasbt
267e33cfaf
remove redundant import
2024-02-29 19:59:05 -06:00
Sebastian Raschka
d419c02792
Merge pull request #39 from rayedbw/patch-3
...
Update Dockerfile
2024-02-29 12:30:50 -06:00
Rayed Bin Wahed
32087331ae
Update Dockerfile
...
Use significantly smaller docker image
2024-03-01 02:10:01 +08:00
Sebastian Raschka
a94d53a752
Merge pull request #38 from rayedbw/patch-2
...
Update README.md
2024-02-29 12:06:05 -06:00
Rayed Bin Wahed
c47e434162
Update README.md
...
Correct spelling mistake
2024-03-01 01:56:58 +08:00
rasbt
7d732a5db0
add readme for devcontainer
2024-02-29 09:00:06 -06:00
rasbt
ee24acd481
Merge branch 'main' of https://github.com/rasbt/LLMs-from-scratch
2024-02-29 08:31:20 -06:00
rasbt
b827bf4eea
remove redundant double-unsequeeze
2024-02-29 08:31:07 -06:00
Sebastian Raschka
3278243dd5
Merge pull request #31 from rayedbw/main
...
Add devcontainer
2024-02-29 08:24:29 -06:00
rasbt
fb770ef97c
update docker files and docs
2024-02-29 08:22:53 -06:00
Rayed Bin Wahed
2fb035435e
Update ch04.ipynb
...
Add missing import
2024-02-27 23:05:36 +08:00
rasbt
d89aaf319d
update folder name
2024-02-27 08:53:04 -06:00
Sebastian Raschka
a060f923d3
Merge pull request #32 from rasbt/hparam
...
Add hparam tuning script
2024-02-27 08:52:01 -06:00
rasbt
87a743076d
hparam tuning script
2024-02-27 08:51:03 -06:00
rasbt
f6266c3756
improve code comments
2024-02-27 06:40:35 -06:00
Rayed Bin Wahed
45a10dd823
Add devcontainer starter doc
2024-02-27 13:04:06 +08:00
Rayed Bin Wahed
fa7e659eb3
Add devcontainer
2024-02-26 22:29:27 +08:00
Sebastian Raschka
78ed2e35bc
Add requirements.txt to main repo
2024-02-25 13:32:30 -06:00
Sebastian Raschka
3debb2f0df
Update README.md
2024-02-25 13:31:32 -06:00
rasbt
3f186ab072
use .shape instead of .size() for consistency
2024-02-25 08:47:25 -06:00
rasbt
cdcd73ba7f
drop_last=True
2024-02-25 07:23:38 -06:00
rasbt
6243726ab3
rename to dataloader v1
2024-02-24 07:48:18 -06:00
rasbt
4e68649f16
comment update
2024-02-24 06:52:17 -06:00
rasbt
f057156181
use smaller number of tokens to emphasize next token prediction goal
2024-02-15 20:09:20 -06:00
rasbt
557ddfc684
make a new example for shortcut connections
2024-02-15 19:34:12 -06:00
rasbt
250e6306e2
use attn_scores from sec 3.4 instead of 3.3
2024-02-14 20:23:59 -06:00
rasbt
231a854ae7
use less ambiguous var name
2024-02-13 07:05:37 -06:00
Sebastian Raschka
320f63829f
Merge pull request #29 from Intelligence-Manifesto/patch-5
...
**step 2**
2024-02-12 07:34:37 -06:00
Intelligence-Manifesto
6a09e7b03a
**step 2**
...
step 2: According to the context, the formatting here should be **step 2**.
Additionally, it seems that there is a lack of text description for step 1 in this section, as other sections are all labeled with steps 1, 2, 3 in order, clearly indicating the steps.
2024-02-12 18:32:28 +08:00