mirror of https://github.com/rasbt/LLMs-from-scratch.git synced 2025-07-30 20:34:16 +00:00

History

* fixed typos

* fixed formatting

* Update ch03/02_bonus_efficient-multihead-attention/mha-implementations.ipynb

* del weights after load into model

---------

Co-authored-by: Sebastian Raschka <mail@sebastianraschka.com>

2024-10-24 18:23:53 -05:00

01_main-chapter-code

Test code in pytorch 2.4 (#285 )

2024-07-24 21:53:41 -05:00

02_bonus_efficient-multihead-attention

fixed typos (#414 )

2024-10-24 18:23:53 -05:00

03_understanding-buffers

Update README.md

2024-07-30 06:55:41 -05:00

README.md

Update bonus section formatting (#400 )

2024-10-12 10:26:08 -05:00

README.md

Chapter 3: Coding Attention Mechanisms

Main Chapter Code

01_main-chapter-code contains the main chapter code.

Bonus Materials

02_bonus_efficient-multihead-attention implements and compares different implementation variants of multihead-attention
03_understanding-buffers explains the idea behind PyTorch buffers, which are used to implement the causal attention mechanism in chapter 3