mirror of
https://github.com/rasbt/LLMs-from-scratch.git
synced 2025-11-29 16:40:08 +00:00
* Added Apple Silicon GPU device * Added Apple Silicon GPU device * delete: remove unused model.pth file from understanding-buffers * update * update --------- Co-authored-by: missflash <missflash@gmail.com>
Chapter 3: Coding Attention Mechanisms
Main Chapter Code
- 01_main-chapter-code contains the main chapter code.
Bonus Materials
- 02_bonus_efficient-multihead-attention implements and compares different implementation variants of multihead-attention
- 03_understanding-buffers explains the idea behind PyTorch buffers, which are used to implement the causal attention mechanism in chapter 3
In the video below, I provide a code-along session that covers some of the chapter contents as supplementary material.
