mirror of https://github.com/rasbt/LLMs-from-scratch.git synced 2025-11-29 16:40:08 +00:00

History

Sebastian Raschka fc101b710e

Added Apple Silicon GPU device update (#820 )

* Added Apple Silicon GPU device

* Added Apple Silicon GPU device

* delete: remove unused model.pth file from understanding-buffers

* update

* update

---------

Co-authored-by: missflash <missflash@gmail.com>

2025-09-13 12:48:06 -05:00

01_main-chapter-code

align formulas in notes with code (#605 )

2025-04-06 16:46:53 -05:00

02_bonus_efficient-multihead-attention

Added Apple Silicon GPU device update (#820 )

2025-09-13 12:48:06 -05:00

03_understanding-buffers

Added Apple Silicon GPU device update (#820 )

2025-09-13 12:48:06 -05:00

README.md

Fix link (#596 )

2025-04-02 09:47:07 -05:00

README.md

Chapter 3: Coding Attention Mechanisms

Main Chapter Code

01_main-chapter-code contains the main chapter code.

Bonus Materials

02_bonus_efficient-multihead-attention implements and compares different implementation variants of multihead-attention
03_understanding-buffers explains the idea behind PyTorch buffers, which are used to implement the causal attention mechanism in chapter 3

In the video below, I provide a code-along session that covers some of the chapter contents as supplementary material.