LLMs-from-scratch/ch03/02_bonus_efficient-multihead-attention
Sebastian Raschka fc101b710e
Added Apple Silicon GPU device update (#820)
* Added Apple Silicon GPU device

* Added Apple Silicon GPU device

* delete: remove unused model.pth file from understanding-buffers

* update

* update

---------

Co-authored-by: missflash <missflash@gmail.com>
2025-09-13 12:48:06 -05:00
..
2024-09-05 18:24:33 +02:00

More Efficient Multi-Head Attention Implementations

Summary

The figures below summarize the performance benchmarks (lower is better).

 

Forward pass only

 

Forward and backward pass

 

Forward and backward pass after compilation