mirror of
https://github.com/rasbt/LLMs-from-scratch.git
synced 2025-11-30 09:00:09 +00:00
* Added Apple Silicon GPU device * Added Apple Silicon GPU device * delete: remove unused model.pth file from understanding-buffers * update * update --------- Co-authored-by: missflash <missflash@gmail.com>
More Efficient Multi-Head Attention Implementations
- mha-implementations.ipynb contains and compares different implementations of multi-head attention
Summary
The figures below summarize the performance benchmarks (lower is better).
Forward pass only
Forward and backward pass


