mirror of
https://github.com/rasbt/LLMs-from-scratch.git
synced 2025-12-26 06:32:10 +00:00
* updated Dockerfile * updated MHA implementations for PT 2.5 * fixed typo * update installation instruction * Update setup/03_optional-docker-environment/.devcontainer/Dockerfile --------- Co-authored-by: rasbt <mail@sebastianraschka.com>
More Efficient Multi-Head Attention Implementations
- mha-implementations.ipynb contains and compares different implementations of multi-head attention
Summary
The figures below summarize the performance benchmarks (lower is better).
Forward pass only
Forward and backward pass


