# More Efficient Multi-Head Attention Implementations - [mha-implementations.ipynb](mha-implementations.ipynb) contains and compares different implementations of multi-head attention ### Summary The figures below summarize the performance benchmarks (lower is better).   #### Forward pass only   #### Forward and backward pass   #### Forward and backward pass after compilation