back to home

mjun0812 / flash-attention-prebuild-wheels

Provide with pre-build flash-attention package wheels on Linux and Windows platforms using GitHub Actions

View on GitHub
1,122 stars
60 forks
4 issues

AI Architecture Analysis

This repository is indexed by RepoMind. By analyzing mjun0812/flash-attention-prebuild-wheels in our AI interface, you can instantly generate complete architecture diagrams, visualize control flows, and perform automated security audits across the entire codebase.

Our Agentic Context Augmented Generation (Agentic CAG) engine loads full source files into context on-demand, avoiding the fragmentation of traditional RAG systems. Ask questions about the architecture, dependencies, or specific features to see it in action.

Source files are only loaded when you start an analysis to optimize performance.

Embed this Badge

Showcase RepoMind's analysis directly in your repository's README.

[![Analyzed by RepoMind](https://img.shields.io/badge/Analyzed%20by-RepoMind-4F46E5?style=for-the-badge)](https://repomind.in/repo/mjun0812/flash-attention-prebuild-wheels)
Preview:Analyzed by RepoMind

Repository Overview (README excerpt)

Crawler view

flash-attention pre-build wheels This repository provides wheels for the pre-built flash-attention. Since building flash-attention takes a **very long time** and is resource-intensive, I also build and provide combinations of CUDA and PyTorch that are not officially distributed. **This repository uses a self-hosted runner and AWS CodeBuild for building the wheels. If you find this project helpful, please consider sponsoring to help maintain the infrastructure!** **Special thanks to @KiralyCraft for providing the computing resources used to build wheels. Thank you!!** Install • Select the versions for Python, CUDA, PyTorch, and flash_attn. • Find the corresponding version of a wheel from the **Useful Search Page**, Packages page, or releases page. • Direct Install or Download and Local Install Packages Coverage | Platform | Existing | Missing | Coverage | |----------|----------|---------|----------| | Linux x86_64 | 242 | 18 | 93.1% | | Linux ARM64 | 30 | 36 | 45.5% | | Windows | 36 | 30 | 54.5% | | **Total** | **308** | **84** | **78.6%** | > [!NOTE] > Since v0.8.0, Flash Attention 3 ( ) wheels are also available. > Flash Attention 3 requires Hopper (SM90) or newer GPUs and CUDA 12.3+. > [!NOTE] > Since v0.7.0, wheels are built with manylinux2_28 platform. > These wheels for Linux x86_64 and ManyLinux are compatible with old glibc versions ( [!NOTE] > Since v0.5.0, wheels are built with a local version label indicating the CUDA and PyTorch versions. > Example: -> See ./doc/packages.md for the full list of available packages. History The history of this repository is available here. Citation If you use this repository in your research and find it helpful, please cite this repository! Acknowledgments • @okaris : Sponsored me! • @xhiroga : Sponsored me! • cjustus613 : Buy me a coffee! • @KiralyCraft : Provided with computing resource! • @kun432 : Buy me a coffee 3 times! • @wodeyuzhou : Sponsored me! • Gabr1e1 : Buy me a coffee! • wp : Buy me a coffee! • wangxiyu191: Buy me a coffee! • @sr99622 : Sponsored me! Star History and Download Statistics Original Repository repo Self build If you cannot find the version you are looking for, you can fork this repository and create a wheel on GitHub Actions. • Fork this repository • (Optional) Set up your self-hosted runner. • Edit Python script to set the version you want to build. You can use GitHub hosted runners or self-hosted runners. • Add tag to trigger the build workflow. Please note that depending on the combination of versions, it may not be possible to build. Self-Hosted Runner Build In some version combinations, you cannot build wheels on GitHub-hosted runners due to job time limitations. To build the wheels for these versions, you can use self-hosted runners. See self-hosted-runner/README.md for detailed setup instructions. Build Environments This repository builds wheels across multiple platforms and environments: | Platform | Runner Type | Container Image | | ------------------ | ---------------------------------- | ----------------------------------------- | | **Linux x86_64** | GitHub-hosted ( ) | - | | **Linux x86_64** | Self-hosted | or | | **Linux ARM64** | GitHub-hosted ( ) | - | | **Windows x86_64** | GitHub-hosted ( ) | - | | **Windows x86_64** | Self-hosted ( ) | - | | **Windows x86_64** | AWS CodeBuild | - |