GPU Python packages
that just install.

Pre-built CUDA wheels for flash-attn, deepspeed, mamba-ssm, and 7 more packages.
One command. No compiling.

bash

$ easywheels install flash-attn

Detected: Python 3.12, CUDA 12.8, RTX 4090 (sm_89), torch 2.9

Resolving flash-attn...

Found: flash_attn-2.8.3+cu128torch2.9-cp312-cp312-linux_x86_64.whl

Downloading flash_attn-2.8.3 (250.7 MB)

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 250.7/250.7 MB 9.2s

Successfully installed flash-attn-2.8.3

14-day free trial. No credit card required.

flash-attn flash-attn-3 deepspeed mamba-ssm causal-conv1d llama-cpp-python exllamav2 gptqmodel sageattention flashinfer flash-attn flash-attn-3 deepspeed mamba-ssm causal-conv1d llama-cpp-python exllamav2 gptqmodel sageattention flashinfer

The Problem

You know the drill.

45+ minutes

Average compile time for flash-attn from source. Per environment. Per CUDA version.

CUDA mismatch

Wrong toolkit version? Incompatible driver? Start over from scratch.

cmake errors

Missing headers, broken builds, cryptic C++ template errors three screens long.

45 min

Build from source

9 seconds

EasyWheels

flash-attn 2.8.3 on Python 3.12. Measured on a real install.

Every ML developer has lost an afternoon to this.
We built EasyWheels so you never have to again.

How It Works

Three steps.
Seriously, that's it.

Install the CLI

$ pip install easywheels
$ easywheels login

One-time setup. Auto-detects your environment.

Install any package

$ easywheels install flash-attn

Detects your Python, CUDA, GPU, and torch version. Picks the exact right wheel.

Done in seconds

✓ flash-attn 2.8.3 installed (3.2s)

Pre-compiled wheel. No cmake. No waiting.

Founding Member Pricing

Start free. Scale when you're ready.

14-day trial. 3 downloads. No credit card required.

Early supporter rates — locked in for founding members. Prices will increase as the registry grows.

Trial

$0 / 14 days

Full registry access
3 downloads included

Lite

$9 /mo

10 downloads / month
Registry access
Email support

RECOMMENDED

Pro

$19 /mo

Unlimited downloads
Custom builds (coming soon)
Priority support

Team

$49 /mo

Unlimited downloads
Custom builds (coming soon)
5 team seats included
Dedicated support

Need more? Enterprise plans start at $199/mo. Contact us →

Or pay per download — $2/wheel, no subscription needed.

FAQ

Got questions?
We got answers.

A registry of pre-built GPU Python wheels with a CLI that auto-detects your CUDA version, GPU, torch, and Python. Instead of compiling packages like flash-attn or deepspeed from source (30-60 minutes), you run easywheels install flash-attn and it picks the exact right wheel for your setup. We cover 10 packages across CUDA 11.8-13.0 and Python 3.10-3.13.

Install the CLI and log in:

pip install easywheels
easywheels login
easywheels install flash-attn

The CLI auto-detects your CUDA, GPU, torch, and Python version. You can also use pip directly with --extra-index-url.

GPU Python packagesthat just install.