Instructions to use mlx-community/Mega-ASR-8bit with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use mlx-community/Mega-ASR-8bit with MLX:
# Download the model from the Hub pip install huggingface_hub[hf_xet] huggingface-cli download --local-dir Mega-ASR-8bit mlx-community/Mega-ASR-8bit
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- LM Studio
Mega-ASR-8bit
8-bit quantized robust-merged variant of Mega-ASR, in MLX format, for mlx-audio.
No router — always-on robust. The Mega-ASR robustness LoRA is merged into the Qwen3-ASR-1.7B base and then quantized, so the per-utterance clean/degraded router is not present (you cannot add fp32 LoRA deltas to quantized weights). This model always runs the robust path.
For the full dynamic Mega-ASR — clean speech on the base path, noisy speech on the LoRA path — use
mlx-community/Mega-ASR-bf16.Use this 8-bit variant for noisy-only / memory-constrained deployments: ~2.5 GB and ~4× faster than the dynamic model (no per-clip LoRA toggling). For the smallest lossless option, prefer
mlx-community/Mega-ASR-6bit.
Use with mlx-audio
pip install mlx-audio
from mlx_audio.stt import load
model = load("mlx-community/Mega-ASR-8bit")
result = model.generate("audio.wav", language="en")
print(result.text)
Quality
8-bit is effectively lossless versus bf16 on noisy speech. WER on a NOIZEUS subset (merged-robust path):
| Precision | overall WER | size |
|---|---|---|
| bf16 | 7.95 | 4.08 GB |
| 6-bit | 7.89 | 2.04 GB |
| 8-bit (this model) | 8.06 | 2.47 GB |
(4-bit degrades to 10.78 WER and is not published.)
License & attribution
Apache-2.0. Built on zhifeixie/Mega-ASR (adapter + router) and Qwen/Qwen3-ASR-1.7B (base).
- Downloads last month
- 23
8-bit
Model tree for mlx-community/Mega-ASR-8bit
Base model
Qwen/Qwen3-ASR-1.7B