Mega-ASR-8bit

8-bit quantized robust-merged variant of Mega-ASR, in MLX format, for mlx-audio.

No router — always-on robust. The Mega-ASR robustness LoRA is merged into the Qwen3-ASR-1.7B base and then quantized, so the per-utterance clean/degraded router is not present (you cannot add fp32 LoRA deltas to quantized weights). This model always runs the robust path.

For the full dynamic Mega-ASR — clean speech on the base path, noisy speech on the LoRA path — use mlx-community/Mega-ASR-bf16.

Use this 8-bit variant for noisy-only / memory-constrained deployments: ~2.5 GB and ~4× faster than the dynamic model (no per-clip LoRA toggling). For the smallest lossless option, prefer mlx-community/Mega-ASR-6bit.

Use with mlx-audio

pip install mlx-audio

from mlx_audio.stt import load

model = load("mlx-community/Mega-ASR-8bit")
result = model.generate("audio.wav", language="en")
print(result.text)

Quality

8-bit is effectively lossless versus bf16 on noisy speech. WER on a NOIZEUS subset (merged-robust path):

Precision	overall WER	size
bf16	7.95	4.08 GB
6-bit	7.89	2.04 GB
8-bit (this model)	8.06	2.47 GB

(4-bit degrades to 10.78 WER and is not published.)

License & attribution

Apache-2.0. Built on zhifeixie/Mega-ASR (adapter + router) and Qwen/Qwen3-ASR-1.7B (base).

Downloads last month: 23

Safetensors

Model size

0.8B params

Tensor type

BF16

U32

MLX

Hardware compatibility

8-bit

Model tree for mlx-community/Mega-ASR-8bit

Base model

Qwen/Qwen3-ASR-1.7B

Quantized

(33)

this model