Audio stabilityai/stable-audio-open-small Text-to-Audio • Updated May 27, 2025 • 2.17k • 243 Running Featured 86 ONNX Model Explorer 🔍 86 Explore ONNX models interactively microsoft/VibeVoice-1.5B Text-to-Speech • 3B • Updated 14 days ago • 328k • 2.21k nvidia/audio-flamingo-3 Audio-Text-to-Text • Updated Nov 28, 2025 • 752 • 140
Play-Ground Running on CPU Upgrade 247 Inference Playground 🔋 247 Customize theme based on user preference
OCR SkalskiP/paligemma2_latex_ocr_v5 Updated Dec 11, 2024 • 1 • 2 nanonets/Nanonets-OCR-s Image-Text-to-Text • 4B • Updated Jun 20, 2025 • 26.8k • 1.58k nvidia/NVIDIA-Nemotron-Parse-v1.1 Image-Text-to-Text • Updated 8 days ago • 116k • 138
Multimode microsoft/Phi-4-multimodal-instruct Automatic Speech Recognition • 6B • Updated Dec 10, 2025 • 270k • 1.57k ByteDance/Sa2VA-8B Image-Text-to-Text • 8B • Updated Sep 8, 2025 • 851 • 65
microsoft/Phi-4-multimodal-instruct Automatic Speech Recognition • 6B • Updated Dec 10, 2025 • 270k • 1.57k
Speako ibm-granite/granite-speech-3.2-8b Automatic Speech Recognition • 8B • Updated Apr 16, 2025 • 150 • 85 ByteDance/MegaTTS3 Text-to-Speech • Updated Apr 4, 2025 • 107 • 415 Running 2 Demo 🚀 2 Transcribe and translate audio/video files into text nvidia/audio-flamingo-3-hf Audio-Text-to-Text • 8B • Updated 8 days ago • 71.1k • 169
ibm-granite/granite-speech-3.2-8b Automatic Speech Recognition • 8B • Updated Apr 16, 2025 • 150 • 85
Audio stabilityai/stable-audio-open-small Text-to-Audio • Updated May 27, 2025 • 2.17k • 243 Running Featured 86 ONNX Model Explorer 🔍 86 Explore ONNX models interactively microsoft/VibeVoice-1.5B Text-to-Speech • 3B • Updated 14 days ago • 328k • 2.21k nvidia/audio-flamingo-3 Audio-Text-to-Text • Updated Nov 28, 2025 • 752 • 140
Speako ibm-granite/granite-speech-3.2-8b Automatic Speech Recognition • 8B • Updated Apr 16, 2025 • 150 • 85 ByteDance/MegaTTS3 Text-to-Speech • Updated Apr 4, 2025 • 107 • 415 Running 2 Demo 🚀 2 Transcribe and translate audio/video files into text nvidia/audio-flamingo-3-hf Audio-Text-to-Text • 8B • Updated 8 days ago • 71.1k • 169
ibm-granite/granite-speech-3.2-8b Automatic Speech Recognition • 8B • Updated Apr 16, 2025 • 150 • 85
Play-Ground Running on CPU Upgrade 247 Inference Playground 🔋 247 Customize theme based on user preference
OCR SkalskiP/paligemma2_latex_ocr_v5 Updated Dec 11, 2024 • 1 • 2 nanonets/Nanonets-OCR-s Image-Text-to-Text • 4B • Updated Jun 20, 2025 • 26.8k • 1.58k nvidia/NVIDIA-Nemotron-Parse-v1.1 Image-Text-to-Text • Updated 8 days ago • 116k • 138
Multimode microsoft/Phi-4-multimodal-instruct Automatic Speech Recognition • 6B • Updated Dec 10, 2025 • 270k • 1.57k ByteDance/Sa2VA-8B Image-Text-to-Text • 8B • Updated Sep 8, 2025 • 851 • 65
microsoft/Phi-4-multimodal-instruct Automatic Speech Recognition • 6B • Updated Dec 10, 2025 • 270k • 1.57k