Audio-Text-to-Text
Transformers
Safetensors
step_audio_2
text-generation
audio-reasoning
chain-of-thought
multi-modal
step-audio-r1
custom_code
8-bit precision
compressed-tensors
Instructions to use TransWithAI/Step-Audio-R1-NVFP4A16 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use TransWithAI/Step-Audio-R1-NVFP4A16 with Transformers:
# Load model directly from transformers import AutoModelForCausalLM model = AutoModelForCausalLM.from_pretrained("TransWithAI/Step-Audio-R1-NVFP4A16", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
| { | |
| "additional_special_tokens": [ | |
| "<|EOT|>", | |
| "<|BOT|>", | |
| "<|CALL_START|>", | |
| "<|CALL_END|>", | |
| "<|THINK_START|>", | |
| "<|THINK_END|>", | |
| "<|IMG_START|>", | |
| "<|IMG_END|>", | |
| "<|META_START|>", | |
| "<|META_END|>", | |
| "<im_patch>", | |
| "<im_start>", | |
| "<im_end>", | |
| "<dream>", | |
| "<dream_start>", | |
| "<dream_end>", | |
| "<|MASK_1e69f|>", | |
| "<|UNMASK_1e69f|>", | |
| "<video_start>", | |
| "<video_end>", | |
| "<patch_start>", | |
| "<patch_end>", | |
| "<patch_newline>", | |
| "<audio_start>", | |
| "<audio_end>", | |
| "<audio_patch>", | |
| "<audio_patch_pad>", | |
| "<|SC|>", | |
| "<tts_start>", | |
| "<tts_end>", | |
| "<tts_pad>" | |
| ], | |
| "eos_token": { | |
| "content": "<|endoftext|>", | |
| "lstrip": false, | |
| "normalized": false, | |
| "rstrip": false, | |
| "single_word": false | |
| }, | |
| "pad_token": { | |
| "content": "<|endoftext|>", | |
| "lstrip": false, | |
| "normalized": false, | |
| "rstrip": false, | |
| "single_word": false | |
| } | |
| } | |