Instructions to use SPRINGLab/Indic-Mio with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use SPRINGLab/Indic-Mio with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-to-speech", model="SPRINGLab/Indic-Mio")# Load model directly from transformers import AutoTokenizer, AutoModelForMultimodalLM tokenizer = AutoTokenizer.from_pretrained("SPRINGLab/Indic-Mio") model = AutoModelForMultimodalLM.from_pretrained("SPRINGLab/Indic-Mio") - Notebooks
- Google Colab
- Kaggle
Training / Fine Tuning Script?
It's quite astonishing how much little data and training went, I would like to push more to see if it can be taken to production.
Can you help with training or fine tuning scripts
Same bro i am also looking for fine tuning Scripts but i think fine tune script is not available as open source soon i will share the script with you.
Use the speech codec to convert your audio data into speech codes. Map those codes to the token format and then you can do simple SFT training just like for text.
Thank you very much for your response Dear,
but dear developer I have a doubt about how to add emotion tags like sad, angry,happy etc. did you add these as special tokens are you applied any conditioning. please give the response i love to see your respons. One possibility is that u fine-tune on labeled data like collected the emtion wise audio and added emotic tag in vocab.
@rumourscape please give the response. I am really thinking about more ideas if you will tell that will be a big help. I am also thinkding to add non-verbal ssml tags(laugh,sighs,chuckles,giggle,clears throat with intensity level.)