File size: 2,761 Bytes
a1e429b
ef8b3fe
 
0ca353f
ef8b3fe
a1e429b
ef8b3fe
 
a1e429b
ef8b3fe
 
a1e429b
ef8b3fe
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
---
title: Live Football Commentary Translator
emoji: 
colorFrom: green
colorTo: blue
sdk: gradio
sdk_version: 5.42.0
python_version: "3.11"
app_file: app.py
pinned: false
license: apache-2.0
---

# Live Football Commentary Translator

Speak (or upload) commentary in one language, hear it spoken in another.

## What this is

A HuggingFace Space that translates football commentary between languages.
Two modes:

- **Single clip** — record or upload one clip, get one translation.
- **Continuous live** — start a session, speak naturally, translations chunk
  at natural pauses (~0.8s silence) and play sequentially.

Sources: English, Scottish English, German, Spanish, Arabic
Targets: all of the above + Swahili, Amharic, Afrikaans

## How it works

Two pipelines, routed by target language:

| Target language | Pipeline | Cost |
|---|---|---|
| English, Scottish-EN, German, Spanish, Arabic | Single Qwen-Omni call: audio in → translated speech out | 1 API call |
| Swahili, Amharic, Afrikaans | Qwen-Omni (audio → translated text), then YourVoic (text → speech) | 2 API calls |

Qwen-Omni is `qwen3.5-omni-plus` on DashScope International. YourVoic is the
fallback for languages Qwen-Omni doesn't cover natively. This split exists
because Qwen-Omni does not produce intelligible speech in Swahili, Amharic,
or Afrikaans on its own.

## Deploy

1. Create a new HuggingFace Space, SDK = Gradio
2. Upload `app.py`, `requirements.txt`, and this `README.md`
3. Add secrets in **Settings → Variables and secrets**:
   - `DASHSCOPE_API_KEY` (required) — get one from DashScope International
   - `YOURVOIC_API_KEY` (required for Swahili/Amharic/Afrikaans only)
4. (Recommended) Set hardware to **ZeroGPU** if you have access. CPU also works
   but will be slower on the audio-decode steps.

## Expected latency

On free ZeroGPU, expect 3-8 seconds from end-of-speech to start-of-output. The
demo is designed to feel "live-ish" but not simultaneous-interpretation grade.
Speak in short bursts — one play, one tackle, one moment — rather than long
monologues.

## Known limitations

- "Scottish English" is treated as accented English in the system prompt rather
  than a separate language. Qwen-Omni's Scottish accent is decent but not
  authentic.
- YourVoic voice support per language is sparsely documented. The code falls
  back to a universal voice ("Peter") if the primary choice fails.
- Arabic voice cloning is intentionally not exposed — the underlying
  `qwen3-tts-vc` model doesn't support Arabic.
- Free-tier ZeroGPU has cold-start delays. First call after idle is slower.

## Files

- `app.py` — Gradio UI and pipeline
- `requirements.txt` — Python dependencies
- `README.md` — this file (also the Space metadata header)