Spaces:
Running
Running
File size: 18,113 Bytes
46bc2e2 dd53022 cdf4f90 46bc2e2 dd53022 46bc2e2 dd53022 be5ee6d dd53022 be5ee6d cdf4f90 be5ee6d dd53022 cdf4f90 be5ee6d cdf4f90 02805c2 dd53022 02805c2 dd53022 cdf4f90 dd53022 cdf4f90 02805c2 dd53022 cdf4f90 dd53022 cdf4f90 dd53022 cdf4f90 dd53022 cdf4f90 dd53022 cdf4f90 dd53022 cdf4f90 dd53022 cdf4f90 dd53022 cdf4f90 02805c2 cdf4f90 dd53022 02805c2 cdf4f90 be5ee6d dd53022 be5ee6d cdf4f90 be5ee6d cdf4f90 be5ee6d cdf4f90 dd53022 cdf4f90 dd53022 cdf4f90 dd53022 cdf4f90 dd53022 cdf4f90 dd53022 cdf4f90 dd53022 cdf4f90 dd53022 be5ee6d cdf4f90 dd53022 cdf4f90 be5ee6d dd53022 be5ee6d cdf4f90 be5ee6d dd53022 be5ee6d dd53022 be5ee6d dd53022 be5ee6d dd53022 be5ee6d dd53022 be5ee6d dd53022 be5ee6d dd53022 be5ee6d cdf4f90 be5ee6d cdf4f90 dd53022 cdf4f90 02805c2 cdf4f90 02805c2 be5ee6d dd53022 be5ee6d dd53022 cdf4f90 dd53022 cdf4f90 dd53022 be5ee6d dd53022 cdf4f90 dd53022 cdf4f90 be5ee6d cdf4f90 dd53022 cdf4f90 dd53022 cdf4f90 dd53022 cdf4f90 dd53022 be5ee6d 02805c2 dd53022 02805c2 be5ee6d cdf4f90 dd53022 be5ee6d cdf4f90 02805c2 be5ee6d 02805c2 dd53022 cdf4f90 dd53022 cdf4f90 dd53022 cdf4f90 dd53022 be5ee6d dd53022 02805c2 be5ee6d cdf4f90 be5ee6d cdf4f90 dd53022 02805c2 cdf4f90 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 | ---
title: Multi-LLM API Gateway
emoji: π‘οΈ
colorFrom: indigo
colorTo: red
sdk: docker
pinned: false
license: apache-2.0
short_description: 'Secure Multi-LLM Gateway β (Streamable HTTP / SSE)'
---
# Multi-LLM API Gateway
β or Universal MCP Hub (Sandboxed)
β or secure AI wrapper with dual interface: REST + MCP
aka: a clean, secure starting point for your own projects.
Pick the description that fits your use case. They're all correct.
> A production-grade **the-thing** that actually thinks about security.
> Built on [PyFundaments](PyFundaments.md) β running on **simpleCity**.
```
No key β no tool β no crash β no exposed secrets
```
> [!WARNING]
> Most MCP servers are prompts dressed up as servers. This one has a real architecture.
---
> [!IMPORTANT]
> This project is under active development β always use the latest release from [Codey Lab](https://github.com/Codey-LAB/Multi-LLM-API-Gateway) *(more stable builds land here first)*.
> This repo ([DEV](https://github.com/VolkanSah/Multi-LLM-API-Gateway)) is where the chaos happens. π¬ A β on the repos will be cool π
---
## Why this exists
The AI ecosystem is full of servers with hardcoded keys, `os.environ` scattered everywhere, zero sandboxing. One misconfigured fork and your API keys are gone.
This is exactly the kind of negligence (and worse β outright fraud) that [Wall of Shames](https://github.com/Wall-of-Shames) documents: fake "AI tools" exploiting non-technical users β API wrappers dressed up as custom models, Telegram payment funnels, bought stars. If you build on open source, you should know this exists.
This hub is the antidote:
- **Structural sandboxing** β `app/*` can never touch `fundaments/` or `.env`. Not by convention. By design.
- **Guardian pattern** β `main.py` is the only process that reads secrets. It injects validated services as a dict. `app/*` never sees the raw environment.
- **Graceful degradation** β No key? Tool doesn't register. Server still starts. No crash, no error, no empty `None` floating around.
- **Single source of truth** β All tool/provider/model config lives in `app/.pyfun`. Adding a provider = edit one file. No code changes.
---
## Two Interfaces β One Server
This hub exposes **two completely independent interfaces** on the same hypercorn instance:
```
POST /api β REST interface β for custom clients, desktop apps, CMS plugins
GET+POST /mcp β MCP interface β for Claude Desktop, Cursor, Windsurf, any MCP client
GET / β Health check β uptime, status
```
They share the same tool registry, provider config, and fallback chain. Adding a tool once makes it available on both interfaces automatically.
### REST API (`/api`)
Simple JSON POST β no protocol overhead, works with any HTTP client:
```json
POST /api
{"tool": "llm_complete", "params": {"prompt": "Hello", "provider": "anthropic"}}
```
Used by: Desktop Client (`DESKTOP_CLIENT/hub.py`), WordPress plugin, any custom integration.
### MCP Interface (`/mcp`)
Full MCP protocol β tool discovery, structured calls, streaming responses.
**Primary transport: Streamable HTTP** (MCP spec 2025-11-25)
**Fallback transport: SSE** (legacy, configurable via `.pyfun`)
Configured via `HUB_TRANSPORT` in `app/.pyfun [HUB]`:
```ini
HUB_TRANSPORT = "streamable-http" # default β MCP spec 2025-11-25
# HUB_TRANSPORT = "sse" # legacy fallback for older clients
```
Used by: Claude Desktop, Cursor, Windsurf, any MCP-compatible client.
---
## Architecture
```
main.py (Guardian)
β
β reads .env / HF Secrets
β initializes fundaments/* conditionally
β injects validated services as dict
β
ββββΊ app/app.py (Orchestrator, sandboxed)
β
β unpacks fundaments ONCE, at startup, never stores globally
β starts hypercorn (async ASGI)
β routes: GET / | POST /api | /mcp (transport-dependent)
β
βββ app/mcp.py β FastMCP + transport handler (Streamable HTTP / SSE)
βββ app/tools.py β Tool registry (key-gated)
βββ app/providers.py β LLM + Search execution + fallback chain
βββ app/models.py β Model limits, costs, capabilities
βββ app/config.py β .pyfun parser (single source of truth)
βββ app/db_sync.py β Internal SQLite IPC (app/* state only)
β fundaments/postgresql.py (Guardian-only)
```
**The sandbox is structural:**
```python
# app/app.py β fundaments unpacked ONCE, NEVER stored globally
async def start_application(fundaments: Dict[str, Any]) -> None:
config_service = fundaments["config"]
db_service = fundaments["db"] # None if not configured
encryption_service = fundaments["encryption"] # None if keys missing
access_control_service = fundaments["access_control"]
...
# From here: app/* reads its own config from app/.pyfun only.
# fundaments are never passed into other app/* modules.
```
`app/app.py` never calls `os.environ`. Never imports from `fundaments/`. Never reads `.env`.
This isn't documentation. It's enforced by the import structure.
### Why Quart + hypercorn?
**Quart** is async Flask β fully `async/await` native. FastMCP's handlers are async; mixing sync Flask would require thread hacks. With Quart, `/mcp` hands off directly to FastMCP β no bridging, no blocking.
**hypercorn** is an ASGI server (vs. waitress/gunicorn which are WSGI). WSGI servers handle one request per thread β wrong for long-lived MCP connections. hypercorn handles both Streamable HTTP and SSE natively, and runs without extra config on HuggingFace Spaces. HTTP/2 support (`config.h2 = True`) is built-in β relevant for Streamable HTTP performance at scale.
The `/mcp` route in `app.py` remains the natural interception point regardless of transport β auth checks, rate limiting, and logging can all be added there before the request reaches FastMCP.
---
## Two Databases β One Architecture
```
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Guardian Layer (fundaments/*) β
β β
β postgresql.py β Cloud DB (e.g. Neon, Supabase) β
β asyncpg pool, SSL enforced β
β β
β user_handler.py β SQLite (users + sessions tables) β
β PBKDF2-SHA256 password hashing β
β Session validation incl. IP + UserAgent β
β Account lockout after 5 failed attempts β
β β
ββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββ
β inject as fundaments dict
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β App Layer (app/*) β
β β
β db_sync.py β SQLite (hub_state + tool_cache tables) β
β aiosqlite (async, non-blocking) β
β NEVER touches users/sessions tables β
β Relocated to /tmp/ on HF Spaces auto β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
```
**Table ownership β hard rule:**
| Table | Owner | Access |
| :--- | :--- | :--- |
| `users` | `fundaments/user_handler.py` | Guardian only |
| `sessions` | `fundaments/user_handler.py` | Guardian only |
| `hub_state` | `app/db_sync.py` | app/* only |
| `tool_cache` | `app/db_sync.py` | app/* only |
| `hub_results` | PostgreSQL / Guardian | via `persist_result` tool |
---
## Tools
Tools register at startup β only if the required API key exists. No key, no tool. Server always starts.
| ENV Secret | Tool | Notes |
| :--- | :--- | :--- |
| `ANTHROPIC_API_KEY` | `llm_complete` | Claude Haiku / Sonnet / Opus |
| `GEMINI_API_KEY` | `llm_complete` | Gemini 2.0 / 2.5 / 3.x Flash & Pro |
| `OPENROUTER_API_KEY` | `llm_complete` | 100+ models via OpenRouter |
| `HF_TOKEN` | `llm_complete` | HuggingFace Inference API |
| `BRAVE_API_KEY` | `web_search` | Independent web index |
| `TAVILY_API_KEY` | `web_search` | AI-optimized search with synthesized answers |
| `DATABASE_URL` | `cloud DB` | e.g. Neon, Supabase |
| `DATABASE_URL` | `db_query`, `persist_result` | SQLite read + PostgreSQL write |
| *(always)* | `list_active_tools` | Shows key names only β never values |
| *(always)* | `health_check` | Status + uptime + active transport |
| *(always)* | `get_model_info` | Limits, costs, capabilities per model |
For all key names see [`app/.pyfun`](app/.pyfun).
**Tools are configured in `.pyfun` β including system prompts:**
```ini
[TOOL.code_review]
active = "true"
description = "Review code for bugs, security issues and improvements"
provider_type = "llm"
default_provider = "anthropic"
timeout_sec = "60"
system_prompt = "You are an expert code reviewer. Analyze the given code for bugs, security issues, and improvements. Be specific and concise."
[TOOL.code_review_END]
```
Current built-in tools: `llm_complete`, `code_review`, `summarize`, `translate`, `web_search`, `db_query`
Future hooks (commented, ready): `image_gen`, `code_exec`, `shellmaster_2.0`, Discord, GitHub webhooks
---
## LLM Fallback Chain
All LLM providers share one `llm_complete` tool. If a provider fails, the hub walks the fallback chain from `.pyfun`:
```
e.g. anthropic β gemini β openrouter β huggingface
```
```ini
[LLM_PROVIDER.anthropic]
fallback_to = "gemini"
[LLM_PROVIDER.anthropic_END]
[LLM_PROVIDER.gemini]
fallback_to = "openrouter"
[LLM_PROVIDER.gemini_END]
```
Same pattern applies to search providers (`brave β tavily`).
---
## Quick Start
### HuggingFace Spaces (recommended)
1. Fork / duplicate this Space
2. Go to **Settings β Variables and secrets**
3. Add the API keys you have (any subset works)
4. Space starts automatically β only tools with valid keys register
[β Live Demo Space](https://huggingface.co/spaces/codey-lab/Multi-LLM-API-Gateway) (no LLM keys set)
### Local / Docker
```bash
git clone https://github.com/VolkanSah/Multi-LLM-API-Gateway
cd Multi-LLM-API-Gateway
cp example-mcp___.env .env
# fill in your keys
pip install -r requirements.txt
python main.py
```
Minimum required ENV vars (everything else is optional):
```env
PYFUNDAMENTS_DEBUG=""
LOG_LEVEL="INFO"
LOG_TO_TMP=""
ENABLE_PUBLIC_LOGS="true"
HF_TOKEN=""
HUB_SPACE_URL=""
```
Transport is configured in `app/.pyfun [HUB]` β not via ENV.
---
## Connect an MCP Client
### Streamable HTTP (default β MCP spec 2025-11-25)
```json
{
"mcpServers": {
"universal-mcp-hub": {
"url": "https://YOUR_USERNAME-universal-mcp-hub.hf.space/mcp"
}
}
}
```
### Streamable HTTP β Private Space (with HF token)
```json
{
"mcpServers": {
"universal-mcp-hub": {
"url": "https://YOUR_USERNAME-universal-mcp-hub.hf.space/mcp",
"headers": {
"Authorization": "Bearer hf_..."
}
}
}
}
```
### SSE legacy fallback (set `HUB_TRANSPORT = "sse"` in `.pyfun`)
```json
{
"mcpServers": {
"universal-mcp-hub": {
"url": "https://YOUR_USERNAME-universal-mcp-hub.hf.space/mcp"
}
}
}
```
> Same URL (`/mcp`) for both transports β the protocol is negotiated automatically.
> SSE fallback is for older clients that don't support Streamable HTTP yet.
---
## Desktop Client
###### (experimental β ~80% AI generated)
A full PySide6 desktop client is included in `DESKTOP_CLIENT/hub.py`.
Communicates via the REST `/api` endpoint β no MCP protocol overhead.
Ideal for private or non-public Spaces.
```bash
pip install PySide6 httpx
# optional file handling:
pip install Pillow PyPDF2 pandas openpyxl
python DESKTOP_CLIENT/hub.py
```
**Features:**
- Multi-chat with persistent history
- Tool / Provider / Model selector loaded live from your Hub
- File attachments: images, PDF, CSV, Excel, ZIP, source code
- Connect tab with health check + auto-load
- Settings: HF Token + Hub URL saved locally, never sent anywhere except your own Hub
- Full request/response log with timestamps
- Runs on Windows, Linux, macOS
[β Desktop Client docs](DESKTOP_CLIENT/README.md)
---
## CMS & Custom Clients
| Client | Interface used | Notes |
| :--- | :--- | :--- |
| [Desktop Client](DESKTOP_CLIENT/hub.py) | REST `/api` | PySide6, local |
| [WP AI Hub](https://github.com/VolkanSah/WP-AI-HUB/) | REST `/api` | WordPress plugin |
| TYPO3 (soon) | REST `/api` | β |
| Claude Desktop | MCP `/mcp` | Streamable HTTP |
| Cursor / Windsurf | MCP `/mcp` | Streamable HTTP |
---
## Configuration (.pyfun)
`app/.pyfun` is the single source of truth for all app behavior. Three tiers:
```
LAZY: [HUB] + one [LLM_PROVIDER.*] β works
NORMAL: + [SEARCH_PROVIDER.*] + [MODELS.*] β works better
PRODUCTIVE: + [TOOLS] + [HUB_LIMITS] + [DB_SYNC] β full power
```
Key settings in `[HUB]`:
```ini
[HUB]
HUB_TRANSPORT = "streamable-http" # streamable-http | sse
HUB_STATELESS = "true" # true = HF Spaces safe, no session state
HUB_PORT = "7860"
[HUB_END]
```
Adding a new LLM provider β two steps:
```ini
# 1. app/.pyfun
[LLM_PROVIDER.mistral]
active = "true"
base_url = "https://api.mistral.ai/v1"
env_key = "MISTRAL_API_KEY"
default_model = "mistral-large-latest"
models = "mistral-large-latest, mistral-small-latest"
fallback_to = ""
[LLM_PROVIDER.mistral_END]
```
```python
# 2. app/providers.py β uncomment the dummy
_PROVIDER_CLASSES = {
...
"mistral": MistralProvider, # β uncomment to activate
}
```
---
## Dependencies
```
# PyFundaments Core (always required)
asyncpg β async PostgreSQL pool (Guardian/cloud DB)
python-dotenv β .env loading
passlib β PBKDF2 password hashing in user_handler.py
cryptography β encryption layer in fundaments/
# MCP Hub
mcp β MCP protocol + FastMCP (Streamable HTTP + SSE)
httpx β async HTTP for all provider API calls
quart β async Flask (ASGI) β needed for MCP + hypercorn
hypercorn β ASGI server β Streamable HTTP + SSE, HF Spaces native
requests β sync HTTP for tool workers
# Optional (uncomment in requirements.txt as needed)
# aiofiles β async file ops (ML pipelines, file uploads)
# discord.py β Discord bot integration (planned)
# PyNaCl β Discord signature verification
# psycopg2-binary β alternative PostgreSQL driver
```
> **Note:** The package is `mcp` (not `fastmcp`) β `FastMCP` is imported from `mcp.server.fastmcp`.
> Streamable HTTP support requires `mcp >= 1.6.0`.
---
## Security Design
- API keys live in HF Secrets / `.env` β never in `.pyfun`, never in code
- `list_active_tools` returns key **names** only β never values
- `db_query` is SELECT-only, enforced at application level (not just docs)
- `app/*` has zero import access to `fundaments/` internals
- Direct execution of `app/app.py` blocked by design β warning + null-fundaments fallback
- `fundaments/` initialized conditionally β missing services degrade gracefully, never crash
- Streamable HTTP uses standard Bearer headers β no token-in-URL (unlike SSE)
> PyFundaments is not perfect. But it's more secure than most of what runs in production today.
[β Full Security Policy](SECURITY.md)
---
## Foundation
Built on [PyFundaments](PyFundaments.md) β a security-first Python boilerplate:
- `config_handler.py` β env loading with validation
- `postgresql.py` β async DB pool (Guardian-only)
- `encryption.py` β key-based encryption layer
- `access_control.py` β role/permission management
- `user_handler.py` β user lifecycle management
- `security.py` β unified security manager composing the above
None accessible from `app/*`. Injected as a validated dict by `main.py`.
[β PyFundaments Function Overview](PyFundaments%20β%20Function%20Overview.md)
[β Module Docs](docs/app/)
[β Source Repo](https://github.com/VolkanSah/Multi-LLM-API-Gateway)
---
## Related Projects
- [Customs LLMs for free β Build Your Own LLM Service](https://github.com/VolkanSah/SmolLM2-customs/)
- [WP AI Hub (WordPress Client)](https://github.com/VolkanSah/WP-AI-HUB/)
- [ShellMaster (2023 precursor)](https://github.com/VolkanSah/ChatGPT-ShellMaster)
---
## History
[ShellMaster](https://github.com/VolkanSah/ChatGPT-ShellMaster) (2023, MIT) was the precursor β browser-accessible shell for ChatGPT with session memory, built before MCP was a concept. Universal MCP Hub is its natural evolution: same idea, proper architecture, dual interface.
---
## License
Dual-licensed:
- [Apache License 2.0](https://www.apache.org/licenses/LICENSE-2.0)
- [Ethical Security Operations License v1.1 (ESOL)](ESOL) β mandatory, non-severable
By using this software you agree to all ethical constraints defined in ESOL v1.1.
---
*Architecture, security decisions, and PyFundaments by Volkan KΓΌcΓΌkbudak.*
*Built with Claude (Anthropic) as a typing assistant for docs (and the occasional bug).*
> crafted with passion β just wanted to understand how it works, don't actually need it, have a CLI π
|