Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
Proto_AGI
PRO
mayafree
19
224
Follow
branikita's profile picture
Ctulu82's profile picture
PhysiQuanty's profile picture
43 followers
ยท
273 following
AI & ML interests
None yet
Recent Activity
upvoted
a
collection
about 13 hours ago
VKAE Accelerated
upvoted
an
article
1 day ago
Adding a GPU Without Building One
reacted
to
SeaWolf-AI
's
post
with โค๏ธ
1 day ago
๐ Adding a GPU without building one AI is usually framed as "how smart is the model / how many GPUs did you buy." The real bottleneck is elsewhere โ how efficiently you use the GPUs you already have. Training happens once; inference runs the entire time users use your product. So a service's economics come down to cost per token. Inference acceleration uses software to pull several times more out of the same GPU โ the effect of plugging in one more "virtual GPU." VIDRAFT's VKAE, measured (B200, same-harness, no quality loss): Qwen3.5-35B-A3B (MoE): 25.7 โ 601 tok/s (23.4ร) Darwin-36B-Opus (in-house MoE): 25.0 โ 280.8 (11.2ร) 10,000+ tok/s peak aggregate under concurrency The key: it's reproducible โ model + serving shipped as one container. docker pull vidraft/qwen35-vkae:601 Don't take our word for it โ run it yourself. The mechanism will be released as a paper. ๐ Leaderboard & demo ๐ https://huggingface.co/spaces/VIDraft/vkae Articles ๐ https://huggingface.co/blog/FINAL-Bench/vkae-leaderboard
View all activity
Organizations
mayafree
's activity
All
Models
Datasets
Spaces
Buckets
Papers
Collections
Community
Posts
Upvotes
Likes
Articles
published
an
article
5 months ago
view article
Article
Open NPC AI: Design Principles of a Proto-AGI Society
MAYA-AI
โข
Feb 5
โข
4