Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up

All HF Hub posts

ST-x-Tony 
posted an update 1 day ago
view post
Post
5995
Hello AI Community! 👋

We are thrilled to announce the release of **NRS_QWEN_MYTHOS_1M**, a high-performance reasoning model built on the powerful **Qwen 3.5 9B** base. At **SKT AI LABS**, we’ve applied our proprietary **Neural Reasoning System (NRS)** to push the boundaries of what a 9B model can do.

🔥 **Why this model is a Game-Changer:**

✅ **100x High Reasoning Capacity:** Deep logical thinking and complex problem-solving via NRS Boosting.
✅ **1 Million Token Context:** Handle massive codebases, long documents, and multi-turn agentic tasks with ease (YaRN Scaling).
✅ **Advanced Thinking Mode:** Native tags for step-by-step Chain-of-Thought reasoning.
✅ **Tool-Use Ready:** Optimized for Python execution and Web Search with self-correction.
✅ **Blazing Fast:** Efficient 9B architecture that runs smoothly on consumer hardware (RTX 3090/4090).

🛠️ **Technical Highlights:**
* **Base:** Qwen 3.5 9B
* **Tuning:** NRS Specific Tuning high-quality samples.
* **License:** NRS DOCS
Whether you are a developer building coding agents, a researcher dealing with long-context data, or just someone who loves deep reasoning, this model is built for you.

👇 **Try it now on Hugging Face:**
SKT-NRS/NRS_QWEN_MYTHOS_1M
  • 1 reply
·
constannnt 
posted an update 3 days ago
view post
Post
10220
We are excited to announce Sipp.sh: a high-performance library for running AI inference locally and in the cloud through a unified API.

We began to realize that an LLM isn't just a chat interface for information retrieval. It can be integrated directly into web, games, or productivity apps to handle continuous monitoring and decision-making. It can act as a sort of "second brain,” the silent hand that guides and helps a user without them even realizing it. We see this as the next frontier of UX design, but this is only possible if developers have access to low-cost, zero-latency compute and absolute data privacy.

That's why we created Sipp. It’s an opinionated library that lets developers integrate local AI into any application, giving them the superpowers to completely rethink user experiences across the web, games, and desktop.

To achieve this, we built an entirely new stack in Rust and C++, working alongside the llama.cpp project. Through our work, we were able to contribute back to that community to help upgrade the GGML WebGPU backend. This deep optimization is what enables our fast, responsive decode speeds directly in the browser. Sipp ships as a zero-dependency library for desktop and web, achieving 3x to 5x speedup in token decode compared to popular alternatives.

We are already seeing some incredible use cases emerge from this, from continuous monitoring using local vision to the dynamic generation of game elements in a real-time wizard vs. wizard game.

The best part? It's fully open-source!

We see this as the start of a dialogue about what the future of user interaction is going to look like, and we built Sipp to lay the foundation for that exciting future. Check out the live demos on our site, run your own benchmarks, or come hang out with us in our Discord.

Website: https://www.sipp.sh/
Github: https://github.com/noumena-labs/Sipp
  • 1 reply
·
pankajpandey-dev 
posted an update about 8 hours ago
view post
Post
881
🇮🇳 New in my Hindi LLM Series: Gemma-4 E4B, fine-tuned for Hindi — and it runs on your laptop's CPU.
I fine-tuned Google's new Gemma-4 E4B on ~10k Hindi instruction pairs (AI4Bharat: anudesh + dolly) using Unsloth + LoRA, on a single L4 GPU.
Then I ran an honest side-by-side eval: base Gemma-4 vs my fine-tune, across 25 Hindi prompts. The results were interesting 👇
✅ My fine-tune is more concise — ask for "3 tips" and it gives exactly 3. Base writes a 1,200-character essay.

✅ Pure native Hindi — base keeps slipping into English ("संतुलित आहार (Eat a Balanced Diet)", "तारा (Star)"). My fine-tune stays in clean Hindi.

✅ Tighter instruction-following — ask for a "short message" and it gives one, not a menu of options.
⚖️ And to be honest: base Gemma-4 is more detailed and comprehensive. I didn't build a "smarter" model — I built a focused, Hindi-native, edge-friendly one that runs as a 5GB GGUF (Q4) on CPU.
🔗 Try it:

Live demo (CPU): pankajpandey-dev/gemma-4-e4b-hindi-demo
GGUF (Ollama/llama.cpp): pankajpandey-dev/gemma-4-e4b-hindi-instruct-GGUF
16-bit model: pankajpandey-dev/gemma-4-e4b-hindi-instruct

Built with @unsloth · Data by @ai4bharat 🙏
#Hindi #LLM #Gemma #Unsloth #IndicNLP #GGUF
  • 1 reply
·
Shrijanagain 
posted an update about 17 hours ago
view post
Post
56
🚀 Big News for the AI Community! 🔥

We’re excited to release NRS_QWEN_MYTHOS_1M — a powerful reasoning model built on Qwen 3.5 9B!
At SKT AI LABS, we’ve supercharged this 9B model with our proprietary Neural Reasoning System (NRS) to deliver next-level performance.

🔥 Why This Model is a Game-Changer:
✅ 100x Reasoning Capacity — Exceptional deep logical thinking and complex problem-solving
✅ 1 Million Token Context — Perfect for massive codebases, long documents, and multi-turn agentic workflows
✅ Advanced Thinking Mode — Native <think> tags for true step-by-step Chain-of-Thought reasoning
✅ Tool-Use Ready — Optimized for Python execution, Web Search, and self-correction
✅ Blazing Fast — Runs smoothly on consumer GPUs like RTX 3090/4090

Technical Highlights:

Base: Qwen 3.5 9B
Tuning: NRS-specific high-quality reasoning data
Context: 1M Tokens (YaRN Scaling)
License: NRS DOCS

Whether you’re a developer building coding agents, a researcher working with long-context data, or someone who loves powerful reasoning — this model is built for you.

👉 Try it now on Hugging Face:
SKT-NRS/NRS_QWEN_MYTHOS_1M

Drop a comment: What will you build with it first? 👇
#AI #OpenSource #LLM #Qwen #ReasoningModel #HuggingFace #NewModel #AICommunity
ovi054 
posted an update about 20 hours ago
view post
Post
77
Qwen3-14B Manim Expert LoRA

For "Build Small Hackathon", I built a Gradio app that turns any concept into a Manim explainer video.

This is powered by Qwen3-14B + Manim LoRA I trained on a synthetic 10k dataset I generated.

👉 Try it now:
build-small-hackathon/anim-vid-ai
kanaria007 
posted an update 1 day ago
view post
Post
79
✅ Article highlight: Structural Abstraction Stack: From Raw Perception to Reusable Jumps (art-60-183, v0.1)

TL;DR:
This article argues that abstraction is not summary polish.

Once embodied systems parse, regulate, react, and act with receipts, they still need a way to learn reusable structure from real episodes. 183 defines that stack: extract invariant relation form, neutralize local semantics, preserve evaluative caution, and register only bounded jump anchors.

Read:
kanaria007/agi-structural-intelligence-protocols

Why it matters:
• prevents pattern learning from becoming a hidden heuristic library
• keeps abstractions downstream of parsed, receipted episodes
• preserves contradiction, missingness, fit limits, and failure modes
• separates structural abstraction from surface analogy
• makes reusable jumps bounded, reviewable, and revisable

What’s inside:
• candidate records from observation, reflex, actuation, posture, and failure traces
• structural abstraction records for invariant relation form
• semantic maps that keep source terms and provenance visible
• evaluative profiles for fit, non-fit, failure modes, and sandbox-first caution
• jump registration objects with thresholds, constraints, review hooks, and revision triggers
• rejection and reentry receipts for patterns that stay local, sandbox-only, quarantined, or blocked

Key idea:
Do not say:

“the system generalized from prior cases.”

Say:

“this pattern came from these parsed episodes, preserved this relation form, generalized these terms without erasing provenance, carried these fit and failure conditions, and registered only this bounded jump anchor.”

Abstraction is not a clever sentence.

It is governed reuse.
AxionLab-official 
posted an update 22 days ago
view post
Post
10941
THIS IS CRAZY! THE MODEL ON THE IMAGE(Supra-50M-Reasoning) answered correctly and its QUANTIZED IN 2BIT! THE RESPONSE IS CORRECT, IN A 15MB SIZE FILE!
  • 14 replies
·
lbourdois 
posted an update 26 days ago
view post
Post
975
New blog post!
An introduction to a little-known but highly effective model reduction method: 𝗧𝗿𝗶𝗺𝗺𝗶𝗻𝗴✂️
We show how to reduce model size (we went up to 87.24% reduction) while preserving its performance.

We applied this technique to 16 different model families across several modalities to illustrate that it works on any architecture (as long as the embedding layer is the last one of the model) and on any modality involving text.
From these 16 families, we generated over 𝟱,𝟱𝟬𝟬 𝗺𝗼𝗻𝗼𝗹𝗶𝗻𝗴𝘂𝗮𝗹 𝗺𝗼𝗱𝗲𝗹𝘀 𝗶𝗻 𝟭𝟮𝟰 𝗱𝗶𝗳𝗳𝗲𝗿𝗲𝗻𝘁 𝗹𝗮𝗻𝗴𝘂𝗮𝗴𝗲𝘀 🌍

Key takeaways from our experiments:
1️⃣ Trimming does not require a GPU. Our models were obtained on a CPU.
2️⃣ This method scales up to at least 4B parameters (we did not test beyond that).
3️⃣ Trimmed model is smaller than the original while preserving its performance. If you observe a slight performance drop, just fine-tuned to recover or even surpass the original performance.
4️⃣ For an equivalent compute budget, it is better to trim then fine-tune rather than fine-tuning the original model. Since the model is smaller, you can run more epochs/show more data and get in fine a better model than the original.
5️⃣ Trimming is a competitive alternative to distillation and quantization. E.g. we obtained our alternative to DistilBERT in 9 minutes on CPU vs. 90 hours of GPU for the latter.
6️⃣ Trimming could generate reasoning traces in the language of the trimmed model. This could be an alternative to generating traces in English and then translating them into the desired language.

And many other things (such as how much data are needed, the impact of the database used, the order in which it should be done, etc.) are available in the blogpost!

Blogpost: https://huggingface.co/blog/lbourdois/introduction-to-trimming
Models: alphaedge-ai/Trimming_models_search
  • 4 replies
·
prithivMLmods 
posted an update about 1 month ago
view post
Post
6197
PiD — Pixel Diffusion Decoder Image Edit Upscale and Image Generation Upscale, an all-in-one demo, is now live on Spaces! Great improvements in realism-based image generation and editing are powered by FLUX.2-Klein, while image generation is paired with Z-Image, and upscaling is enabled by default!

🤗 Space: prithivMLmods/PiD-Image-Upscaler
🔗 Collection: https://huggingface.co/collections/prithivMLmods/image-generation-apps-collection

🤗 > To learn more, visit the app page or the respective model pages.
bartowski 
posted an update May 8
view post
Post
31274
You may have noticed that my upload of MiMo-V2.5 upload didn't have the author in the model name:

bartowski/MiMo-V2.5-GGUF

Going forward, I plan to upload models from major 1st party developers without the author name attached for cleanliness, I feel it results in a nicer and more expected user experience

I will continue to uploaded fine tunes with that author + "_" appended for clarity, I personally feel it's nice to know at a glance who's tune it is, but it's also for the reason I first started doing it, to avoid it being confused for a new version of the official release

I hope this change makes sense, it seemed most reasonable to me and a poll I did (forever ago, I move slow sometimes) made it seem likely others would find it reasonable as well (feel free to let me know if you disagree, may not change my mind but I do value knowing what others think)

Thanks for downloading :)
  • 4 replies
·