Prithiv Sakthi's picture

Building on HF

Prithiv Sakthi PRO

prithivMLmods

hugging-science

·

https://linktr.ee/prithivsakthi

AI & ML interests

computer vision, nlp, multimodality - HuggingFace Fellow🤗

Recent Activity

new activity 19 minutes ago

prithivMLmods/Qwen3-TTS-Daggr-UI:update app [support_qwen_asr] ✅

liked a Space about 4 hours ago

merve/daggr-image-to-3d

liked a Space about 5 hours ago

victor/image-model-comparison-video

View all activity

Organizations

upvoted a changelog about 6 hours ago

Changelog

View Running Jobs Count from the User Menu

about 7 hours ago

• 10

upvoted a collection 1 day ago

HunyuanImage

4 items • Updated 2 days ago • 13

upvoted a paper 3 days ago

Qwen3-VL-Embedding and Qwen3-VL-Reranker: A Unified Framework for State-of-the-Art Multimodal Retrieval and Ranking

Paper • 2601.04720 • Published 22 days ago • 51

upvoted a collection 4 days ago

QIE Jan 23, 26

adapter LoRA developed for Qwen’s Qwen-Image-Edit-2511 image-to-image model • 6 items • Updated 2 days ago • 2

upvoted 3 papers 6 days ago

The Flexibility Trap: Why Arbitrary Order Limits Reasoning Potential in Diffusion Language Models

Paper • 2601.15165 • Published 8 days ago • 68

EvoCUA: Evolving Computer Use Agents via Learning from Scalable Synthetic Experience

Paper • 2601.15876 • Published 7 days ago • 89

BayesianVLA: Bayesian Decomposition of Vision Language Action Models via Latent Action Queries

Paper • 2601.15197 • Published 8 days ago • 54

upvoted a paper 7 days ago

GutenOCR: A Grounded Vision-Language Front-End for Documents

Paper • 2601.14490 • Published 9 days ago • 35

upvoted a changelog 7 days ago

Changelog

Sort Models by Parameter Size

7 days ago

• 23

upvoted an article 8 days ago

Article

LightOnOCR-2-1B: a lightweight high-performance end-to-end OCR model family

10 days ago

•

70

upvoted 2 papers 8 days ago

Think3D: Thinking with Space for Spatial Reasoning

Paper • 2601.13029 • Published 10 days ago • 45

Being-H0.5: Scaling Human-Centric Robot Learning for Cross-Embodiment Generalization

Paper • 2601.12993 • Published 10 days ago • 75

upvoted an article 10 days ago

Article

Open Responses: What you need to know

+2

15 days ago

•

101

upvoted a paper 11 days ago

Your Group-Relative Advantage Is Biased

Paper • 2601.08521 • Published 16 days ago • 146

upvoted a collection 11 days ago

Qwen Image Edit (exps)

adapter LoRA developed for Qwen’s Qwen-Image-Edit-2511 image-to-image model • 12 items • Updated 6 days ago • 5

upvoted 2 collections 13 days ago

Jan 5 Releases

35 items • Updated 16 days ago • 5

YOLO26 Models

YOLO26 models: detection, segmentation, classification, pose, and OBB variants with demos and ONNX variants. • 42 items • Updated 10 days ago • 31

upvoted 3 papers 14 days ago

A Safety Report on GPT-5.2, Gemini 3 Pro, Qwen3-VL, Doubao 1.8, Grok 4.1 Fast, Nano Banana Pro, and Seedream 4.5

Paper • 2601.10527 • Published 14 days ago • 24

FlowAct-R1: Towards Interactive Humanoid Video Generation

Paper • 2601.10103 • Published 15 days ago • 69

STEP3-VL-10B Technical Report

Paper • 2601.09668 • Published 15 days ago • 189