Qwen3-VL-Embedding and Qwen3-VL-Reranker: A Unified Framework for State-of-the-Art Multimodal Retrieval and Ranking Paper • 2601.04720 • Published 22 days ago • 51
QIE Jan 23, 26 Collection adapter LoRA developed for Qwen’s Qwen-Image-Edit-2511 image-to-image model • 6 items • Updated 2 days ago • 2
The Flexibility Trap: Why Arbitrary Order Limits Reasoning Potential in Diffusion Language Models Paper • 2601.15165 • Published 8 days ago • 68
EvoCUA: Evolving Computer Use Agents via Learning from Scalable Synthetic Experience Paper • 2601.15876 • Published 7 days ago • 89
BayesianVLA: Bayesian Decomposition of Vision Language Action Models via Latent Action Queries Paper • 2601.15197 • Published 8 days ago • 54
GutenOCR: A Grounded Vision-Language Front-End for Documents Paper • 2601.14490 • Published 9 days ago • 35
view article Article LightOnOCR-2-1B: a lightweight high-performance end-to-end OCR model family 10 days ago • 70
Being-H0.5: Scaling Human-Centric Robot Learning for Cross-Embodiment Generalization Paper • 2601.12993 • Published 10 days ago • 75
Qwen Image Edit (exps) Collection adapter LoRA developed for Qwen’s Qwen-Image-Edit-2511 image-to-image model • 12 items • Updated 6 days ago • 5
YOLO26 Models Collection YOLO26 models: detection, segmentation, classification, pose, and OBB variants with demos and ONNX variants. • 42 items • Updated 10 days ago • 31
A Safety Report on GPT-5.2, Gemini 3 Pro, Qwen3-VL, Doubao 1.8, Grok 4.1 Fast, Nano Banana Pro, and Seedream 4.5 Paper • 2601.10527 • Published 14 days ago • 24
FlowAct-R1: Towards Interactive Humanoid Video Generation Paper • 2601.10103 • Published 15 days ago • 69