J C

dark-pen

AI & ML interests

None yet

Recent Activity

liked a dataset about 1 hour ago
BEE-spoke-data/awesome-python-apps
reacted to buddhi19's post with 🚀 about 1 hour ago
Article Highlight: SyntheticGen, Controllable Diffusion for Long-Tail Remote Sensing 🛰️ Why is remote-sensing segmentation still hard—even with strong models? Because the issue is not only the model… it’s the data. In real-world datasets like LoveDA, class distributions are highly imbalanced, and the problem is compounded by Urban/Rural domain shifts, where visual characteristics and class frequencies differ significantly. This leads to poor learning for minority classes and weak generalization. ⚖️ The Idea: Make Data Controllable Instead of treating data augmentation as a random process, SyntheticGen turns it into a controllable pipeline. 👉 What if you could: Specify which classes you want more of? Control how much of each class appears? Generate data that respects domain (Urban/Rural) characteristics? That’s exactly what SyntheticGen enables. 🧠 How It Works SyntheticGen introduces a structured generation process: Layout Generation (Stage A) A ratio-conditioned discrete diffusion model generates semantic layouts that match user-defined class distributions. Image Synthesis (Stage B) A ControlNet-guided Stable Diffusion pipeline converts layouts into realistic remote-sensing imagery. 💡 This separation between semantic control and visual realism is key—it allows both precision and high-quality generation. Why It Matters Tackles long-tail imbalance directly at the data level Improves minority-class segmentation performance Enhances cross-domain generalization (Urban ↔ Rural) Moves toward data-centric AI, where we design training data—not just models Recent research shows that diffusion-based synthetic data can significantly improve performance in long-tailed settings by generating high-value samples for rare or difficult cases . SyntheticGen takes this further by making the process explicitly controllable, not just generative. 📄 Paper https://arxiv.org/abs/2602.04749 💻 Code & Synthetic Data https://github.com/Buddhi19/SyntheticGen
View all activity

Organizations

Hugging Face MCP Course's profile picture Agents-MCP-Hackathon's profile picture