VectorSynth

VectorSynth is a ControlNet model that generates satellite imagery from OpenStreetMap (OSM) vector data embeddings. It conditions Stable Diffusion 2.1 Base on rendered OSM text to synthesize realistic aerial imagery.

Model Description

VectorSynth uses a two-stage pipeline:

  1. RenderEncoder: Projects 768-dim CLIP text embeddings of OSM text to 3-channel control images
  2. ControlNet: Conditions Stable Diffusion 2.1 on the rendered control images

This model uses standard CLIP embeddings. For the COSA embedding variant, see VectorSynth-COSA.

Files

  • config.json - ControlNet configuration
  • diffusion_pytorch_model.safetensors - ControlNet weights
  • render_encoder/clip-render_encoder.pth - RenderEncoder weights
  • render.py - RenderEncoder class definition

Citation

@inproceedings{cher2025vectorsynth,
  title={VectorSynth: Fine-Grained Satellite Image Synthesis with Structured Semantics},
  author={Cher, Daniel and Wei, Brian and Sastry, Srikumar and Jacobs, Nathan},
  year={2025},
  eprint={arXiv:2511.07744},
  note={arXiv preprint}
}

Related Models

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for MVRL/VectorSynth

Adapter
(710)
this model

Collection including MVRL/VectorSynth

Paper for MVRL/VectorSynth