VectorSynth

VectorSynth is a ControlNet model that generates satellite imagery from OpenStreetMap (OSM) vector data embeddings. It conditions Stable Diffusion 2.1 Base on rendered OSM text to synthesize realistic aerial imagery.

Model Description

VectorSynth uses a two-stage pipeline:

RenderEncoder: Projects 768-dim CLIP text embeddings of OSM text to 3-channel control images
ControlNet: Conditions Stable Diffusion 2.1 on the rendered control images

This model uses standard CLIP embeddings. For the COSA embedding variant, see VectorSynth-COSA.

Files

config.json - ControlNet configuration
diffusion_pytorch_model.safetensors - ControlNet weights
render_encoder/clip-render_encoder.pth - RenderEncoder weights
render.py - RenderEncoder class definition

Citation

@inproceedings{cher2025vectorsynth,
  title={VectorSynth: Fine-Grained Satellite Image Synthesis with Structured Semantics},
  author={Cher, Daniel and Wei, Brian and Sastry, Srikumar and Jacobs, Nathan},
  year={2025},
  eprint={arXiv:2511.07744},
  note={arXiv preprint}
}

Related Models

VectorSynth-COSA - COSA embedding variant
GeoSynth - Text-to-satellite image generation

Downloads last month: -

Model tree for MVRL/VectorSynth

Base model

stabilityai/stable-diffusion-2-1-base

Adapter

(710)

this model

Collection including MVRL/VectorSynth

VectorSynth

Collection

Models for https://arxiv.org/abs/2511.07744 • 2 items • Updated about 7 hours ago

Paper for MVRL/VectorSynth

VectorSynth: Fine-Grained Satellite Image Synthesis with Structured Semantics

Paper • 2511.07744 • Published Nov 11, 2025