ATOMIC-LLaVA

ATOMIC-LLaVA is a domain-specific Vision-Language Model for Transmission Electron Microscopy (TEM), fine-tuned from LLaVA-v1.5-7B (Vicuna-v1.5-7B) using a two-stage training pipeline on 32,564 TEM subfigures collected from Nature portfolio journals.

This model is introduced in the ECCV 2026 paper:

ATOMIC: A Domain-Specific Vision-Language Model for Transmission Electron Microscopy

For code, evaluation scripts, and dataset, please refer to our GitHub repository: 👉 https://github.com/SemiMRTLab-NCKU/ATOMIC

Model Details


Base Model	LLaVA-v1.5-7B (Vicuna-v1.5-7B)
Training Stage	Stage 1 (alignment) + Stage 2 (instruction tuning)
Training Data	120K Stage 1 pairs + 60K Stage 2 conversations
Domain	Transmission Electron Microscopy (TEM)
Modalities	CTEM, HR-TEM, STEM, Diffraction

Important: Inference Requirements

ATOMIC-LLaVA is built on LLaVA and cannot be loaded directly via transformers. Inference requires the LLaVA repository.

Step 1 — Clone LLaVA:

git clone https://github.com/haotian-liu/LLaVA.git
cd LLaVA
pip install -e .

Step 2 — Download weights:

from huggingface_hub import snapshot_download
snapshot_download(repo_id="LabSmart/ATOMIC-LLaVA", local_dir="./ATOMIC-LLaVA")

Step 3 — Run inference using our evaluation scripts:

Please refer to evaluation/ in our GitHub repository for inference and evaluation scripts.

Training Data

Training data is available on HuggingFace: 👉 https://huggingface.co/datasets/LabSmart/ATOMIC_dataset

Citation

@inproceedings{atomic2026eccv,
  title     = {ATOMIC: A Domain-Specific Vision-Language Model
               for Transmission Electron Microscopy},
  author    = {Tu, C. and Hsu, Shu-han and others},
  booktitle = {Proceedings of ECCV 2026},
  year      = {2026},
  note      = {BibTeX will be updated upon publication}
}

License

This model is released under the LLaMA 2 Community License. It is intended for academic research purposes only and may not be used for commercial purposes.

Downloads last month: 21

Safetensors

Model size

7B params

Tensor type

BF16

Inference Providers NEW

Image-Text-to-Text

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for LabSmart/ATOMIC-LLaVA

Base model

lmsys/vicuna-7b-v1.5

Finetuned

(66)

this model