ATOMIC-Gemma

ATOMIC-Gemma is a domain-specific Vision-Language Model for Transmission Electron Microscopy (TEM), fine-tuned from Gemma3-4B-IT using Stage 2 instruction tuning on TEM conversation data.

Note: ATOMIC-Gemma is developed after the ECCV 2026 submission deadline and is not part of the published paper. It is released here to demonstrate the generalizability of the ATOMIC training pipeline across different base model architectures.

For the published paper and full pipeline, please refer to our GitHub repository: 👉 https://github.com/SemiMRTLab-NCKU/ATOMIC

Model Details


Base Model	Gemma3-4B-IT (`google/gemma-3-4b-it`)
Training Stage	Stage 2 (instruction tuning) only
Training Data	60K Stage 2 conversations
Domain	Transmission Electron Microscopy (TEM)
Modalities	CTEM, HR-TEM, STEM, Diffraction

Inference

ATOMIC-Gemma can be loaded directly via transformers:

from transformers import AutoProcessor, Gemma3ForConditionalGeneration
from PIL import Image
import torch

model_id = "LabSmart/ATOMIC-Gemma"

model = Gemma3ForConditionalGeneration.from_pretrained(
    model_id,
    device_map="auto",
    torch_dtype=torch.bfloat16
).eval()
processor = AutoProcessor.from_pretrained(model_id)

image = Image.open("your_TEM_image.png").convert("RGB")

messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "image": image},
            {"type": "text", "text": "What type of TEM image is this?"}
        ]
    }
]

inputs = processor.apply_chat_template(
    messages,
    add_generation_prompt=True,
    tokenize=True,
    return_dict=True,
    return_tensors="pt"
).to(model.device, dtype=torch.bfloat16)

input_len = inputs["input_ids"].shape[-1]

with torch.inference_mode():
    generation = model.generate(**inputs, max_new_tokens=256, do_sample=False)

generation = generation[0][input_len:]
response = processor.decode(generation, skip_special_tokens=True)
print(response)

Training Data

Training data is available on HuggingFace: 👉 https://huggingface.co/datasets/LabSmart/ATOMIC_dataset

Citation

@inproceedings{atomic2026eccv,
  title     = {ATOMIC: A Domain-Specific Vision-Language Model
               for Transmission Electron Microscopy},
  author    = {Tu, C. and Hsu, Shu-han and others},
  booktitle = {Proceedings of ECCV 2026},
  year      = {2026},
  note      = {BibTeX will be updated upon publication}
}

License

This model is released under the Gemma Terms of Use. It is intended for academic research purposes only.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

Image-Text-to-Text

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for LabSmart/ATOMIC-Gemma

Base model

google/gemma-3-4b-pt

Finetuned

google/gemma-3-4b-it

Finetuned

(724)

this model