ATOMIC-Gemma

ATOMIC-Gemma is a domain-specific Vision-Language Model for Transmission Electron Microscopy (TEM), fine-tuned from Gemma3-4B-IT using Stage 2 instruction tuning on TEM conversation data.

Note: ATOMIC-Gemma is developed after the ECCV 2026 submission deadline and is not part of the published paper. It is released here to demonstrate the generalizability of the ATOMIC training pipeline across different base model architectures.

For the published paper and full pipeline, please refer to our GitHub repository: ๐Ÿ‘‰ https://github.com/SemiMRTLab-NCKU/ATOMIC


Model Details

Base Model Gemma3-4B-IT (google/gemma-3-4b-it)
Training Stage Stage 2 (instruction tuning) only
Training Data 60K Stage 2 conversations
Domain Transmission Electron Microscopy (TEM)
Modalities CTEM, HR-TEM, STEM, Diffraction

Inference

ATOMIC-Gemma can be loaded directly via transformers:

from transformers import AutoProcessor, Gemma3ForConditionalGeneration
from PIL import Image
import torch

model_id = "LabSmart/ATOMIC-Gemma"

model = Gemma3ForConditionalGeneration.from_pretrained(
    model_id,
    device_map="auto",
    torch_dtype=torch.bfloat16
).eval()
processor = AutoProcessor.from_pretrained(model_id)

image = Image.open("your_TEM_image.png").convert("RGB")

messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "image": image},
            {"type": "text", "text": "What type of TEM image is this?"}
        ]
    }
]

inputs = processor.apply_chat_template(
    messages,
    add_generation_prompt=True,
    tokenize=True,
    return_dict=True,
    return_tensors="pt"
).to(model.device, dtype=torch.bfloat16)

input_len = inputs["input_ids"].shape[-1]

with torch.inference_mode():
    generation = model.generate(**inputs, max_new_tokens=256, do_sample=False)

generation = generation[0][input_len:]
response = processor.decode(generation, skip_special_tokens=True)
print(response)

Training Data

Training data is available on HuggingFace: ๐Ÿ‘‰ https://huggingface.co/datasets/LabSmart/ATOMIC_dataset


Citation

@inproceedings{atomic2026eccv,
  title     = {ATOMIC: A Domain-Specific Vision-Language Model
               for Transmission Electron Microscopy},
  author    = {Tu, C. and Hsu, Shu-han and others},
  booktitle = {Proceedings of ECCV 2026},
  year      = {2026},
  note      = {BibTeX will be updated upon publication}
}

License

This model is released under the Gemma Terms of Use. It is intended for academic research purposes only.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for LabSmart/ATOMIC-Gemma

Finetuned
(724)
this model