LAION-CLAP — AEmotionStudio mirror

Mirror of LAION-CLAP audio-text joint-embedding model weights, used by:

Tessera's Find-Similar grain overlay (corpus map → click → top-K)
The standalone CLAP panel: Text Search · Similar Clips · Auto-tag

Upstream: https://huggingface.co/lukewys/laion_clap
License: CC0-1.0.

Format

We ship .safetensors only (no pickle, no PyTorch 2.6+ weights_only=True gotchas, ~3× smaller than the upstream .pt because training metadata is dropped). Each file contains the bare audio-encoder + text-encoder state_dict. Use safetensors.torch.load_file(path) and module.model.load_state_dict(sd, strict=False) — the legacy load_ckpt(ckpt=...) API still works against the upstream .pt files but not against these.

Files

630k-audioset-best.safetensors (variant general, amodel=HTSAT-tiny) — non-fusion HTSAT-tiny checkpoint trained on 630k clips + AudioSet (best validation); amodel='HTSAT-tiny' in laion_clap.CLAP_Module(...).
music_audioset_epoch_15_esc_90.14.safetensors (variant music, amodel=HTSAT-base) — music-specialized LAION-CLAP fine-tune; 90.14% on ESC-50; better on music corpora at the cost of marginal regression on speech/SFX. amodel='HTSAT-base' (NOT tiny — the music variant trains a bigger backbone).

Loading

import laion_clap
from safetensors.torch import load_file
m = laion_clap.CLAP_Module(enable_fusion=False, amodel='HTSAT-tiny')
sd = load_file('630k-audioset-best.safetensors')
m.model.load_state_dict(sd, strict=False)
emb = m.get_audio_embedding_from_data(audio_array_list)

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support