DINOv2 ViT-L/14

A powerful self-supervised vision foundation model from Meta AI, producing high-quality image embeddings for vision tasks without task-specific labels.

1 min

Wav2Vec2 Large 960h

A widely used self-supervised speech representation model from Meta AI for automatic speech recognition and audio understanding tasks.

1 min