Provider: Sentence Transformers / SBERT.net
License: Apache 2.0
Access: Fully open on Hugging Face
Architecture: DistilRoBERTa-like MiniLM encoder (6 layers)
๐ Overview
all-MiniLM-L6-v2
is one of the most popular sentence embedding models from the Sentence Transformers library. It delivers strong semantic similarity performance while being extremely compact and fast, making it suitable for real-time systems and large-scale semantic indexing.
Key highlights:
- โก Fast & Lightweight: Only 6 transformer layers, optimized for low-latency use
- ๐ง High Quality Embeddings: Trained on NLI and paraphrase datasets to encode semantic meaning
- ๐ Versatile: Performs well on tasks like clustering, semantic search, duplicate detection, and retrieval
โ๏ธ Technical Specs
- Architecture: MiniLM (6 layers, ~22M parameters)
- Embedding Dimension: 384
- Input: Natural language sentences or short paragraphs
- Training Data: NLI + STS datasets
- Tokenizer: WordPiece (BERT-compatible)
๐ Deployment
- Model Card: all-MiniLM-L6-v2 on Hugging Face
- Libraries:
sentence-transformers
,transformers
,faiss
for indexing - Use Cases: Semantic search, retrieval-based QA, deduplication, clustering
- Hardware: Runs on CPU or low-end GPU with minimal latency