LLaMA 4 Scout 17B 16E

Provider: Meta AI
License: Meta Llama Scout License (strictly for research use, not for deployment or commercial use)
Access: Open weights via Hugging Face
Architecture: Sparse Mixture-of-Experts (MoE) Transformer
Experts: 16 experts with 2 active per forward pass
Parameters: 17B total

🔍 Overview

LLaMA 4 Scout 17B 16E is a sparse MoE model released by Meta as part of its LLaMA 4 Scout initiative. It serves as a research prototype for exploring model scaling, routing behavior, and performance trade-offs in sparse expert-based large language models.

Highlights:

🧪 Sparse Routing: Activates only 2 out of 16 experts per token, enabling compute efficiency
🧠 Research Preview: Not intended for production; shared to encourage study of next-gen architectures
🔍 Open Metrics: Released with evaluation data for reproducibility and benchmark comparison

⚙️ Technical Details

Model Type: Decoder-only MoE transformer
Parameters: 17B total, ~4.7B active per token
Routing: Top-2 expert gating
Training Corpus: Internally curated data (details not fully disclosed)
Tokenizer: LLaMA tokenizer (same family as LLaMA 2/3)

🚀 Deployment

Model Card: Meta LLaMA 4 Scout on Hugging Face
Use Case: Academic and scaling law research
Inference: Requires custom support for MoE models; compatible with PyTorch / Meta AI tools

🔍 Overview#

⚙️ Technical Details#

🚀 Deployment#

🔗 Resources#

🔍 Overview

⚙️ Technical Details

🚀 Deployment

🔗 Resources