VLLM on Cognaptus

VLLM on Cognaptus https://cognaptus.com/tags/vllm/ Recent content in VLLM on Cognaptus Hugo -- 0.145.0 en-us Wed, 27 May 2026 00:00:00 +0000 The Experts Are Sparse Inside: Why MoE Cost Cuts Stop at 1.2x https://cognaptus.com/blog/2026-05-27-the-experts-are-sparse-inside-why-moe-cost-cuts-stop-at-12x/ Wed, 27 May 2026 00:00:00 +0000 https://cognaptus.com/blog/2026-05-27-the-experts-are-sparse-inside-why-moe-cost-cuts-stop-at-12x/ A mechanism-first reading of intra-expert activation sparsity in MoE models, and why large theoretical sparsity becomes modest but useful inference savings in production.