Provider: Google DeepMind
License: Gemma Custom License (allows commercial use with restrictions and responsible AI guidelines)
Access: Open weights via Hugging Face and Vertex AI
Architecture: Decoder-only Transformer
Parameters: 7 billion
Tuning: Base (not instruction-tuned)


🔍 Overview

Gemma 7B is Google’s open-weight foundation model designed for flexible and responsible deployment in both research and production contexts. Trained using the same infrastructure and data filtering pipeline as Gemini models, Gemma is optimized for compact size and performance-to-efficiency ratio.

Key features:

  • Responsible AI Focus: Aligned with Google’s safety and fairness frameworks
  • Versatile Backbone: Suitable for chat, retrieval, RAG, and language understanding
  • Efficient Serving: Works well with quantization and hardware optimization libraries

⚙️ Technical Specs

  • Architecture: Decoder-only transformer
  • Parameters: 7B
  • Context Length: 8K tokens
  • Tokenizer: SentencePiece
  • Training Data: Mixture of web-scale public data filtered for quality and safety

🚀 Deployment

  • Hugging Face Repo: google/gemma-7b
  • Google Cloud Support: Deployable via Vertex AI and Colab
  • Frameworks: 🤗 Transformers, JAX, TF, PyTorch, GGUF/llama.cpp support
  • Fine-Tuning: Compatible with LoRA/QLoRA and Google’s KerasNLP tools

đź”— Resources