Gemma 3 (Keras)

An experimental LLM built using Keras 3 and JAX/TPU, designed to showcase research-focused model development on the Kaggle Models platform.

1 min

Gemma 7B

A 7-billion-parameter open-weight language model developed by Google, optimized for efficiency, safety, and general-purpose reasoning.

1 min

Grok-1

An open-weight language model released by xAI (Elon Musk’s AI company), intended for research and analysis, with performance comparable to top-tier 2023 models.

1 min

Grok-2

The next-generation model from xAI, built on a new architecture and fully integrated into X (formerly Twitter) as part of Elon Musk’s AI assistant efforts.

1 min

LLaMA 2 7B (Base)

Meta’s 7B-parameter base language model from the LLaMA 2 series, designed for general-purpose pretraining and customizable fine-tuning.

1 min

LLaMA 2 7B Chat (Hugging Face)

Meta’s 7B-parameter instruction-tuned model optimized for chat, dialogue, and assistant-style applications.

1 min

LLaMA 4 Maverick 17B 128E (Original)

Meta’s experimental ultra-sparse MoE model with 128 experts, designed to explore efficient large-scale scaling and routing strategies for future LLaMA architectures.

1 min

LLaMA 4 Scout 17B 16E

Meta’s experimental LLaMA 4-series MoE model with 17 billion parameters and 16 experts, designed to explore sparse routing and scaling strategies.

1 min

Meta Llama 3 8B

A next-generation 8-billion-parameter open-weight language model from Meta, optimized for reasoning and general-purpose tasks.

1 min

Mixtral 8x7B Instruct v0.1

A powerful sparse Mixture-of-Experts (MoE) instruction-tuned language model by Mistral AI, combining efficiency and performance for chat and task-oriented generation.

1 min