Gemma 3 (Keras)
An experimental LLM built using Keras 3 and JAX/TPU, designed to showcase research-focused model development on the Kaggle Models platform.
An experimental LLM built using Keras 3 and JAX/TPU, designed to showcase research-focused model development on the Kaggle Models platform.
Meta’s experimental ultra-sparse MoE model with 128 experts, designed to explore efficient large-scale scaling and routing strategies for future LLaMA architectures.
Meta’s experimental LLaMA 4-series MoE model with 17 billion parameters and 16 experts, designed to explore sparse routing and scaling strategies.