Provider: Hexgrad
License: MIT
Access: Fully open weights hosted on Hugging Face
Architecture: Decoder-only transformer
Parameters: 82 million
Tokenizer: Custom sentencepiece tokenizer trained on Japanese data
๐ Overview
Kokoro-82M is a small-scale, expressive Japanese language model designed for creative dialogue and casual speech generation. Trained on a blend of Japanese fiction, web forums, and conversational text, the model is a rare example of a lightweight LLM tailored for stylized Japanese use cases.
Key characteristics:
- Compact and Lightweight: Only 82M parameters, fast to run and fine-tune
- Stylistic Awareness: Capable of generating emotionally nuanced responses
- Language Specific: Focused solely on Japanese; not multilingual
โ๏ธ Technical Details
- Architecture: Decoder-only GPT-style transformer
- Parameters: 82M
- Vocabulary: 8,000 token Japanese tokenizer
- Training Data: Mix of open Japanese corpora, fictional dialogue, and user comments
- Context Window: Not specified, assumed standard short-form (~512 tokens)
๐ Deployment
- Hugging Face Repo: hexgrad/Kokoro-82M
- Inference: Supports ๐ค Transformers; can run on CPU or low-end GPUs
- Applications: Character modeling, emotional NLG, creative experimentation