Run powerful AI on your CPU, mobile device, or edge hardware. 51% faster inference, 50%+ smaller models, fully local and private.
From multimodal powerhouses to ultra-lightweight mobile models, optimized for every use case.
Full-featured AI with vision and audio support
Mobile-first AI that actually works on real devices
Pushing the boundaries of AI optimization
One command to run powerful local AI on your hardware
# Install Ollama from https://ollama.com # For multimodal AI with vision: ollama run ssfdre38/gemma4-turbo:e4b # For ultra-lightweight mobile/edge: ollama run ssfdre38/gemma4-nano:e2b # That's it! Start chatting with local AI 🚀
# Download from Hugging Face: wget https://huggingface.co/ssfdre38/gemma4-turbo-gguf/resolve/main/gemma4-e4b-iq4xs-turbo.gguf # Use with llama.cpp, Ollama, or any GGUF-compatible tool
# Turbo (multimodal): e2b → 4.3 GB # Smallest with vision e4b → 6.1 GB # Recommended default 26b → 15 GB # High capability 31b → 18 GB # Maximum performance # Nano (text-only): e2b → 3.1 GB # Mobile-ready e4b → 4.7 GB # Balanced 26b → 12 GB # High quality 31b → 14 GB # Best performance