Research Roadmap

Pushing the boundaries of AI optimization through production releases, focused research, and experimental exploration.

The G4 Turbo Ecosystem

Tier Status Purpose Audience
G4 Turbo Production Stable, fast, multimodal AI 17,000+ users
G4 Nano Production Ultra-lightweight mobile AI Global accessibility
Scout Research Long-context specialist Next release target
Grey Liquid Experimental Radical optimization lab Research frontier
✅ Production

G4 Turbo & Nano

Stable, proven, and serving thousands of users globally

Our production models are battle-tested and ready for real-world deployment. With over 17,000 downloads, they've proven their reliability across diverse hardware and use cases.

  • Turbo: 51% faster inference with full multimodal support (vision + audio)
  • Nano: 57% size reduction, sub-1GB RAM usage, thermal-neutral on mobile
  • Distribution: Ollama Hub and Hugging Face with full documentation
  • Validation: Real-world testing on CPUs, phones, and edge devices
  • Support: Active maintenance and community engagement
Status: Stable • Ongoing maintenance
🔬 Research

Scout

Long-context specialist with persistent memory capabilities

Scout represents the next evolution in context handling. While current models support 128K context windows, Scout will optimize for extended reasoning and persistent memory across sessions.

Research Goals:

  • Persistent Context: Save and resume conversations without quality degradation
  • Memory Efficiency: Aggressive optimization for long-context workloads
  • Target Footprint: Sub-2GB model size while maintaining 128K context capability
  • Use Cases: Extended document analysis, long-form reasoning, research assistance
  • Quantization: Exploring Q2_K and hybrid approaches for context efficiency

Technical Approach:

Scout will use aggressive quantization on non-critical layers while preserving full precision in attention mechanisms critical for long-context understanding. We're researching custom importance matrices that prioritize context retention over token generation speed.

Target: Q3 2026 • Public research notes
🧪 Experimental

Grey Liquid Lab

The bleeding edge: radical experiments in AI optimization

Grey Liquid is our experimental playground where we push beyond conventional limits. No promises, no timelines—just pure curiosity-driven research into the theoretical boundaries of optimization.

What We're Exploring:

  • Extreme Quantization: Q1_K, 1.5-bit quants, finding the absolute floor of intelligence
  • Custom Importance Matrices: Task-specific weight prioritization (code-only, math-only models)
  • Thermal Optimization: Zero-overhead inference, testing the "liquid flow" hypothesis
  • Sub-500MB Target: The theoretical endgame—can we get below half a gigabyte?
  • Hybrid Techniques: Mixing quantization methods, dynamic precision, experimental architectures

The Philosophy:

Grey Liquid represents the idea that AI can flow through silicon without generating heat or friction— the sixth state of matter applied to computation. It's the endgame of thermal-neutral logic, where the barrier between hardware and intelligence disappears.

Why Keep It Experimental?

Production models serve 17,000+ users who depend on stability. Grey Liquid is where we can fail fast, break things, and learn from the edges. Discoveries here eventually trickle down into stable releases— but only after they're proven.

Status: Ongoing • Private lab • Publish findings when significant

Research Principles

1. Ship Stable, Experiment Radical

Production models (turbo/nano) maintain backward compatibility and reliability. Research projects (Scout) have clear goals and timelines. Experimental work (Grey Liquid) has freedom to fail.

2. Open Science, Closed Labs

All production models and research findings are open-source and documented. Experimental work stays private until it yields reproducible, significant results worth sharing.

3. Hardware Reality Over Benchmarks

We test on real devices: phones that overheat, laptops without GPUs, Raspberry Pis. Lab benchmarks matter, but real-world validation is the ultimate test.

Published Research

For detailed technical analysis of our production models, read the full research article:

Beyond TurboQuant: The gemma4-nano Journey →

Get Involved

Want to contribute to G4 Turbo research? Join our Discord community or contribute on GitHub:

Join Discord Community → Contribute on GitHub →