Research Roadmap

Pushing the boundaries of AI optimization through production releases, focused research, and experimental exploration.

The G4 Turbo Ecosystem

Tier	Status	Purpose	Audience
G4 Turbo	Production	Stable, fast, multimodal AI	17,000+ users
G4 Nano	Production	Ultra-lightweight mobile AI	Global accessibility
Scout	Research	Long-context specialist	Next release target
Grey Liquid	Experimental	Radical optimization lab	Research frontier

✅ Production

G4 Turbo & Nano

Stable, proven, and serving thousands of users globally

Our production models are battle-tested and ready for real-world deployment. With over 17,000 downloads, they've proven their reliability across diverse hardware and use cases.

Turbo: 51% faster inference with full multimodal support (vision + audio)
Nano: 57% size reduction, sub-1GB RAM usage, thermal-neutral on mobile
Distribution: Ollama Hub and Hugging Face with full documentation
Validation: Real-world testing on CPUs, phones, and edge devices
Support: Active maintenance and community engagement

Status: Stable • Ongoing maintenance

🔬 Research

Scout

Long-context specialist with persistent memory capabilities

Scout represents the next evolution in context handling. While current models support 128K context windows, Scout will optimize for extended reasoning and persistent memory across sessions.

Research Goals:

Persistent Context: Save and resume conversations without quality degradation
Memory Efficiency: Aggressive optimization for long-context workloads
Target Footprint: Sub-2GB model size while maintaining 128K context capability
Use Cases: Extended document analysis, long-form reasoning, research assistance
Quantization: Exploring Q2_K and hybrid approaches for context efficiency

Technical Approach:

Scout will use aggressive quantization on non-critical layers while preserving full precision in attention mechanisms critical for long-context understanding. We're researching custom importance matrices that prioritize context retention over token generation speed.

Target: Q3 2026 • Public research notes

🧪 Experimental

Grey Liquid Lab

The bleeding edge: radical experiments in AI optimization

Grey Liquid is our experimental playground where we push beyond conventional limits. No promises, no timelines—just pure curiosity-driven research into the theoretical boundaries of optimization.

What We're Exploring:

Extreme Quantization: Q1_K, 1.5-bit quants, finding the absolute floor of intelligence
Custom Importance Matrices: Task-specific weight prioritization (code-only, math-only models)
Thermal Optimization: Zero-overhead inference, testing the "liquid flow" hypothesis
Sub-500MB Target: The theoretical endgame—can we get below half a gigabyte?
Hybrid Techniques: Mixing quantization methods, dynamic precision, experimental architectures

The Philosophy:

Grey Liquid represents the idea that AI can flow through silicon without generating heat or friction— the sixth state of matter applied to computation. It's the endgame of thermal-neutral logic, where the barrier between hardware and intelligence disappears.

Why Keep It Experimental?

Production models serve 17,000+ users who depend on stability. Grey Liquid is where we can fail fast, break things, and learn from the edges. Discoveries here eventually trickle down into stable releases— but only after they're proven.

Status: Ongoing • Private lab • Publish findings when significant

Research Principles

1. Ship Stable, Experiment Radical

Production models (turbo/nano) maintain backward compatibility and reliability. Research projects (Scout) have clear goals and timelines. Experimental work (Grey Liquid) has freedom to fail.

2. Open Science, Closed Labs

All production models and research findings are open-source and documented. Experimental work stays private until it yields reproducible, significant results worth sharing.

3. Hardware Reality Over Benchmarks

We test on real devices: phones that overheat, laptops without GPUs, Raspberry Pis. Lab benchmarks matter, but real-world validation is the ultimate test.

Published Research

For detailed technical analysis of our production models, read the full research article:

Beyond TurboQuant: The gemma4-nano Journey →

Get Involved

Want to contribute to G4 Turbo research? Join our Discord community or contribute on GitHub:

Real-world testing reports (especially mobile and edge devices)
Quantization experiments and findings
Performance benchmarks on diverse hardware
Use case feedback and optimization suggestions
Documentation improvements and tutorials

Join Discord Community → Contribute on GitHub →