Research Roadmap
Pushing the boundaries of AI optimization through production releases, focused research, and experimental exploration.
The G4 Turbo Ecosystem
| Tier | Status | Purpose | Audience |
|---|---|---|---|
| G4 Turbo | Production | Stable, fast, multimodal AI | 17,000+ users |
| G4 Nano | Production | Ultra-lightweight mobile AI | Global accessibility |
| Scout | Research | Long-context specialist | Next release target |
| Grey Liquid | Experimental | Radical optimization lab | Research frontier |
G4 Turbo & Nano
Stable, proven, and serving thousands of users globally
Our production models are battle-tested and ready for real-world deployment. With over 17,000 downloads, they've proven their reliability across diverse hardware and use cases.
- Turbo: 51% faster inference with full multimodal support (vision + audio)
- Nano: 57% size reduction, sub-1GB RAM usage, thermal-neutral on mobile
- Distribution: Ollama Hub and Hugging Face with full documentation
- Validation: Real-world testing on CPUs, phones, and edge devices
- Support: Active maintenance and community engagement
Scout
Long-context specialist with persistent memory capabilities
Scout represents the next evolution in context handling. While current models support 128K context windows, Scout will optimize for extended reasoning and persistent memory across sessions.
Research Goals:
- Persistent Context: Save and resume conversations without quality degradation
- Memory Efficiency: Aggressive optimization for long-context workloads
- Target Footprint: Sub-2GB model size while maintaining 128K context capability
- Use Cases: Extended document analysis, long-form reasoning, research assistance
- Quantization: Exploring Q2_K and hybrid approaches for context efficiency
Technical Approach:
Scout will use aggressive quantization on non-critical layers while preserving full precision in attention mechanisms critical for long-context understanding. We're researching custom importance matrices that prioritize context retention over token generation speed.
Grey Liquid Lab
The bleeding edge: radical experiments in AI optimization
Grey Liquid is our experimental playground where we push beyond conventional limits. No promises, no timelines—just pure curiosity-driven research into the theoretical boundaries of optimization.
What We're Exploring:
- Extreme Quantization: Q1_K, 1.5-bit quants, finding the absolute floor of intelligence
- Custom Importance Matrices: Task-specific weight prioritization (code-only, math-only models)
- Thermal Optimization: Zero-overhead inference, testing the "liquid flow" hypothesis
- Sub-500MB Target: The theoretical endgame—can we get below half a gigabyte?
- Hybrid Techniques: Mixing quantization methods, dynamic precision, experimental architectures
The Philosophy:
Grey Liquid represents the idea that AI can flow through silicon without generating heat or friction— the sixth state of matter applied to computation. It's the endgame of thermal-neutral logic, where the barrier between hardware and intelligence disappears.
Why Keep It Experimental?
Production models serve 17,000+ users who depend on stability. Grey Liquid is where we can fail fast, break things, and learn from the edges. Discoveries here eventually trickle down into stable releases— but only after they're proven.
Research Principles
1. Ship Stable, Experiment Radical
Production models (turbo/nano) maintain backward compatibility and reliability. Research projects (Scout) have clear goals and timelines. Experimental work (Grey Liquid) has freedom to fail.
2. Open Science, Closed Labs
All production models and research findings are open-source and documented. Experimental work stays private until it yields reproducible, significant results worth sharing.
3. Hardware Reality Over Benchmarks
We test on real devices: phones that overheat, laptops without GPUs, Raspberry Pis. Lab benchmarks matter, but real-world validation is the ultimate test.
Published Research
For detailed technical analysis of our production models, read the full research article:
Get Involved
Want to contribute to G4 Turbo research? Join our Discord community or contribute on GitHub:
- Real-world testing reports (especially mobile and edge devices)
- Quantization experiments and findings
- Performance benchmarks on diverse hardware
- Use case feedback and optimization suggestions
- Documentation improvements and tutorials