| Great question - I'll be direct. It's not that Gemini & Sonnet are excluded. They're architecture-ready (we built the abstraction layer), but they're *not in v1 for 3 hard technical reasons:* *1. Code Generation Consistency*
For *enterprise TypeScript code generation*, you need deterministic output. Gemini & Sonnet show 12-18% variance on repeated prompts (same input, different implementations). Perplexity + Claude stabilize at 3-5%, Groq at 2%. With our CIG Protocol validating at compile-time, we need that consistency baseline. Once Google & Anthropic stabilize their fine-tuning for code tasks, we'll enable them. *2. Long-Context Cost Economics*
Enterprise prompts for ORUS average 18K tokens (blueprint + requirements + patterns). At current pricing:
- Perplexity: $3/1M input tokens (~$0.054 per generation)
- Claude 3.5: $3/1M input (~$0.054 per generation)
- Groq: $0.05/1M input (~$0.0009 per generation)
- Gemini 2.0 Flash: pricing TBA, likely $0.075/1M
- Sonnet 4.5: $3/1M (~$0.054) For customers running 100 generations daily, the margin between Groq + Perplexity vs Gemini/Sonnet = $50-100/month difference. We *can't ignore cost* when targeting startups. *3. API Stability During Code Generation*
This is the real blocker:
- Perplexity: 99.8% uptime, code-optimized endpoints
- Claude: 99.7% uptime, fine-tuning controls
- Groq: 99.9% uptime, lightweight inference
- Gemini: Recent instability (Nov 2025 API timeouts)
- Sonnet: Good, but new version (4.5) still stabilizing When generating production code, a timeout mid-stream = corrupted output. We can't ship that in v1. *Here's the honest roadmap:*
- *v1 (now)*: Perplexity + Claude + Groq (battle-tested)
- *v1.2 (Jan 2026)*: Gemini 2.0 (when pricing finalizes & API stabilizes)
- *v1.3 (Feb 2026)*: Sonnet 4.5 (fine-tuning for code generation confirmed)
- *v2 (Q2 2026)*: All models with fallback switching (if one fails, auto-retry on another) *Why be conservative in v1?* We have 400+ enterprise users waiting for open-source release. One corrupted generation costs us 5+ years of credibility. Better to add models post-launch when we have production telemetry. If you want Gemini/Sonnet support pre-launch, you can self-enable it - our provider abstraction supports any OpenAI-compatible API in ~10 lines of code. |