Changelog
Latest updates, new features, and improvements
Track all updates to SkyAIApp. Subscribe to our mailing list for release notifications.
Get notified of updates
Subscribe to learn about new features first
v2.0.2
latestDeepSeek V4 mini added to cost pool
DeepSeek V4 mini (2026-05-09 release) added as the new low-cost default for batch / classification workloads. Replaces V3.2 in cost pool routing.
Qwen 3.5 Max routable in Alibaba region
Alibaba's Qwen 3.5 Max (released May 2026) joins the model pool. Auto-selected for tenants with cn-east-1 residency requirement and zh-CN prompts.
Diagram connectors now have arrowheads everywhere
Every architecture diagram across docs and use-cases now has visible arrowheads on directional flows. Fixed broken elbow paths in support / sales-CRM diagrams.
Migration guide updated to GPT-5.5
Migration guide had stale gpt-4 model code example. Now shows gpt-5.5-pro with the four sub-tier reference.
v2.0.0
stableGPT-5.5 family added (Pro / Default / Instant / mini / nano)
OpenAI GPT-5.5 went GA on April 23, 2026 with a 1M-token context window. SkyAIApp router now routes to GPT-5.5 Pro for high-stakes tasks and GPT-5.5 Instant as ChatGPT-grade default. Pricing on the API: $5 / 1M input, $30 / 1M output.
Claude Opus 4.7 + Sonnet 4.6 + Haiku 4.5 + advisor tool
Full Anthropic Claude 4.x lineup is online, including the new advisor tool that lets Opus 4.7 guide Sonnet/Haiku executors at a fraction of the cost.
Gemini 3.1 Pro + Gemini 3 (Pro / Flash / Flash-Lite)
Google's Gemini 3.1 Pro (released Feb 19, 2026) is the new quality-first default for Google in our router. Gemini 3 Flash remains the default for low-latency tasks.
DeepSeek V4 + Mistral Medium 3.5 + Llama 4 Maverick/Scout
Frontier open-weight coverage refreshed: DeepSeek V4 (April 2026), Mistral Medium 3.5 dense flagship with agentic coding, and Meta Llama 4 with 10M-token Scout context.
Router v3 with regional fallback and policy versioning
Per-region failover, cross-cloud fallback chains, and immutable policy version pinning for audited rollouts.
Semantic cache 2.0 — hit rate +52%
New embedding-based cache key plus paraphrase clustering raises hit rate by 52% on average across pilot tenants.
v1.5.0
stableAdded support for Claude Opus 4.5 and GPT-5.2 series
Full support for Anthropic's Claude Opus 4.5 with extended thinking capabilities and OpenAI's latest GPT-5.2 series (Instant/Thinking/Pro) with improved reasoning.
Agent Runtime v2 with parallel tool execution
New Agent Runtime supports parallel tool execution, reducing complex workflow execution time by up to 60%.
Improved semantic caching with 40% better hit rate
Enhanced semantic similarity algorithm now achieves 40% better cache hit rate while maintaining accuracy.
Fixed streaming response truncation in high-latency networks
Resolved an issue where streaming responses could be prematurely terminated on networks with >500ms latency.
v1.4.2
stableEnhanced PII detection with 15 new patterns
Added detection patterns for international phone numbers, passport numbers, and healthcare identifiers across 30+ countries.
Reduced API latency by 23% across all regions
Infrastructure optimizations including edge caching and connection pooling improvements.
Fixed rate limit header inconsistency
X-RateLimit-Remaining header now correctly reflects remaining quota in all edge cases.
v1.4.0
Knowledge Base API for RAG applications
New API endpoints for creating, managing, and querying vector knowledge bases with automatic chunking and embedding.
Custom routing policies via dashboard
Create and manage custom routing policies with visual rule builder. Define model preferences, fallback chains, and cost caps.
Python SDK async/await support
Full async/await support in Python SDK with connection pooling and automatic retry logic.
Deprecated v0 API endpoints removed
Legacy v0 endpoints have been removed. Please migrate to v1 endpoints. See migration guide for details.
v1.3.5
DeepSeek V3.1 and Mistral Large 3 support
Added support for DeepSeek's V3.1 model with enhanced MoE architecture and Mistral's Large 3 with expanded context window.
Enhanced error messages with actionable suggestions
Error responses now include specific suggestions for resolution and links to relevant documentation.
Fixed webhook retry mechanism for 5xx errors
Webhooks now properly implement exponential backoff with jitter for transient failures.
v1.3.0
Multi-modal routing with vision support
Route requests with images to vision-capable models. Automatic format conversion and optimization.
Real-time usage dashboard with cost alerts
New dashboard showing real-time API usage, cost breakdown by model/project, and configurable budget alerts.
SOC 2 readiness pack published
SkyAIApp published the SOC 2 readiness worksheet, control map, and enterprise security questionnaire pack for diligence requests.
Go SDK v1.0 stable release
Go SDK graduates from beta to stable with full feature parity, comprehensive tests, and performance optimizations.
View older updates
Full History on GitHubReady to try the latest features?
Get started with SkyAIApp today
Was this page helpful?
Let us know how we can improve