API Reference
REST API v1 · interface may receive non-breaking adjustments before public beta
Complete REST API reference: parameter tables, request/response schemas, status codes, error patterns, rate limits, idempotency, and pagination for every endpoint. All examples use cURL — equivalent TypeScript / Python / Go versions live under SDKs.
Base information
https://api.skyaiapp.comhttps://api-sandbox.skyaiapp.comv1Bearer tokenapplication/json; UTF-8RFC 3339 (ISO 8601)UTF-8 onlyTLS 1.2+ requiredAuthentication
Every request needs a Bearer token. Keys are created in the console with live / test prefixes.
Authorization: Bearer sk_live_01JEXAMPLE...
# or for sandbox / CI:
Authorization: Bearer sk_test_01JEXAMPLE...sk_live_…Production keys — billed, real model calls, real trace retention.
sk_test_…Sandbox keys — hit api-sandbox.skyaiapp.com, never billed, returns canned model output. Use these in CI.
If a key leaks
Revoke in the console — propagation < 5s globally. All traces from that key over the last 24h are flagged as compromised. Already-billed requests are not auto-refunded; file a ticket for manual review.
Full flow: Security guide · key rotation。
Model routing
POST/v1/routeRoute one request. The router picks a primary and fallback from 50+ models, executes the call, and returns the response plus a decision trace.
Request body
| Field | Type | Required | Description |
|---|---|---|---|
messages | Message[] | required | OpenAI-compatible chat messages (system / user / assistant / tool). |
goal | string | optional | cost / quality / stability. Defaults to cost under balanced strategy. |
strategy | string | optional | balanced (default) / cost-optimized / quality-first / latency-optimized. |
policy_id | string | optional | Pre-created policy ID from console. Overrides inline goal/strategy when present. |
models | string[] | optional | Constrain candidates to this whitelist. |
fallback | Fallback | optional | { models: string[], maxRetries: number }. Overrides the default fallback chain. |
budget | Budget | optional | { maxCostUsd?, maxTokens? }. Hard constraint — candidates exceeding it are rejected. |
cache | boolean | CacheOpts | optional | true enables default semantic cache; CacheOpts customizes threshold + TTL. |
stream | boolean | optional | true returns an SSE stream; defaults to false. |
tools | Tool[] | optional | Tool/function declarations (OpenAI tools compatible). Chat context only. |
tool_choice | string | object | optional | auto / none / required / specific tool name. |
metadata | object | optional | Arbitrary K/V tags, indexed into traces and billing reports. |
timeout_ms | number | optional | End-to-end timeout (default 60s); on timeout returns 504 + RouterTimeoutError. |
idempotency_key | string | optional | Idempotency key (≤ 64 bytes). Repeat requests within 24h return the same result with no double-billing. |
Example
curl https://api.skyaiapp.com/v1/route \
-H "Authorization: Bearer $SKYAIAPP_API_KEY" \
-H "Content-Type: application/json" \
-H "Idempotency-Key: req_user42_2026-05-14T10:30Z" \
-d '{
"goal": "quality",
"strategy": "quality-first",
"messages": [
{ "role": "system", "content": "You are a senior travel planner." },
{ "role": "user", "content": "Plan a 7-day European city-hopping trip in October." }
],
"fallback": {
"models": ["claude-opus-4.7", "gemini-3.1-pro"],
"maxRetries": 2
},
"budget": { "maxCostUsd": 0.05 },
"cache": true,
"timeout_ms": 30000,
"metadata": { "tenant": "acme-corp", "workflow": "trip-planner", "env": "prod" }
}'Response
{
"trace_id": "tr_01JFGYZ7K8M2N3P4Q5R6S7T8U9",
"output": "Day 1: Land in Lisbon...",
"routing": {
"selected_model": "gpt-5.5-pro",
"fallback_chain": ["claude-opus-4.7", "gemini-3.1-pro"],
"policy_version": "policy_prod_v3@2026-05-12",
"decision_reason": "quality-first + quality goal: gpt-5.5-pro wins on benchmark composite within budget.",
"cost_usd": 0.0421,
"latency_ms": 1820,
"cache_hit": false,
"tokens": {
"input": 256,
"output": 1840,
"cached": 0
}
},
"usage": {
"rate_limit_remaining": 4998,
"rate_limit_reset_unix": 1715680800
}
}Status codes
| Code | Meaning | Retry? |
|---|---|---|
| 200 | Success (including fallback-rescued) | — |
| 400 | Request validation failed — see error.detail | No (fix request) |
| 401 | Missing or invalid key | No (fix key) |
| 402 | Insufficient balance / quota exhausted | No (top up) |
| 403 | Blocked by policy or RBAC | No (change policy) |
| 422 | All candidates filtered by budget/constraints | No (relax budget) |
| 429 | Rate-limited — see Retry-After header | Yes (per Retry-After) |
| 500 | Router internal error | Yes (exp backoff) |
| 502 | Upstream provider failed — fallback also failed | Yes (exp backoff + jitter) |
| 504 | End-to-end timeout (timeout_ms) | Yes (increase timeout or pick faster model) |
Full error envelope and SDK class mapping: Error handling。
Available models
GET/v1/modelsList currently routable models. Each entry has capability, context window, pricing, and provider availability status.
curl https://api.skyaiapp.com/v1/models \
-H "Authorization: Bearer $SKYAIAPP_API_KEY"{
"models": [
{
"id": "gpt-5.5-pro",
"provider": "openai",
"context_window": 1000000,
"modalities": ["text", "image", "tool"],
"pricing_per_1m": { "input_usd": 5.00, "output_usd": 30.00 },
"availability": { "status": "available", "regions": ["us", "eu"] },
"deprecated_at": null
},
{ "id": "claude-opus-4.7", "provider": "anthropic", "...": "..." },
{ "id": "gemini-3.1-pro", "provider": "google", "...": "..." }
]
}Agent runtime
POST/v1/agents/runExecute a multi-step agent: it picks tools, maintains state, with timeouts and retries handled by the runtime.
Request body (key fields)
| Field | Type | Description |
|---|---|---|
task | string | Natural-language task description (required). |
tools | Tool[] | Tools the agent may call — built-in (web_search, calculator, code_exec) + custom. |
max_steps | number | Hard cap (default 8). On overrun returns partial result + max_steps_reached. |
per_step_timeout_ms | number | Per-step timeout (default 30s). |
total_budget_usd | number | Cumulative cost cap for the whole run. |
model_strategy | object | Underlying LLM calls reuse the /v1/route goal/strategy semantics. |
stream_steps | boolean | true streams per-step events as SSE. |
curl https://api.skyaiapp.com/v1/agents/run \
-H "Authorization: Bearer $SKYAIAPP_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"task": "Find this month overdue invoices and email a polite reminder to each.",
"tools": [
"web_search",
{
"name": "lookup_invoice",
"description": "Fetch invoice by ID from internal billing system.",
"parameters": { "type": "object", "properties": { "id": { "type": "string" } }, "required": ["id"] }
}
],
"max_steps": 10,
"per_step_timeout_ms": 30000,
"total_budget_usd": 0.50,
"model_strategy": { "goal": "quality", "strategy": "balanced" },
"metadata": { "tenant": "acme-corp", "workflow": "billing-followup" }
}'Response (excerpt)
{
"trace_id": "tr_01JFAGENT2N3P4Q5R6S7T8U9",
"status": "completed", // completed | partial | failed | budget_exceeded
"output": "Sent reminders to 7 customers; logged failures for 2.",
"steps": [
{
"number": 1,
"action": "tool_call",
"tool": "lookup_invoice",
"input": { "id": "inv_2901" },
"output": { "status": "overdue", "amount_usd": 1820 },
"duration_ms": 220,
"cost_usd": 0
},
{ "number": 2, "action": "llm_call", "...": "..." }
],
"summary": {
"total_steps": 9,
"total_cost_usd": 0.0421,
"total_latency_ms": 18420,
"tools_used": ["lookup_invoice", "send_email"]
}
}Traces
GET/v1/traces/{trace_id}Fetch the full span tree of a single trace (with sub-traces, cache events, guardrail hits).
GET/v1/traces?from=…&to=…&filter=…&limit=100&cursor=…Filter traces by time / metadata / error class. Cursor-based pagination — see pagination section below.
# Filter on metadata.tenant + only failed traces in the last hour
curl -G https://api.skyaiapp.com/v1/traces \
-H "Authorization: Bearer $SKYAIAPP_API_KEY" \
--data-urlencode "from=2026-05-14T09:00:00Z" \
--data-urlencode "to=2026-05-14T10:00:00Z" \
--data-urlencode "filter=metadata.tenant=acme-corp,status=failed" \
--data-urlencode "limit=100"Analytics
Low-cardinality aggregate endpoints for dashboards. For raw data use /v1/traces instead.
GET /v1/analytics/usageUsage, cost, token counts; by time-bucket + groupBy (model / tenant / workflow).GET /v1/analytics/cacheHit rate, dollars saved, TTL distribution.GET /v1/analytics/fallbackFallback trigger counts, reason histogram, per-candidate success rate.GET /v1/analytics/latencyP50 / P95 / P99 latency by model and strategy.curl -G https://api.skyaiapp.com/v1/analytics/usage \
-H "Authorization: Bearer $SKYAIAPP_API_KEY" \
--data-urlencode "start_date=2026-05-01" \
--data-urlencode "end_date=2026-05-14" \
--data-urlencode "group_by=model" \
--data-urlencode "bucket=day"Webhooks
Events like router.fallback_triggered, router.budget_exceeded, agent.run_completed, trace.error_burst can be pushed to your endpoint. Full event catalog, signature verification, retry policy, and replay tools live on the dedicated page.
Full Webhooks docsRate limits
Two-tier limiting: account-level (per plan) + key-level (configured in console). Both use sliding windows (not fixed buckets) — smoother and harder to spike.
| Plan | RPM (req/min) | TPM (tok/min) | Concurrency |
|---|---|---|---|
| Free | 60 | 60k | 5 |
| Pro | 600 | 600k | 25 |
| Team | 3,000 | 3M | 100 |
| Enterprise | Per contract | Per contract | Per contract |
Response headers
X-RateLimit-Limit: 600
X-RateLimit-Remaining: 582
X-RateLimit-Reset: 1715680800 # unix epoch
X-RateLimit-Resource: requests # or "tokens"
Retry-After: 4 # seconds; only on 429Best practice: self-throttle from Remaining; on 429, back off per Retry-After (do not retry immediately). SDKs do this automatically.
Idempotency
For side-effectful endpoints (route / agents/run), send an Idempotency-Key header. Repeating the same key within 24h returns the original result with no double-billing. Lets your client safely retry on network blips.
Idempotency-Key: req_<your-id>_<timestamp>
# Length ≤ 64 bytes; recommended format: <client>_<userId>_<utc-millis>
# Stored 24h then evicted.Reusing a key with a different body returns 409 idempotency_key_reused_with_different_body.
Pagination (cursor)
All list endpoints use cursor pagination. First page omits cursor; if next_cursor is non-null in the response, there's another page. Cursors are valid for 24h.
# First page
GET /v1/traces?limit=100
# Subsequent
GET /v1/traces?limit=100&cursor=eyJ0Ijoi...
# Response
{
"data": [ ... ],
"next_cursor": "eyJ0IjoiMjAyNi0wNS0xNFQwOToyMDoxMVoifQ==",
"has_more": true
}Error codes & handling
Errors share a stable envelope and code field. SDKs map each code to a typed Error class for precise handling. Full list and retry matrix on the dedicated page.
{
"error": {
"code": "router.budget_exceeded",
"message": "All candidate models exceed budget.maxCostUsd=0.001",
"type": "validation_error",
"request_id": "req_01JFGYZ7K8M2N3P4Q5R6S7T8U9",
"trace_id": "tr_01JFGYZ7K8M2N3P4Q5R6S7T8U9",
"detail": {
"rejected_candidates": [
{ "model": "gpt-5.5-pro", "estimated_cost_usd": 0.012 },
{ "model": "claude-opus-4.7", "estimated_cost_usd": 0.015 }
],
"suggestion": "Increase budget.maxCostUsd or include cheaper models in the policy."
}
}
}See also
Was this page helpful?
Let us know how we can improve