Gonka Provider Overview
Compact scorecards for uptime, latency, context handling, throughput, and price.
Recent Incidents
Latest incident windows opened by health probe failures.
Methodology and Probe Tiers
- One probe environment, so latency is comparable here but not globally representative.
- Health status changes after repeated failures or recoveries to reduce false alarms.
- Capability checks are shown separately from API uptime so feature gaps do not distort availability.
- Preview cards use estimated all-in spend, while provider detail cards show token pricing snapshots.
API uptimeuses core health probes only: basic chat and streaming health.Performance reliabilitytracks short latency, TTFT, throughput, and stability separately from uptime.Context reliabilitytracks 2k, 8k, and 32k pass rates separately from core API health.Real-world gencombines practical reply, long-stream, and heavy-report generation probes to show whether a provider can sustain real working completions.Reply generationis a medium-length practical answer probe,Long streamis a longer streaming completion probe, andHeavy reportis a long non-stream structured report probe.32kis treated as a rare stress tier, and64kremains calibration-only.Temporarily unavailablemeans a provider hit rate limits or routing pressure during a probe, not necessarily a missing feature.
API uptimegoes down when a core health probe fails. That includes no model response, HTTP 4xx/5xx errors, timeouts, failed validation, or protocol-level failures.Latencyshows the median full-response time on the standard short non-stream probe over the last 24 hours. Thep95line shows the slower tail, not an average.Output speedis the median effective completion speed on the standard medium-generation throughput probe. It reflects both generation speed and request overhead.Context reliabilityis the average success rate across the 2k, 8k, and 32k context probes. The percentage drops when the provider returns an error, times out, fails validation, or does not complete the request correctly.Real-world genfalls when any practical generation probe fails due to timeouts, 5xx responses, truncated output, or missing required topics.Heavy reportis primarily a completion test: it checks whether the provider can finish a long working report, not just return a fast short answer.Real spend / 1Mestimates all-in USD spend per 1M tokens, including configured cash-in fees. For direct GNK wallet billing, the dashboard normalizes GNK-denominated pricing into USD via the liveGNK / USDT midpoint. Provider detail cards keep a separateToken price / 1M (API)metric. View GNK price chart.- Provider card ranking uses a weighted score:
Failed probes30%,API uptime (24h)30%,Real spend / 1M20%,Output speed10%, andHeavy report10%.