Public inference observability

Gonka Router

Synthetic monitoring and capability tracking for Gonka inference providers.

Providers 0
Open incidents 0

Provider Status

Current health, latest latency, and rolling uptime.

Provider Detail

Select a provider card to inspect metrics and capability notes.

Waiting for provider selection.

Recent Incidents

Latest incident windows opened by health probe failures.

Methodology

  • One probe environment, so latency is comparable here but not globally representative.
  • Health status changes after repeated failures or recoveries to reduce false alarms.
  • Capability checks are shown separately from live uptime so feature gaps do not distort health.
  • Pricing is manually configured and cost estimates are informational.

Probe Tiers

  • `Short latency` is the primary fast-response metric for homepage comparisons.
  • `Streaming TTFT` measures time to first token and verifies full SSE completion.
  • `Medium throughput` tracks tokens per second on a controlled 160-word generation.
  • `2k` and `8k` context tiers run regularly; `32k` is a rare stress probe and `64k` stays calibration-only.
  • `Temporarily unavailable` means a provider hit rate limits or routing pressure during a probe, not necessarily a missing feature.