Monitoring Metrics Reference

Swarm Hosts monitoring clients request logical metric names from an allowlist. Raw PromQL is not accepted from browsers or API clients. The monitor gateway maps each logical metric to a safe host-local query.

Host Metrics

Metric Unit Meaning
host_cpu_percent percent Overall swarm host CPU utilization
host_memory_used_percent percent Swarm host memory in use
host_filesystem_used_percent percent Persistent filesystem usage
host_disk_read_bps bytes per second Disk read throughput
host_disk_write_bps bytes per second Disk write throughput
host_network_rx_bps bytes per second Network receive throughput
host_network_tx_bps bytes per second Network transmit throughput

Deployment Metrics

Metric Unit Meaning
container_cpu_percent percent Deployment container CPU utilization
container_memory_percent percent Deployment memory usage compared with its memory limit
container_memory_working_set_bytes bytes Deployment memory working set
container_network_rx_bps bytes per second Deployment network receive throughput
container_network_tx_bps bytes per second Deployment network transmit throughput
container_block_read_bps bytes per second Deployment filesystem or block read throughput
container_block_write_bps bytes per second Deployment filesystem or block write throughput

Game Metrics

Game metrics appear only when the game image or deployment config provides a metrics target.

Metric Unit Meaning
game_players_connected count Current connected players
game_connections_active count Active game connections
game_tick_rate ticks per second Game server tick rate, when available

Query Behavior

Historical queries are bounded by the swarm host retention window and may be downsampled so large chart requests stay responsive. Live subscriptions have a minimum update interval of one second.

Metric labels are managed by Swarm Hosts. Custom game exporters should use stable, low-cardinality labels and must not include player IDs, IP addresses, session IDs, usernames, email addresses, request IDs, or tokens.

Control Plane Product Metrics

The control plane exposes Prometheus text metrics at /metrics for aggregate product events and current resource summaries. Production deploys generate a PRODUCT_METRICS_TOKEN in the swarmhosts-web-secret Kubernetes secret; scrape the endpoint with Authorization: Bearer <token> or X-Metrics-Token: <token>.

The endpoint intentionally uses only bounded labels such as result, reason, status, role, region, and game slug. It does not expose user emails, user IDs, deployment IDs, swarm host IDs, tokens, password hashes, private keys, or raw credentials.

Initial product metrics include:

  • swarmhosts_logins_total
  • swarmhosts_user_registrations_total
  • swarmhosts_email_verifications_total
  • swarmhosts_deployment_requests_total
  • swarmhosts_swarmhost_registrations_total
  • swarmhosts_admin_actions_total
  • swarmhosts_deployment_runtime_events_total
  • swarmhosts_users_current
  • swarmhosts_deployments_current
  • swarmhosts_swarmhosts_current
  • aggregate deployment CPU usage gauges in percent and millicores
  • aggregate deployment memory usage gauges in percent and bytes
  • aggregate swarm host CPU/memory usage gauges

The VPS product-metrics stack in deploy/kustomize/product-metrics scrapes this endpoint into central VictoriaMetrics and provisions Grafana dashboards at https://metrics.swarmhosts.com.