Building a real-time video platform that works reliably at scale is genuinely hard. WebRTC’s peer-to-peer model breaks down beyond 4-5 participants. LiveKit’s Selective Forwarding Unit (SFU) architecture solves this — but deploying it correctly on Kubernetes requires understanding both systems deeply.
This guide distills what we’ve learned deploying LiveKit for platforms ranging from 100 to 50,000 concurrent users.
Understanding LiveKit’s Architecture
LiveKit is an open-source WebRTC SFU. Instead of participants sending video to each other directly (mesh) or through a centralized transcoder (MCU), each participant sends one stream to the SFU, which selectively forwards it to other participants.
This means:
- Bandwidth scales with publishers, not participants: A 100-person meeting with one active speaker uses near-identical bandwidth as a 10-person meeting
- CPU is dominated by packet routing, not transcoding: LiveKit is CPU-efficient and horizontally scalable
- Each LiveKit server handles a room: Rooms don’t span servers (with default config)
Infrastructure Design: Single Region
For a single-region deployment serving up to 5,000 concurrent users:
# livekit-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: livekit
namespace: livekit
spec:
replicas: 4
selector:
matchLabels:
app: livekit
template:
metadata:
labels:
app: livekit
spec:
containers:
- name: livekit
image: livekit/livekit-server:v1.7.2
ports:
- containerPort: 7880 # HTTP/gRPC
- containerPort: 7881 # RTC TCP
- containerPort: 7882 # Prometheus metrics
- containerPort: 50000-60000 # UDP RTC range (host network preferred)
env:
- name: LIVEKIT_CONFIG
valueFrom:
secretKeyRef:
name: livekit-config
key: config.yaml
resources:
requests:
cpu: "2"
memory: "4Gi"
limits:
memory: "8Gi" # No CPU limit — causes severe WebRTC quality issues
hostNetwork: true # Critical for UDP RTC performance
dnsPolicy: ClusterFirstWithHostNet
Critical: LiveKit requires hostNetwork: true for optimal UDP performance. Without it, each UDP packet traverses kube-proxy’s NAT, adding 5-15ms latency and causing packet reordering.
The UDP Challenge
This is the biggest Kubernetes-specific challenge with LiveKit. WebRTC uses UDP for media transport. In Kubernetes:
- NodePort services don’t efficiently handle high-throughput UDP
hostNetwork: truebypasses kube-proxy for UDP but means one LiveKit pod per node- Karpenter or Cluster Autoscaler must provision appropriate node types (compute-optimised, not memory-optimised)
For AWS, we use c6i.2xlarge instances (8 vCPU, 16 GB) — one LiveKit pod per node.
Redis for Room State
LiveKit uses Redis for room state when running multiple replicas:
apiVersion: v1
kind: ConfigMap
metadata:
name: livekit-config
data:
config.yaml: |
port: 7880
rtc:
tcp_port: 7881
port_range_start: 50000
port_range_end: 60000
use_external_ip: true
redis:
address: redis-master.redis.svc.cluster.local:6379
turn:
enabled: true
domain: turn.yourdomain.com
tls_port: 5349
keys:
your-api-key: your-api-secret
Global Multi-Region Deployment
For 10,000+ concurrent users across geographies, you need a multi-region approach:
Architecture Overview
┌──────────────────────────────────────────────────────────┐
│ Global Load Balancer │
│ (Cloudflare or Route53 Latency) │
└──────┬───────────────────────────┬───────────────────────┘
│ │
┌──────▼──────┐ ┌──────▼──────┐
│ Mumbai AZ │ │ Singapore │
│ LiveKit x4 │ │ LiveKit x4 │
│ Redis │ │ Redis │
└─────────────┘ └─────────────┘
Key design decisions:
- Latency-based DNS routing: Users connect to the nearest LiveKit cluster automatically
- No cross-region room state: Rooms are anchored to a single region. If you need cross-region rooms, use LiveKit’s distributed config with a central Redis.
- TURN servers co-located: Deploy Coturn alongside LiveKit in each region for NAT traversal
Ingress for WebSocket
LiveKit’s signaling uses WebSocket (HTTP/2 upgradeable). Configure your ingress carefully:
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: livekit-ingress
annotations:
nginx.ingress.kubernetes.io/proxy-read-timeout: "3600"
nginx.ingress.kubernetes.io/proxy-send-timeout: "3600"
nginx.ingress.kubernetes.io/proxy-buffering: "off"
nginx.ingress.kubernetes.io/upstream-hash-by: "$remote_addr" # Sticky sessions
spec:
rules:
- host: livekit.yourdomain.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: livekit
port:
number: 7880
Important: Set aggressive proxy timeouts. LiveKit WebSocket connections last for the duration of the room session.
Observability for WebRTC
Standard infrastructure metrics aren’t enough for WebRTC. You need media quality metrics:
Key Metrics to Monitor
# Prometheus scrape config
- job_name: 'livekit'
static_configs:
- targets: ['livekit:7882']
metrics_path: '/metrics'
Critical metrics from LiveKit’s Prometheus endpoint:
| Metric | Alert Threshold | Meaning |
|---|---|---|
livekit_room_total | > 80% capacity | Approaching room limits |
livekit_participant_total | > 80% capacity | Approaching participant limits |
livekit_packet_loss_* | > 2% | Network quality degrading |
livekit_nack_total | Rising | Packet retransmissions needed |
livekit_jitter_* | > 30ms | Jitter buffer stress |
Grafana Dashboard
We open-source our LiveKit Grafana dashboard — it includes all WebRTC quality metrics, room lifecycle visualisations, and per-participant quality scoring. Contact us to request access.
Load Testing LiveKit
Never deploy at scale without load testing. We use a combination of:
- livekit-load-tester (official tool) — simulates publish-only rooms
- Custom k6 scripts — for realistic join/leave patterns
- Chaos engineering — simulate node failures during active rooms
Run these in staging with your target concurrent user count × 1.5 for headroom.
Results: What to Expect
With the architecture above on 4× c6i.2xlarge nodes:
- Concurrent participants: 15,000-20,000 per region
- Average latency: 60-90ms within region
- Packet loss (good network): < 0.1%
- CPU utilisation: ~60% at full load (room for spikes)
- Cost: ~$800-1,200/month per region on AWS Spot
At 50,000+ concurrent users, we add Karpenter for dynamic node provisioning and geographic expansion to 4+ regions.
Getting Started
LiveKit is genuinely one of the best open-source WebRTC stacks available. The complexity is in the deployment — particularly the UDP networking on Kubernetes.
If you want KubeAce to architect and deploy a LiveKit platform for your use case — whether it’s education, telehealth, gaming, or enterprise conferencing — schedule a free architecture review. We typically scope and quote in one session.