Why Kubernetes cost optimization matters
Kubernetes makes it easy to scale workloads — but it also makes it easy to waste money. Overprovisioned nodes, idle pods, and unattended test environments drain budgets silently. For companies spending ₹5L–₹50L per month on cloud infrastructure, a disciplined cost optimization program can return 20-40% without impacting performance.
This guide covers the practical levers engineering and finance teams can pull to reduce Kubernetes spend, improve unit economics, and establish lasting FinOps governance.
Step 1: Establish a cost baseline
Before optimizing, you need visibility. Start with:
- Namespace-level cost attribution — Allocate spend to teams or services using Kubecost or OpenCost
- Node utilization metrics — Identify average CPU and memory utilization across node pools
- Idle resource detection — Find pods with low utilization ratios (target: CPU > 40%, memory > 50%)
- Storage and data transfer costs — Often overlooked, these can represent 15-20% of total spend
Once you have a baseline, set a target savings range and prioritize actions by ROI.
Step 2: Rightsize workloads
Overprovisioned resource requests are the single largest source of waste in Kubernetes clusters.
Audit resource requests and limits
Run the following for each namespace:
kubectl get pods -n <namespace> -o json | jq '.items[].spec.containers[].resources'
Look for pods where:
- Requests are set much higher than actual usage
- No limits are set (risks noisy neighbor issues)
- Memory limits are 3x+ higher than usage
Use Vertical Pod Autoscaler (VPA)
VPA in recommendation mode suggests optimal resource settings without enforcing them:
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: my-app-vpa
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: my-app
updatePolicy:
updateMode: "Off"
Review VPA recommendations weekly and apply them to staging before production.
Step 3: Optimize autoscaling
Horizontal Pod Autoscaler (HPA)
Ensure HPA is configured for variable workloads. Scale on custom metrics (queue depth, RPS) rather than just CPU to avoid over-scaling.
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: my-app-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: my-app
minReplicas: 2
maxReplicas: 20
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
Cluster Autoscaler vs Karpenter
For AWS EKS, Karpenter significantly outperforms Cluster Autoscaler:
- Provisions nodes in ~30 seconds vs 3-5 minutes
- Bin-packs pods optimally, reducing node count
- Supports consolidation — automatically removes underutilized nodes
Migrating from Cluster Autoscaler to Karpenter typically reduces EC2 spend by 15-25%.
Step 4: Use spot and preemptible instances
Spot instances offer 60-90% savings over on-demand pricing. For Kubernetes workloads:
- Run stateless, fault-tolerant workloads (web servers, batch processors, ML training) on spot nodes
- Use node affinity and taints to isolate spot from on-demand workloads
- Implement pod disruption budgets to handle spot interruptions gracefully
- Keep production databases and stateful workloads on on-demand or reserved instances
AWS EKS spot configuration with Karpenter
apiVersion: karpenter.sh/v1alpha5
kind: Provisioner
metadata:
name: spot-provisioner
spec:
requirements:
- key: karpenter.sh/capacity-type
operator: In
values: ["spot"]
- key: kubernetes.io/arch
operator: In
values: ["amd64"]
limits:
resources:
cpu: 1000
Step 5: Optimize storage costs
Storage costs accumulate quickly and are often invisible until your cloud bill arrives.
- Delete unattached PersistentVolumes — Run monthly audits for orphaned PVCs
- Set StorageClass reclaim policies — Use
Deletefor ephemeral storage,Retainonly for critical data - Use object storage for logs — Route logs to S3/Azure Blob instead of retaining in Prometheus long-term
- Implement lifecycle policies — Automatically tier old data to cheaper storage classes
Step 6: Implement FinOps governance
One-time optimizations erode without governance. Build these processes:
Tagging and allocation
Enforce consistent resource tagging:
- Team/owner
- Environment (production, staging, dev)
- Service/application name
- Cost center
Use AWS Cost Allocation Tags or Azure Cost Management to produce per-team reports.
Showback and chargeback
Send weekly cost summaries to engineering leads. Teams that see their costs reduce spend naturally. This cultural shift is often more impactful than technical changes.
Cost anomaly detection
Configure alerts for:
- Spend spikes > 20% above baseline
- New resources without required tags
- Idle resources running for > 72 hours
Expected outcomes
A structured Kubernetes cost optimization program typically delivers:
| Optimization Area | Expected Savings |
|---|---|
| Rightsizing (VPA recommendations) | 10-20% |
| Spot instance migration | 15-30% |
| Karpenter node consolidation | 10-20% |
| Storage lifecycle policies | 5-10% |
| Unused resource cleanup | 5-15% |
Combined, most teams achieve 25-40% reduction within 60 days.
Next steps
If your Kubernetes spend is growing faster than your team, a structured optimization sprint can deliver measurable savings quickly. KubeAce runs 2-4 week Cloud Cost Optimization Sprints that combine tooling, automation, and governance to produce lasting results.
Book a free infra review to discuss your current spend and potential savings range.