Why Kubernetes cost optimization matters

Kubernetes makes it easy to scale workloads — but it also makes it easy to waste money. Overprovisioned nodes, idle pods, and unattended test environments drain budgets silently. For companies spending ₹5L–₹50L per month on cloud infrastructure, a disciplined cost optimization program can return 20-40% without impacting performance.

This guide covers the practical levers engineering and finance teams can pull to reduce Kubernetes spend, improve unit economics, and establish lasting FinOps governance.

Step 1: Establish a cost baseline

Before optimizing, you need visibility. Start with:

Namespace-level cost attribution — Allocate spend to teams or services using Kubecost or OpenCost
Node utilization metrics — Identify average CPU and memory utilization across node pools
Idle resource detection — Find pods with low utilization ratios (target: CPU > 40%, memory > 50%)
Storage and data transfer costs — Often overlooked, these can represent 15-20% of total spend

Once you have a baseline, set a target savings range and prioritize actions by ROI.

Step 2: Rightsize workloads

Overprovisioned resource requests are the single largest source of waste in Kubernetes clusters.

Audit resource requests and limits

Run the following for each namespace:

kubectl get pods -n <namespace> -o json | jq '.items[].spec.containers[].resources'

Look for pods where:

Requests are set much higher than actual usage
No limits are set (risks noisy neighbor issues)
Memory limits are 3x+ higher than usage

Use Vertical Pod Autoscaler (VPA)

VPA in recommendation mode suggests optimal resource settings without enforcing them:

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: my-app-vpa
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: my-app
  updatePolicy:
    updateMode: "Off"

Review VPA recommendations weekly and apply them to staging before production.

Step 3: Optimize autoscaling

Horizontal Pod Autoscaler (HPA)

Ensure HPA is configured for variable workloads. Scale on custom metrics (queue depth, RPS) rather than just CPU to avoid over-scaling.

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: my-app-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: my-app
  minReplicas: 2
  maxReplicas: 20
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70

Cluster Autoscaler vs Karpenter

For AWS EKS, Karpenter significantly outperforms Cluster Autoscaler:

Provisions nodes in ~30 seconds vs 3-5 minutes
Bin-packs pods optimally, reducing node count
Supports consolidation — automatically removes underutilized nodes

Migrating from Cluster Autoscaler to Karpenter typically reduces EC2 spend by 15-25%.

Step 4: Use spot and preemptible instances

Spot instances offer 60-90% savings over on-demand pricing. For Kubernetes workloads:

Run stateless, fault-tolerant workloads (web servers, batch processors, ML training) on spot nodes
Use node affinity and taints to isolate spot from on-demand workloads
Implement pod disruption budgets to handle spot interruptions gracefully
Keep production databases and stateful workloads on on-demand or reserved instances

AWS EKS spot configuration with Karpenter

apiVersion: karpenter.sh/v1alpha5
kind: Provisioner
metadata:
  name: spot-provisioner
spec:
  requirements:
  - key: karpenter.sh/capacity-type
    operator: In
    values: ["spot"]
  - key: kubernetes.io/arch
    operator: In
    values: ["amd64"]
  limits:
    resources:
      cpu: 1000

Step 5: Optimize storage costs

Storage costs accumulate quickly and are often invisible until your cloud bill arrives.

Delete unattached PersistentVolumes — Run monthly audits for orphaned PVCs
Set StorageClass reclaim policies — Use Delete for ephemeral storage, Retain only for critical data
Use object storage for logs — Route logs to S3/Azure Blob instead of retaining in Prometheus long-term
Implement lifecycle policies — Automatically tier old data to cheaper storage classes

Step 6: Implement FinOps governance

One-time optimizations erode without governance. Build these processes:

Tagging and allocation

Enforce consistent resource tagging:

Team/owner
Environment (production, staging, dev)
Service/application name
Cost center

Use AWS Cost Allocation Tags or Azure Cost Management to produce per-team reports.

Showback and chargeback

Send weekly cost summaries to engineering leads. Teams that see their costs reduce spend naturally. This cultural shift is often more impactful than technical changes.

Cost anomaly detection

Configure alerts for:

Spend spikes > 20% above baseline
New resources without required tags
Idle resources running for > 72 hours

Expected outcomes

A structured Kubernetes cost optimization program typically delivers:

Optimization Area	Expected Savings
Rightsizing (VPA recommendations)	10-20%
Spot instance migration	15-30%
Karpenter node consolidation	10-20%
Storage lifecycle policies	5-10%
Unused resource cleanup	5-15%

Combined, most teams achieve 25-40% reduction within 60 days.

Next steps

If your Kubernetes spend is growing faster than your team, a structured optimization sprint can deliver measurable savings quickly. KubeAce runs 2-4 week Cloud Cost Optimization Sprints that combine tooling, automation, and governance to produce lasting results.

Book a free infra review to discuss your current spend and potential savings range.