Cloud KubernetesFinOps

The Complete Guide to Kubernetes Cost Optimization in 2025

Reduce Kubernetes infrastructure costs by 20-40% through rightsizing, autoscaling, spot instances, and FinOps governance practices.

K

DevOps & Kubernetes Consultants

10 min read

Why Kubernetes cost optimization matters

Kubernetes makes it easy to scale workloads — but it also makes it easy to waste money. Overprovisioned nodes, idle pods, and unattended test environments drain budgets silently. For companies spending ₹5L–₹50L per month on cloud infrastructure, a disciplined cost optimization program can return 20-40% without impacting performance.

This guide covers the practical levers engineering and finance teams can pull to reduce Kubernetes spend, improve unit economics, and establish lasting FinOps governance.

Step 1: Establish a cost baseline

Before optimizing, you need visibility. Start with:

  • Namespace-level cost attribution — Allocate spend to teams or services using Kubecost or OpenCost
  • Node utilization metrics — Identify average CPU and memory utilization across node pools
  • Idle resource detection — Find pods with low utilization ratios (target: CPU > 40%, memory > 50%)
  • Storage and data transfer costs — Often overlooked, these can represent 15-20% of total spend

Once you have a baseline, set a target savings range and prioritize actions by ROI.

Step 2: Rightsize workloads

Overprovisioned resource requests are the single largest source of waste in Kubernetes clusters.

Audit resource requests and limits

Run the following for each namespace:

kubectl get pods -n <namespace> -o json | jq '.items[].spec.containers[].resources'

Look for pods where:

  • Requests are set much higher than actual usage
  • No limits are set (risks noisy neighbor issues)
  • Memory limits are 3x+ higher than usage

Use Vertical Pod Autoscaler (VPA)

VPA in recommendation mode suggests optimal resource settings without enforcing them:

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: my-app-vpa
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: my-app
  updatePolicy:
    updateMode: "Off"

Review VPA recommendations weekly and apply them to staging before production.

Step 3: Optimize autoscaling

Horizontal Pod Autoscaler (HPA)

Ensure HPA is configured for variable workloads. Scale on custom metrics (queue depth, RPS) rather than just CPU to avoid over-scaling.

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: my-app-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: my-app
  minReplicas: 2
  maxReplicas: 20
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70

Cluster Autoscaler vs Karpenter

For AWS EKS, Karpenter significantly outperforms Cluster Autoscaler:

  • Provisions nodes in ~30 seconds vs 3-5 minutes
  • Bin-packs pods optimally, reducing node count
  • Supports consolidation — automatically removes underutilized nodes

Migrating from Cluster Autoscaler to Karpenter typically reduces EC2 spend by 15-25%.

Step 4: Use spot and preemptible instances

Spot instances offer 60-90% savings over on-demand pricing. For Kubernetes workloads:

  • Run stateless, fault-tolerant workloads (web servers, batch processors, ML training) on spot nodes
  • Use node affinity and taints to isolate spot from on-demand workloads
  • Implement pod disruption budgets to handle spot interruptions gracefully
  • Keep production databases and stateful workloads on on-demand or reserved instances

AWS EKS spot configuration with Karpenter

apiVersion: karpenter.sh/v1alpha5
kind: Provisioner
metadata:
  name: spot-provisioner
spec:
  requirements:
  - key: karpenter.sh/capacity-type
    operator: In
    values: ["spot"]
  - key: kubernetes.io/arch
    operator: In
    values: ["amd64"]
  limits:
    resources:
      cpu: 1000

Step 5: Optimize storage costs

Storage costs accumulate quickly and are often invisible until your cloud bill arrives.

  • Delete unattached PersistentVolumes — Run monthly audits for orphaned PVCs
  • Set StorageClass reclaim policies — Use Delete for ephemeral storage, Retain only for critical data
  • Use object storage for logs — Route logs to S3/Azure Blob instead of retaining in Prometheus long-term
  • Implement lifecycle policies — Automatically tier old data to cheaper storage classes

Step 6: Implement FinOps governance

One-time optimizations erode without governance. Build these processes:

Tagging and allocation

Enforce consistent resource tagging:

  • Team/owner
  • Environment (production, staging, dev)
  • Service/application name
  • Cost center

Use AWS Cost Allocation Tags or Azure Cost Management to produce per-team reports.

Showback and chargeback

Send weekly cost summaries to engineering leads. Teams that see their costs reduce spend naturally. This cultural shift is often more impactful than technical changes.

Cost anomaly detection

Configure alerts for:

  • Spend spikes > 20% above baseline
  • New resources without required tags
  • Idle resources running for > 72 hours

Expected outcomes

A structured Kubernetes cost optimization program typically delivers:

Optimization AreaExpected Savings
Rightsizing (VPA recommendations)10-20%
Spot instance migration15-30%
Karpenter node consolidation10-20%
Storage lifecycle policies5-10%
Unused resource cleanup5-15%

Combined, most teams achieve 25-40% reduction within 60 days.

Next steps

If your Kubernetes spend is growing faster than your team, a structured optimization sprint can deliver measurable savings quickly. KubeAce runs 2-4 week Cloud Cost Optimization Sprints that combine tooling, automation, and governance to produce lasting results.

Book a free infra review to discuss your current spend and potential savings range.