Day 17/40 — Kubernetes Autoscaling: HPA vs VPA Explained With Hands-On Practice

Question

Day 17/40 — Kubernetes Autoscaling: HPA vs VPA Explained With Hands-On Practice

calendar_today5 hours ago • schedule2 min read

If you've ever wondered how Kubernetes knows when to spin up more pods or give a pod more memory, that's autoscaling — and it's one of those concepts that sounds intimidating until you actually do it yourself. Day 17 of the #40DaysOfKubernetes challenge is where it clicked for me.

What is Autoscaling in Kubernetes?

At its core, autoscaling means Kubernetes adjusts resources automatically based on demand. You don't manually intervene every time traffic spikes. There are two main types:

HPA (Horizontal Pod Autoscaler) — adds or removes pods based on CPU/memory usage
VPA (Vertical Pod Autoscaler) — adjusts the resources (CPU/memory) of existing pods

Think of HPA as hiring more staff when the shop gets busy. VPA is more like giving one staff member more tools to handle the workload alone.

What I Did — Setting Up HPA

First I deployed the sample php-apache app with defined CPU requests and limits:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: php-apache
spec:
  selector:
    matchLabels:
      run: php-apache
  template:
    metadata:
      labels:
        run: php-apache
    spec:
      containers:
      - name: php-apache
        image: registry.k8s.io/hpa-example
        ports:
        - containerPort: 80
        resources:
          limits:
            cpu: 500m
          requests:
            cpu: 200m

apiVersion: v1
kind: Service
metadata:
  name: php-apache
  labels:
    run: php-apache
spec:
  ports:
  - port: 80
  selector:
    run: php-apache

Applying the yaml file

The pod is up and running

The key part is setting resources.requests.cpu — HPA needs this to calculate utilization. Without it, the autoscaler has nothing to measure against.

Then I created the HPA object targeting 50% average CPU utilization, with a minimum of 1 pod and maximum of 10. This is the declarative method:

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: php-apache
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: php-apache
  minReplicas: 1
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 50

While this is the imperative method:

kubectl autoscale deployment php-apache --cpu-percent=50 --min=1 --max=10

autoscale applied

Autoscale complete

Generating Load to Watch It Scale

This is the fun part. I ran a load generator in a separate pod — basically a loop hammering the apache service with requests:

kubectl run -i --tty load-generator --rm --image=busybox:1.28 --restart=Never -- \
  /bin/sh -c "while sleep 0.01; do wget -q -O- http://php-apache; done"

Then watched the HPA respond in real time:

kubectl get hpa php-apache --watch

Image description

Watching the replica count climb from 1 to several pods as CPU utilization crossed 50% made the whole concept land in a way that reading documentation never does.

HPA vs VPA — When Do You Use Which?

	HPA	VPA
Scales	Number of pods	Pod resource limits
Best for	Stateless apps with variable traffic	Apps where sizing is hard to predict upfront
Works with	CPU, memory, custom metrics	CPU and memory

In practice, most production workloads use HPA. VPA is useful during early deployment when you're still figuring out the right resource requests for an app.

Key Takeaway

Don't skip setting resources.requests in your deployment spec. HPA is blind without it. That one line is what connects your workload to the autoscaler.

🔥 Join developers growing publicly

Share your knowledge, build in public, and grow your developer presence with a global community.

Join CoderLegion

chevron_left

Commenters (This Week)

Contribute meaningful comments to climb the leaderboard and earn badges!

	Check out this article for beginners on Kubernetes Onlyfave - Jan 17, 2025
	Resource Requests and Limits in Kubernetes (Hands-on with Metrics Server) #40DaysOfKubernetes – Task AYANFE - Jul 1
	Learn AWS for Free Hands On Without Getting Charged Ijay - Feb 24
	Tips for passing CKAD exam at 1st attempt (2026 Edition) Arkadiusz Pabian - May 6
	Deploying Backstage on Kubernetes with the Helm Chart: The Infrastructure-First Guide JIMOH SODIQ - Apr 17

Day 17/40 — Kubernetes Autoscaling: HPA vs VPA Explained With Hands-On Practice

What is Autoscaling in Kubernetes?

What I Did — Setting Up HPA

Generating Load to Watch It Scale

HPA vs VPA — When Do You Use Which?

Key Takeaway

0 Comments

Please log in to comment on this post.

More Posts

Check out this article for beginners on Kubernetes

Resource Requests and Limits in Kubernetes (Hands-on with Metrics Server) #40DaysOfKubernetes – Task

Learn AWS for Free Hands On Without Getting Charged

Tips for passing CKAD exam at 1st attempt (2026 Edition)

Deploying Backstage on Kubernetes with the Helm Chart: The Infrastructure-First Guide

More From AYANFE

Resource Requests and Limits in Kubernetes (Hands-on with Metrics Server) #40DaysOfKubernetes – Task

Handling Looping Errors in a Caching Matrix in R

Related Jobs

Commenters (This Week)

Welcome to Coder Legion

Connect with 4,595 amazing developers

Don't have an account? Sign up

OR

Day 17/40 — Kubernetes Autoscaling: HPA vs VPA Explained With Hands-On Practice

What is Autoscaling in Kubernetes?

What I Did — Setting Up HPA

Generating Load to Watch It Scale

HPA vs VPA — When Do You Use Which?

Key Takeaway

0 Comments

Please log in to comment on this post.

More Posts

Check out this article for beginners on Kubernetes

Resource Requests and Limits in Kubernetes (Hands-on with Metrics Server) #40DaysOfKubernetes – Task

Learn AWS for Free Hands On Without Getting Charged

Tips for passing CKAD exam at 1st attempt (2026 Edition)

Deploying Backstage on Kubernetes with the Helm Chart: The Infrastructure-First Guide

More From AYANFE

Resource Requests and Limits in Kubernetes (Hands-on with Metrics Server) #40DaysOfKubernetes – Task

Handling Looping Errors in a Caching Matrix in R

Related Jobs

Commenters (This Week)