How to HPA Deployment EKS Cluster: A Step-by-Step Guide
Horizontal scaling in Kubernetes is a method of scaling out (or in) the number of pod replicas in a deployment or replica set to adjust to the workload demands. This is crucial for maintaining optimal performance and availability, especially for applications with varying loads. In this post, we will explore how to HPA Deployment EKS Cluster. Here’s a deeper look into how it works and its business applications:
How Horizontal Scaling Works in Kubernetes
- Pod Replicas: Kubernetes manages the scaling of applications by adjusting the number of pod replicas—instances of your application running on the cluster.
- Horizontal Pod Autoscaler (HPA): This component automatically scales the number of pod replicas in a deployment or replica set based on observed CPU utilization or other selected metrics. The HPA adjusts the
replicas
field of the deployment or replica set. - Metrics Monitoring: Autoscaling decisions are based on metrics, including CPU utilization, memory usage, or custom metrics provided by the Kubernetes Metrics Server or external monitoring tools.
- Scaling Out and In:
- Scaling Out: Increases the number of pod replicas when the demand is high, ensuring that the workload is distributed and the service remains responsive.
- Scaling In: Decreases the number of pod replicas during low-demand periods to save resources, reducing costs without impacting performance.
Business Use Cases
- E-Commerce Websites: For online retailers, traffic can vary significantly due to promotions, sales events, or seasonal traffic. Horizontal scaling ensures that the website remains responsive during traffic spikes and cost-efficient during normal or low traffic.
- Media and Streaming Services: These platforms experience varying loads based on new content releases, live events, or user time zones. Scaling ensures that server resources meet the demand without degrading user experience.
- Financial Services: Banking and trading platforms require high responsiveness, especially during market hours or economic events. Horizontal scaling helps handle the load during peak trading and scales down in off-peak hours.
- SaaS Applications: As the user base grows or usage peaks due to specific operational periods (e.g., end-of-month reporting), SaaS platforms must scale dynamically to handle increased load and return to normal levels afterward.
- IoT Applications: IoT applications often have to process large influxes of data sent from devices. Scaling ensures data processing keeps up with the incoming data flow, critical for real-time analysis and response systems.
- Microservices Architectures: Different components might have different scaling needs in microservices environments. Kubernetes can scale these services independently based on their specific load, which optimizes resource usage across the cluster.
Benefits of Horizontal Scaling
- Cost Efficiency: By scaling down during low demand, businesses can minimize resource usage and reduce costs.
- Improved Availability and Performance: Scaling out during peak times helps maintain performance levels and prevent downtime or slowdowns.
- Flexibility and Agility: Businesses can respond quickly to changes in demand without manual intervention or over-provisioning of resources.
To create a Node.js application that can be deployed with horizontal scaling based on memory usage, you’ll need to prepare a few components:
- Node.js Application Code: A simple application.
- Dockerfile: To containerize the application.
- Kubernetes Configuration:
- Deployment: For the app deployment.
- Service: To expose the app.
- Horizontal Pod Autoscaler (HPA): To scale based on memory usage.
Node.js Application Code
First, let’s write a simple Node.js application. This example will create a basic HTTP server that responds to all requests with a “Hello, World!” message.
// app.js
const http = require('http');
const server = http.createServer((req, res) => {
res.statusCode = 200;
res.setHeader('Content-Type', 'text/plain');
res.end('Welcome to AWSTrainingwithJagan.com!\n');
});
const PORT = process.env.PORT || 8080;
server.listen(PORT, () => {
console.log(`Server running on http://localhost:${PORT}/`);
});
Dockerfile
Next, create a Dockerfile to containerize the application.
# Use an official Node runtime as a parent image
FROM node:14-slim
# Set the working directory in the container
WORKDIR /app
# Copy the current directory contents into the container at /app
COPY . .
# Install any needed packages specified in package.json
RUN npm install
# Make port 8080 available to the world outside this container
EXPOSE 8080
# Define environment variable
ENV NAME World
# Run app.js when the container launches
CMD ["node", "app.js"]
The setup should be in the format, Please create the folder as a screenshot
Please create the docker image as
docker build -t nodeapps .
Kubernetes Configuration
Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: nodejs-app
spec:
replicas: 1
selector:
matchLabels:
app: nodejs-app
template:
metadata:
labels:
app: nodejs-app
spec:
containers:
- name: nodejs-app
image: your-registry/nodejs-app:latest
resources:
requests:
memory: "256Mi"
limits:
memory: "512Mi"
ports:
- containerPort: 8080
Try to run the container in the local machine as per below, since my port no:8080 is occupied, that is why I am running on port 8082
docker run -p 8082:8082 --name nodecont nodeapps
Kubernetes Configuration
Deployment->Deployment.yml
apiVersion: apps/v1
kind: Deployment
metadata:
name: nodejs-app
spec:
replicas: 1
selector:
matchLabels:
app: nodejs-app
template:
metadata:
labels:
app: nodejs-app
spec:
containers:
- name: nodejs-app
image: nodeapps:latest
resources:
requests:
memory: "256Mi"
limits:
memory: "512Mi"
ports:
- containerPort: 8080
Service-> Service.yml
apiVersion: v1
kind: Service
metadata:
name: nodejs-app
spec:
type: ClusterIP
selector:
app: nodejs-app
ports:
- port: 80
targetPort: 8080
Horizontal Pod Autoscaler (HPA) -hpa.yml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: nodejs-app-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: nodejs-app
minReplicas: 1
maxReplicas: 10
metrics:
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 75
Please Create the Access key and Secret Key for EKS Cluster deployment on AWS Cloud
To create the HPA Deployment EKS cluster using eksctl command tools -> Please try to install it
eksctl create cluster --name hpademo --region us-east-1 --nodegroup-name hpanode --node-type t3.nano --nodes 3 --nodes-min 1 --nodes-max 4 --managed
To configure where we are going to HPA Deployment EKS cluster, that needs to specify the cluster side
aws eks update-kubeconfig --region us-east-1 --name hpademo
Deploying the Application
- Build and Push Docker Image: Build your Docker image and push it to your container registry.
- Deploy on Kubernetes: Apply the Kubernetes configurations using
kubectl
.
kubectl apply -f deployment.yaml
kubectl apply -f service.yaml
kubectl apply -f hpa.yaml
This setup will ensure that your Node.js application on Knative is capable of scaling horizontally based on memory usage. Adjust the averageUtilization
value in the HPA to suit the needs of your application based on its memory usage patterns.