Skip to content

rajatvarma/load-balancer-simulator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

System Architecture

I use a VirtualBox VM running Debian 12 as my server machine. The machine is given 6 CPUs and 8GB of memory, along with 20GB of virtual disk space. The load balancer is written Python, using GUnicorn as the server handler. To increase throughput, I implemented a multi-process load balancer, with shared resources and IPC via redis. I use Locust as a load generator, and simulate a maximum of 100 users during the testing phase.

System Layout

Algorithms

Load Balancing

I implement two load balancing algorithms: a simple round robin one and one that looks for the lowest usage container.

Round Robin

This is a very simple algorithm, where requests are cycled through the containers. Generally, given $G$ a list of containers, the container at index i to which the request will be forwarded is given by $$G[i] = n \mod len(G)$$ where $n$ is $n^{\text{th}}$ request.

Container Cost

This algorithm measures the busy-ness of each container using the following busy-ness function: $$b(C) = C_{cpu\ use} + C_{memory\ use}$$ In the implementation, it adds the CPU usage percentage to the number of MBs of memory used by the container to determine the container’s cost $b(C)$ and sends the request to the container with the lowest cost. The idea is that it will not overload containers dealing with either large watermark size or large images, and send the request to a container that was dealing with a computationally smaller request.

Container Scaling

Response SLA based scaling

In this simple algorithm, the scaler considers a sliding window of 50 requests, and starts a new container whenever the average is above a pre-decided maximum average. Once it starts a new container, it lets the system stabilize for 20 seconds before reconsidering starting a new container. This algorithm also shuts down one container for every 30 seconds of server idleness

Request Rate based Scaling

This algorithm decides the number of containers to start based on the rate of requests received. The number of containers running at any time is based on the following equation: $$\text{number of containers running} = \frac{\text{requests per second}}{\text{predicted response time}}$$


Container Image

To build the container image, run the following command in the terminal

podman image build --tag=worker:latest worker

About

FaaS Platform simulation using Python

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors