Write Your Own Load Balancer: A worked Example

I was out walking with a techie friend of mine I’d not seen for a while and he asked me if I’d written anything recently. I hadn’t, other than an article on data sharing a few months before and I realised I was missing it. Well, not the writing itself, but the end result.

In the last few weeks, another friend of mine, John Cricket, has been setting weekly code challenges via linkedin and his new website, https://codingchallenges.fyi/. They were all quite interesting, but one in particular on writing load balancers appealed, so I thought I’d kill two birds with one stone and write up a worked example.

You’ll find my worked example below. The challenge itself is italics and voice is that of John Crickets.

The Coding Challenge

https://codingchallenges.fyi/challenges/challenge-load-balancer/

Write Your Own Load Balancer

This challenge is to build your own application layer load balancer.

A load balancer sits in front of a group of servers and routes client requests across all of the servers that are capable of fulfilling those requests. The intention is to minimise response time and maximise utilisation whilst ensuring that no server is overloaded. If a server goes offline the load balance redirects the traffic to the remaining servers and when a new server is added it automatically starts sending requests to it.

Load balancers can work at different levels of the OSI seven-layer network model for example most cloud providers offer application load balancers (layer seven) and network load balancers (layer four). We’re going to focus on a layer seven - application load balancer, which will route HTTP requests from clients to a pool of HTTP servers.

A load balancer performs the following functions:

Distributes client requests/network load efficiently across multiple servers
Ensures high availability and reliability by sending requests only to servers that are online
Provides the flexibility to add or subtract servers as demand dictates

Therefore our goals for this project are to:

Build a load balancer that can send traffic to two or more servers.
Health check the servers.
Handle a server going offline (failing a health check).
Handle a server coming back online (passing a health check).

Step Zero

As usual this is where you select the programming language you’re going to use for this challenge, set up your IDE and grab a coffee (or other beverage of your choice).

If you’re not used to building systems that handle multiple concurrent events you might like to grab your beverage of choice and do some background reading on multi-threading, concurrency and asynchronous programming as it relates to your programming language of choice.

This is an easy choice for me as most of the work I’m doing at the moment is in node and it supports the sort of concurrency I need. To make getting started easy, I created a node project template which supports typescript, because all good developers love type safety, right?

https://github.com/pjgrenyer/node-typescript-template

Step 1

In this step your goal is to create a basic server that can start-up, listen for incoming connections and then forward them to a single server.

The first sub-step then is to create a program (I’ll call it ‘lb’) that will start up and listen for connections on on a specified port (i.e. 80 for HTTP). I’d suggest you then log a message to standard out confirming an incoming connection, something like this:

./lb
Received request from 127.0.0.1
GET / HTTP/1.1
Host: localhost
User-Agent: curl/7.85.0
Accept: */*

For when I sent a test request to our load balancer like this:

curl http://localhost/

Cool! Time to write some code. Fortunately it’s really easy to create a program which can listen for connections in node, with https://expressjs.com/ and really easy to get going, by making a copy of the node typescript template project I created:

https://github.com/pjgrenyer/coding-challenges-lb

And creating a new feature branch:

git flow feature start listener

Then all I needed was to install express:

npm install express --save
npm install @types/express --save-dev

and fire up an express server:

// index.ts

import express, { Request, Response } from "express";

const port = 8080;
const app = express();

app.get('/', (req: Request, res: Response) => {
res.send('Code Challenge!')
})

app.listen(port, () => {
console.log(`Listening on port ${port}`)
});

run the app:

npm run dev

> coding-challenges-lb@1.0.1 dev
> npm run build && node dist/index.js

> coding-challenges-lb@1.0.1 build
> rimraf dist && tsc

Listening on port 8080

and check it works with curl:

curl localhost:8080

Code Challenge!

The code challenge suggests using port 80 for the load balancer, but I’m developing on Linux where this port is reserved, so I’ve gone for port 8080 instead.

The code challenge suggests logging the request, which is easily done with node and express, by modifying the request handler:

app.get('/', (req: Request, res: Response) => {
console.log(req);
res.send('Code Challenge!');
});

I’ll not show the output here as, from express, it’s very large.

That’s it for this part of the step, so I committed the code:

git add .
git commit -m"feat: added listener"

And completed the feature branch:

git flow feature finish

before going back to the code challenge instructions.

Next up we want to forward the request made to the load balancer to a back end server. This involves opening a connection to the back end server, making the same request to it that we received, then passing the result back to the client.
In order to handle multiple clients making requests you’ll need to add some concurrency, either with your programming language’s async framework or threads.

I decided to use a modified version of my code from the beginning of this step as my backend. It was changed to respond with a HTTP status of 200 and the text: ‘Hello From Backend Server’.

I decided to do much the same thing. Node can handle multiple clients making calls out of the box. I created another new project:

https://github.com/pjgrenyer/coding-challenges-be

which is much the same as the load balancer, but with a few changes. I know from reading ahead in the code challenge I’m going to need to run the backend server on multiple ports, so I made that an environment variable and returned it as part of the response message:

// index.ts

import express, { Request, Response } from 'express';

const port = process.env.PORT ? +process.env.PORT : 8081;
const app = express();

app.get('/', (req: Request, res: Response) => {
   res.send(`Hello From Backend Server (${port})`);
});

app.listen(port, () => {
   console.log(`Listening on port ${port}`);
});

Now, if I run the app as normal it will run on port 8081 by default, but if I set the port environment variable:

PORT=8082 npm run dev

then the app will run on the specified port:

> coding-challenges-be@1.0.1 dev
> npm run build && node dist/index.js

> coding-challenges-be@1.0.1 build
> rimraf dist && tsc

Listening on port 8082

and the port number is returned as part of the response:

curl localhost:8082

Hello From Backend Server (8082)

Next, I need to modify the load balancer to call the backend server.

Node doesn’t support fetch natively like a browser’s implementation of JavaScript does and even though there is a node-fetch package, I prefer Axios for no other reason than its implementation makes a little more sense to me:

npm i axios --save
npm i @types/axios --save

With Axios installed I can call the backend directly and return the response:

import express, { Request, Response } from 'express';
import axios from 'axios'

const port = 8080;
const app = express();

app.get('/', async (req: Request, res: Response) => {
   const response =
await axios.get('http://localhost:8081');
   res.send(response.data);
});

app.listen(port, () => {
   // eslint-disable-next-line no-console
   console.log(`Listening on port ${port}`);
});

Now with the backend running on port 8081 and the load balancer on 8080, making a request to the load balancer from curl gives the backend response:

curl localhost:8080

Hello From Backend Server (8081)

And now it’s on to step 2.

Step 2

In this step your goal is to distribute the incoming requests between two servers using a basic scheduling algorithm - round robin.

Round robin is the simplest form of static load balancing. In a nutshell it works by sending each new request to the next server in the list of servers. When we’ve sent a request to every server, we start back at the beginning of the list.

You can read more about load balancing algorithms on the Coding Challenges website.

So to do this we’ll need to do several things:

Extend our load balancer to allow for multiple backend severs.
Then route the request to the servers based on the round robin scheduling algorithm.

The code challenge goes on to suggest that a Python server could be used as the backend, but I’ve built my own which can be easily started on different ports, so I’m going to stick with that.

Although I came up with quite a few different ways of storing and iterating through backend server urls, including using a database, the simplest idea I had is to maintain an array of the servers and, each time, take the server off the top, use it and push it back on from the bottom. Node modules allow arrays and functions to be easily shared between requests, so I created a new module with the array, initialised it and exported a function to get the next server:

// backend.ts

let backends: Array<string> = [];
[
   'http://localhost:8081',
   'http://localhost:8082',
   'http://localhost:8083'
].forEach((url) => backends.push(url));

export const nextBackend = (): string | undefined => {
   const nextBackend = backends.shift();
   if (nextBackend) {
       backends.push(nextBackend);
   }
   return nextBackend;
};

Then I modified the request to get the next backend server and use its url each time a request is made:

import { nextBackend } from './backend';
…
app.get('/', async (req: Request, res: Response) => {
   const backend = nextBackend();
   if (backend) {
       const response = await axios.get(backend);
       res.send(response.data);
   } else {
       res.status(503).send('Error!');
   }
});

When you ‘shift’ (remove from the top) an element out of an array, it will return undefined if the array is empty, so undefined is a possible value returned by nextBackend and must therefore be handled, in this case, by returning status code 503 and a simple error message.

This should be all which is needed, so I fired up three backend servers on three different ports:

PORT=8081 npm run dev
PORT=8082 npm run dev
PORT=8083 npm run dev

started the modified load balancer and called it a few times:

> curl localhost:8080

Hello From Backend Server (8081)

> curl localhost:8080

Hello From Backend Server (8082)

> curl localhost:8080

Hello From Backend Server (8083)

> curl localhost:8080

Hello From Backend Server (8081)
…

And it worked! I got a response from each of the backed servers in turn! It works from a web browser too.

Step 3

In this step your goal is to periodically health check the application servers that we’re forwarding traffic to. If any server fails the health check then we will stop sending requests to it.

For this exercise we’re going to use a HTTP GET request as the health check. If the status code returned is 200, the server is healthy. Any other response and the server is unhealthy and requests should no longer be sent to it.

Typically the health checks are sent periodically, I’d suggest you make this configurable via the command line so we can set a short period for testing - say 10 seconds. You will also need to be able to specify a health check URL.

So in summary the key tasks for this step:

Allow a health check period to be specified on the command line.
Every period make a GET request to the health check URL if the result is 200 carry on. Otherwise take the server out of the list of available servers to handle requests.
If the health check of a server starts passing again, add it back to the list of servers available to handle requests.

It would be a good idea to run the health check as a background task, concurrently to handling client requests.

In my experience a health endpoint is usually a slightly oddly named endpoint with ‘health’ in the name. One I see frequently is ‘_health’. A health endpoint can be any endpoint which returns 200 and should have zero or minimal side effects. For example, it shouldn’t be calling a database, but it may log. The backend service already has such an endpoint, but that might change in the future, so I’m going to create a dedicated one.

…
app.get('/_health', (req: Request, res: Response) => {
   res.send();
});
…

Then, back in the load balancer I need to call the health check endpoint on each backend every X seconds where X is configurable. Creating a background tasks in the node is really easy using setInterval:

const healthCheckInterval =
process.env.HEALTH_CHECK_INTERVAL ? +process.env.HEALTH_CHECK_INTERVAL : 10;

…
const timer = setInterval(() => {
   console.log('Health check!');
}, 1000 * healthCheckInterval);
timer.unref();
…

Starting the load balancer will print the console message every 10 seconds. Next I need to get a list of the backends and call the health check endpoint on each. First I’m going to create a health check function and put it in a module of its own to keep things clean:

// healthChecks.ts
…
export const healthChecks = async () => {
   console.log('Health check!');
}

And then call it on startup, so the first thing which is done is the health check on each backend, and then from the interval function:

// index.ts
…
healthChecks();
const timer = setInterval(async () => {
   await healthChecks();
}, 1000 * healthCheckInterval);
timer.unref();

Next I need a list of backends. The current backends array changes. Backends are pulled off the top of the array and pushed back on the bottom. In future they’ll also be removed from the array when the backend isn’t healthy, so I need a constant array of backends. I made a few changes for this:

// backend.ts

export const backends = [
'http://localhost:8081',
'http://localhost:8082',
'http://localhost:8083'];

let activeBackends: Array<string> = [];
backends.forEach((url) => activeBackends.push(url));

export const nextBackend = (): string | undefined => {
   const nextBackend = activeBackends.shift();
   if (nextBackend) {backends
           activeBackends.push(nextBackend);
   }
   return nextBackend;
};

Now the backend array is constant and a new activeBackends array is used for round robin load balancing. I also exported the array so that I can use it elsewhere.

Now I want to iterate through the backends:

// healthChecks.ts

export const healthChecks = async () => {
   for (const backend of backends) {
       }
};

Next I want a function I can call to determine if the backend is healthy:

// healthChecks.ts

const healthCheckPath =
process.env.HEALTH_CHECK_PATH ?? '_health';
…
const isHealthy = async (url: string): Promise<boolean> => {
   try {
       const response =
axios.get(`${url}/${healthCheckPath}`);
       return (await response).status === 200;
   } catch (error: any) {
           return false;
   }
};

The path to the health check endpoint should be configurable, so I’ve made it an environment variable. The response status from the endpoint is checked and if it isn’t 200 then the health check fails. If an exception is thrown, which can happen with Axios if the backend isn’t there, it’s caught and considered a health check failure.

Now I can iterate through the backends and, check each one and output a message about its health:

export const healthChecks = async () => {
   for (const backend of backends) {
       if (await isHealthy(backend)) {
           console.log(`${backend} is healthy`);
       } else {
           console.log(`${backend} is not healthy`);
       }
   }
};

This is the point where, if you’re like me, you fire up three instances of the backend (if you haven’t still got them running), fire up the load load balancer and spend hours (well, several minutes) starting and stopping the backends and watching the health check messages:

Listening on port 8080
http://localhost:8081 is healthy
http://localhost:8082 is not healthy
http://localhost:8083 is healthy
http://localhost:8081 is healthy
http://localhost:8082 is not healthy
http://localhost:8083 is healthy
http://localhost:8081 is healthy
http://localhost:8082 is healthy
http://localhost:8083 is healthy
…

And while that’s a lot of fun, it’s not getting us anywhere as we’re not actually removing dead backends from the list or re-adding them when they’re revived. I added two new functions to do that:

// backend.ts

export const removeBackend = (url: string) => {
   activeBackends = activeBackends
.filter((backend) => backend != url);
};

export const addBackend = (url: string) => {
   if (!activeBackends.find((backend) => backend == url)) {
           activeBackends.push(url);
   }
};

The removeBackend function simply iterates through the activeBackends array and filters out the unhealthy backend. addBackend looks to see if the backend already exists and only adds it to activeBackends if it doesn’t so I don’t end up with duplicates.

Then, all that remains to do in this step is to use these methods in the the healthChecks function:

// healthChecks.ts

export const healthChecks = async () => {
   for (const backend of backends) {
       if (await isHealthy(backend)) {
           console.log(`${backend} is healthy`);
           addBackend(backend);
       } else {
           console.log(`${backend} is not healthy`);
           removeBackend(backend);
       }
   }
};

Then restart the load balancer and test…

When it comes to testing this I suggest you start up a third backend server.

Then connect to your load balancer three to six times to verify that it is rotating through backend servers as expected. Once you’ve verified that kill one of the servers and verify the request are only routed to the remaining two, without you, the end user receiving any errors.

Once you’ve verified that, start the server back up, wait just a little longer than the health check duration and then check it is now back to serving content when requests are made through the load balancer.

As a final test, check your load balancer can handle multiple concurrent requests, I suggest using curl for this. First create a file containing the urls to check - for this they’ll all be the same:

url = "http://localhost:8080"
url = "http://localhost:8080"
url = "http://localhost:8080"
url = "http://localhost:8080"
url = "http://localhost:8080"
url = "http://localhost:8080"
url = "http://localhost:8080"
url = "http://localhost:8080"

Then invoke curl to make concurrent requests:

curl --parallel --parallel-immediate --parallel-max 3 --config urls.txt

Tweak the maximum parallelisation to see how well your server copes!

If that all works, congratulations, you’ve built a basic HTTP load balancer!

I tried all the tests against the load balancer and they all passed. I put more than 90 calls in the urls file and had 30 parallel calls before I got bored. It all worked really well.

Finally

Having built a rocking load balancer and got the recommended tests to pass, I’m going to leave this worked example here. However, the code challenge does suggests some further steps:

Beyond Step 3 - Further Extensions You Could Build

Having gotten this far you’ve built a basic working load balancer. That’s pretty awesome!

Here are some other areas you could explore if you wish to take the project further and dig deeper into what makes a load balancer useful and how it works:

Read up about HTTP keep-alive and how it is used to reuse back end connections until the timeout expires.
Add some Logging - think about the kinds of things that would be useful for a developer, i.e. which server did a client’s request go to, how long did the backend server take to process the request and so on.
Build some automated tests that stand up the backend servers, a load balancer and a few clients. Check the load balancer can handle multiple clients at the same time.
If you opted for threads, try converting it to use an async framework - or vice versa.

These seem like a lot of fun, especially writing the automated tests. I may well return to and progress this code challenge in the future.

Thank you to Michael Davey, John Cricket and Stephen Cresswell.

Paul Grenyer

Search This Blog

Write Your Own Load Balancer: A worked Example

The Coding Challenge

Write Your Own Load Balancer

Step Zero

Step 1

Step 2

Step 3

Finally

Beyond Step 3 - Further Extensions You Could Build

Comments

Post a Comment

Popular posts from this blog

Catalina-Ant for Tomcat 7

Bloodstock 2009