Building my next HTTP server

Published on 2019-11-20

This article was originally published on dev.to.

So, I decided to hop into my next side project. I wanted to create a simple thing. But I wanted to make that thing not in the traditional way that I already know how to, but in a different one. I wanted to do a simple thing using a different approach from what I would normally use.

What about an HTTP server? Seems like a good idea. I already know the nuts and bolts of the protocol, already did it in the past, so it would, in a sense, be a simple thing. You know, it's simple: open a TCP socket and wait for someone to perform a request. Once they're there, process the request and give back a nice response. Then, unless keep-alive is specified, close the client socket.

So far so good. The code would look something like this:

require "socket"

server = TCPServer.new 3000
loop do
  client = server.accept
  deal_with_http_stuff client
  client.close
end

(Oh, by the way, I'm using Ruby. It's kinda my favorite language novadays.)

Then let the magic function deal_with_http_stuff just read the HTTP headers and send() back the result to the client.

Now, the deal_with_http_stuff function can be fairly simple. It can also get farily complex, like when doing some FastCGI stuff or acting as a reverse proxy. Despite those, that's not where the "different approach" that I'm looking for would be. There is a different kind of complexity that can be added, and that is where I would like to experiment with different approach.

But before diving that way, let's quickly analyze the solution above. Upon the connection of a client, the next steps are to receive the request information, process it and send back the results. It seems simple at first, but there already are some immediate issues.

The first one is that while one request is being read, processed and then having the response be sent back to the client, other requests will have to wait in the line. That is, this server can only process one request at a time. It lacks any parallelism or concurrency capabilities.

A common and relatively easy solution for this would be to delegate the deal_with_http_stuff processing to a separate processing thread or even a different process (by using fork). The threaded version of the code could look a bit like this:

require "socket"

server = TCPServer.new 3000
loop do
  client = server.accept
  Thread.new do
    deal_with_http_stuff client
    client.close
  end
end

The fork approach would look similar to that. But those approaches have their own issues as well. Also, that is not the way I want to go. I have been there in the past already and I'm looking for a different kind of fun!

Enter asynchronicity

As you may already be familiar with, nginx is a well known HTTP server out there. Somewhere in its wiki page, it's written that:

Unlike traditional servers, NGINX doesn’t rely on threads to handle requests. Instead it uses a much more scalable event-driven (asynchronous) architecture.

Right there. That is the approach I want to go with: asynchronous architecture.

But how is that then different from the threaded approach? Because threads are asynchronous citizens, aren't they? In fact they do. One could argue, though, that, in MRI ruby implementation, threads are not really asynchronous, since MRI uses only one CPU core (in contrast to, for instance, JRuby, which does uses real threads offered by the operating system).

It is exactly there where the difference lies upon. The threaded approach depends on the level of operating system and CPU to manage the asynchronicity. The same applies to the fork approach. The developer writes code just as if everything was synchronous and the OS in partnership with the CPU are the ones doing the heavy lifting of scheduling different threads at different times to different CPU cores.

The approach I want to use, which is what nginx uses, sort of moves that heavy lifting into the server itself. It is then the developer of the server the one who has to deal with the asynchronicity at all times (and by all times, I do mean it!) This way, it is possible to serve multiple requests simultaneously using only one real thread -- that is, a single CPU core.

Two things

The approach nginx uses, which I want to use, is really about two things: CPU and I/O. The concurrency that nginx uses, what it really addresses, is the fact that with just regular programming, either one of those are busy while the other is kept free.

For instance, reading a file from disk means telling the disk driver to fetch the contents of that file and then waiting for it to return the file contents. It takes time. Likewise, sending a package of data to the network means waiting for the network driver to do all electrical signaling and stuff required to deliver that package. Setting a timer, like in Javascript's setTimeout function, means just waiting for a certain amount of time to pass without even doing something useful at all.

The issue then is that while some I/O is being performed, the CPU is not being used. By doing the following:

content = File.read("/tmp/foo")

data = JSON.load(content)

puts(data['foobar'])

the first statement makes the disk busy and frees the CPU from doing any work. It asks the disk for the file data and waits for it to finish reading all the file before proceeding to the next line of code. Once finished, the data is transferred to the JSON module for it to parse the string in memory, producing ruby objects along the way. The parsing itself is, of course, a CPU intensive and I/O-free task. The last statement just prints the data. But pay attention here, because that puts instruction won't return until the statement has been printed, which also means waiting for the I/O to finish.

What if we could just minimize the total amount of time that both CPU and I/O are free? That is, to keep them busy as possible. When CPU is already being used to the max, do some I/O. When I/O is being used to the max, do some CPU intensive tasks.

This is exactly my plan, and I hope to explain how that works in future posts.