The Compelling Case for Node.js

The basic premise of Node.js is that I/O is expensive and that, since I/O is expensive, we can’t block waiting for it to complete.

Many traditional Web Servers typically adopt a one thread per request approach, and any I/O (database, web service, file system call…) during the request blocks that thread of execution. This is inefficient in many ways because when the thread is blocked waiting for I/O to complete, it can’t respond to other requests.

Basically one OS Thread / TCP Connection is suboptimal.

The solution is not a mystery: non-blocking I/O. An asynchronous request is issued and nothing more is done with it until that request is completed leaving the main thread of execution free to handle other requests.

Cooking Analogy

If you are a reasonably experienced cook, you’d go about making Pasta Bolognese something like this:

start boiling water
start frying ground beef
put pasta in water
finish meat sauce
serve once pasta and meat sauce are both done

but if you’re a rookie, something like this is most likely where you’d end up:

start boiling water
wait for water to boil
put pasta in water
wait for pasta to cook
start frying ground beef
finish meat sauce
serve once meat sauce is done

It’s clear that you’re not using your culinary skills to your full potential with the rookie approach.

I know the analogy is limping somewhat but even so, I think it illustrates the basic problem with the “let’s wait for I/O to complete before we do anything else” approach.

The fact that the one OS Thread / TCP Connection is suboptimal has been known for quite a while. Apache and Tomcat both belong to this category.

Solutions to the problem are actually well known and include

multiplex I/O into each thread
use event notification mechanisms like epoll, kqueue, event ports
adopt non-blocking I/O
don’t share memory (or at least limit sharing)
spawn however many threads you want
start event loop on each thread

They all share the same basic principles

event loop
non-blocking I/O

The Goal of Node.js

Make it easy to write high-performance servers.

Enter JavaScript, a language everybody “knows”. A generation of programmers have grown up learning how to program in terms of ‘mouseover’ events. JavaScript is a single threaded, non-blocking interface, it has no preconceived idea about I/O and right now there is an arms race for performance.

What is Node?

Node is a command line tool that uses JavaScript (Google’s V8) for low level work, such as handling sockets, files etc.

only exposes asynchronous interfaces (non-blocking)
has only one thread of execution (one call stack)
has low-level network features
strong HTTP support
purely non-blocking means decent concurrency, basically for free
no mutex locks
one thing at a time, means no thread safety issues

But hang on, if Node is a single thread? What about scaling across multiple cores? The answer is simple; you start more node instances. Given the event loop and a potential for starvation Node is most likely not the solution for really CPU intensive problems. It is a solution for I/O bound problems.

When would I use Node?

I work mainly with web applications and the “modern Web Application” is a prime candidate for Node.

The Modern Web Application

has a JavaScript “fat” client
uses asynchronous client-server communication
uses JSON as the communication format
uses WebSockets or long polling where appropriate

At all of these things, Node excels.

Finally, the idea of using the same language on both the client and the server is obviously extremely attractive and the possibilities are endless.