16 min read

Server-side I/O Performance: Node vs. PHP vs. Java vs. Go

View all articles

Understanding the Input/Output (I/O) model of your application can mean the difference between an application that deals with the load it is subjected to, and one that crumples in the face of real-world use cases. Perhaps while your application is small and does not serve high loads, it may matter far less. But as your application’s traffic load increases, working with the wrong I/O model can get you into a world of hurt.

And like most any situation where multiple approaches are possible, it’s not just a matter of which one is better, it’s a matter of understanding the tradeoffs. Let’s take a walk across the I/O landscape and see what we can spy.

In this article, we’ll be comparing Node, Java, Go, and PHP with Apache, discussing how the different languages model their I/O, the advantages and disadvantages of each model, and conclude with some rudimentary benchmarks. If you’re concerned about the I/O performance of your next web application, this article is for you.

I/O Basics: A Quick Refresher

To understand the factors involved with I/O, we must first review the concepts down at the operating system level. While it is unlikely that will have to deal with many of these concepts directly, you deal with them indirectly through your application’s runtime environment all the time. And the details matter.

System Calls

Firstly, we have system calls, which can be described as follows:

  • Your program (in “user land,” as they say) must ask the operating system kernel to perform an I/O operation on its behalf.
  • A “syscall” is the means by which your program asks the kernel do something. The specifics of how this is implemented vary between OSes but the basic concept is the same. There is going to be some specific instruction that transfers control from your program over to the kernel (like a function call but with some special sauce specifically for dealing with this situation). Generally speaking, syscalls are blocking, meaning your program waits for the kernel to return back to your code.
  • The kernel performs the underlying I/O operation on the physical device in question (disk, network card, etc.) and replies to the syscall. In the real world, the kernel might have to do a number of things to fulfill your request including waiting for the device to be ready, updating its internal state, etc., but as an application developer, you don’t care about that. That’s the kernel’s job.

Syscalls Diagram

Blocking vs. Non-blocking Calls

Now, I just said above that syscalls are blocking, and that is true in a general sense. However, some calls are categorized as “non-blocking,” which means that the kernel takes your request, puts it in queue or buffer somewhere, and then immediately returns without waiting for the actual I/O to occur. So it “blocks” for only a very brief time period, just long enough to enqueue your request.

Some examples (of Linux syscalls) might help clarify: - read() is a blocking call - you pass it a handle saying which file and a buffer of where to deliver the data it reads, and the call returns when the data is there. Note that this has the advantage of being nice and simple. - epoll_create(), epoll_ctl() and epoll_wait() are calls that, respectively, let you create a group of handles to listen on, add/remove handlers from that group and then block until there is any activity. This allows you to efficiently control a large number of I/O operations with a single thread, but I’m getting ahead of myself. This is great if you need the functionality, but as you can see it’s certainly more complex to use.

It’s important to understand the order of magnitude of difference in timing here. If a CPU core is running at 3GHz, without getting into optimizations the CPU can do, it’s performing 3 billion cycles per second (or 3 cycles per nanosecond). A non-blocking system call might take on the order of 10s of cycles to complete - or “a relatively few nanoseconds”. A call that blocks for information being received over the network might take a much longer time - let’s say for example 200 milliseconds (1/5 of a second). And let’s say, for example, the non-blocking call took 20 nanoseconds, and the blocking call took 200,000,000 nanoseconds. Your process just waited 10 million times longer for the blocking call.

Blocking vs. Non-blocking Syscalls

The kernel provides the means to do both blocking I/O (“read from this network connection and give me the data”) and non-blocking I/O (“tell me when any of these network connections have new data”). And which mechanism is used will block the calling process for dramatically different lengths of time.


The third thing that’s critical to follow is what happens when you have a lot of threads or processes that start blocking.

For our purposes, there is not a huge difference between a thread and process. In real life, the most noticeable performance-related difference is that since threads share the same memory, and processes each have their own memory space, making separate processes tends to take up a lot more memory. But when we’re talking about scheduling, what it really boils down to is a list of things (threads and processes alike) that each need to get a slice of execution time on the available CPU cores. If you have 300 threads running and 8 cores to run them on, you have to divide the time up so each one gets its share, with each core running for a short period of time and then moving onto the next thread. This is done through a “context switch,” making the CPU switch from running one thread/process to the next.

These context switches have a cost associated with them - they take some time. In some fast cases, it may be less than 100 nanoseconds, but it is not uncommon for it to take 1000 nanoseconds or longer depending on the implementation details, processor speed/architecture, CPU cache, etc.

And the more threads (or processes), the more context switching. When we’re talking about thousands of threads, and hundreds of nanoseconds for each, things can get very slow.

However, non-blocking calls in essence tell the kernel “only call me when you have some new data or event on one of any of these connections.” These non-blocking calls are designed to efficiently handle large I/O loads and reduce context switching.

With me so far? Because now comes the fun part: Let’s look at what some popular languages do with these tools and draw some conclusions about the tradeoffs between ease of use and performance… and other interesting tidbits.

As a note, while the examples shown in this article are trivial (and partial, with only the relevant bits shown); database access, external caching systems (memcache, et. all) and anything that requires I/O is going to end up performing some sort of I/O call under the hood which will have the same effect as the simple examples shown. Also, for the scenarios where the I/O is described as “blocking” (PHP, Java), the HTTP request and response reads and writes are themselves blocking calls: Again, more I/O hidden in the system with its attendant performance issues to take into account.

There are a lot of factors that go into choosing a programming language for a project. There are even a lot factors when you only consider performance. But, if you are concerned that your program will be constrained primarily by I/O, if I/O performance is make or break for your project, these are things you need to know.

The “Keep It Simple” Approach: PHP

Back in the 90’s, a lot of people were wearing Converse shoes and writing CGI scripts in Perl. Then PHP came along and, as much as some people like to rag on it, it made making dynamic web pages much easier.

The model PHP uses is fairly simple. There are some variations to it but your average PHP server looks like:

An HTTP request comes in from a user’s browser and hits your Apache web server. Apache creates a separate process for each request, with some optimizations to re-use them in order to minimize how many it has to do (creating processes is, relatively speaking, slow). Apache calls PHP and tells it to run the appropriate .php file on the disk. PHP code executes and does blocking I/O calls. You call file_get_contents() in PHP and under the hood it makes read() syscalls and waits for the results.

And of course the actual code is simply embedded right into your page, and operations are blocking:


// blocking file I/O
$file_data = file_get_contents(‘/path/to/file.dat’);

// blocking network I/O
$curl = curl_init('http://example.com/example-microservice');
$result = curl_exec($curl);

// some more blocking network I/O
$result = $db->query('SELECT id, data FROM examples ORDER BY id DESC limit 100');


In terms of how this integrates with system, it’s like this:

I/O Model PHP

Pretty simple: one process per request. I/O calls just block. Advantage? It’s simple and it works. Disadvantage? Hit it with 20,000 clients concurrently and your server will burst into flames. This approach does not scale well because the tools provided by the kernel for dealing with high volume I/O (epoll, etc.) are not being used. And to add insult to injury, running a separate process for each request tends to use a lot of system resources, especially memory, which is often the first thing you run out of in a scenario like this.

Note: The approach used for Ruby is very similar to that of PHP, and in a broad, general, hand-wavy way they can be considered the same for our purposes.

The Multithreaded Approach: Java

So Java comes along, right about the time you bought your first domain name and it was cool to just randomly say “dot com” after a sentence. And Java has multithreading built into the language, which (especially for when it was created) is pretty awesome.

Most Java web servers work by starting a new thread of execution for each request that comes in and then in this thread eventually calling the function that you, as the application developer, wrote.

Doing I/O in a Java Servlet tends to look something like:

public void doGet(HttpServletRequest request,
	HttpServletResponse response) throws ServletException, IOException

	// blocking file I/O
	InputStream fileIs = new FileInputStream("/path/to/file");

	// blocking network I/O
	URLConnection urlConnection = (new URL("http://example.com/example-microservice")).openConnection();
	InputStream netIs = urlConnection.getInputStream();

	// some more blocking network I/O

Since our doGet method above corresponds to one request and is run in its own thread, instead of a separate process for each request which requires its own memory, we have a separate thread. This has some nice perks, like being able to share state, cached data, etc. between threads because they can access each other’s memory, but the impact on how it interacts with the schedule it still almost identical to what is being done in the PHP example previously. Each request gets a new thread and the various I/O operations block inside that thread until the request is fully handled. Threads are pooled to minimize the cost of creating and destroying them, but still, thousands of connections means thousands of threads which is bad for the scheduler.

An important milestone is that in version 1.4 Java (and a significant upgrade again in 1.7) gained the ability to do non-blocking I/O calls. Most applications, web and otherwise, don’t use it, but at least it’s available. Some Java web servers try to take advantage of this in various ways; however, the vast majority of deployed Java applications still work as described above.

I/O Model Java

Java gets us closer and certainly has some good out-of-the-box functionality for I/O, but it still doesn’t really solve the problem of what happens when you have a heavily I/O bound application that is getting pounded into the ground with many thousands of blocking threads.

Non-blocking I/O as a First Class Citizen: Node

The popular kid on the block when it comes to better I/O is Node.js. Anyone who has had even the briefest introduction to Node has been told that it’s “non-blocking” and that it handles I/O efficiently. And this is true in a general sense. But the devil is in the details and the means by which this witchcraft was achieved matter when it comes to performance.

Essentially the paradigm shift that Node implements is that instead of essentially saying “write your code here to handle the request”, they instead say “write code here to start handling the request.” Each time you need to do something that involves I/O, you make the request and give a callback function which Node will call when it’s done.

Typical Node code for doing an I/O operation in a request goes like this:

http.createServer(function(request, response) {
	fs.readFile('/path/to/file', 'utf8', function(err, data) {

As you can see, there are two callback functions here. The first gets called when a request starts, and the second gets called when the file data is available.

What this does is basically give Node an opportunity to efficiently handle the I/O in between these callbacks. A scenario where it would be even more relevant is where you are doing a database call in Node, but I won’t bother with the example because it’s the exact same principle: You start the database call, and give Node a callback function, it performs the I/O operations separately using non-blocking calls and then invokes your callback function when the data you asked for is available. This mechanism of queuing up I/O calls and letting Node handle it and then getting a callback is called the “Event Loop.” And it works pretty well.

I/O Model Node.js

There is however a catch to this model. Under the hood, the reason for it has a lot more to do with how the V8 JavaScript engine (Chrome’s JS engine that is used by Node) is implemented 1 than anything else. The JS code that you write all runs in a single thread. Think about that for a moment. It means that while I/O is performed using efficient non-blocking techniques, your JS can that is doing CPU-bound operations runs in a single thread, each chunk of code blocking the next. A common example of where this might come up is looping over database records to process them in some way before outputting them to the client. Here’s an example that shows how that works:

var handler = function(request, response) {

	connection.query('SELECT ...', function (err, rows) {

		if (err) { throw err };

		for (var i = 0; i < rows.length; i++) {
			// do processing on each row

		response.end(...); // write out the results


While Node does handle the I/O efficiently, that for loop in the example above is using CPU cycles inside your one and only main thread. This means that if you have 10,000 connections, that loop could bring your entire application to a crawl, depending on how long it takes. Each request must share a slice of time, one at a time, in your main thread.

The premise this whole concept is based on is that the I/O operations are the slowest part, thus it is most important to handle those efficiently, even if it means doing other processing serially. This is true in some cases, but not in all.

The other point is that, and while this is only an opinion, it can be quite tiresome writing a bunch of nested callbacks and some argue that it makes the code significantly harder to follow. It’s not uncommon to see callbacks nested four, five, or even more levels deep inside Node code.

We’re back again to the trade-offs. The Node model works well if your main performance problem is I/O. However, its achilles heel is that you can go into a function that is handling an HTTP request and put in CPU-intensive code and bring every connection to a crawl if you’re not careful.

Naturally Non-blocking: Go

Before I get into the section for Go, it’s appropriate for me to disclose that I am a Go fanboy. I’ve used it for many projects and I’m openly a proponent of its productivity advantages, and I see them in my work when I use it.

That said, let’s look at how it deals with I/O. One key feature of the Go language is that it contains its own scheduler. Instead of each thread of execution corresponding to a single OS thread, it works with the concept of “goroutines.” And the Go runtime can assign a goroutine to an OS thread and have it execute, or suspend it and have it not be associated with an OS thread, based on what that goroutine is doing. Each request that comes in from Go’s HTTP server is handled in a separate Goroutine.

The diagram of how the scheduler works looks like this:

I/O Model Go

Under the hood, this is implemented by various points in the Go runtime that implement the I/O call by making the request to write/read/connect/etc., put the current goroutine to sleep, with the information to wake the goroutine back up when further action can be taken.

In effect, the Go runtime is doing something not terribly dissimilar to what Node is doing, except that the callback mechanism is built into the implementation of the I/O call and interacts with the scheduler automatically. It also does not suffer from the restriction of having to have all of your handler code run in the same thread, Go will automatically map your Goroutines to as many OS threads it deems appropriate based on the logic in its scheduler. The result is code like this:

func ServeHTTP(w http.ResponseWriter, r *http.Request) {

	// the underlying network call here is non-blocking
	rows, err := db.Query("SELECT ...")
	for _, row := range rows {
		// do something with the rows,
// each request in its own goroutine

	w.Write(...) // write the response, also non-blocking


As you can see above, the basic code structure of what we are doing resembles that of the more simplistic approaches, and yet achieves non-blocking I/O under the hood.

In most cases, this ends up being “the best of both worlds.” Non-blocking I/O is used for all of the important things, but your code looks like it is blocking and thus tends to be simpler to understand and maintain. The interaction between the Go scheduler and the OS scheduler handles the rest. It’s not complete magic, and if you build a large system, it’s worth putting in the time to understand more detail about how it works; but at the same time, the environment you get “out-of-the-box” works and scales quite well.

Go may have its faults, but generally speaking, the way it handles I/O is not among them.

Lies, Damned Lies and Benchmarks

It is difficult to give exact timings on the context switching involved with these various models. I could also argue that it’s less useful to you. So instead, I’ll give you some basic benchmarks that compare overall HTTP server performance of these server environments. Bear in mind that a lot of factors are involved in the performance of the entire end-to-end HTTP request/response path, and the numbers presented here are just some samples I put together to give a basic comparison.

For each of these environments, I wrote the appropriate code to read in a 64k file with random bytes, ran a SHA-256 hash on it N number of times (N being specified in the URL’s query string, e.g., .../test.php?n=100) and print the resulting hash in hex. I chose this because it’s a very simple way to run the same benchmarks with some consistent I/O and a controlled way to increase CPU usage.

See these benchmark notes for a bit more detail on the environments used.

First, let’s look at some low concurrency examples. Running 2000 iterations with 300 concurrent requests and only one hash per request (N=1) gives us this:

Mean number of milliseconds to complete a request across all concurrent requests, N=1

Times are the mean number of milliseconds to complete a request across all concurrent requests. Lower is better.

It’s hard to draw a conclusion from just this one graph, but this to me seems that, at this volume of connection and computation, we’re seeing times that more to do with the general execution of the languages themselves, much more so that the I/O. Note that the languages which are considered “scripting languages” (loose typing, dynamic interpretation) perform the slowest.

But what happens if we increase N to 1000, still with 300 concurrent requests - the same load but 100x more hash iterations (significantly more CPU load):

Mean number of milliseconds to complete a request across all concurrent requests, N=1000

Times are the mean number of milliseconds to complete a request across all concurrent requests. Lower is better.

All of a sudden, Node performance drops significantly, because the CPU-intensive operations in each request are blocking each other. And interestingly enough, PHP’s performance gets much better (relative to the others) and beats Java in this test. (It’s worth noting that in PHP the SHA-256 implementation is written in C and the execution path is spending a lot more time in that loop, since we’re doing 1000 hash iterations now).

Now let’s try 5000 concurrent connections (with N=1) - or as close to that as I could come. Unfortunately, for most of these environments, the failure rate was not insignificant. For this chart, we’ll look at the total number of requests per second. The higher the better:

Total number of requests per second, N=1, 5000 req/sec

Total number of requests per second. Higher is better.

And the picture looks quite different. It’s a guess, but it looks like at high connection volume the per-connection overhead involved with spawning new processes and the additional memory associated with it in PHP+Apache seems to become a dominant factor and tanks PHP’s performance. Clearly, Go is the winner here, followed by Java, Node and finally PHP.

While the factors involved with your overall throughput are many and also vary widely from application to application, the more you understand about the guts of what is going on under the hood and the tradeoffs involved, the better off you’ll be.

In Summary

With all of the above, it’s pretty clear that as languages have evolved, the solutions to dealing with large-scale applications that do lots of I/O have evolved with it.

To be fair, both PHP and Java, despite the descriptions in this article, do have implementations of non-blocking I/O available for use in web applications. But these are not as common as the approaches described above, and the attendant operational overhead of maintaining servers using such approaches would need to be taken into account. Not to mention that your code must be structured in a way that works with such environments; your “normal” PHP or Java web application usually will not run without significant modifications in such an environment.

As a comparison, if we consider a few significant factors that affect performance as well as ease of use, we get this:

Language Threads vs. Processes Non-blocking I/O Ease of Use
PHP Processes No
Java Threads Available Requires Callbacks
Node.js Threads Yes Requires Callbacks
Go Threads (Goroutines) Yes No Callbacks Needed

Threads are generally going to be much more memory efficient than processes, since they share the same memory space whereas processes don’t. Combining that with the factors related to non-blocking I/O, we can see that at least with the factors considered above, as we move down the list the general setup as it related to I/O improves. So if I had to pick a winner in the above contest, it would certainly be Go.

Even so, in practice, choosing an environment in which to build your application is closely connected to the familiarity your team has with said environment, and the overall productivity you can achieve with it. So it may not make sense for every team to just dive in and start developing web applications and services in Node or Go. Indeed, finding developers or the familiarity of your in-house team is often cited as the main reason to not use a different language and/or environment. That said, times have changed over the past fifteen years or so, a lot.

Hopefully the above helps paint a clearer picture of what is happening under the hood and gives you some ideas of how to deal with real-world scalability for your application. Happy inputting and outputting!

About the author

Brad Peabody, United States
member since January 9, 2017
Brad likes to build and improve software that solves real-world business problems and creates a positive experience for users, as well as having a positive business impact for the organization. He is inspired by a high-productivity/innovative work culture—walking the line between perfection and a getting-it-done mentality. [click to continue...]
Hiring? Meet the Top 10 Freelance Back-End Developers for Hire in March 2019


Interesting comparison. One little nitpick: Node doesn't "require" callbacks, certainly not since Promises became a native feature in ES6 and since v 7.6 it even has async/await (even though that's just a wrapper around promises).
Nice article, I would like that you also include as part of Servlet technologies son non blocking IO technologies like Netty, Vert.x and AKKA. Those are based on async calls and non blocking calls, Vertx. Uses a thread per core processor and takes both world advantages, Non Blocking IO / Async Calls and just a couple of threads. Best regards, Dimitri.
Srsly ? PHP v5.4.16; Apache v2.4.6 Your benchmark is unfair. You use newest version of node, java and go and oldest version of php and old apache. Use php 7.1 with nginx and show results ( I would do that but i have different machine and you did not provide source for benchmark files for me to redo all tests).
I would like to notice one thing. Each language has its own specific area of use. I am 100% sure that financial institution will not go with "non-blocking" languages even if they are "super-mega fast", because they need secure and consistent running cycle. While "messaging startups" can go with Go / Node, because it doesn't work with vital data.
Roland Harrison
You forgot writing async functions in c# Imo a great approach, reads like. Blocking code buy runs asynchronous.
Samuel Lawson
You should run those benchmarks with PHP7 now that it has gained wider acceptance in enterprise software. Given the reported 100% performance increase, it should give Go a run for its money.
Bar the comments about older versions of software I found the content of the article to be very informative. I was not surprised by the results but I did learn something about how the various environments work - which was useful. Thank you
Fadel Chafai
Man PHP7 release date is 3 December 2015
Qiong Wu
I am wondering, would you usually perform the hash operation within nodejs or wouldnt you write an interface for that since CPU intensive tasks are explicitly known to perform terribly in nodejs? how would nodejs compare in that case?
Great read. Thanks for the article Brad.
John Corry
php 7.1/nginx should show an improvement...but the results will be similar because of (as explained in depth in the article) PHP's IO blocking. The real takeaway from this is "PHP may be easy to use, but it's not 'performant'...Go is as performant as anything available".
Peter Kokot
Thanks for sharing these benchmarks. Always useful to see what is each platform doing good and not so good. I would add few notes for PHP that might change a lot. There is a Swoole extension for PHP. You might be surprised how fast that goes. several 10x faster than the usual setup as pointed above. But requires a bit of installation and adjustments since that is not traditional PHP application anymore.
Juan Pablo Carzolio
Thanks, Brad. Great read! I agree with some of the objections in other comments (Promises, PHP 7, etc.), but the explanations are very good and the article informative. I wasn't familiar with Go and find the concept quite interesting. The benchmarks are useful to give a rough idea - the exact numbers are not really that relevant IMHO.
Stas Slesarev
Did not found any word on how Node.js is running in your benchmarks. I mean, did you use clustering(e.g. run `pm2 start index.js -i 0` to use all CPUs) ? If not, then we could consider this benchmark as unfair for NodeJS, because Go uses all CPUs for his routines
Mike Critchley
Excellent read. And not just in a in a broad, general, hand-wavy way (LMAO @ that, btw). This isn't my field, but I definitely understand it a helluva lot better than I did 20 minutes ago thanks to this. Thanks for taking the time to write it up!
Use a nodejs cluster at least! How you can compare a multicore program with a single thread one? Everybody use nodejs clusters in production! This benchmark means nothing to me!
Ruan Kovalczyk
Who is Everybody? I do not know him. ;)
Great idea, that would be a good way to build a real-world Node app. This article would have been more transparent if the author had Fibonacci instead of SHA256 for this demo.
Julius Koronci
Great article..pretty happy with the PHP results..because adding a few more servers to PHP is still cheaper than developing with node or java :)
well you cannot always use promises to completely serialize things or you will need either a global object which will keep everything you need or returning an object all the time from promises to keep everything from the beginning so anyway even with promises things get complicated when the structure is complicated
are you drunk or what?
very useful comment. congratz. you are a real hero.
Ruan Kovalczyk
Thank you.
why do u think so ?
Ryan Winchester
Please include Elixir next time! - <a href="http://www.techworld.com/apps/how-elixir-helped-bleacher-report-handle-8x-more-traffic-3653957/">Bleacher Report goes from 150 servers to 5 moving from Rails to Elixir</a> - <a href="https://venturebeat.com/2015/12/18/pinterest-elixir/">Pinterest goes from 30 servers to 15 moving from Java to Elixir</a>
Thanks. True, promises can help with readability, but you still need a function that gets called back.
Mihai Tudor
Yes, this PHP benchmark looks like when dinosaurs where roaming around here. I don't say you should go with PHP when you are creating an intensely used API, but under an limit is an good competitor for everything else on the market. I will like to see this benchmark recreated with PHP 7.x.
Which mechanism are you referring to specifically? You're definitely right that there are ways to work around it and schedule CPU-intensive tasks so they do in fact run in parallel. But you also have to take into account the idea that if you have to use a thread pool to perform a simple operation, that's a lot of work on the developer's part to affect something simple. I'm not sure that accurately reflects how real world development typically goes. That said, you are correct that performance could certainly see a big improvement IF you do the extra dev work. It's easy to make calls in a handler function without knowing their CPU-intensity and inadvertently do what I've done here.
Alexander Roddis
https://uploads.disquscdn.com/images/90484bb6980dc60025ff6881661a55687b96bfbe0251fb27393a32fec18d6cd4.jpg To answer the PHP 7 comments: Bluntly, this is a valid criticism. But I also don't think it changes the overall point of the article, which is the models used, not the specific benchmarks. That said, CentOS (latest and greatest) comes out of the box with PHP 5.4. And PHP is also notorious for breaking things between versions (at least in my book and I've gone through the process with major apps multiple times) so getting things running on the latest version is not always that simple, there are a lot of PHP 5.x users still out there (https://seld.be/notes/php-versions-stats-2017-1-edition). I definitely concede that PHP 7 is bound to do better on performance, not including that is an oversight on my part.
Lol, there's truth to that :)
Valid point. I don't know for myself what the stats are on that but yes I'm sure a lot of production setups are clustered and this would certainly help relieve the CPU bottleneck. As an interesting note on this, it does have some side effects like the fact that as you spawn multiple processes you also make a copy of any per-process resources - overall process overhead memory, caches, mem-mapped files, etc. This is not necessarily a huge deal but in a complex app it could certainly add bloat that you didn't intend and wouldn't have when you just run a single copy of node.
Antonio Gallo
i hope he turned on xcache OP at least:-P
Alexandr Cherednichenko
My 50 cents: In node, you can also not use callbacks. With async/await and async generators, you can write really clean code. Of course, it is not real "callback less" flow, due to its internal implementation. But if we're talking about the "Ease of Use", that doesn't really matter. And it's natively supported in node7. And with TS or Babel you can transpile to previous versions, you need to support.
It would be great to include Elixir in this. It's a different paradigm and different runtime model after all.
Give PHP4 a spin next.
Hi Ryan, we moved from PHP5.6 to PHP7, reduced number of workers and CPU usage is also half on workers we had with PHP5.6. We average out 1500 hits per seconds on our haproxy.
Wow!!! Insightful...wait a minute..."PHP v5.4.16; Apache v2.4.6"? @bradliusgp:disqus you really are a Go fanboi. But still a good read.
Alejandro Pablo
"An important milestone is that in version 1.4 Java (..) gained the ability to do non-blocking I/O calls." Yes, 15 years ago. Please provide the source code for each benchmark, without that this article does not make any sense.
Try it again with PHP-FPM 7.x.
Sid Wood
You shouldn't ever need to keep a global object. Checkout bluebird's .spread().
Kabir Baidya
When you say "secure and consistent running cycle" IMHO it's a relative statement. Could you elaborate why do you think Go / Node aren't "secure and consistent running cycle" enough for financial institutions in 2017?
Kabir Baidya
Node is as lightweight as PHP too and hosting Node on servers isn't really expensive. You could buy the cheapest available servers from DigitalOcean or linode and you're ready to go, and can scale up of course when the need arise.
When we are talking about financial operations, they require a few changes within a database (increment there, insert that, change something else etc). So there are a few things to overcome while using non-blocking languages: 1. Don't be lost in a callbacks, because all of those operations should be called one by one. In a case of failure, everything should be reverted. 2. Keep the order of requests (especially increment / decrement of a same balance from different resources (devices)). Also, I didn't face any banking software, trading software or even payment gateways that works on asynchronous languages, mostly Java (Python / C++ is some cases).
and compare it to plain ANSI C
You don't know how to use node.js
Julius Koronci
it is not the servers which is expensive..the development(developers, time, resources) ..that is why PHP takes 83% of the web
I can argue with your position. It is relatively easy to use for example Python Tornado in banking sphere without any of drawbacks on a good speed, with high performance, and with "secure and consistent running cycle". There are many ways to do so (e.g. with help of some transaction blockers, etc).
Andreas Galazis
could you please post the code run for each language and the environment it was run in/how it was run? (ideally linked in a repo) did you use clustering for node? with how many workers ?
I believe this is the more authoritative and most objective benchmarks on web frameworks: https://www.techempower.com/benchmarks/
Boban Petrovic
Performance of PHP + Apache depends a lot on Apache optimisations for specific load, and PHP version and settings as well as opcode cache in use. So benchmark here is not relevant. Also, a lot of people talks about nginx + php-fpm , but all of my single php page tests showed that apache+mod_ssl performs ~10% better. The best general site results was witg nginx+apache+mod_ssl.
Abraham Sanchez
This Article is amazing and so, so polemic as I can see in comments... xD Probably I'm agree with some php people, it should be tested with php7 and nginx. But this article helped me to understand a lot of things and however I'm still surprised about the power of Go. Thanks....
Decebal Dobrica
Great job at describing I/O and love the idea of comparing them across languages. Benchmarks are always bound to be subjective, I wouldn't mind conflicting comments around here. This article has good insights into I/O models for people that maybe are not looking around because they're busy doing something else. For anyone trying to contest the benchmarks or re-take them docker hub has all of these 4 languages available in official containers, easy to run and easy to switch between versions.
Ram Lal
php5.4? srsly?? :o you are comparing apple to orange :) have a look at http://rojan.com.np/scraping-nodejs-vs-php/#comment-1128148853 :) then run a test on php 7.2 with opcache JIT using swoole extension. this is still not apple-apple but much closer :) perhaps you can opensource your benchmark codes as well :)
Jonathan Sterling
PHP takes 83% of the web because WordPress is so widespread. That is, people that don't even know PHP are using it and deploying the same core WordPress code over and over. WordPress sites are slow, clunky, and have security holes. Just because a language is widespread doesn't mean a larger percentage of developers know it, that it's cheaper and/or faster to develop in, or that it uses less resources. Further, if you look at the usage statistics for high traffic sites, Java and C# are more widespread than PHP ( https://w3techs.com/technologies/details/pl-php/all/all ). WordPress is a great option for smaller sites where performance isn't so critical, but when it comes to high-performance, multi-million-user, enterprise applications, it's not so widely used. Node, Go, Java, and C# are far more suitable in these situations, as the benchmarks above clearly show.
Julius Koronci
you are making a mistake in your assumption..while wordpress has more than 80 million websites PHP has more than 800 million websites..I had a look at proper statistics and PHP is the top..Java and c# are used for corporate sites, banking systems..not high traffic sites..and actually some banks are starting to adopt PHP which is amazing..because the quality is the same but costs are 10times lower..is is cheaper to throw in a few more servers than hiring 2 more Java devs for the next couple of years
In fact, PHP supports non-blocking I/O as well. See https://github.com/amphp/amp + https://github.com/amphp/aerys.
Stefano Fratini
What I see here doesn't match what you see on https://www.techempower.com/benchmarks/ and my personal experience Handling concurrently 5k requests with php? on one server?
Rossen Hristov
What about ASP.NET?
Jonathan Sterling
Ah yes, I'm wrong on the WordPress argument - you're right. That said, I think once you get passed the entry-level PHP developers that are making simple sites, and start looking at experienced PHP developers that can build something like Facebook, the costs start to equalize. I know the owner of a PHP shop in Leeds, England that always laments that he can't find good PHP developers. He said the vast majority of people that come in to interview can only make simple sites, not the enterprise-grade stuff his company develops. So whilst PHP can definitely work on an enterprise scale (as we see at Facebook, Wikipedia, and others), at that level the costs start to equal that of Java/C#/Go developers anyway (just look at what Facebook pays their employees). For simpler stuff, I agree - no reason not to use PHP.
Oh... in my opinion subtitle "Lies, Damned Lies and Benchmarks" will be better title for whole article. It looks like set of examples from stackoverflow. I have multiple questions for author: - Why you use one machine for run benchmark and applications. Application and benchmark must be started on different units. - Why are you read whole file in your code for hashing, byte buffers will be better - Why you use HTTP for benchmarking? Simple console app is fair way for creating IO-benchmarks(no time spent for http-message parsing, start application server and many other things) - Why you don`t implement simple http-server for Java(as in Go and Node exmaples) and start using Tomcat? Junior developers can be confused after articles like that
Niranjan Godbole
Go doesn't have callbacks and unlike java, its garbage collector is optimised for low latency ( < 1 millisecond). There are people already in finance using Go - Monzo bank.
Niranjan Godbole
Go is the clear winner here. :)
Julius Koronci
yeah people are always an issue..finding a proper developer is always hard no matter the language..I am not earning less than any Java developer, that is true..the real benefit of PHP is not in the salaries...bud in development time and number of developers..PHP is just made for fast web development..and you can develop a website a hell lot faster than in Java or .asp..also finding proper developers in Java or .asp is harder than in PHP. If you have a bigger project, it is ok to have a few seniors and a lot of juniors..as far as seniors, they are hard to find but juniors is another thing..PHP is easy to learn and even easier to lead juniors to proper code..while with Java or .asp this is almost not possible..the languages by itself are very difficult, hard to learn..so having a team of juniors is more a hindrance than an asset.. But to be fair the experience of your friend is a common one..everyone can start writing PHP overnight..even people with no programming experience can create a simple site within a week..this is good and bad at the same time..and there are hundreds of such people coming for appointments
Sadly it seems like the benchmarks have overshadowed the entire point of the article. Please read the article if you didn't. To answer your questions: "one machine", for practicality. Reading whole files and HTTP - because the point was to test I/O, not just the language performance itself. If anything, I should have removed the hashing. On the Java server I used Tomcat mainly because I felt it was more representative of an "average" Java deployment - I could be wrong on that, I didn't do a bunch of research on current Java deployment stats. And I'm not sure what the jab about SO copying is all about - those examples are intended to be as close as practically possible to equivalent functionality in each language, I wrote them as such.
Jimin Hsieh
<!-- /* Font Definitions */ @font-face {font-family:新細明體; mso-font-charset:136; mso-generic-font-family:auto; mso-font-pitch:variable; mso-font-signature:-1610611969 684719354 22 0 1048577 0;} @font-face {font-family:"Cambria Math"; panose-1:2 4 5 3 5 4 6 3 2 4; mso-font-charset:1; mso-generic-font-family:roman; mso-font-format:other; mso-font-pitch:variable; mso-font-signature:0 0 0 0 0 0;} @font-face {font-family:Calibri; panose-1:2 15 5 2 2 2 4 3 2 4; mso-font-charset:0; mso-generic-font-family:auto; mso-font-pitch:variable; mso-font-signature:-536870145 1073786111 1 0 415 0;} /* Style Definitions */ p.MsoNormal, li.MsoNormal, div.MsoNormal {mso-style-unhide:no; mso-style-qformat:yes; mso-style-parent:""; margin:0cm; margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:12.0pt; font-family:Calibri; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:新細明體; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;} .MsoChpDefault {mso-style-type:export-only; mso-default-props:yes; font-family:Calibri; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:新細明體; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;} @page WordSection1 {size:612.0pt 792.0pt; margin:72.0pt 72.0pt 72.0pt 72.0pt; mso-header-margin:36.0pt; mso-footer-margin:36.0pt; mso-paper-source:0;} div.WordSection1 {page:WordSection1;} --> <!-- /* Font Definitions */ @font-face {font-family:新細明體; mso-font-charset:136; mso-generic-font-family:auto; mso-font-pitch:variable; mso-font-signature:-1610611969 684719354 22 0 1048577 0;} @font-face {font-family:"Cambria Math"; panose-1:2 4 5 3 5 4 6 3 2 4; mso-font-charset:1; mso-generic-font-family:roman; mso-font-format:other; mso-font-pitch:variable; mso-font-signature:0 0 0 0 0 0;} @font-face {font-family:Calibri; panose-1:2 15 5 2 2 2 4 3 2 4; mso-font-charset:0; mso-generic-font-family:auto; mso-font-pitch:variable; mso-font-signature:-536870145 1073786111 1 0 415 0;} /* Style Definitions */ p.MsoNormal, li.MsoNormal, div.MsoNormal {mso-style-unhide:no; mso-style-qformat:yes; mso-style-parent:""; margin:0cm; margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:12.0pt; font-family:Calibri; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:新細明體; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;} .MsoChpDefault {mso-style-type:export-only; mso-default-props:yes; font-family:Calibri; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:新細明體; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;} @page WordSection1 {size:612.0pt 792.0pt; margin:72.0pt 72.0pt 72.0pt 72.0pt; mso-header-margin:36.0pt; mso-footer-margin:36.0pt; mso-paper-source:0;} div.WordSection1 {page:WordSection1;} --> If you want to compare I/O intensive job in Java, no one would servlet. There is Netty,Vert.x, and Akka-HTTP.
Miguel García López
Benchmarking and comparisons aside (insightful as they are however), I particularly appreciate the educating part on how sync. vs. async. I/O is finally driven to the OS. This is something I keep finding people are not aware of (nor care about!) whilst it being crucial to understanding it all. As already pointed out, for the Java world there is the Vert.x project (built on top of Netty for async. I/O) I definitely recommend taking a look into. Thanks for the post!
Александр Холодов
Danila Matveev
Terrible and harmful post. Please delete it for the sake of great goodness. 1) random stacks are compared and then named after languages 2) environment configuration is missed 3) lack of understanding of comparable environments and practices 4) measurement details are missed but is extremely important for example for java
Noir Alsafar
Can you post your source code that you benchmarked? It'd be helpful to see that you setup your test servers optimally and unbiased
See https://peabody.io/post/server-env-benchmarks/
Oleg Abrazhaev
Bred, pleasu update your article with php7.1 + php-fpm + nginx results. It's a shame for you and harm your reputation as specialist to post results for so old php 5.4 version with apache =/
Oleg, there are many, many different configurations possible for all four languages discussed. As mentioned elsewhere the PHP and Apache versions I used are the default that come with the latest CentOS. I think it would be very constructive to do further benchmarks with other common scenarios and link to them from here. If someone does a comprehensive enough set I'm sure I can get the main article updated to include a link to it, otherwise just include the link the comments. But this is unfortunately not something I have time to work on. The code I used is here: https://peabody.io/post/server-env-benchmarks/ Please feel free to contribute your own comparisons.
Noir Alsafar
You setup the node.js server wrong. You blocked the thread by using a sync version of sha256. You should use the vanilla native modules that come with Node.js to actually perform the test, and use you should use the async version, not one that blocks the event loop. You know that. You should rerun your numbers. Your post is phenomenal except for this part. Happy to help if you if you want me to send you source code?
Oleg Abrazhaev
Take a look at this benchmark. I think this is the most accurate and well-known one https://www.techempower.com/benchmarks/#section=data-r14&hw=ph&test=query in latest round 14 PHP raw faster than raw go in multiple queries and fortunes benchmarks. This benchmark is reliable and done right. Your benchmark doesn't make sense if you took versions from different years for all languages. It's not accurate, it compares nothing, and if you will look at the comments - many people think the same. Also, I came here from a Russian translation of this article on https://habrahabr.ru/company/mailru/blog/329258/ by mail.ru. And here in comments people also pissed off about a version of php that you have chosen and why it's with Apache. In the real world nowadays php developers use php7 and nginx + php-fpm. And people in comments saying, like here in comments, that author not qualified, he knows nothing about php and this comparison doesn't make sense. You see what I'm telling you about? This article in a current state only harms your reputation in all languages in which it translated. Even mail.ru realized that by translating this article they made a mistake. You need to add or fix results for php, it couldn't be fixed otherwise.
Thanks Oleg. I do understand. The only things I can tell you are a) the article was never meant to be a comparison of general language performance - it's about I/O models and comparing how things work. And b) there are many possible criticisms (the PHP guys are after me for the version and Apache, the Node guys wants to see it run in a cluster, and the Java guys think I should have used something that is natively NIO, the list goes on). And I will not be updating the article at all simply because it was never meant to be a comprehensive comparison of language performance - again, it's about I/O models. I also think the text in the article makes that clear. I'm happy to see links to other information that provides other comparisons and I appreciate your time in describing the situation, and to a degree I agree with you, but again, I won't be updating the article benchmarks. Regarding the point of my reputation, the point still stands that the setup I used was the default with a major Linux distro. My intention was not to bash PHP by using an old version, but to use a simple common setup and simply doing "yum install" on a RedHat-based distro does constitute that. I think that's clear to anyone objectively looking at it. I'm sure many PHP devs will hate me for years to come because of it - comes with the territory. I don't hate them back - I welcome the additional data and views. Hopefully all that makes sense.
Thanks Noir, The source code is in the link above. Yes, you're right I do know that, and I did feel I clearly stated that in the article text. The reason is that the article is about I/O models and how environments differ in their - in fact this difference was the point of the demonstration - to show how it work and what that difference is. I appreciate the input but I will not be updating the article's benchmarks. However if there are additional benchmarks that should be linked to in the comments I think that's also a good way to provide counterpoints and more data. (If there is something that is really much more comprehensive on the same subject - comparing I/O models, I think that would make sense to update the article body with a link to as well.)
Oleg Abrazhaev
It makes sense, thanks for the clarification. But still more latest versions of Node and Java were used and you can see that Java and Node guys satisfied in general. Their claims to add cluster or to use NIO looks like an addition and possible improvement. But with PHP it completely screwed up, we are not even talking about additions there. PHP version in "the latest CentOS" was always outdated. If you would use latest Ubuntu, for example, you would find php7 there by default. Also, we have some famous choices to install latest PHP on OS with an outdated package. It's dot.deb for Debian - https://www.dotdeb.org/ And it's Sury PPA for Ubuntu - https://launchpad.net/~ondrej/+archive/ubuntu/php I don't know about the same sources for yum world, I'm a mostly apt user, but googling by "php7 centos" gives me in 5 sec a solution with only one additional command to yum (install rpm first). Look here - https://webtatic.com/packages/php70/ I understand, that the meaning of this article is comparing I/O model and not languages performance itself. But the main problem and issue here are a lack of preparation. For me, as for experienced programmer and engineer, it's obvious to spend at least 5-10 minutes and read something about technology which I wold like to touch. If I were you and would think about writing such article I would definitely install the latest versions of all compilers and interpreters. Because for me as a programmer it's natural and obvious that I need to work with latest versions. Still, considering my clarification above, there is no any real reason to not spend 10 minutes to do a minimum research and install latest versions. That's the main issue here and exactly this factor harms this article the most.
Oleg Abrazhaev
php7 have almost twice improved it's performance in comparison to php5.6 so, we will definitely see different results. I'm not saying that PHP would win against Go but at leas it would lost with a better results, not like in the article.
Oleg Abrazhaev
There are his sources https://peabody.io/post/server-env-benchmarks/
Oleg Abrazhaev
In general, you are right, and results will not be very different with PHP 7. But the thing which violates this article authority and meaning isn't the version of PHP itself, but it shows that author hasn't spent his time in preparation for this article. This fact harms author authority and reputation. That's why it was better to use latest versions. And there would not any arguments in comments (which now about 50% of all comments here and under the Russian translation of this article the same situation). We are programmers, engineers and this is author target auditory. And this auditory like when things made smart, precisely and technically right.
Filip Petkovski
Nice article, thanks for investing the time and effort. However, I have to say I cringed when I saw your PHP setup. Your workflow was testing Apache, not PHP performance :) It is also a good practice to put labels on your chart axes so that they can be interpreted more easily.
Rohan Kapadia
Where can I see the date on which this article was posted? When it comes to benchmarking, old and stale information is worse than no information.
Dan Dollins
Excellent tutorial, so great and fluid : thanks !!
John Robie
Not using PHP7, and not even a mention of Elixir? I want my money back.
Marcin K
Node process relatively light. 50mb on startup and max 1.4GB maximum imposed by V8, depends if you code is abusing GC or not. On 4x Core machine you should have 8GB ram and start node with pm2 https://github.com/Unitech/pm2 pm2 start app.js -i 4 I would agree phra, these benchmarks are pointless. They also do not test real life applications. fetching something from DB would be more suitable test. In CRUD type app it is possible that node would be order magnitude faster than PHP.
Wasin Thonkaew
Thanks for article. It lights me up, and I could found it sooner. For testing, how you simulate concurrent request i.e. aim at 3000, 5000 concurrent request? How to achieve that exactly?
Steven Sagaert
I think the Java setup is not the most modern. Next to that classic "blocking io in threadpool" I'd like to see more modern JVM based server architectures like Akka based ones (either Java or Scala) or Vert.x (several JVM languages) or Kotlin coroutines. I'd also like to see Erlang web servers & Elixir+ Phoenix in the benchmark and .Net (with async/await).
Declan Nnadozie
All the same, you are GOOD :-p, from what i can see here, PHP and Java may not have flexiblility Node.JS offers, one writing c/c++ addons is relatively easier than Java JNI/C interface. Once again thanks Brad, you expanded my horizon
Adam Patterson
It comes down the the right tool for the job. If your building an API then maybe Python, Go, or Node make sense. But if you tuning for High IO on a website then you need to consider the ecosystems for mature frameworks, integration of front end and back end teams. I bet 9 times out of 10 you will have a PHP front end. Maybe you're building a web app, then again, you need to address those concerns. Brad mentioned this but what kind of talent can you attract and retain in your area, what kind of costs are associated with these environments. If you are going for IO then you probably have some form of proxy caching or a load balancer or both. I don't think you can look at a languages numbers and make a sound judgment call.
Alvaro Urbaez
It would be interesting to see what configuration you had in thread pool in Java test, that is a critical parameter to face a bunch of traffic in a Java application.. So, I think these tests are really improveables, but at least they give a look of this matter.. Thanks..
Michal Boška
With regards to java, have a look at https://www.playframework.com/ or http://vertx.io/ . It is not uncommon to write non blocking IO code in Java nowadays, having all the benefits of async IO processing (requiring only as many threads as CPU cores) and also benefits of multithreading (being able to share immutable data structures within a single process, not having to do IPC which often slows the process down etc) Would be interesting to see benchmark between these languages using Play or Vert.x with Java (instead of blocking servlet spec)
I opened this page because I was thinking "Should I learn a new language instead of continuing to use Go"... this made me have a laugh, I think I'll stick with Go
Hayden James
Plus PHP 5.5 has opcache enabled by default. So PHP 7 + opcache would be more like 5x faster than what was benched above.
Binh Thanh Nguyen
Thanks, nice post
And I'm also wondering who still uses the apache prefork with the php mod anymore?
Maxim Kuderko
PHP 7.x FPM with preforked workers + opcache + nginx properly configured on quad code cpu could go even further than 5k rps
It appears that the author chose 5.4.16 because it is bundled in RedHat/Oracle/CentOS/Scientific Linux v7. # rpm -qa | grep php php-5.4.16-43.el7_4.x86_64 php-cli-5.4.16-43.el7_4.x86_64 php-common-5.4.16-43.el7_4.x86_64 I don't think Node or Go are bundled at all, but might be available in EPEL. RH7 bundles two JREs. # rpm -qa | grep ^java | sort java-1.7.0-openjdk- java-1.7.0-openjdk-headless- java-1.8.0-openjdk- java-1.8.0-openjdk-headless- javapackages-tools-3.4.1-11.el7.noarch
Tarun Ramakrishna
Frankly, when I last tried simple request/response benchmarking with some CPU activity in the request handling, the Java version smoked both node and go at high RPM after the JVM was sufficiently warmed up. Can you kindly share your source code for analysis ?
Arthur V Zhuk
Conclusion: No one agrees with the results.
WC Plunger
Promises are just callback wrappers...
Totally right, this benchmark does not even speak about clustering node.js but talk about many web workers for java and php. The author of this benchmark is unknowledgable about Node.js in production, this is a real shame and remove any sense of a potencial real world benchmark.
Ruiheng Wang
it's Opcache now
...or Atari BASIC
Eddy WM
In Java, these days devs are using Vert.X, Akka, Reactor and others which are mostly event-driven, non-blocking and adhere to the Reactive Manifesto. Serverlets are old days way of doing backend systems in Java.
Jon Forrest
I suggest you go back and proofread this article. For example " your JS can that is doing CPU-bound operations".
Ahmad Samiei
Java is built for concurrent computing by default. Multithreading is one of the options you have and you only do multi-threaded when it make sense. It's not the default option to work with Java (it was popular many many years ago). And your Java code is horribly old school.. The PHP version you're using is really really old (2012) and also nobody uses your setup with Apache anymore. PHP7.X's raw performance is almost three times faster than the version you're using. Node had a significant performance improvement after version 8. But you're using v6.
Not even a mention of Elixir/Erlang ?? Wtf.
Dustin Deus
Hi, I would like to see how Node 9 performed (TurboFan) and then you can also use the native Crypto module.
Grant Jenks
Python comparison at https://gist.github.com/grantjenks/dacc0a1e7fa9a08264439b9c6a05ec5b The Python results are really good: 5,347 requests per second for Python vs 6,856 for Golang. I expected more from Golang. My guess is the benchmark is CPU-bound so all the time is spent reading the file and hashing it. For Python, that all happens in optimized C code.
What about the WebSockets for PHP (http://socketo.me/) ? Php can do the non blocking I/O now days!
Paco Meraz
Ikr, no use of Elixir/Erlang that wouldhave shredded the others. Also no ruby, no Python... nothing
Rus Brushkoff
nodejs has async/await which do not block. Seems like your synthetic tests are unaware about this: https://stackoverflow.com/questions/46004290/will-async-await-block-a-thread-node-js
Richard A
PHP 5.4 from 2012-2013 And Apache vs the rest. Fair fight.... Why not Php 7.1 And Nginx?
giuseppe d
For high request rate you should never use a single php backend app (it is not designed to do that). You should use HHVM. Anyway, thanks for the clear explanation.
Sarath Uch
I realized that Go is a compiled code and it's a super amazing language; Very lighting performance.
Kolja Kutschera
To be fair, use nodejs with its native clustering to make use of all cpus like the others and than run the benchmark again. Please fix this post :D
Jonas Steinberg
I can't say for elixir, but I think Ruby and Python would have lost handily. Decent analysis (better than this, in general, I feel) is here (benchmarking setup seems better and more clear): https://www.linkedin.com/pulse/ruby-vs-php-python-simple-microservice-performance-comparison-per/
Jonas Steinberg
The explanation part I found very insightful, especially, believe it or not, from an operations perspective. My team has recently deployed ELK at scale and we found, basically for the first time ever, that tuning threads, IOPS, server specs, etc actually really, really matters at scale. And much of what you cover initially adds insight to what I've found for myself. I must say though: you lost me at the benchmarking. And I don't mean the apples/oranges/preferences arguments that seems to be afflicting everyone else, but rather the fact that, at least for me, you didn't elaborate on your benchmarking setup very well. I think the reason why there are so many criticisms in the comments is that this could've been a great piece of analysis, but ended up being fairly mediocre in the end, which is disappointing. But I think next time you'll nail it.
Jonas Steinberg
Additionally, according to https://www.techempower.com/benchmarks/#section=data-r15&hw=ph&test=fortune&b=2&s=1&l=gcrv5b NodeJS out performs Go by 200% for fairly typical database queries and client responses. Setups were *highly* similar.
Ezequiel Valenzuela
Thank you for adding Python to the list. Having looked at your "gist" (and other comments here), I'd say that possibly the best framework and design (for Python) hasn't been put forward yet: tornado + a thread pool. The code would not be really that big, and it'd have the chance of performing the hashing in parallel (as the Python lock would be released just before performing the low-level "hashing"), so it'd benefit of the parallel execution that Go is doing for you (assuming it has been tuned to use the maximum number of threads it'd deem appropriate), making the comparison more representative of equivalent execution profiles.
Grant Jenks
I’m sure the Python code can be made faster just as the other solutions could be made faster too. But the point was to show how the simple solution was competitively fast. My solution already uses a pool of workers to process requests in parallel and completely sidesteps issues with the Python global interpreter lock. It’s in the last line where it says “workers=8”.
Ezequiel Valenzuela
I appreciate that. However, when you're "hammering" an http server for which you're handling the requests in your chosen programming language/framework (Python), having 8 "workers", with each of them behaving (mostly) synchronously, is not going to do much in terms of producing a low standard deviation in terms of execution time for any given request. Also, even in the case where each "worker" is assigned a CPU thread (assuming you've got 8 CPU threads to play with), each of them might be busy doing CPU intensive work (such as calculating a hash) instead of servicing the I/O-bound network traffic (or even file reading), which is ultimately a design inefficiency, and (more importantly) adding latency to almost every request. In that regard, my point was that tornado will deal with the I/O for each of those network connections (very) efficiently, whilst freeing the CPU to do the CPU-intensive work. That way, you should basically be spending most of the CPU time doing the hashing (which (AFAIK) should be implemented in C by the Python standard library), which is an inescapable cost. I'd guess that Python should do only slightly worse than a native (or JIT-ed) language/framework, to account for the Python bytecode execution. Maybe having requests ultimately waiting to be serviced (as they're waiting in the "worker" queue) only makes those requests perform badly, and others (the ones being serviced) perform very well (as they benefit from having the CPU to themselves), giving roughly the same "mean" value for the sample set. But I'm not sure about that. That might need some measuring, before venturing a verdict.
Grant Jenks
I added "japronto" to the comparison which is on the bleeding-edge of Python web server frameworks. The performance is currently 13% faster than golang. It's design is similar to bottle+gunicorn in that it uses a pre-fork worker model. But each of those workers uses async-IO (irrelevant for your benchmark) and a fast HTTP parser.
could you share the source code you used to run these tests
Dmitry Sokulev
Using curl_multi_exec with php will beat all of these.
Namjith Aravind
Our application is built on PHP 7 and what we could realize the real bottleneck is the blocking calls (I/O)to MySQL and Redis. We are able to manage with persistent connections since PHP does not have Connection Pool (Persistend connections will close in the end of process life), But the thing i would like to know is , Is there any language feature (Go (Goroutine),Akka Http(Actor)) are able to solve this with connection pool(Use dedicated less number of connections instead one connection per request).
Have a look at the PHP extension called swoole [1] and then throw away nginx and Apache :-) Here's an example benchmark with PHP swoole wiping the floor with nodejs [2]. [1] https://www.swoole.com/index.en.html [2] https://www.w3c-lab.com/php-7-1-swoole-v1-9-5-vs-node-js-benchmark-test-php7-swoole-beats-node-js/
Namjith Aravind
Is it fine to use with zend framework 2?
Gary Fry
First of all, thank you Brad for spending your time to research. Getting something out there, for people to find and react to means we're looking for genuine peer-reviewed results. To have people give you any feedback is a compliment. So well-done! Clearly you're not massively familiar with the nuances of all of the languages in your benchmarks, but you've got a good instinct. You mentioned "we’re seeing times that more to do with the general execution of the languages themselves, much more so that the I/O. ". You are right. In the case of Java, when you start Java code, it has to load a not-so-insignificant amount of code into memory from the file system before any code gets executed. That's an overhead which is a factor in any language, so if you're timing it from the shell, that would be bad, so hopefully your timings were derived from code-land, rather than script-land. Java has something called a Just-in-time (JIT) compiler that self-optimises code execution paths after a number of iterations in a loop of code. If you allow the JVM to warm up (I think 10K iterations) - https://stackoverflow.com/questions/36370483/jvm-warmup-queries?rq=1, then things get cooking, performance-wise. Also, the Garbage collector and the heap size settings could be tweaked to improve Java performance. No doubt, GoLang and Node would have a swiss-army knife of performance enhancers up their sleeves too... so just saying. :-) Getting each application to an optimal point where your benchmarks make sense is a challenge, but well worth the journey.. There's clearly a demand for the work you're doing, so keep going!
I completely agree with this comment (was actually looking if it is posted already). The author most probably doesn't know very well what he is talking about as it is a basic scaling feature of NodeJS to use clustering. It is extremely easy to spin up threads as well as relatively simple to set-up separate child processes in order to handle heavier blocking computation tasks. Using a Node child process to send a scientific computation over to a separate Python interpreter and wait for the response is also common. All in all what I mean is that a very much more efficient utilization of the underlying resources (CPU) is possible than what is shown here.
To make this comparison to be fair, I hope you have used a node.js cluster, otherwise you would be running everything in a single thread. The idea of Node.js cluster is that you will run as many node.js concurrrent processes as you have cpu cores. This feature is built in to the core packages of node.js, and done the right way you need less than 10 lines of code to make use of it without any changes to your source code. https://nodejs.org/api/cluster.html
Natalia N
It's a very comprehensive article, thank you for your job! Some say that Node is better for complex single-page applications as they have heavy I/O workload characteristics. https://softmedialab.com/blog/nodejs-vs-php/ What do you think?
Lemuel Adane
I agree
Lemuel Adane
But if you 'toe-to-toe' PHP7 with Go, Go will still win.
Robert Mennell
I found this in 2018 as a top search result. Don't come here. This benchmark is so badly written it doesn't even pass the test of "Is it fair" by a long shot. Can we get this taken down?
Kıran Ali
what about .net core, it's worth to include here I guess
James Suárez
YOUR BENCHMARK IS NOT GOOD FOR REAL PEOPLE. Almost people talk very bad about CPU intensive in Node because have only one thread in your core. BUt why don't say that Nodejs have cluster module. With the good configuration of cluster (one cluster for each core) can beat almost like a golang or any other language that have thread support of your benchmarks I will try to make the benchmarks using a cluster and will see noticed differences
James Suárez
You don't need start cluster each request, opening a cluster for each core, can beat the performance of other languagues you used for benchmark. I always see people taking that nodejs is bad for cpu intensive, but is really? With a good cluster environment you will get same performance like other language can get using threads
James Suárez
Is the same thing for NodeJs. In benchmark all languages have the multicore support (with threads/processes) , and nodejs also have available multicore support (using cluster module) but here the programmer don't apply to NodeJS. With NodeJs cluster you will get 2x performance at least, depending on CPU cores
Thanks to all of you to provided constructive feedback and meaningful discussion on this topic. Please also note that responses which are primarily negative/not constructive don't help anyone. I wrote this article to help provide information for people to use in their own projects, decision making, etc. - that's the point of the article. Opinions and feedback are welcome. But I don't feed trolls.
Well written! Performance was considered highly from the start of Go. Languages like python and Ruby should not be mention in terms of Non-blocking IO. If added tomorrow, it will still be a copied concept - not part of the main language; leading to mediocre performance. For high scale performance app, go with Go.
Hridyansh Thakur
how about binary ?
Harish Sivakumar
Hey Brad, Kindly give a quick view of performance of rest api calls with request response time for the below languages Node.js vs Django vs PHP vs Go vs Java(jsp or serlvet even rmi) Thanks in Advance, Regards, Hariharan
comments powered by Disqus
Free email updates
Get the latest content first.
No spam. Just great articles & insights.
Free email updates
Get the latest content first.
Thank you for subscribing!
Check your inbox to confirm subscription. You'll start receiving posts after you confirm.
Trending articles
Relevant Technologies
About the author
Brad Peabody
JavaScript Developer
Brad likes to build and improve software that solves real-world business problems and creates a positive experience for users, as well as having a positive business impact for the organization. He is inspired by a high-productivity/innovative work culture—walking the line between perfection and a getting-it-done mentality.