Given that a web application is correct, available, and secure, users care most about speed. They don’t care about your hardware utilization or how many requests are handled per server, and they also don’t care how fast your web app is on average, they care about how fast it is for them. Page load time and time-until-usable are what users are concerned about.
Looking at cnn.com, there are 89 requests, totalling a little less than 1MB, which took 4.24s for it to load for me. Of those 89 requests, 3 were HTML requests from cnn.com (1 for the HTML page and 2 for weather). The HTML from cnn.com is about 30KB…out of 1MB! So if you want to speed up your site, where is the best place to focus? HTML from the server that makes up 3% of the total weight, 3% of the total number of requests, and 9% of the total page load time? Or would you focus on reducing the number of requests, the size of assets, and the caching of those assets?
Cnn.com’s HTML took 336 ms to get to me. Let’s say you made that 10x faster. You would have then reduced the total page time by 300 ms or about 7% of the total page load time and still get about 4 seconds for page load. You could have a 1 ms HTML response time and still have a slow site. The HTML generation and return time is usually not where the problem is for web application performance.
Most of the assets on a web page are static (meaning they don’t change per request) so they can be served by a cache server (so the origin server isn’t hit) and by the browser (so not even the cache server is hit). The origin server can generate the HTML and server up the static assets if needed, but it really shouldn’t do that very often because the browser cache and cache servers should be serving them. So then what you really need is a content server that is geared toward HTML generation, whether it be static or dynamic. So you have the origin server generating dynamic but cacheable HTML (like templated by little-changing info pages), and for handling dynamic but non-cacheable HTML (like search).
The content server should not need to do hardly any IO. Why would an HTML content server need to write to the filesystem? Even if it does, why does the web visitor need to wait on the result of that file write operation before seeing the server response? If you really need to write to the filesystem, spawn a thread or offload that operation to something else that can queue up write operations. Your content server doesn’t need to do it; it just need to invoke something else to do it.
If your content server is serving up dynamic content, what else can it be doing before it gets the data from the database? It’s primarily going to be formatting and creating presentation using the data from the DB, and if it has something it can be doing in the meantime I’m arguing it should be doing it. Something else can communicate with other services and cache HTML fragments or whatever. All the content server does is process content, so if it has to wait for the data, it waits.
But why would there be any IO for data that takes much time at all? If the data is so far removed from the presentation engine (the content server) that it blocks for any noticeable amount of time, you got a problem with data retrieval. The answer isn’t to create a callback for when the data finally arrives from the DB, the answer is to fix the problem of data coming back so slow from the DB.