Reverse Proxy Performance – Varnish vs. Squid (Part 2)

In part one of this series I tested the raw throughput performance of Varnish and Squid. My results are consistent with all the blogs and comments floating around the blogosphere – Varnish blows away Squid.
Unfortunately, the first series of tests were somewhat uninformative. Since they only tested the raw performance of serving cached content from memory, it did not mimic a real world scenario of serving cached content as well as fetching content from the backend and caching it.
While we would hope for a primed, full cache, it is unlikely to happen and you will undoubtedly see a decent amount of backend requests from your caching proxy.
A better test of the two proxies would involve a large set of random URLs, but not too random because we want to simulate both cache hits and cache misses. To accomplish this, I wrote a small PHP script that would take two parameters: total number of URLs to generate and the hostname for those URLs.
Generating a usable URL list
Generating the list is simple. This script looks like this:
<?php ob_start(); $total = $_GET['total']; $host = $_GET['host']; for ($i = 0; $i < $total; $i++) { // generate random numbers srand(time()); $random = mt_rand(10, 40); $random2 = mt_rand(1, 100); if (substr($random2,0,1) > 1) { $as="?as=$random2"; } else { $as = ""; } echo "http://$host/varnish/gen/$random$as\n"; flush(); ob_flush(); } echo "http://$host/varnish/gen/$random$as"; ?>
All this does is create a long list of URLs. I used PHPs output buffering mechanisms to flush the buffer which is necessary when creating large URL lists so that you don’t wait forever. Maybe it could have been written better but I don’t care – that wasn’t the point of this test.
The URLs that are created are in the format of:
http://host/varnish/gen/50?as=100 http://host/varnish/gen/50
This URL is mapped to another PHP file that simply generates dummy data of the size specified in the URL. In the above cases, the files would be 50Kb large. The query parameter “as” is just a useless piece of information that is meant to tell the proxy to cache it. If the “as” query parameter does not exist, the proxy will forward the request to the backend and not cache it. Its a simple way to generate cacheable and non-cacheable URLs.
To generate the list and store it in a local file, I used this command:
curl http://192.168.165.101/varnish/makelist.php?total=10000&host=192.168.165.104:8080 > urls-10k.txt
Verify the results of the script
For your own sanity, make sure that the script did in fact generate a list of URLs that suits your needs.
Count the amount of URLs generated:
cat urls-10k.txt | wc –l
(yes, I know it creates one extra URL … Its fine by me.)
Count the amount of cacheable URLs containing the “as” query parameter:
cat urls-10k.txt | grep as | wc –l
Count the amount of unique cacheable URLs:
cat urls-10k.txt | grep as | sort | uniq | wc –l
Running the tests
In part one I used ApacheBench to load the servers but for these tests, I used Siege and http_load which both allowed me to load URLs from a file.
I started with Varnish using the following commands:
curl http://192.168.165.101/varnish/makelist.php?total=100000&host=192.168.165.104:8080 > urls-100k.txt
http_load -parallel 10 -fetches 100000 urls-100k.txt http_load -parallel 25 -fetches 100000 urls-100k.txt http_load -parallel 50 -fetches 100000 urls-100k.txt http_load -parallel 100 -fetches 100000 urls-100k.txt http_load -parallel 200 -fetches 100000 urls-100k.txt http_load -parallel 400 -fetches 100000 urls-100k.txt
In between each http_load command, I restarted the Varnish service so that each test ran with an empty cache. When I was done with the Varnish tests, I ran the same tests against Squid using the same commands above.
The results
The results of these tests represent the typical web application much better than the original tests did.
This first graph shows the average time for the proxy to accept a connection. As concurrency goes up, it is expected that the time to connect would go up too. Squid suffers more than Varnish does, but the difference is negligible.
The second graph is much more interesting. As concurrency goes up, the Time-To-First-Byte for Squid goes up very sharply while Varnish holds its ground and remains very quick around 25ms.
This third graph shows another interesting behavior. As concurrency goes up, Varnish begins to even itself out at just under 800 fetches per second while Squid peaks at around 1100 fetches per second with around 50 concurrent connects and then sharply drops off as concurrency goes up.
Conclusion
Squid versus Varnish is just another holy war that may never end. The tests that I have performed have been very helpful for me and my team but your results may vary. Of course, there are many more things to consider and I plan to write about some of the major differences between Squid and Varnish.
My results show that in raw cache hit performance, Varnish puts Squid to shame. In real world scenarios I found that Squid can hold its own when dealing with small amounts of traffic, but it’s performance drops off very sharply as it begins to handle more connections. Varnish handles them without a sweat, as it was designed to do.
My next blog post will detail the differences between Varnish and Squid’s architecture, features, and the reasons I am pushing for Varnish in our environment.
Edit:
Some people are complaining in comments on Reddit and HackerNews that I have not provided any information about the hardware or operating system for my tests. This information was posted in Part one of this post.
|
|
|
|
|
![]() |












September 9th, 2009 at 2:52 am
“I found that Squid can hold its own when dealing with small amounts of traffic, but it’s performance drops off very sharply as it begins to handle more connections. Varnish handles them without a sweat, as it was designed to do.”
I find that a bit disingenuous. As far as the graph depicts even at the highest level of concurrency tested, Squid wins (at fetches/sec anyway). You could just assume that the number of fetches/sec will continue at the same trajectories, but without testing there’s really no evidence of that.
Also, if that drop is considered sharp, then Varnish’s drop from 1000 to 800 when going from 10 concurrent requests to 20 concurrent requests should be considered sharp as well, especially in comparison to Squid’s gaining in that same period.
Thank you for the information though, it may be quite useful.
September 9th, 2009 at 3:23 am
What’s up with your graphs? Vertical axis is… time, right? In ms? Scaled in proportion to total request number? Averaged across requests? I’ve got no idea what’s up with your horizontal axis–why not plot concurrency along X? That way we can get a good picture of how the two caching proxies scale.
You mentioned using siege in your performance tests, but only show commands for http_load. Did you combine results from both?
Did you observe number-of-request or order-of-url dependent behavior in your tests?
September 9th, 2009 at 4:01 am
Please post the settings you’re using to run varnish and squid. Your graphs don’t mean much without knowing how the servers are running.
September 9th, 2009 at 4:21 am
Nitpick: you should turn your line charts into a histogram with the number of http_load clients on the X-axis.
September 9th, 2009 at 5:07 am
The data supports using squid. Throughput is a far more important metric, as long as first byte response and complete response are fast enough. Users don’t notice 50ms of lag, 200ms is just perceptable and certainly not going to impact how fast the user really sees the page, your site will still run pretty fast.
But being able to handle anywhere from 20-50% extra load is going to help more when your server is getting hammered and its this figure you should be investigating further.
September 9th, 2009 at 5:43 am
[...] This post was mentioned on Twitter by Pierre, Tech news and Bryan Migliorisi. Pierre said: Reverse Proxy Performance – Varnish vs. Squid (Part 2) > http://bit.ly/1ahxUF mmmm. [...]
September 9th, 2009 at 7:56 am
From my point of view, the graphs are to be read like this:
This X axis represents the number of the test run. So the number 3 stands for test no. 3 which would be “http_load -parallel 50 -fetches 100000 urls-100k.txt”. So the higher the number on the X axis, the more concurrent connections.
As for the time to first byte: Yes, 25ms aren’t noticed by any user but as you could have seen, Squid goes up to well beyond 250 ms in the szenario with the highest level of concurrency.
And for the fetches: As I understand these are fetches that get information from the web servers in the backend. I really can’t imagine how one would declare a system the winner that needs about 30 percent more backend calls to deliver content than the other contestant.
Now, having said the above, I have to say that I’m using both Squid and Varnish in some larger installations and found them both to perform very well. In my environment, Varnish has the advantage of putting less stress on the hardware (read: load on the system) than Squid does. Squid on the other hand has the advantage of being able to provide a distributed cache, meaning that using HTCP I can provide a system of distributed caches which together will act as one large cache whereas with Varnish right now all you can do is build caching islands.
So at the end of the day, both are just tools. And there are situations where Squid fits the job bether and others where Varnish would be my tool of choice.
All in all, the results posted above reflect my personal findings quite closely. The only thing with synthetic tests is that they are, well, synthetic.
As for the tests run here, my objections would be that they consisted of just flat (HTML?) content wheres on the real internet you’d have CSS, JS, grahics and video, gzipped content and what everything else. Also the tests didn’t take into account that Varnish will save copies of the object cached based on the user agent string presented by the client while Squid to my knowledge will happily ignore this. This results in Varnish normally requiring more cache storage for the same amount of infomation. Also not taken into account is the header mingling and munging often done in real world setups which Squid sort of does automatically while you’d have to tell Varnish to do so. So in an out-of-the-box setup, Varnish would obviously have an edge here which is maybe reflected in the TTFB measurements above.
September 9th, 2009 at 8:09 am
[...] deserialized.com there’s a nice pair (Part I & Part II) of articles comparing Squid and Varnish. While the findings there closely resemble the results of [...]
September 10th, 2009 at 12:34 pm
@Tyler
The graphs depiect that Squid has a higher fetch rate than Varnish does but as concurency increases, that higher rate drops. More importantly, the time to first byte increases sharply as concurrency increases with Squid. That is more important because the whole purpose of a reverse proxy is to quickly serve up content to your visitors without hitting your app server.
If the time to first byte slows down as more users hit your server then using Squid doesnt have much of a point, at least in terms of speed.
@John Adams
Which settings would you like to see? Proxy configuration files? Server configs?
@Jacques Chester
Thanks – I’ll try that and post them when I get the chance to do so.
@Paul
Throughput is just as important as time to first byte and complete response time. They are all important to the user, though the user doesnt know it. All the visitor knows is “this site took too long to load” and all of those things play into it.
I only tested up to 400 concurrent users, but if I went up to 800 or 1200 concurrent users, Squid’s TTFB would go up to (or possibly over) 1 second which would defeat the whole purpose of using Squid. Varnish on the other hand barely slowed down.
As Stefan says they “both are just tools” and either of them should help you out quite a bit. Varnish has the benefit of working on modern hardware with multiple CPUs and multiple cores while Squid does not (at least thats my understanding, correct me if I am wrong).
For us, this is important because to get the same performance out of Squid, we would need several squid boxes. That means purchasing more machines, taking up more rack space, and eating up more power at our datacenter. Those things add up to a bunch of money. Becuase of this, varnish would be cheaper. Of course that is just one way of looking at it and it would be easy to find arguments in favor of Squid – its a religious war.
Use what is best for you, but make sure you do your own testing. Don’t rely on mine or anyone elses. I’ve done tests that are in order with the things that are important to our business – your goals may differ.
September 10th, 2009 at 3:35 pm
[...] Excerpt from: Reverse Proxy Performance – Varnish vs. Squid (Part 2) | Deserialized [...]
September 16th, 2009 at 5:22 pm
[...] Reverse Proxy Performance – Varnish vs. Squid (Part 2) | Deserialized [...]
September 22nd, 2009 at 2:07 pm
[...] As a result, a lot of what you do with Varnish feels a bit like trial and error. Google, read the comments, make some changes, cross your fingers, see what happens, rinse, repeat… On the upside, though, it’s really f’ing fast*. [...]
October 1st, 2009 at 11:21 am
You appear to have a graphing failure. Firstly, your X axis should be:
10 25 50 100 200 400
and the concurrently plot removed.
Secondly, your graphs are not continuous, and should not be line graphs. They should either be bar charts or just have the points plotted, but not connected.
October 11th, 2009 at 3:30 pm
Comparison of the wild
http://community.livejournal.com/ru_root/1867961.html
October 12th, 2009 at 11:45 pm
qiu zhi xing pao cun bu le.
October 16th, 2009 at 12:33 pm
I’d be curious to see a comparison of squid and varnish that deals with maximum concurrent connections. I currently use squid as a simple reverse proxy (no caching) to insulate apache from slow clients. It can handle ~5000 concurrent connections without problems. Unfortunately squid tops out around 3500 requests per second due to its high cpu usage and singe-process single-thread architecture.
Varnish, on the other hand, was able to do 13,000 requests per second in the same test setup, but when I tried it in a real-world environment it fell flat on its face with thousands of concurrent connections.
Anyone have any good tips for varnish running in simple reverse-proxy mode, and how to have it handle 10,000+ concurrent connections?
October 16th, 2009 at 12:52 pm
@John
That is a good question. I know both Facebook and Twitter use Varnish and I am sure they both handle and immense amount of traffic to their servers. I wish we could get that answer from them.
October 27th, 2009 at 4:47 pm
http_load in -parallel mode tests how a server handles overload, not its capacity.
In other words, you’re showing (very effectively) how the servers handle a load greater than they’re able to, which isn’t necessarily at all related to how they’re able to handle well-sized loads.
While overload metrics are important, they’re not really giving any indication of how fast the server actually is; you’re only showing what happens once you exceed the servers’ capabilities.
To do that, you need to run a series of tests at different request rates, e.g., using autobench.
A few other metrics to look at: how different response sizes are handled, and proxy performance vs. cache performance.
October 27th, 2009 at 8:35 pm
@Mark
Thanks for the pointer. I’ll rerun my tests using your recommendations.
November 7th, 2009 at 4:30 am
Did you publish the configuration somewhere? I seriously doubt seeing Squid’s concurrency falling off so sharply. I’ve been running Squid and Lusca in production with between 10 and 20,000 concurrent connections over a WAN with no noticable drop in connection timing.
You’re also not testing a “WAN” scenario well enough. You should introduce client and/or server side latency; packet loss; experiment with larger request and response bodies, etc, etc. You may find Varnish drops off quicker than you’d expect.
November 12th, 2009 at 8:28 pm
[...] the fastest, which one is easier to setup or integrate with legacy apps etc. I’m certainly NOT trying to get into that! In fact I will skirt the issue entirely saying this: when the features are [...]
December 30th, 2009 at 12:54 am
Hey Brian,
Can u tell me how i can calculate the number of
— Maximum no.of Concurrent users during a time period(a day/a month)
— Maximum no.of concurrent sessions during a time period (a day/ a month)
Is it possible to calculate this from the access.logs?
Cud u tell me if u know any formula with which i can calculate the above.
Thanks
Nix
February 2nd, 2010 at 7:32 am
Great post! I just suscribed to your RSS feed. Your site is kinda messy in my browser. I used Konqueror. Just to let you know.
February 2nd, 2010 at 7:35 am
Thanks Ronald!
I’ll check it out in Konqueror, though thats not the most common browser
March 2nd, 2010 at 8:00 am
@Bryan
“I know both Facebook and Twitter use Varnish and I am sure they both handle and immense amount of traffic to their servers. I wish we could get that answer from them.”
There’s some information on these subjects in the list archives, for instance, John from Twitter posted about their setup here: http://projects.linpro.no/pipermail/varnish-dev/2009-February/000968.html
You’ll also want to look at Kristians blog, among other things he describes his 143k req/s setup: http://kristian.blog.linpro.no/2010/01/13/pushing-varnish-even-further/
April 10th, 2010 at 4:05 am
Can you please do the same thing you’ve done on Squid vs. Varnish and do one for Lusca vs. Varnish?
Lusca supposingly being an improved port of Squid 2.
Thank you !