Reverse Proxy Performance – Varnish vs. Squid (Part 1)
Typical web applications require dozens of SQL queries to generate a single page. When your application is serving over 1,000,000 pages per day, you quickly realize that the performance bottleneck is your database. The typical answer to slow database queries is “just use memcached!” Memcached and other data caches can only take you so far. This is where reverse proxies come in. There are a handful of them out there, including Nginx, Perlbal, Squid and Varnish. Which to use is up to you.
Deciding what is best for you
Assuming that you have taken a step back and really analyzed your problem first, the next step is to analyze the possible solutions. For us, Varnish seems like the best option with Squid close behind. To be fair, I’ve set up a test server with both Varnish and Squid running. I’ll use ApacheBench to generate load and requests.
I’ve analyzed our pages to see what the typical page size is and recorded the average page sizes for 5 different page types. They range from around 10KB to 35KB (gzipped). For my test, I’ll be benchmarking with 10KB, 15KB, 20KB, 30KB, 40KB, and 50KB files to get a good range of different size requests.
To test under different load capacities, I’ll use ApacheBench to generate loads with different amounts of concurrent users ranging from 10 to 400.
The test
I’ll be using two identical machines on the same local class C network to eliminate (as much as possible) network latency.
The machines look something like this:
- Pentium 4 3GHz (8KB Level 1, 512KB Level 2)
- 2GB (4×512 DDR 400MHz)
- 120GB ATA Western Digital Caviar WD1200JB
- CentOS 5
(I don’t have more information than that. Suffice to say that it is a few years old and not very powerful)
I am using Varnish 2.04 and Squid 2.6.STABLE21. There are newer versions of Squid but i am using this version because the 3.x branch is missing features found in the 2.x branch and I have read several reports of 2.7 crashing, etc.
The command to run the load test looks something like this:
ab –c concurrent_users –n total_requests “url”
This will let you specify how many concurrent users to run and how many requests to make. I have the proxy servers running on ServerA and I run the benchmark from ServerB.
The results
In general, Varnish seems to perform twice as well as Squid does. In every test, Varnish serves nearly 2x more requests per second and has half the average response time.
| Varnish | Squid | ||||||
| File Size | Concurrent Users | (V) Requests per second | (V) Avg across all requests | (V) Average Request (ms) | (S) Requests per second | (S) Avg across all requests | (S) Average Request (ms) |
| 10k | 10 | 6592 | 0.152 | 1 | 3078 | 0.325 | 3 |
| 10k | 25 | 6915 | 0.145 | 3 | 3568 | 0.280 | 7 |
| 10k | 50 | 7071 | 0.141 | 7 | 3539 | 0.283 | 14 |
| 10k | 100 | 6860 | 0.146 | 13 | 3565 | 0.280 | 28 |
| 10k | 200 | 7252 | 0.138 | 27 | 3506 | 0.285 | 57 |
| 10k | 400 | 7181 | 0.139 | 56 | 3518 | 0.284 | 113 |
| 15k | 10 | 4636 | 0.216 | 2 | 2949 | 0.339 | 3 |
| 15k | 25 | 5954 | 0.168 | 4 | 3168 | 0.316 | 7 |
| 15k | 50 | 6036 | 0.166 | 8 | 3118 | 0.321 | 16 |
| 15k | 100 | 6060 | 0.165 | 16 | 3247 | 0.308 | 30 |
| 15k | 200 | 6066 | 0.165 | 32 | 3226 | 0.310 | 61 |
| 15k | 400 | 6048 | 0.165 | 66 | 3092 | 0.323 | 129 |
| 20k | 10 | 4689 | 0.213 | 2 | 2553 | 0.392 | 3 |
| 20k | 25 | 5342 | 0.187 | 4 | 2675 | 0.374 | 9 |
| 20k | 50 | 5422 | 0.184 | 9 | 2799 | 0.357 | 17 |
| 20k | 100 | 5446 | 0.184 | 18 | 2861 | 0.349 | 34 |
| 20k | 200 | 5430 | 0.184 | 36 | 2795 | 0.358 | 71 |
| 20k | 400 | 5400 | 0.185 | 74 | 2656 | 0.376 | 150 |
| 25k | 10 | 4135 | 0.242 | 2 | 2331 | 0.429 | 4 |
| 25k | 25 | 4485 | 0.223 | 5 | 2308 | 0.433 | 10 |
| 25k | 50 | 4488 | 0.223 | 11 | 2221 | 0.450 | 22 |
| 25k | 100 | 4446 | 0.225 | 22 | 2217 | 0.451 | 45 |
| 25k | 200 | 4311 | 0.232 | 46 | 2180 | 0.459 | 91 |
| 25k | 400 | 4160 | 0.240 | 96 | 2026 | 0.493 | 197 |
| 30k | 10 | 3463 | 0.289 | 2 | 1936 | 0.516 | 5 |
| 30k | 25 | 3689 | 0.271 | 6 | 2002 | 0.499 | 12 |
| 30k | 50 | 3661 | 0.273 | 13 | 1887 | 0.530 | 26 |
| 30k | 100 | 3627 | 0.276 | 27 | 1778 | 0.562 | 56 |
| 30k | 200 | 3589 | 0.279 | 55 | 1746 | 0.573 | 114 |
| 30k | 400 | 3541 | 0.282 | 112 | 1798 | 0.556 | 222 |
| 40k | 10 | 2752 | 0.363 | 3 | 1602 | 0.624 | 6 |
| 40k | 25 | 2824 | 0.354 | 8 | 1584 | 0.631 | 15 |
| 40k | 50 | 2826 | 0.354 | 17 | 1492 | 0.670 | 33 |
| 40k | 100 | 2827 | 0.354 | 35 | 1551 | 0.645 | 64 |
| 40k | 200 | 2822 | 0.354 | 70 | 1538 | 0.65 | 130 |
| 40k | 400 | 2794 | 0.358 | 143 | 1372 | 0.728 | 291 |
| 50k | 10 | 2254 | 0.443 | 4 | 1401 | 0.713 | 7 |
| 50k | 25 | 2265 | 0.441 | 11 | 1379 | 0.725 | 18 |
| 50k | 50 | 2266 | 0.441 | 22 | 1368 | 0.731 | 36 |
| 50k | 100 | 2268 | 0.441 | 44 | 1360 | 0.735 | 73 |
| 50k | 200 | 2266 | 0.441 | 88 | 1230 | 0.813 | 162 |
| 50k | 400 | 2267 | 0.441 | 176 | 1216 | 0.822 | 328 |
Here are the graphs of the above data for easier visualization:
Something is wrong here
These are simply benchmarks and are not meant to represent real world scenarios for a few reasons. Most importantly, this test takes place on a local network that goes through one router. Running this test on a local network does not take into consideration the typical network latency you would find across the internet.
Secondly, this test only illustrates the raw speed of serving up cached content which isn’t a typical real world scenario. To really test the overall performance of both of these, we need to simulate the three major steps of a reverse proxy:
- Forwarding a request to a backend server
- Physically caching it (memory or disk)
- Serving the cached data
Testing any of these three steps is good, and shows the raw performance of that function but it doesn’t give us a general overview of the overall performance.
Next Steps
I need to come up with a way to generate load on the server such that it represents the typical flow of requests that we would normally see on a server. I am running this on a test server, not against production data, so if anyone has an idea of how I can do this, please do let me know. The results of this test will be Part 2 of this post.
Additionally, please let me know if you spot inefficiencies in my testing methodology. I don’t claim to be a load testing expert so any advice you can offer is appreciated.
|
|
|
|
|
![]() |












September 9th, 2009 at 8:07 am
[...] deserialized.com there’s a nice pair (Part I & Part II) of articles comparing Squid and Varnish. While the findings there closely resemble [...]
September 10th, 2009 at 1:50 pm
[...] [...]
September 16th, 2009 at 5:22 pm
[...] Reverse Proxy Performance – Varnish vs. Squid (Part 1) [...]
November 12th, 2009 at 8:25 pm
[...] is currently the fastest, which one is easier to setup or integrate with legacy apps etc. I’m certainly NOT trying to get into that! In fact I will skirt the issue entirely saying this: when the features [...]
December 15th, 2009 at 3:33 pm
[...] [...]
April 28th, 2010 at 6:08 am
Hi,
Great post!, I´m trying to decide about this two tools, Squid and Varnish to configure as accelerator for a OpenCms web portal and now I´ve more data, thanks.
To do the tests I would like to advise about The Grinder, it´s a wonderful tool to do use case tests.
Regards,
August 28th, 2010 at 11:02 pm
[...] Deserialized » Blog Archive » Reverse Proxy Performance – Varnish vs. Squid – August 28th %(postalicious-tags)( tags: varnish performance squid proxy benchmark comparison server web cache )% [...]