
I installed Varnish on my bart “Lenny” testing machine, its a reverse proxy, which as I understand it means it acts a lot like a regular web server, but instead of getting content off a file system, it redirects requests to other web servers, and caches the content.
About Varnish
The creators of Varnish put a lot of thought into this program, and I imagine that they’ve come up with a nice result. I’m interested in trying out the new ESI language (edge-side includes, similar to server-side includes, SSI), as well as the VCL (varnish configuration language). VCL looks like perl, but it is transformed into C and compiled into machine code. Cool!
Configuring Varnish
Trying out Varnish is easy. I edited /etc/default/varnish to route requests to bart:80, then loaded up bart:6081/. Bingo, same content served as from port 80. So what’s the big deal? VCL can implement regexp url rewriting.
Other Reverse Proxies
* Nginx
* Lighttpd
* Apache w/ mod_proxy
* Squid
* Pound
* Perlbal
References and External Links
http://en.wikipedia.org/wiki/Varnish_cache
http://www.w3.org/TR/esi-lang
http://varnish.projects.linpro.no/
Perdition is an awesome IMAP (S) and POP3 (S) proxy server, it can even route different domains to different back end servers. I’ve only tested it out a little, but I’m planning to use it in the future full time.
I’ve been trying all the different configurations for mod_proxy the past few days - balancer, cache, and even the mod_proxy_http module. Its all very very cool.
One thing I’m definitely interested in figuring out is how to combine the conditional powers of mod_rewrite with the failover capabilities of balancer.
I setup a new kind of proxy today. Its only meant for secure access to one host server, but the point was to enable the use of the VIA ACE “wicked-fast” encryption capabilities.
So far so good, I had some issues with the TCP stack, but all in all, I really like the idea of proxying this way.
To be more specific about how its setup, my loopback device and localhost address is the proxy, which accepts socket connections to Stunnel, which has an SSL connection setup with my webmail server. Very cool!
When investigating this, I also found this cool program:
http://tinyproxy.sourceforge.net/
This morning, I was able to setup mod_proxy as a simple and effective alternative to network address translation (NAT). I eventually want to use NAT, but to do so at this time would require some fairly extensive routing tables, which I don’t want to worry about.
In this case, the nice thing about mod_proxy, is that I don’t have to worry about the routing tables! But there are other benefits. For example, I am using SSL to connect to the public proxy, and from there I am using plain http, since the secondary network is private. I think this is how many SSL off-loaders work.
The Apache mailing lists have been mentioning mod_serf lately as an alternative to mod_proxy. It actually looks really interesting, and its even more interesting that is has been added to the trunk. It provides asynchronous communications, which sounds great.
Here’s the mod_serf homepage at Google code, but I recall someone saying that since its now in the httpd trunk, its new how will be the apache repository.
http://code.google.com/p/serf/
I’ve been running Apache as my caching proxy server and it seems to be running OK. I have a couple of thoughts thuogh:
* Can it communicate with other Apache proxy caches to share their cache?
* Is there a management interface (command line even?) to find out the hit / miss ratio?
Squid isn’t just a proxy server, its also a cache server. That means that while squid can do handle http or ftp requests for clients transparently, it can also save the response for other clients.
This is great - it can seriously speed up network access, but what if the squid server crashes or becomes overloaded? To prevent a total outage, you can setup several squid servers and load balance between them. That’s what I did using pfsense as a local network load balancer. So far its working really well, and I can take down one of the machines without any consequence. Its pretty neat actually, the pfsense bridge pings each server occasionally to test whether it is alive or not.
I finally setup my first reverse proxy with Apache 2.2 and mod_rewrite. Some might say that Apache is bloated, but I’m really pleased with it.
I followed the mod_rewrite URL rewriting guide, which explains that you’ll need mod_proxy and http_proxy enabled. To activate the proxy capability for a RewriteRule, all you need to do is add [P] at the end. Its actually very easy.
The danger is having an open proxy, but you can disable this by specifying ProxyRequests off in your apache configuration. That way, only requests you specify will be handled by mod_proxy. You can specify them using proxypass, or a rewriterule.
I’m looking forward to working with mod_proxy a lot in the future. I plan to use it as a layer 7 load balancer. 
Proxy servers are common to these types of situations:
- Caching aggregated content and making it available for faster subsequent access.
- Examining connections and transactions for recording and playback, or routing through a proxy server for testing environments
Squid is a popular choice as a proxy cache server.
Selenium RC and MaxQ are proxy servers for use with testing.