I've been overhauling the way that I am host a number of virtual websites upon my main box. Partly to increase security, and partly for a cleaner separation or roles, ownership, and control. (In general everything on my box is "mine", but some things are "ours"...)
After a fair amount of experimentation I decided that I wasn't willing or able to rewrite all my Apache mod_rewrite rules just yet. So my interim plan was to update each existing virtual host:
- Add a dedicated user & group to run it under.
- Launch it via a minimal server listening upon the loopback adapter.
- Have Apache 2.x proxy through to it.
- Expanding any mod_rewrite rules prior to the proxying.
To make it clear what the users were for I decided that every hosting-user would have an "s-" prefix. So the virtual host "static.steve.org.uk" was initially going to be served by the s-static user.
The thttpd configuration file would look like this, and would be located in /etc/thttpd/sites/static.steve.org.uk:
host=127.0.0.1
port=1008
dir=/home/www/static.steve.org.uk/htdocs/
chroot
user=s-static
throttles=/etc/thttpd/throttle.conf
logfile=/home/www/static.steve.org.uk/logs/thttpd.log
pidfile=/home/www/static.steve.org.uk/pid/file
(I wrote a trivial script to stop/start all the sites en mass, and removed the default thttpd init script, logrotation job, and similar things.)
How did I decide which port to run this instance under? By taking the UID of the user:
steve@skx:~$ id s-static
uid=1008(s-static) gid=1009(s-static) groups=1009(s-static)
With this in place I could then update the Apache configuration file from serving the site directly to merely proxying to the back-end server:
<VirtualHost *>
ServerName static.steve.org.uk
# Proxy ACL
<Proxy *>
Order allow,deny
Allow from all
</Proxy>
# Proxy directives
ProxyPass / http://localhost:1008/
ProxyPassReverse / http://localhost:1008/
ProxyPreserveHost on
</VirtualHost>
So was that all there is to it? Sadly not. There were a couple of minor issues, some of which were:
- cronjobs
I have various cron-jobs in my main steve account which previously updated blog indexes, etc. (I use namazu2 to make my blog searchable.)
I had to change the ownership of the existing indexes, the scripts themselves, and move the cronjob to the new s-blog user.
- cross-user dependencies
I run a couple of sites which pull in content from other locations. For example a couple of list summaries, and archives. These are generally fed from a ~/.procmail snippet under my primary login.
Since my primary login no longer owns the web-tree it is no longer able to update things directly. Instead I had to duplicate a couple of subscriptions and move this work under the UID of the site-owner.
- I'm no longer running apache
For a day or two I'd forgotten I was using the apache facility to include snippets in my site; such as links to my wishlist.
Since I'm not using Apache in the back-end server-parsed files no longer work. Happily I'm using a simple template-based setup for my main sites, so I updated the template-parser to understand "##include ./path/to/file". For example this source file produces my donation page.
The upshot is my "static" site is even more static, which is a good thing.
- uploads are harder
Several of my domains host entirely static content which is generated on my main desktop machine, and then uploaded via rsync post-build.
I had to add some more accounts and configure SSH keys, then update the uploading routines/Makefiles appropriately. Not a major annoyance, but suddenly my sshd_config file has gone from "PermitUser steve,backup" to including many additional accounts.
The single biggest pain was handling my my mercurial repositories - overhauling that took a bit of creativity to ensure that nothing was broken for existing or new checkouts. I wish that a backport of mercurial-server was trivial because I'd love to be using that.
In general though watching the thttpd logs has been sufficient to spot problems. I had to tweak things a little to generate statistics properly, but otherwise all is good.
Why thttpd? Well small, lightweight, and the ability to run CGI scripts. Something missing from nginx for example.
I'm still aiming to remove apache2 from the front-end - it is mostly just a dumb proxy, but it does perform some ACL operations and expand mod_rewrite rules. I could port those to another engine .. but not today.
The most likely candidates are nginx, perlbal, or lighttpd - each of these should be capable of doing simple ACL checks, and performing mod_rewrite-like rules.
ObFilm: Mallrats
Tags: apache, migration, thttpd
|