About Archive Tags RSS Feed


This is the part where you tell me what matters is on the inside

24 November 2009 21:50

Some technology evolves very quickly, for example the following things are used by probably 80% of readers of this page:

  • A web browser.
  • A mail client.
  • A webserver.

But other technology is stuck in the past and only sees laclustre updates and innovations (not that inovation is mandatory or automatically a good thing)

Right now I'm looking at my webserver logs, trying to see who is viewing my sites, where they came from, and what their favourite pie is.

In the free world we have the choice of awstats, webalizer, and visitors (possibly more that I'm unaware of). In the commercial world everybody and their dog uses Google's analytics.

On the face of it a web analysis package is trivial:

  • Read in some access.log files.
  • Process to some internal database representaqtion.
  • Generate static/dynamic HTML output from your intermediate form, optionally including graphs, images, and pie-charts.

If you add javascript-fu to each of your pages you can track page titles, exit links, screen resolutions, and other data to record too. (Though I guess thats a seperate problem; trying to merge that data in with the data you have in your access log without making nasty links like "GET /trackin.gif?x_res=800;y_res=600". Anyway I guess with cookies you could correlate reasonably carefully.)

In conclusion why are my web statistics so dull, boring, and less educational than I desire?

I'd be tempted to experiment, but I suspect this is a problem which has subtle issues I'm overlooking and requires an artistic slant to make pretty.

(ObLink: asql is my semi-solution to logfile analysis.)

ObFilm: Bound



Comments on this entry

icon Andrew Ruthven at 23:16 on 24 November 2009

jawstats - a nicer interface to the awstats data files
piwik - a Google Analytics clone - http://piwik.org/


icon Jon at 13:34 on 30 November 2009

If you're going to collect info on the resolution of your clients, I'd also suggest collecting info on the size of their browser windows. I'm fairly sure you can get both via JS.

In late '08 the BBC news website was redesigned and the new design required a larger minimum width than the old site (which, iirc, was fixed at a pretty small width). They came into some criticism, including from me, for doing this. I think the minimum width was 1000 or so, leaving 24 pixels for window decoration assuming you browsed maximized and had a 1024×768 res display (which was not oncommon for ultralight laptops in 08). And therein lies the problem: they did assume people browse maximized.

You can prove that assumption for your audience if you keep track of their browser window sizes too: the BBC didn't.

icon Steve Kemp at 07:51 on 29 November 2009

Sigfried: Piwik looks great (and was already mentioned above) but I'm not a PHP user. Ever.

icon Siegfried Gevatter at 16:03 on 28 November 2009

Give http://piwik.org/ (formerly phpMyVisites) a try.

icon Justin at 18:43 on 25 November 2009

Re: innovation, see Splunk.

It's not really geared towards web logs, but it can do some nifty things with them.

It's not open source though :-/

icon Steve Kemp at 14:06 on 25 November 2009

Thanks for the links both of you.

piwiki was featured on slashdot a couple of hours after this post went live; it looks lovely but looking at the comments struggles under load, and is in PHP (which I do not use).

jawstats looks pretty, but I suspect I'm going to want to do the evil javascript tracking for proper extras. I'm currently experimenting with that and sqlite to see what it will gain me.

icon Tobias at 12:12 on 25 November 2009

Also ModLogAn is a alternative to Webalizer with nicer looks, even if both projects seem to be mostly dead...