Its a lot like life

Assume for a moment that you have 148 hosts logging, via syslog-ng, to a central host. That host is recording all log entries into an MySQL database. Assume that each of these machines is producing a total of 4698816 lines per day.

(Crazy random numbers pulled from thin air; globviously).

Now the question: How do you process, read, or pay attention to those logs?

Here is what we've done so far:

syslog-ng

All the syslog-ng client machines are logging to a central machine, which inserts the records into a database.

This database may be queried using the php-syslog-ng script. Unfortunately this search is relatively slow, and also the user-interface is appallingly bad. Allowing only searches, not a view of most recent logs, auto-refreshing via AJAX etc.

rss feeds

To remedy the slowness, and poor usability of the PHP front-end to the database I wrote a quick hack which produces RSS feeds via queries, against that same database, accessed via URIs such as:

http://example.com/feeds/DriveReady
http://example.com/feeds/host/host1

The first query returns and RSS feed of log entries containing the given term. The second shows all recent entries from the machine host1.

That works nicely for a fixed set of patterns, but the problem with this approach, and that of php-syslog-ng in general, is that it will only show you things that you look for - it won't volunteer trends, patterns, or news.

The fundamental problem is a lack of notion in either system of "recent messages worth reading" (on a global or per-machine basis).

To put that into perspective given a logfile from one host containing, say, 3740 lines there are only approximately 814 unique lines if you ignore the date + timestamp.

Reducing logentries by that amount (78% decrease) is a significant saving, but even so you wouldn't want to read 22% of our original 4698816 lines of logs as that is still over a million log-entries.

I guess we could trim the results down further via a pipe through logcheck or similar, but I can't help thinking that still isn't going to give us enough interesting things to view.

To reiterate I would like to see:

per-machine anomolies.
global anomolies.

To that end I've been working on something, but I'm not too sure yet if it will go anywhere... In brief you take the logfiles and tokenize, then you record the token frequencies as groups within a given host's prior records. Unique pairings == logs you want to see.

(i.e. token frequency analysis on things like "<auth.info> yuling.example.com sshd[28427]: Did not receive identification string from 1.3.3.4"

What do other people do? There must be a huge market for this? Even amongst people who don't have more than 20 machines!

Tags: lazyweb, logfiles, mysql, php-syslog-ng | 17 comments

Its a lot like life

Comments on this entry

Recent Posts