So I've had a hectic few days, and I'm getting close to having caught up with the things that I've been sitting on whilst I've been away.
ObRandom: Several people, independantly, have told me within the past few days that "whilst" is not a real word. it is. End of ..
Some interesting things I've been working upon recently include a fun little firewall tool. Once upon a time I wrote a firewall script which worked like this:
When you executed the magic firewall script it would scan the incoming.d directory, and for each file it found lookup the relevant port in /etc/services. These port numbers would then be opened. And at the end you'd just have a "-j DROP".
After a long phone conversation to a colleague on Thursday/Friday of last week I've now reworked this idea anew. There is still the notion of filenames referring to what is allowed for a pair of directories (incoming.d/ + outgoing.d/) but even more flexability and no hardwired use of /etc/servvices.
I guess some ideas are just too simple to give up ..?
Anyway there are a plethora of different firewall applications of varying sophistication and complexity in the world. I don't really want to go out of my way to promote this one - but at the same time it might be a useful idea for somebody?
The next (work) job I have is determining how to make a "kernel" + "kernel-dev" RPM package based on Debian sources. Joy. Actually the more I look around the more fiddly, annoying, and troublesome I suspect this is going to be. Sigh.
ObQuote: The Grudgy
Tags: bytemark, centos, firewalls, kernels, rpm
24 June 2008 21:50
I've spent a few hours recently looking at building RPM packages of GNU/Linux kernels, which has been a frustrating process.
There are many many online guides which give the impression that this is actually a pretty complex process. For example How To Compile A Kernel - The CentOS Way guide. (Did I mention how bad most of the howtoforge guides are recently?)
So, after fiddling around for an afternoon and getting lost I decided to abandon the process.
Here is a tested process for building a binary RPM kernel package:
Yes this works just fine upon a Centos 5.x machine - I'm used to using make-kpkg to make a Debian kernel package, but it seems that if you just visit kernel.org and download the latest version you can build a RPM without any extra effort thanks to native support. Cool.
Now I need to work out how to create, host, and update a YUM repository. That looks fiddly and annoying too. XML. Eww. Any guides are most welcome - ultimately I need to package and host a "recent" kernel for Centos 4.x, Centos 5.x and Fedora Core 6-9 - each for i386 + amd64.
Tags: rpm, work, yum
29 June 2008 21:50
There exist many bayasian/statistical spam filters, ranging from products such as spambayes, and spamassassin, to crm114. Each of them works in their own way. Having used and tested almost all of them I've noticed a common flaw.
The vast majority of spam-filters struggle to correctly classify "419 scam" mails, lottery fraud, and similar mails.
Why is that? In general, having read hundreds of these mails, I can see several things that are common in these kind of the mails:
- Mention of currency in both numeric and word forms. ($1,000,000 + 1 million US dollars)
- Mention of a country / nationality (Sierra Lione, Nigerian)
- Mention of a reference/claim number and often "official address".
- Christian references.
- Greetings such as "dear friend", and mentions of discretion/secrecy.
- Size. (A scam mail is typically greater in length than an average spam mail).
Whilst none of these individually are indicative of a scam mail it is interesting to count their combined occurance.
I've written a toy program to count these things, and so far the success rate is >60% which is a reasonable start - providing this kind of detection occurs after normal filtering.
I may experiment further, but I figured a public query on scam detection might be appropriate.
Whilst the detecting a scam mail is a subset of detecting a spam email there are probably simplifications that may be made, and exploring those wouldn't be a bad thing.