About Archive Tags RSS Feed

 

Entries tagged perl

You're making me live

26 November 2007 21:50

Is there an existing system which will allow me to query Apache logfiles via an SQL string? (Without importing into a database first).

I've found the perl library SQL::YASL - but that has a couple of omissions which mean it isn't ideal for my task:

  • It doesn't understand DISTINCT
  • It doesn't understand COUNT
  • It doesn't understand SUM

Still it did allow me to write a simple shell which works nicely for simple cases:

SQL>LOAD /home/skx/hg/engaging/logs/access.log;
SQL>select path,size from requests where size > 10000;
path size 
/css/default.css 13813 
/js/prototype.js 71261 
/js/effects.js 37872 
/js/dragdrop.js 30645 
/js/controls.js 28980 
/js/slider.js 10403 
/view/messages 15447 
/view/messages 15447 
/recent/messages 25378 

It does mandate the use of a "WHERE" clause, but that was easily fixed with "WHERE 1=1". If I could just have support for count I could do near realtime interesting things...

Then again maybe I should just log directly and not worry about it. I certainly don't want to create my own SQL engine .. it just seems that Perl doesn't have a suitable library already made which is a bit of a shocker!

| No comments

 

If you read the TV Guide, you don't need a TV

19 April 2008 21:50

So I've written a quick hack. A client-side filter/utility program for working against IMAP servers.

Consider it a general purpose system which is similar to Procmail, but applied after your remote machine has already done the sorting.

Here's a flavour:


<GMail>
  username somebody.like.me
  password yeah.right
</Gmail>

<Folders>
  <livejournal>
        unread exec /usr/local/bin/notify "Livejournal Comment"
        mark read
  </livejournal>

  <inbox>
        mark read
  </inbox>

</Folder>

What does that do? It first of all logs into GMail with the given username and password, then selects two folders:

=livejournal/

For each unread message in the folder it runs the specified command with STDIN being the message body.

Then it marks each new message as "read".

=inbox/

This simple rule just marks all messages as read.

Why? Well I have a bunch of folders on a bunch of gmail accounts and I don't pay attention to them - but some, specific, mails should result in an SMS being sent to me ... so I need to do something clever.

I'm sure with a bit of effort this could be made IMAP-server independent, and could have a more flexible matching system. The simplicity right now comes about primarily because i dont want to parse a config file.

Anyway, suggestions for potential features are welcome. It does what I need as-is, even if it isn't pretty.

ObQuote: Lost Boys

| 2 comments

 

On the other side of the screen, it all looks so easy

20 April 2008 21:50

I've updated the IMAP utility that I mentioned previously, which has now been given the name sift. It will accept, and process, a much simpler configuration file format keeping state as it goes.

Here's my updated sample file:

username: blah.bah
password: pas.word

#
#  Comments are fine.
#
folder:livejournal status:new subject:temp mark:read exec:~/bin/notify
folder:foo status:new mark:read
folder:bar status:old exec:/usr/local/bin/record delete

Each line consists of a set of tokens, split by whitespace, which is "executed" in order.

So the first line selects the folder "livejournal", finds messages which are "new", then each message containing "temp" in the subject is marked as read, and the program "notify" is executed once for each match.

Essentially we keep a list of messages as "current" as we process each line, that list of messages is then refined as we move through the line. (When a folder is opened all messages are selected by default.)

As a simple example to delete all the messages contained in a folder we'd use this:

folder:foo delete

To refine that to only delete messages from "fred" we'd say:

folder:foo from:fred delete

(If there were no matches the "delete" action wouldn't occur.)

Consider each line of input a collection of filters each operating on the previous result. Simple to understand, simple to extend with more operations, and simple for me to code!

TODO: Add a "move:xxx" to move a message to folder "xxx", and a bit more polish, then release.

ObQuote: Tron.

| No comments

 

I want reliable people, people who aren't going to be carried away

21 April 2008 21:50

OK I'm done with this now, the sift utility has been released.

I think that is a large overlap with imapfilter; but I win because I can write simple rules, rather than any actual code, to perform jobs.

 

In other news I flew my kite today, and I still like eating Pies: Thank God reading Debian Planet isn't mandatory.

ObQuote: The Godfather

| No comments

 

My ass is on fire! Spank my ass

15 August 2008 21:50

This has been a rather random few days.

  • I bought a steam-boat (not really; but close enough.)
  • I fell in love with mod_perl

For those of you that don't know it mod_perl is an Apache module which embeds perl into your webserver.

You can use this to write extensions, handlers, and all kinds of fun things in pure perl.

Me? I just changed the beefy CGI script that I use to power a couple of my sites from being plain-CGI to being mod_perl-CGI - that means:

  • The same perl engine & copy of my script stays in memory.
  • I don't need the fork()/exec() overhead for each incoming rquest.

The downside is that I have to "/etc/init.d/apache2 reload" if I change my script, or any of my custom modules it uses. (I suspect this is something I can fix; I just don't know how yet :)

All that was possible, with zero changes to my applications as I use the CGI::Application framework - lucky? or planned? I'll let you decide ...

In terms of speedup I can now process about 100 requests a second, compared to 10. As reported by the Apache-benchmark tool. Cool.

ObFilm: The Mummy: Tomb of the Dragon Emperor

| 4 comments

 

My ass is on fire! Spank my ass

15 August 2008 21:50

This has been a rather random few days.

  • I bought a steam-boat (not really; but close enough.)
  • I fell in love with mod_perl

For those of you that don't know it mod_perl is an Apache module which embeds perl into your webserver.

You can use this to write extensions, handlers, and all kinds of fun things in pure perl.

Me? I just changed the beefy CGI script that I use to power a couple of my sites from being plain-CGI to being mod_perl-CGI - that means:

  • The same perl engine & copy of my script stays in memory.
  • I don't need the fork()/exec() overhead for each incoming rquest.

The downside is that I have to "/etc/init.d/apache2 reload" if I change my script, or any of my custom modules it uses. (I suspect this is something I can fix; I just don't know how yet :)

All that was possible, with zero changes to my applications as I use the CGI::Application framework - lucky? or planned? I'll let you decide ...

In terms of speedup I can now process about 100 requests a second, compared to 10. As reported by the Apache-benchmark tool. Cool.

ObFilm: The Mummy: Tomb of the Dragon Emperor

| 4 comments

 

Let me show you the way.

26 August 2008 21:50

So I got bored tonight and figured I'd write a game...

I'm genuinely not sure whether I've seen this concept before, or came up with it myself. I suspect the former. I know that I sat down with the intention of coding this game and knew how it would play and what the mechanics would be.

Having said that though I cannot think of a similar game I've played - though parts are obviously derivitive.

Anyway the aim of the game:

  • A (single currently) ball bounces around the screen.
  • You draw lines upon the screen, using the mouse, to influence the movement of the ball.
  • The level (game) is over when the ball lands in the "exit box".

Thus far the game exists only in the skelital form with the minimum required functionality. There are two modes currently: "easy" & "hard". The hard mode was primarily added to prove to myself that the "leveling" system could work in a fun way.

Feedback welcome. Especially if it can tell me where I'm going wrong with the collision detection - but even if it is to critique my hacked-up SDL coding.

(The only other SDL coding I've done was in C, and was the mousetrap game.)

Obviously the game is written in perl, and I admit nasty perl at that. To play it you'll only need:

apt-get install libsdl-perl

Code:

ObQuote: The Chronicles of Riddick

| 12 comments

 

We should just deal with nice people

16 November 2008 21:50

For various reasons I've recently been thinking about forums.

Many technical users dislike forums, because they are things that are hard to follow. Even with RSS feeds & etc you need to keep a login and remember to return to see if your post(s) have been answered.

However non-technical users love forums, and from a community-building perspective they're very cheap and easy. Particularly if you manage to appoint moderators from within the comunity.

I currently find myself in a position where I'd kinda like to have a forum package. Something that I can integrate into existing site easily.

Unfortunately most of the "best" forum packages are PHP-based, and have their own complex login, group, and admin facilities. That makes it hard to update them to authenticate against my existing MySQL table(s). (We'll leave my PHP-allergy in the background)

So, once more, I've been contemplating the bad route; create my own forum software. I'm well aware that down that path lies badness madnesss.

Let us recap. What is a forum?

  • A forum is an online site.
  • With a coarse list of topics.
  • Inside each topic is a list of threads.
  • Each thread is comprised of a number of (threaded) messages.

Sound familiar? It should if you use email:

  • ~/Maildir contains storage for a collection of mailboxes.
  • Each mailbox is a course list of topic-specific discussion.
  • Each topic is comprised of a number of (threaded) messages.

So, the unthinkable, could we convert (bi/uni-directinally?) from a Maildir hierarchy to an online forum?

Would that make sense? On the face of it. Yes.

There are implementation details - the forum index would be essentially a list of Maildir folders (perhaps "~/Maildir/topic1/.title" would be require to give it a pretty name).

Each thread topic would be a rendered display of the messages in the folder.

So, what are the drawbacks? Well reading Maildir folders gives us threading, and subjects, bodies, etc. But it does mean a fair bit of overhead parsing messages.

(Times like this I remember Hughe. Every time we've gotten together for beer & geekery the topic of an extensible perl-based IMAP server comes up. I'm sure it should be written ..)

I'll wrap this up now. I'm sure I've made the point. There are some details which have impact - Should the forum accept new posts online? Or only via gated email-delivery? Will it work? Should it be Maildir, or IMAP based? Still at least filtering your SPAM would be easy ;)

More questions. Some questions have no answers. Some answers we ignore because we don't like.

I need to sleep.

ObFilm: 007: Quantum of Solace

Bad film. Don't waste your pennies.

| 13 comments

 

It is an army bred for a single purpose

9 February 2009 21:50

It is funny the way things work out when you're looking for help.

Recently I was working on a Ruby + FUSE based filesystem and as part of the development I added simple diagnostic output via trivial code such as this:

@debug && puts "called foo(#{param});"

That was adequate for minimal interactive use, but not so good for real live use. In real live use I started outputing messages to a dedicated logfile, but in practise became overwhelmed by thousands of lines of output describing everything ever applied to the filesystem.

I figured the natural solution was to have a ring-buffer. (Everybody knows what a ringbuffer is, right?) It could keep the last 500 messages and newer debug information would just replace older entreis. That'd be just enough to be useful if I had a problem, but not so overwhelming it would get ignored.

In Perl I found a nice ringbuffer library, but for Ruby nothing. Locking a region of shared memory via shmget, shmset and keeping an array of a few hunded strings would be simple, but it seems odd I have to code this myself.

I started searching around and I accidentally stumbled upon the unrelated IPC::DirQueue perl module. Not useful for my ringbuffer logging problem, but beautifully useful.

There is no package for Debian but that was easily created:

dh-make-perl --build --cpan IPC::DirQueue

Already I have a million and one uses for it - not least to solve my problem of maintaining a centralised quarantine for all the spam mail rejected by N MX machines. (Which currently uses a combination of rsync and lockfiles.)

This is the reason why sites like Perl Advent Calendar are useful - they introduce a useful module every day or two, and introduce you to thinks that you can use in the future.

Of course keeping a sustainable site like that up and running is hard which is why sites like debaday struggle to attract contributors, for example.

Anyway random happyness.

ObFilm: Lord of the rings: Two Towers

| 3 comments

 

Why do you keep torturing yourself?

9 July 2009 21:50

Recently I came to realise that my planning and memory skills weren't adequate to keeping track of what I want to do, and what I need to do.

For a while I've been fooling myself into thinking than "emacs ~/TODO" was a good way to keep track of tasks. It really isn't - especially if you work upon multiple machines throughout the week.

So I figured I needed something "always available", which these days mostly means an online application / website.

Over the years I've looked at many multi-user online "todo-list" applications, and inevitably they all suck. Mostly they suck because they're either too rigid or don't meet my particular way of working, living, and doing "stuff".

To my mind :

  • A todo-list should above all make it easy to add tasks.
    • If you cannot easily add tasks then you won't. So you'll never use it.
  • A task might be open or closed, but it will never be 23.55% complete.
  • A task might be "urgent" or not, but it will never be "urgent", "semi-urgent", "do soon", "do today".
  • A task might have many steps but they should either be added separately, or the steps noted in some notes.
    • I find the notion of making "task A" depend upon "task B" perilous.
  • A task belongs to one person. It cannot be moved, shared, or split.

Some of those things such as subtasks and completion percentages I guess are more application to project management software. Those and time-sheet applications really wind me up.

With my mini-constraints in mind I sketched out a couple of prototypes. The one I was expecting to use had a typical three-pane view:

[ Task Menu ]     |  Task1: Buy potatoes
                  |  Task2: Remember to check email
  All Tasks       |  Task3: Order more cake.
  Completed Tasks |------------------------------------
  Urgent Tasks    |
                  |  [Taske Details here]
  Tags:           |
   * Work         |  [Urgent | Non-Urgent ]
   * Fun          |
   * Birthdays    |  [Close Task | Re-OPen Task ]
   * Parties      |
   * Shopping     |  [Notes ..]
   ...

That turned out to be a pain to implement, and also a little unwieldy. I guess trying to treat a tasklist as a collection of email is a difficult proposition at best - but more on that in my next post.

So a quick rethink and now I've came up with a simple but functional layout based upon the notions that:

  • Adding tasks must be almost too easy.
  • Most tasks only need a one-line description.
  • Adding tags is good. Because tasks cross boundaries.
  • Adding notes is good.
  • No task should ever be deleted - but chances are we don't actually wish to view tasks older than a month. We can, but we can hide them by default.
  • When a task is closed/completed it cannot be edited.
  • All tasks belong to their owner and are non-public.

So what I've got is a multi-user system which is essentially split into four behaviours: Adding a task, viewing open tasks, viewing closed tasks, and searching for tasks.

Tasks can be tagged, have notes added to them (but never deleted/edited) and over time closed tasks fade away (though they're never deleted).

Some of my constraints or assumptions might change over time, but so far I'm happy with them. (e.g. I can imagine tagging an entry "public" might make it appear visible to others.)

Anyway the code is (surprise) built using minimal perl & jquery and you can play with it:

The site contains a demo user which you can use. I don't much care if people wish to use it for real once it is more complete, but I expect that it will either be ignored completely or be the kind of thing you wish to self-host.

With that in mind the code is currently closed, but I'll add it to my mercurial repository soon. (Maybe even tonight!)

ObSubject: Dog Soldiers

| 8 comments

 

Thank you for coming back to me.

20 August 2009 21:50

I made a new release of the chronicle blog compiler today, and learned to hate the freshmeat.net website a little more.

The only real change is that now each compiled blog will receive a generated sitemap.xml file containing links to every output page. This will be useful for those folk that use real titles for their posts.

Nothing too much to report upon, although I noted with interest Antti-Juhani Kaijanaho's recent forum installation.

I love the idea of having a forum be a mere wrapper around a real transport system, which supports threading natively - but as I said almost a year ago I'd have done it using Mailing lists and/or Maildir folders....

ObFilm: Brief Encounter.

| No comments

 

But now that I have you in my custody, I may do with you what I please.

27 December 2009 21:50

I sketched out a quick prototype of a Kernel ChangeLog viewer:

Choose the kernel on the left, select the changelog summary at the top and the text is shown in the bottom pane.

I spend a fair amount of time reading kernel changelogs and something like this (but with nice filtering and searching) would be useful. The only major problems I see are :

  • "Recent" changelog entries have one format, older ones have another.
  • You need to download a lot of changelog files locally for it to be useful.

Anyway if you follow kernels you might like the idea, if not the implementation. I look forward to seeing your improved version. (Doesn't free software rock? ;)

ObSubject: Aeon Flux

| 4 comments

 

That friend promises his undying friendship if you would do him a small favour.

17 June 2010 21:50

Perl & Apache?

Once upon a time, within the past year, I saw mention of a simpler version of mod_perl - an apache module which let you write code to run within the context of a persistent perl process.

However my DuckDuckGofu is weak, and I'm struggling to find this project.

Did I dream it, or could somebody tell me where it lives?

Dynamic Picture Frames

So I've been taking pictures recently. Lots of pictures.

Many times many images have been printed and hung upon my walls, and the price of frames is starting to become onerous.

I'd love to see some kind of "dynamic" picture wall - but the two alternatives I considered fail:

Metal & Magnets

Place a huge sheet of metal upon your wall. Then put wee magnets inside your frames.

Corkboard

Imagine a full wall that was paneled with what is essentially a large notice-board..

Both of these would look ugly; the metal one perhaps less so.

But the idea of having a wall which could have pictures mounted upon it, without having big nail holes if you rearranged and which could cope with dynamic repositioning and sizes is nice ..

Invent it for me? I'll buy one. Probably even two...

ObFilm: The Godfather

| 21 comments

 

I am the edge!

23 June 2010 21:50

Over the past few years I've amassed a collection of a few thousand images taken with a succession of digital cameras.

I'm pretty good at organising images, in a directory hierarchy which makes sense to me, in a few simple and broad categories:

skx@birthday:~$ tree -L 1 ~/Images/
/home/skx/Images/
|-- Misc
|-- Parties
|-- People
|-- Pets & Animals
`-- Travel

Beneath ~/Images/People, for example, I have subdirectories for specific individuals (or a "Debian/" folder for Debian-people who've been snapped but don't warrant their own folder.)

~/Images/Travel has things like Travel/Local/2010, Travel/Vienna/2008, etc.

In summary I have images of people, places, and things stored beneath what should be a reasonably discoverable directory hierarchy, however this just doesn't work. I still struggle to find images - for example images of myself might be located in ~/Images/People/Self/*, but in practise I'm often included in ~/Images/Travel/* as well.

A few times I've looked at using f-spot, digikam, and similar tools to perform image-organisation (but not editing, or timelines, or anything else. Just organisation). I've found I didn't like being locked into their formats, didn't want them to copy my images to a second location, and other gripes. In the end I've forced myself to come up with a Steve-Specific-Solution. Not for the first time, but I think I have just cause...

I'm now using the User-Comment field in the image's EXIF data to store tags. (When it comes to EXIF data I keep camera-generated fields, but sometimes update/set "Copyright", "Comment", and "Title" fields. So UserComment is one I've never used until now, and thus I run no risk of trashing existing meta-data.)

I've put together a simple perl script, called itag, which will:

  • Index the tag information from all images beneath ~/Images into a DBM file.
  • Show the filenames of all images matching a tag, or tags.
  • Allow me to add tag(s) to an image (which both updates the EXIF data and updates the DBM "cache").

This is enough for me to be able to see all images of "Edinburgh", via:

~$ qiv --fullscreen --slide --delay 5 $(itag --search=edinburgh)

Similarly I could find myself:

~$ itag --search=steve --search=people

I'm not sure it is useful to others, mostly on the basis that people probably fall into their own routine when it comes to filing, and I suspect that people with vast collections of images will just get annoyed by the obscenely slow indexing process I've got. (Hint: run "exiftool" on every /.jpe?g$/i file..)

Still its a simple enough idea and I think it should scale in the future - I can even see myself writing a wee GUI to do tag exploration and similar. Just not today.

ObFilm:Aeon Flux

ObRandom: Apologies for people waiting on email - it's been that kind of week.

| 9 comments

 

Good morning, Bastian

25 June 2010 21:50

So previously I introduced the idea of my image-tagging system. There seemed to be at least a little interest. So here's a brief introduction and real update.

There is a command line tool, itag, which will index the UserComment field from a hierarchy of JPG files. (This field is compatible with digikam, by happy accident).

Additionally there are a pair of GUI tools, both very nasty in terms of code quality and extensibility:

itagview

This presents a list of all the tags which are found, (by invoking "itag --tags"), and allows you to view thumbnails of all images with a single specific tag. Double-click to launch the image full-sized.

itagger

This is a GUI tool which will present thumbnails of all images beneath a given directory, recursively, and allow you to enter tags either on individual images, or on multiple ones.

This doesn't update the DBM cache file that itag uses though, so you'll want to re-run that aftward.

Anyway enough pimping, if you like the sound of it visit the itag page. If you're optimistic, abhor reading, and just wanna play then there is an itag package for Lenny.

Patches welcome, especially to the nasty Gtk2 code...

ObFilm: The NeverEnding Story

| 1 comment

 

I'm a CPAN author.

23 July 2010 21:50

As of this morning I'm a published author on CPAN!

Thus far I have only a single module to my name, but that will most likely change in the future:

CGI::Session::Driver::redis

A module for storing (CGI) session data within a Redis database.

A while back I setup a dynamic website which was 100% redis backed, using my redis backports for lenny, and realised I needed somewhere to store the session data too. Hence this module.

I'll create a .deb package of the module, and stick it alongside the redis server.

ObQuote:I like to keep this handy... for close encounters.

Aliens

| No comments

 

As promised a new blogspam.net

18 August 2010 21:50

A while back I mentioned that I was going to be updating and overhauling the blogspam.net service. That process is now almost complete. A couple of nights ago I overhauled the website, and today I've finally committed my last (planned) change to the repository for the purposes of migration. I started reworking the code a week or so ago, but as of this evening the code in the repository is the code the server is actually running.

The previous codebase was functional but a little hasty - and was implemented before I switched to per-UID server-hosting - so there was a need to clean things up and make sure permissions and similar niggles were checked.

The new, modular, codebase requires no root access, and will store all state (logs & transient caches) in a clean extensible fashion. The code is also much more flexible making use of Module::Pluggable rather than Class::Pluggable. This allowed me to overhaul the API of the plugins (primarily to add an expire method such that each plugin has a well-defined means to expire any state they may maintain). Module::Pluggable is a great module - allows me to treat plugins as first class objects, which wasn't the case with C::P.

Since all the code behind the service is Perl it is also now available on CPAN in addition to the mercurial repository where it is developed..

I see that the server is getting pretty popular these days, used by the likes of embedders.org, publiclive.com, & etc. It doesn't hurt that ikiwiki, identi.ca, and other people include support in their distributions these days. Me? I mostly use it on debian-administration.org where it does a great job.

ObQuote: What's the name of that thing that if I eat it real fast, it's free? - Whip It.

| 2 comments

 

Recently I've been working with flash

19 October 2010 21:50

Recently I've been producing simple Flash animations.

Mostly these are simple "Show an image, slide it around a bit, show another..". But I still feel vaguely unclean and non-free.

I started off using the SWF Perl binding, but soon realised that wasn't much fun. So I wrote a mini intepretter such that I can script creation:

#!/usr/bin/flash-scripter

# create the movie
create 640 480

# background == black
clear 0, 0, 0

# load image at 0,0
load 0, 0, foo.png

# move it about a bit
move 0, -1
move 0, -1
move 0, -1

# movie-time is over now.
stop

# finally save the movie
save foo.swf

# all done
exit

I see I'm not alone in doing such a thing as swftools (not available for Debian) includes swfc "A tool for creating SWF files from simple script files.". Sadly swftools fails to build for me on Squeeze so I couldn't try it out.

My little tool is called SWF::Scripter and uses plugins to implement each "command" so I should probably upload it somewhere public. On the other hand it was a quick hack to produce a mini-story from a bunch of images with no complex transitions so I'm not sure it is worth the effort.

ObQuote: "I honestly think I'd give up smoking if he asked me." - Breakfast at Tiffany's

| 7 comments

 

How do you deploy applications?

30 May 2011 21:50

I've got a few projects which are hosted in mercurial repositories. To deploy them I manually checkout the repository, create symlinks by hand, then update apache thttpd to make them work.

When I want to update my applications I manually become the correct user, find the repository and run "hg pull --update".

I think it is about time that I sat down and started doing things neatly. I made a start at this by writing a shell script for each site called .deploy then I drive it like so:

#!/bin/sh
#
# ~/bin/deploy  execute the .deploy file associated with this project.
#
while true; do

    #
    #  If we're at the root directory we're done.
    #
    if [ $PWD = "/" ]; then
        echo "Reached /"
        exit
    fi

    # found our file?
    #
    if [ -x ".deploy" ]; then

       ./.deploy
       exit
    fi

    cd ..
done

It seems the main candidate is capistrano, which was previously very Ruby on Rails centric, but these days seems to be divorced from it.

Alternatively there is the python-based fabric project which has been stalled for two years, vlad the deployer (great name!) which is another Rake-based and thus Ruby-loving system, and finally whiskey disk which is limited to Git-based projects as far as I can tell.

In short each of these projects is very similar, and each relies upon being able to do two things:

  • SSH to remote machine(s) and run a command.
  • Copy files to the remote command / pull a repository from a known location.

I've automated SSH before, and I've automed SCP/rsync. The hard part is doing both "copy" and "command" over one SSH channel - such that you don't get prompted for passwords multiple times - and handling the case of runnign sudo where appropriate. Still most of the initial stages are trivial.

I wonder what project I should be using:

  • I like perl. Perl is good.
  • I use mercurial. Mercurial is good.
  • Rake is perhaps permissable, but too ruby-centric == not for me.

Anything I've missed? Or pointers to good documentation?

ObQuote: "We need to be a little more constructive here, okay? " - Terminator 2

| 11 comments

 

So I chose fabric and reported a bug..

6 June 2011 21:50

When soliciting for opinions, recently, I discovered that the python-based fabric tool was not dead, and was in fact perfect for my needs.

During the process of getting acquainted with it I looked over the source code, it was mostly neat but there was a trivial (low-risk) symlink attack present.

I reported that as #629003 & it is now identified more globally as CVE-2011-2185.

I guess this goes to show that getting into the habit of looking over source code when you install a new package is a worthwhile thing to do; and probably easier than organising a distribution-wide security audit </irony>.

In other news I'm struggling to diagnose a perl segfault, when running a search using the swish-a perl modules. Could it be security worthy? Possibly. Right now I just don't want my scripts to die when I attempt to search 20Gb of syslog data. Meh.

ObQuote: "You're scared of mice and spiders, but oh-so-much greater is your fear that one day the two species will cross-breed to form an all-powerful race of mice-spiders who will immobilize human beings in giant webs in order to steal cheese. " - Spaced.

| No comments

 

Scriptable email clients

5 September 2011 21:50

This is just a quick post to remind myself in the morning, as soon as I've made it I intend to turn my computer off and leave it off until I can re-organize my office.

I've been using mutt for my email for the past few years. Nothing compares to the flexibility of procmail/sieve for organizing server-side mail, and then mutt is ideal for reading them.

With the addition of the mutt-patched sidebar mode you can even go for a few days before realizing you're not in a graphical environment. But one thing I do long for is the ability to execute scripts at various times.

Thus far I've not actually planned what I'd like to do, but as a starting point imagine being able to execute a hook when new mail arrives? Or when you send a message matching a pattern in some fashion?

There are some things out there, such as the various hacks which are designed to abort sending a message if you mention "See attachment" in a message body but fail to add one before sendign the message. These hacks generally abuse the sendmail configuration such that they're extremely ad-hoc and hard to chain/nest.

I've mellowed out over the years and I have no interest in attempting to write a mail-client (though at the same time how hard can it be? Just restrict yourself to using inotify on ~/Maildir and offload delivery to exim and you're almost done? I guess the hard part is the UI, though I do like the mutt + sidebar layout. Write the whole thing in some scripty language?)

I'll re-examine notmuch and gnus over the next week or two, but I suspect both will continue to disappoint in various ways.

Anyway, for the moment I'm just pondering. But threading is an obvious concern. Most current mutt hooks relate to the local folder, or the local message. If I were viewing a message in one directory and a new mail notification fired for a delivery to both ~/Maildir and ~/Maildir/.people.foo I'd need to either serialise them or thread them.

Ponder ponder.

In other news I've been doing more photography recently. Nothing cohesive except for my recent experiment with shooting a "street-girl" outdoors in falling light, but that was an interesting challenge and the results were sufficient to make me want to try shooting outdoors in an organized fashion again. (Some random images have been linked to from my wee twitter page.)

ObFilm: "She doesn't get eaten by the eels at this time " - The Princess Bride

| 5 comments

 

Slaughter is at the cross-roads

1 November 2011 21:50

There are many system administration and configuration management tools available, I've mentioned them in the past and we're probably all familiar with our pet favourites.

The "biggies" include CFEngine, Puppet, Chef, BFG2. The "minis" are largely in-house tools, or abuses of existing software such as fabric.

My own personal solution manages my home network, and three dedicated servers I pay for in various ways.

Currently I've been setting up some configuration "stuff" for a friend and I've elected to manage some of the setup with this system of my own, and I guess I need to decide what I'm going to do going forward.

slaughter is well maintained, largely by virtue of not doing too much. The things it does are genuinely useful and entirely sufficient to handle a lot of the common tasks - and because the server-side requirement is a HTTP server, and the only client-side requirement is CRON it is trivial to deploy.

In the past I've thought of three alternatives that would make it more complex:

  • Stop using HTTP and have a mini-daemon to both serve and schedule.
  • Stop using HTTP and use rsync instead.
  • Rewrite it in Javascript. (Yes, really).

Each approaches have their appeal. I like the idea of only executing GPG-signed policies, and that would be trivial if there was a real server in place. It could also use SSL because that's all you need for security (ha!).

On the other hand using rsync allows me to trivially implement the only missing primitive I actually miss at times - the ability to recursively download and install a remote directory tree. (I solve this problem by downloading a .tar file and unpacking it. Not good. Doesn't cope with template expansion and is fiddlier than I like).

In the lifetime of the project I think I've had 20-50 feature requests or comments, which suggests it might actually be used by 50-100 people. (Ha! Optimism)

In the meantime I'll keep a careful eye on the number of people who download the tarball & the binary packages...

ObQuote: "I have vermin to kill. " - Kill Bill

| 2 comments

 

Two minor toys ..

23 February 2014 21:50

Two minor things:

graphite_send

A simple shell-script to submit metrics to a graphite server, extensible via local plugins, but covers the obvious metrics by default.

Metrics are submitted via simple calls to netcat.

Trivial, but much more lightweight than collectd and similar.

HTML::Emoji

A perl module for converting HTML like "<p>:smile:</p>" into something graphical.

This was written for my markdown sharing site, but is pretty fun.

The konami-code page demonstrates usage.

(This parses the HTML so it won't transform attributes, ids, or anything that isn't in the "text" part of any HTML input.)

The graphite sending script is perhaps the most useful, but at the same time it feels too small to be a package of its own. I'm tempted to bundle it up into my sysadmin-util collection, but I can't quite decide if it belongs there either.

| 2 comments

 

So I bought some new hardware, for audio purposes.

6 March 2014 21:50

This week I received a logitech squeezebox radio, which is basically an expensive toy that allows you to listen to either "internet radio", or music streamed from your own PC via a portable device that accesses the network wirelessly.

The main goal of this purchase was to allow us to listen to media stored on a local computer in the bedroom, or living-room.

The hardware scans your network looking for a media server, so the first step is to install that:

The media-server has a couple of open ports; one for streaming the media, and one for a user-browsable HTML interface. Interestingly the radio-device shows up in the web-interface, so you can mess around with the currently loaded playlist from your office, while your wife is casually listening to music in the bedroom. (I'm not sure if that's a feature or not yet ;)

Although I didn't find any alternative server-implementations I did find a software-client which you can use to play music from the central server - slimp3slave - and again you can push playlists, media, etc, to this.

My impressions are pretty positive; the device was too expensive, certainly I wouldn't buy two, but it is functional. The user-interface is decent, and the software being available and open is a big win.

Downsides? No remote-control for the player, because paying an additional £70 is never going to happen, but otherwise I can't think of anything.

(Shame the squeezebox product line seems to have been cancelled (?))

Procmail Alternatives?

Although I did start hacking a C & Lua alternative, it looks like there are enough implementations out there that I don't feel so strongly any more.

I'm working in a different way to most people, rather than sort mails at delivery time I'm going to write a trivial daemon that will just watch ~/Maildir/.Incoming, and move mails out of there. That means that no errors will cause mail to be lost at SMTP/delivery time.

I'm going to base my work on Email::Filter since it offers 90% of the primitives I want. The only missing thing is the ability to filter mails via external commands which has now been reported as a bug/omission.

| 10 comments

 

Time to get back to my roots: Perl

7 March 2014 21:50

Today I wrote a perl Test::RemoteServer module:

#!/usr/bin/perl -w -I.

use strict;
use warnings;

use Test::More tests => 4;
use Test::RemoteServer;

#
#  Ping Tests
#
ping_ok( "192.168.0.1",       "Website host is up: IPv4" );
ping6_ok( "www.steve.org.uk", "Website host is up: IPv6" );

#
#  Socket tests
#
socket_open( "ipv4.steve.org.uk", "2222", "OpenSSH is running" );
socket_closed( "ipv4.steve.org.uk", "22", "OpenSSH is not available on :22" );

I can see a lot of value in defining tests that are carried out against remote hosts - even if they're more basic than the kind of comprehensive testing you'd get via Custodian, Nagios, etc.

Being able to run "make test" and remotely probe services is cool.

Unfortunately I suspect the new-hotness is to couple the testing with your Chef, Puppet, CFengine, Slaughter, Ansible, etc, policies. That way you have two things:

  • A consistent way to define system-state.
  • A consistent way to test that the damn thing worked.

Coming to CPAN in the near future anyway, I can throw it up on Github in advance if there is any interest..

| 4 comments

 

So I failed at writing some clustered code in Perl

24 March 2014 21:50

Until this time next month I'll be posting code-based discussions only.

Recently I've been wanting to explore creating clustered services, because clusters are definitely things I use professionally.

My initial attempt was to write an auto-clustering version of memcached, because that's a useful tool. Writing the core of the service took an hour or so:

  • Simple KeyVal.pm implementation.
  • Give it the obvious methods get, set, delete.
  • Make it more interesting by creating a read-only append-log.
  • The logfile will be replayed for clustering.

At the point I was done the following code worked:

use KeyVal;

# Create an object, and set some values
my $obj = KeyVal->new( logfile => "/tmp/foo.log" );
$obj->incr( "steve" );
$obj->incr( "steve" );

print $obj->get( "steve" ) # prints 2.

# Now replay the append-only log
my $replay = KeyVal->new( logfile => "/tmp/foo.log" );
$replay->replay();

print $replay->get( "steve" ) # prints 2.

In the first case we used the primitives to increment a value twice, and then fetch it. In the second case we used the logfile the first object created to replay all prior transactions, then output the value.

Neat. The next step was to make it work over a network. Trivial.

Finally I wanted to autodetect peers, and deploy replication. Each host would send out regular messages along the lines of "Do you have updates made since $time?". Any that did would replay the logfile from the given unixtime offset.

However here I ran into problems. Peer discovery was supposed to be basic, and I figured I'd write something that did leader election by magic. Unfortunately Perls threading code is .. unpleasant:

  • I wanted to store all known-peers in a singleton.
  • Then I wanted to create threads that would announce and receive updates.

This failed. Majorly. Because you cannot launch the implementation of a class-method as a thread. Equally you cannot make a variable which is "complex" shared across threads.

I wrote some demo code which works without packages and a shared singleton:

The Ruby version, by contrast, is much more OO and neater. Meh.

I've now shelved the project.

My next, big, task was to make the network service utterly memcached compatible. That would have been fiddly, but not impossible. Right now I just use a simple line-based network protocol.

I suspect I could have got what I wanted using EventMachine, or similar, but that's a path I've not yet explored, and I'm happy enough with that decision.

| 2 comments

 

I'm still not a developer, but ..

10 June 2014 21:50

Some coding updates:

My templer static site generator has now been uploaded to CPAN, and is available as App::Templer.

I've converted most of my Dockerfiles to work with docker 1.0.0, which is nice.

I also hacked up a fun DNS-server for sharing JSON-encoded data, within a LAN or other environment:

Finally I updated the blogspam-detecting site a little, on the back-end. The code is now running inside Docker containers which means I can redeploy more easily in the future.

My blog post about looking for a job received some attention via a Reddit advert I posted to /r/edinburgh + /r/sysadmin, but thus far has mostly resulted in people wanting me to write code for them .. which is frustrating.

For the moment I'm working on a fun challenge involving (email) spam-detection. That takes me back.

| 2 comments

 

Accidental data-store ..

18 May 2016 21:50

A few months back I was looking over a lot of different object-storage systems, giving them mini-reviews, and trying them out in turn.

While many were overly complex, some were simple. Simplicity is always appealing, providing it works.

My review of camlistore was generally positive, because I like the design. Unfortunately it also highlighted a lack of documentation about how to use it to scale, replicate, and rebalance.

How hard could it be to write something similar, but also paying attention to keep it as simple as possible? Well perhaps it was too easy.

Blob-Storage

First of all we write a blob-storage system. We allow three operations to be carried out:

  • Retrieve a chunk of data, given an ID.
  • Store the given chunk of data, with the specified ID.
  • Return a list of all known IDs.

 

API Server

We write a second server that consumers actually use, though it is implemented in terms of the blob-storage server listed previously.

The public API is trivial:

  • Upload a new file, returning the ID which it was stored under.
  • Retrieve a previous upload, by ID.

 

Replication Support

The previous two services are sufficient to write an object storage system, but they don't necessarily provide replication. You could add immediate replication; an upload of a file could involve writing that data to N blob-servers, but in a perfect world servers don't crash, so why not replicate in the background? You save time if you only save uploaded-content to one blob-server.

Replication can be implemented purely in terms of the blob-servers:

  • For each blob server, get the list of objects stored on it.
  • Look for that object on each of the other servers. If it is found on N of them we're good.
  • If there are fewer copies than we like, then download the data, and upload to another server.
  • Repeat until each object is stored on sufficient number of blob-servers.

 

My code is reliable, the implementation is almost painfully simple, and the only difference in my design is that rather than having an API-server which allows both "uploads" and "downloads" I split it into two - that means you can leave your "download" server open to the world, so that it can be useful, and your upload-server can be firewalled to only allow a few hosts to access it.

The code is perl-based, because Perl is good, and available here on github:

TODO: Rewrite the thing in #golang to be cool.

| 1 comment

 

A simple Perl alternative to storing data in Redis

16 December 2016 21:50

I continue to be a big user of Perl, and for many of my sites I avoid the use of MySQL which means that I largely store data in flat files, SQLite databases, or in memory via Redis.

One of my servers was recently struggling with RAM, and the suprising cause was "too much data" in Redis. (Surprising because I'd not been paying attention and seen how popular it was, and also because ASCII text compresses pretty well).

Read/Write speed isn't a real concern, so I figured I'd move the data into an SQLite database, but that would require rewriting the application.

The client library for Perl is pretty awesome, and simple usage looks like this:

# Connect to localhost.
my $r = Redis->new()

# simple storage
$r->set( "key", "value" );

# Work with sets
$r->sadd( "fruits", "orange" );
$r->sadd( "fruits", "apple" );
$r->sadd( "fruits", "blueberry" );
$r->sadd( "fruits", "banannanananananarama" );

# Show the set-count
print "There are " . $r->scard( "fruits" ) . " known fruits";

# Pick a random one
print "Here is a random one " . $r->srandmember( "fruits" ) . "\n";

I figured, if I ignored the Lua support and the other more complex operations, creating a compatible API implementation wouldn't be too hard. So rather than porting my application to using SQLite directly I could juse use a different client-library.

In short I change this:

use Redis;
my $r = Redis->new();

To this:

use Redis::SQLite;
my $r = Redis::SQLite->new();

And everything continues to work. I've implemented all the set-related functions except one, and a random smattering of the other simple operations.

The appropriate test-cases in the Redis client library (i.e. removing all references to things I didn't implement) pass, and my own new tests also make me confident.

It's obviously not a hard job, but it was a quick solution to a real problem and might be useful to others.

My image hosting site, and my markdown sharing site now both use this wrapper and seem to be performing well - but with more free RAM.

No doubt I'll add more of the simple primitives as time goes on, but so far I've done enough to be useful.

| No comments

 

BlogSpam.net repository cleanup, and email-changes.

3 December 2017 21:50

I've shuffled around all the repositories which are associated with the blogspam service, such that they're all in the same place and refer to each other correctly:

Otherwise I've done a bit of tidying up on virtual machines, and I'm just about to drop the use of qpsmtpd for handling my email. I've used the (perl-based) qpsmtpd project for many years, and documented how my system works in a "book":

I'll be switching to pure exim4-based setup later today, and we'll see what that does. So far today I've received over five thousand spam emails:

  steve@ssh /spam/today $ find . -type f | wc -l
  5731

Looking more closely though over half of these rejections are "dictionary attacks", so they're not SPAM I'd see if I dropped the qpsmtpd-layer. Here's a sample log entry (for a mail that was both rejected at SMTP-time by qpsmtpd and archived to disc in case of error):

   {"from":"<[email protected]>",
    "helo":"adrian-monk-v3.ics.uci.edu",
    "reason":"Mail for juha not accepted at steve.fi",
    "filename":"1512284907.P26574M119173Q0.ssh.steve.org.uk.steve.fi",
    "subject":"Viagra Professional. Beyond compare. Buy at our shop.",
    "ip":"2a00:6d40:60:814e::1",
    "message-id":"<[email protected]>",
    "recipient":"[email protected]",
    "host":"Unknown"}

I suspect that with procmail piping to crm114, and a beefed up spam-checking configuration for exim4 I'll not see a significant difference and I'll have removed something non-standard. For what it is worth over 75% of the remaining junk which was rejected at SMTP-time has been rejected via DNS-blacklists. So again exim4 will take care of that for me.

If it turns out that I'm getting inundated with junk-mail I'll revert this, but I suspect that it'll all be fine.

| 1 comment

 

Rewriting some services in golang

30 March 2018 10:00

The past couple of days I've been reworking a few of my existing projects, and converting them from Perl into Golang.

Bytemark had a great alerting system for routing alerts to different enginners, via email, SMS, and chat-messages. The system is called mauvealert and is available here on github.

The system is built around the notion of alerts which have different states (such as "pending", "raised", or "acknowledged"). Each alert is submitted via a UDP packet getting sent to the server with a bunch of fields:

  • Source IP of the submitter (this is implicit).
  • A human-readable ID such as "heartbeat", "disk-space-/", "disk-space-/root", etc.
  • A raise-field.
  • More fields here ..

Each incoming submission is stored in a database, and events are considered unique based upon the source+ID pair, such that if you see a second submission from the same IP, with the same ID, then any existing details are updated. This update-on-receive behaviour is pretty crucial to the way things work, especially when coupled with the "raise"-field.

A raise field might have values such as:

  • +5m
    • This alert will be raised in 5 minutes.
  • now
    • This alert will be raised immediately.
  • clear
    • This alert will be cleared immediately.

One simple way the system is used is to maintain heartbeat-alerts. Imagine a system sends the following message, every minute:

  • id:heartbeat raise:+5m [source:1.2.3.4]
    • The first time this is received by the server it will be recorded in the database.
    • The next time this is received the existing event will be updated, and crucially the time to raise an alert will be bumped (i.e. it will become current-time + 5m).
    • The next time the update is received the raise-time will also be bumped
    • ..

At some point the submitting system crashes, and five minutes after the last submission the alert moves from "pending" to "raised" - which will make it visible in the web-based user-interface, and also notify an engineer.

With this system you could easily write trivial and stateless ad-hoc monitoring scripts like so which would raise/clear :

 curl https://example.com && send-alert --id http-example.com --raise clear --detail "site ok" || \
  send-alert  --id http-example.com --raise now --detail "site down"

In short mauvealert allows aggregation of events, and centralises how/when engineers are notified. There's the flexibility to look at events, and send them to different people at different times of the day, decide some are urgent and must trigger SMSs, and some are ignorable and just generate emails .

(In mauvealert this routing is done by having a configuration file containing ruby, this attempts to match events so you could do things like say "If the event-id contains "failed-disc" then notify a DC-person, or if the event was raised from $important-system then notify everybody.)

I thought the design was pretty cool, and wanted something similar for myself. My version, which I setup a couple of years ago, was based around HTTP+JSON, rather than UDP-messages, and written in perl:

The advantage of using HTTP+JSON is that writing clients to submit events to the central system could easily and cheaply be done in multiple environments for multiple platforms. I didn't see the need for the efficiency of using binary UDP-based messages for submission, given that I have ~20 servers at the most.

Anyway the point of this blog post is that I've now rewritten my simplified personal-clone as a golang project, which makes deployment much simpler. Events are stored in an SQLite database and when raised they get sent to me via pushover:

The main difference is that I don't allow you to route events to different people, or notify via different mechanisms. Every raised alert gets sent to me, and only me, regardless of time of day. (Albeit via an pluggable external process such that you could add your own local logic.)

I've written too much already, getting sidetracked by explaining how neat mauvealert and by extension purple was, but also I rewrote the Perl DNS-lookup service at https://dns-api.org/ in golang too:

That had a couple of regressions which were soon reported and fixed by a kind contributor (lack of CORS headers, most obviously).

| 2 comments