About Archive Tags RSS Feed

 

Entries tagged sysadmin

The plans you refer to will soon be back in our hands.

22 August 2009 21:50

Many of us use rsync to shuffle data around, either to maintain off-site backups, or to perform random tasks (e.g. uploading a static copy of your generated blog).

I use rsync in many ways myself, but the main thing I use it for is to copy backups across a number of hosts. (Either actual backups, or stores of Maildirs, or similar.)

Imagine you backup your MySQL database to a local system, and you keep five days of history in case of accidental error and deletion. Chances are that you'll have something like this:

/var/backups/mysql/0/
/var/backups/mysql/1/
/var/backups/mysql/2/
/var/backups/mysql/3/
/var/backups/mysql/4/

(Here I guess it is obvious that you backup to /mysql/0, after rotating the contents of 0->1, 1->2, 2->3, & 3->4)

Now consider what happens when that rotation happens and you rsync to an off-site location afterward: You're copying way more data around than you need to because each directory will have different content every day.

To solve this I moved to storing my backups in directories such as this:

/var/backups/mysql/9-03-2009/
/var/backups/mysql/10-03-2009/
/var/backups/mysql/11-03-2009/
..

This probably simplifies the backup process a little too: just backup to $(date +%d-%m-%Y) after removing any directory older than four days.

Imagine you rsync now? The contents of previous days won't change at all, so you'll end up moving significantly less data around.

This is a deliberately contrived and simple example, but it also applies to common everyday logfiles such as /var/log/syslog, syslog.1, syslog.2.gz etc.

For example on my systems qpsmtpd.log is huge, and my apache access.log files are also very large.

Perhaps food for thought? One of those things that is obvious when you think about it, but doesn't jump out at you unless you schedule rsync to run very frequently and notice that it doesn't work as well as it "should".

ObFilm: Star Wars. The Family Guy version ;)

| 12 comments

 

Slaughter is at the cross-roads

1 November 2011 21:50

There are many system administration and configuration management tools available, I've mentioned them in the past and we're probably all familiar with our pet favourites.

The "biggies" include CFEngine, Puppet, Chef, BFG2. The "minis" are largely in-house tools, or abuses of existing software such as fabric.

My own personal solution manages my home network, and three dedicated servers I pay for in various ways.

Currently I've been setting up some configuration "stuff" for a friend and I've elected to manage some of the setup with this system of my own, and I guess I need to decide what I'm going to do going forward.

slaughter is well maintained, largely by virtue of not doing too much. The things it does are genuinely useful and entirely sufficient to handle a lot of the common tasks - and because the server-side requirement is a HTTP server, and the only client-side requirement is CRON it is trivial to deploy.

In the past I've thought of three alternatives that would make it more complex:

  • Stop using HTTP and have a mini-daemon to both serve and schedule.
  • Stop using HTTP and use rsync instead.
  • Rewrite it in Javascript. (Yes, really).

Each approaches have their appeal. I like the idea of only executing GPG-signed policies, and that would be trivial if there was a real server in place. It could also use SSL because that's all you need for security (ha!).

On the other hand using rsync allows me to trivially implement the only missing primitive I actually miss at times - the ability to recursively download and install a remote directory tree. (I solve this problem by downloading a .tar file and unpacking it. Not good. Doesn't cope with template expansion and is fiddlier than I like).

In the lifetime of the project I think I've had 20-50 feature requests or comments, which suggests it might actually be used by 50-100 people. (Ha! Optimism)

In the meantime I'll keep a careful eye on the number of people who download the tarball & the binary packages...

ObQuote: "I have vermin to kill. " - Kill Bill

| 2 comments

 

So I removed some more software from my host

3 December 2011 21:50

Today I was idly performing some maintainence upon one of my hosts, and it crossed my mind to look beneath /etc in there I found:

/etc/python
/etc/python2.4
/etc/python2.5
/etc/python2.6

That made me look more closely at the contents of /etc - the following command output was surprising:

steve@steve:~$ ls /etc | wc -l
187

Is that average? Heavy? Light? I have no idea, but I purged a hell of a lot of software today. Now I have only python v2.6 although for some reason I still have:

python
python-apt
python-apt-common
python-central
python-minimal
python-support
python2.6
python2.6-minimal

I suspect I could drop the pything2.6-minimal package, but for the moment I'm done. I have to make pretty people look exceptional with my magical camera.

Anyway as part of this cleanup I ran a quick sanity-check on which processes are running and I think, short of kernel processes, I'm as minimal as I can be. I understand the purpose and reason for every running service:

UID        PID  CMD
root         1  init [2]
pdnsd    14091  /usr/sbin/pdnsd --daemon -p /var/run/pdnsd.pid
root     14199  /usr/sbin/monit -c /etc/monit/monitrc -s /var/lib/monit/monit.state
root     14206  /usr/sbin/syslog-ng -p /var/run/syslog-ng.pid
root     14234  /usr/sbin/cron
102      14595  /usr/sbin/exim4 -bd -q30m
redis    14627  /usr/bin/redis-server /etc/redis/redis.conf
root     14637  /usr/sbin/sshd

These are basic services; I use monit to ensure those essential daemons keep running. The only oddity there is probably the local DNS cache, but it is useful if you run any kind of DNS blacklist-using service, for example.

root     14794  /sbin/getty -L ttyS0 9600 vt100

I need a serial console login for emergencies.

root     14796  runsv node-reverse-proxy
root     14797  /bin/sh ./run
root     14799  /opt/node/bin/node node-reverse-proxy.js --config ./rewrites.js

These three processes combine to run my reverse proxy which routes incoming HTTP requests to a number of local thttpd instances.

qpsmtpd  27309    /usr/bin/perl -Tw /usr/bin/qpsmtpd-prefork --port 25 --user qpsmtpd --pid-file /var/run/qpsmtpd/qpsmtpd.pid --detach
..

The perl SMTP daemon which runs my incoming mail, passing it to exim4 which listens upon 127.0.0.1:2525. You can read about my setup in the out-of-date writeup Chris & I put together.

 /usr/bin/memcached -m 64 -p 11211 -u root -l 127.0.0.1

Memory cache for transient items.

s-blog    thttpd -C /etc/thttpd/sites.enabled/blog.steve.org.uk
1030      thttpd -C /etc/thttpd/sites.enabled/edinburgh-portraits.com
s-hg      thttpd -C /etc/thttpd/sites.enabled/hg.steve.org.uk
s-ipv4    thttpd -C /etc/thttpd/sites.enabled/ipv4.steve.org.uk
s-ipv6    thttpd -C /etc/thttpd/sites.enabled/ipv6.steve.org.uk
s-kvm     thttpd -C /etc/thttpd/sites.enabled/kvm-hosting.org
...

One thttpd instance is launched for each distinct HTTP site my server runs. Each site runs under its own UID, with its own chrooted directory tree. This is important for security.

Each local instance listens upon 127.0.0.1 - and the reverse proxy previously mentioned rewrites connections to the appropriate one.

1016     28812     /usr/bin/perl -I./lib/ -I./ /usr/local/bin/blogspam

My anti-spam filter for blog comments.

Here is my christmas challenge. Can you identify each service upon your host? Do you know why you're running what you're running?

Me? I had no idea I had a dbus deamon running. Now I've purged it. Ha!

ObQuote - "I owe everything to George Bailey. Help him, dear Father." - It's a wonderful life.

| 6 comments

 

Some good, some bad

14 May 2013 21:50

Today my main machine was down for about 8 hours. Oops.

That meant when I got home, after a long and dull train journey, I received a bunch of mails from various hosts each saying:

  • Failed to fetch slaughter policies from rsync://www.steve.org.uk/slaughter

Slaughter is my sysadmin utility which pulls policies/recipies from a central location and applies them to the local host.

Slaughter has a bunch of different transports, which are the means by which policies and files are transferred from the remote "central host" to the local machine. Since git is supported I've now switched my policies to be fetched from the master github repository.

This means:

  • All my servers need git installed. Which was already the case.
  • I can run one less service on my main box.
  • We now have a contest: Is my box more reliable than github?

In other news I've fettled with lumail a bit this week, but I'm basically doing nothing until I've pondered my way out of the hole I've dug myself into.

Like mutt lumail has the notion of "limiting" the display of things:

  • Show all maildirs.
  • Show all maildirs with new mail in them.
  • Show all maildirs that match a pattern.
  • Show all messages in the currently selected folder(s)
    • More than one folder may be selected :)
  • Shall all unread messages in the currently selected folder(s).

Unfortunately the latter has caused an annoying, and anticipated, failure case. If you open a folder and cause it to only show unread messages all looks good. Until you read a message. At which point it is no longer allowed to be displayed, so it disappears. Since you were reading a message the next one is opened instead. WHich then becomes marked as read, and no longer should be displayed, because we've said "show me new/unread-only messages please".

The net result is if you show only unread messages and make the mistake of reading one .. you quickly cycle through reading all of them, and are left with an empty display. As each message in turn is opened, read, and marked as non-new.

There are solutions, one of which I documented on the issue. But this has a bad side-effect that message navigation is suddenly complicated in ways that are annoying.

For the moment I'm mulling the problem over and I will only make trivial cleanup changes until I've got my head back in the game and a good solution that won't cause me more pain.

| 6 comments

 

I'm a contractor, available for work

29 April 2020 14:00

For the foreseeable future I'm working for myself, as a contractor, here in sunny Helsinki, Finland.

My existing contract only requires me to work 1.5-2.0 days a week, meaning my week looks something like this:

  • Monday & Tuesday
    • I work for the client.
  • Wednesday - Friday
    • I act as a stay-at-home dad.

It does mean that I'm available for work Wednesday-Friday though, in the event I can find projects to work upon, or companies who would be willing to accept my invoices.

I think of myself as a sysadmin, but I know all about pipelines, automation, system administration, and coding in C, C++, Perl, Ruby, Golang, Lua, etc.

On the off-chance anybody reading this has a need for small projects, services, daemons, or APIs to be implemented then don't hesitate to get in touch.

I did manage to fill a few days over the past few weeks completing an online course from Helsinki Open University, Devops with Docker, it is possible I'll find some more courses to complete in the future. (There is an upcoming course Devops with Kubernetes which I'll definitely complete.)

| No comments

 

Some brief sysbox highlights

17 May 2020 15:00

I started work on sysbox again recently, adding a couple of simple utilities. (The whole project is a small collection of utilities, distributed as a single binary to ease installation.)

Imagine you want to run a command for every line of STDIN, here's a good example:

 $ cat input | sysbox exec-stdin "youtube-dl {}"

Here you see for every (non-empty) line of input read from STDIN the command "youtube-dl" has been executed. "{}" gets expanded to the complete line read. You can also access individual fields, kinda like awk.

(Yes youtube-dl can read a list of URLs from a file, this is an example!)

Another example, run groups for every local user:

$ cat /etc/passwd | sysbox exec-stdin --split=: groups {1}

Here you see we have split the input-lines read from STDIN by the : character, instead of by whitespace, and we've accessed the first field via "{1}". This is certainly easier for scripting than using a bash loop.

On the topic of bash; command-completion for each subcommand, and their arguments, is now present:

$ source <(sysbox bash-completion)

And I've added a text-based UI for selecting files. You can also execute a command, against the selected file:

$ sysbox choose-file -exec "xine {}" /srv/tv

This is what that looks like:

/2020/05/17-choose-file.png

You'll see:

  • A text-box for filtering the list.
  • A list which can be scrolled up/down/etc.
  • A brief bit of help information in the footer.

As well as choosing files, you can also select from lines read via STDIN, and you can filter the command in the same way as before. (i.e. "{}" is the selected item.)

Other commands received updates, so the calculator now allows storing results in variables:

$ sysbox calc
calc> let a = 3
3
calc> a / 9 * 3
1
calc> 1 + 2 * a
7
calc> 1.2 + 3.4
4.600000

| 2 comments