Entries tagged puppet

Related tags: automation, cfengine, chef, cvsrepository, git, health, ldap, openldap, puppet-dashboard, slaughter, work.

Interesting times debugging puppet

Saturday, 26 August 2017

I recently upgraded a bunch of systems from Jessie to Stretch, and as a result of that one of my hosts has started showing me a lot of noise in an hourly cron-email:

Command line is not complete. Try option "help"

I've been ignoring these emails for the past while, but today I sat down to track down the source. It was obviously coming from facter, the system that puppet uses to gather information about hosts.

Running facter -debug made that apparent:

 root@smaug ~ # facter --debug
 Found no suitable resolves of 1 for ec2_metadata
 value for ec2_metadata is still nil
 value for netmask_git is still nil
 value for ipaddress6_lo is still nil
 value for macaddress_lo is still nil
 value for ipaddress_master is still nil
 value for ipaddress6_master is still nil
 Command line is not complete. Try option "help"
 value for netmask_master is still nil
 value for ipaddress_skx_mail is still nil

There we see the issue, and it is obviously relating to our master interface.

To cut a long-story short /usr/lib/ruby/vendor_ruby/facter/util/ip.rb contains some code which eventually runs this:

 ip link show $interface

That works on all other interfaces I have:

  $ ip link show git
  6: git: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN mode DEFAULT group default qlen 1000

But not on master:

  $ ip link show master
  Command line is not complete. Try option "help"

I ninja-edited the code from this:

  ethbond = regex.match(%x{/sbin/ip link show '#{interface}'})


  ethbond = regex.match(%x{/sbin/ip link show dev '#{interface}'})

And suddenly puppet-runs without any errors. I'm not 100% sure if this is a bug bug, but it is something of a surprise anyway.

This host runs KVM guests, one of the guests is a puppet-master, with a local name master. Hence the name of the interface. Similarly the interface git is associated with the KVM guest behind git.steve.org.uk.



So I did a thing, then another thing.

Thursday, 3 August 2017

So I did start a project, to write a puppet-dashboard, it is functionally complete, but the next step is to allow me to raise alerts based on failing runs of puppet - in real-time.

(i.e. Now that I have a dashboard I wish to not use it. I want to be alerted to failures, without having to remember to go look for them. Something puppet-dashboard can't do ..)

In other news a while back I slipped in a casual note about having a brain scan done, here in sunny Helsinki.

One of the cool things about that experience, in addition to being told I wasn't going to drop dead that particular day, was that the radiologist told me that I could pay €25 to get a copy of my brain data in DICOM format.

I've not yet played with this very much, but I couldn't resist a brief animation:

  • See my brain.
    • Not the best quality, or the best detail, but damn. It is my brain.
    • I shall do better with more experimentation I think.
    • After I posted it my wife, a doctor, corrected me: That wasn't a gif of my brain, instead it was a gif of my skull. D'oh!

| No comments


So I'm considering a new project

Saturday, 29 July 2017

In the past there used to be a puppet-labs project called puppet-dashboard, which would let you see the state of your managed-nodes. Having even a very basic and simple "report user-interface" is pretty neat when you're pushing out a change, and you want to see it be applied across your fleet of hosts.

There are some other neat features, such as allowing you to identify failures easily, and see nodes that haven't reported-in recently.

This was spun out into a community-supported project which is largely stale:

Having a dashboard is nice, but the current state of the software is less good. It turns out that the implementation is pretty simple though:

  • Puppet runs on a node.
  • The node reports back to the puppet-master what happened.
  • The puppet-master can optionally HTTP-post that report to the reporting node.

The reporting node can thus receive real-time updates, and do what it wants with them. You can even sidestep the extra server if you wish:

  • The puppet-master can archive the reports locally.

For example on my puppet-master I have this:

  root@master /var/lib/puppet/reports # ls | tail -n4

Inside each directory is a bunch of YAML files which describe the state of the host, and the recipes that were applied. Parsing those is pretty simple, the hardest part would be making a useful/attractive GUI. But happily we have the existing one to "inspire" us.

I think I just need to write down a list of assumptions and see if they make sense. After all the existing installation(s) won't break, it's just a matter of deciding whether it is useful/worthwhile way to spend some time.

  • Assume you have 100+ hosts running puppet 4.x
  • Assume you want a broad overview:
    • All the nodes you're managing.
    • Whether their last run triggered a change, resulted in an error, or logged anything.
    • If so what changed/failed/was output?
  • For each individual run you want to see:
    • Rough overview.
  • Assume you don't want to keep history indefinitely, just the last 50 runs or so of each host.

Beyond that you might want to export data about the managed-nodes themselves. For example you might want a list of all the hosts which have "bash" installed on them. Or "All nodes with local user "steve"." I've written that stuff already, as it is very useful for auditing & etc.

The hard part about that is that to get the extra data you'll need to include a puppet module to collect it. I suspect a new dashboard would be broadly interesting/useful but unless you have that extra detail it might not be so useful. You can't point to a slightly more modern installation and say "Yes this is worth migrating to". But if you have extra meta-data you can say:

  • Give me a list of all hosts running wheezy.
  • Give me a list of all hosts running exim4 version 4.84.2-2+deb8u4.

And that facility is very useful when you have shellshock, or similar knocking at your door.

Anyway as a hacky start I wrote some code to parse reports, avoiding the magic object-fu that the YAML would usually invoke. The end result is this:

 root@master ~# dump-run www.steve.org.uk
    Puppet Version: 4.8.2
    Runtime: 2.16
    Time:2017-07-29 18:13:04 +0000
            total -> 176
            skipped -> 2
            failed -> 0
            changed -> 3
            out_of_sync -> 3
            scheduled -> 0
            corrective_change -> 3
    Changed Resources
            Ssh_authorized_key[skx@shelob-s-fi] /etc/puppet/code/environments/production/modules/ssh_keys/manifests/init.pp:17
            Ssh_authorized_key[skx@deagol-s-fi] /etc/puppet/code/environments/production/modules/ssh_keys/manifests/init.pp:22
            Ssh_authorized_key[steve@ssh.steve.org.uk-s-fi] /etc/puppet/code/environments/production/modules/ssh_keys/manifests/init.pp:27
    Skipped Resources
            Exec[clone sysadmin utils]
            Exec[update sysadmin utils]



Validating puppet manifests via git hooks.

Monday, 27 April 2015

It looks like I'll be spending a lot of time working with puppet over the coming weeks.

I've setup some toy deployments on virtual machines, and have converted several of my own hosts to using it, rather than my own slaughter system.

When it comes to puppet some things are good, and some things are bad, as exected, and as any similar tool (even my own). At the moment I'm just aiming for consistency and making sure I can control all the systems - BSD, Debian GNU/Linux, Ubuntu, Microsoft Windows, etc.

Little changes are making me happy though - rather than using a local git pre-commit hook to validate puppet manifests I'm now doing that checking on the server-side via a git pre-receive hook.

Doing it on the server-side means that I can never forget to add the local hook and future-colleagues can similarly never make this mistake, and commit malformed puppetry.

It is almost a shame there isn't a decent collection of example git-hooks, for doing things like this puppet-validation. Maybe there is and I've missed it.

It only crossed my mind because I've had to write several of these recently - a hook to rebuild a static website when the repository has a new markdown file pushed to it, a hook to validate syntax when pushes are attempted, and another hook to deny updates if the C-code fails to compile.



More competition for server management and automation is good

Saturday, 2 February 2013

It was interesting to read recently from Martin F. Krafft a botnet-like configuration management proposal.

Professionally I've used CFEngine, which in version 2.x, supported a bare minimum of primitives, along with a distribution systme to control access to a central server. Using thse minimal primitives you could do almost anything:

  • Copy files, and restart services that depend upon them.
  • Make minor edits to files. (Appending lines not present, replacing lines you no longer wanted, etc)
  • Installing / Removing packages.
  • More ..

Now I have my mini cluster (and even before that when I had 3-5 machines) it was time to look around for something for myself.

I didn't like the overhead of puppet, and many of the other systems. Similarly I didn't want to mess around with weird configuration systems. From CFEngine I'd learned that using only a few simple primitives would be sufficient to manage many machines provided you could wrap them in a real language - for control flow, loops, conditionals, etc. What more natural choice was there than perl, the sysadmin army-knife?

To that end slaughter was born:

  • Download polices (i.e. rules) to apply from a central machine using nothing more complex than HTTP.
  • Entirely client-driven, and scheduled via cron.

Over time it evolved so that HTTP wasn't the only transport. Now you can fetch your policies, and the files you might serve, via git, hg, rsync, http, and more.

Today I've added one final addition, and now it is possible to distribute "modules" alongside policies and files. Modules are nothing more than perl modules, so they can be as portable as you are careful.

I envisage writing a couple of sample modules; for example one allowing you to list available sites in Apache, disable the live ones, enable/disable mod_rewrite, etc.

These modules will be decoupled from the policies, and will thus be shareable.

Anyway , I'm always curious to learn about configuration management systems but I think that even though I've reinvented the wheel I've done so usefully. The DSL that other systems use can be fiddly and annoying - using a real language at the core of the system seems like a good win.

There are systems layered upon SSH, such as fabric, ansible, etc, and that was almost a route I went down - but ultimately I prefer the notion of client-pull to server-push, although it is possible in the future we'll launche a mini-daemon to allow a central host/hosts to initial a run.

| 1 comment.


I don't like this ending...

Saturday, 16 January 2010

I've talked before about the minimal way in which I've been using a lot of the available automation tools. I tend to use them to carry out only a few operations:

  • Fetch a file from a remote source.
    • If this has changed run some action.
  • Ensure a package is installed.
    • If this is carried out run some action.
  • Run a command on some simple criterion.
    • E.g. Every day at 11pm run a mirror.

In the pub I've had more than a few chats about how to parse a mini-language and carry these operations out, and what facilities other people use. It'd be almost trivial to come up with a mini-language, but the conclusion has always been that such mini-languages aren't expressive enough to give you the arbitrary flexibility some people would desire. (Nested conditionals and the ability to do things on a per-host, per-day, per-arch basis for example.)

It struck me last night that you could instead cheat. Why not run scripting langues directly on your client nodes? Assume you could write your automation in Ruby or Perl and all you need to do is define a few additional primitives.

For example:

#  /policies/default.policy - the file that all clients nodes poll.

#  Fetch the per-node policy if it exists.
FetchPolicy $hostname.policy ;

#  Ensure SSH is OK
FetchPolicy ssh-server.policy ;

#  Or explicitly specify the URL:
# FetchPolicy http://example.com/policies/ssh-server.policy ;

#  Finally a quick fetch of a remote file.
if ( FetchFile(
                Source => "/etc/motd",
                Dest => "/etc/motd",
                Owner => "root",
                Group => "root",
                Mode => "0644" ) )

    RunCommand( "id" );

This default policy attempts to include some other policies which are essentially perl files which have some additional "admin-esque" primitives. Such as "InstallPackage", "PurgePackage", and "FetchFile".

FetchFile is the only one I've fully implemented, but given a server it will fetch http://server/prefix/files/$FILENAME - into a local file, and will setup the owner/gid/mode. If the fetch succeeded and contents differ from the current contents of the named file (or the current file doesn't exist) it will be moved into place and the function will return true.

On the server side I just have a layout that makes sense:

|-- files
|   `-- etc
|       |-- motd
|       |-- motd.silver.my.flat
|       `-- motd.gold
`-- policies
    |-- default.policy
    |-- ssh-server.policy
    `-- steve.policy

Here FetchFile has been implemented to first request /files/etc/motd.gold.my.flat, then /files/etc/motd.gold, and finally the global file /files/etc/motd.

In short you don't want to be forced to write perl which would run things like this:

# install ssh
if ( -e "/etc/apt/sources.list" )
  # we're probably debian
  system( "apt-get update" );
  system( "apt-get install openssh-server" );

You just want to be able to say "Install Package foo", and rely upon the helper library / primitives being implemented correctly enough to be able to have that work.

I'll probably stop there, but it has given me a fair amount to think about. Not least of which : What are the minimum required primitives to usefully automate client nodes?

ObFilm: Moulin Rouge!



And if someone gets upset you say, "chill out"!

Friday, 25 December 2009

It was interesting to see Clint Adams describe love and dissatification with configuration management.

At work I've got control of 150(ish) machines which are managed via CFEngine. These machines are exclusively running Debian Lenny. In addition to these hosts we also have several machines running Solaris, OpenBSD, and various Ubuntu releases for different purposes.

Unfortunately I made a mistake when I setup the CFEngine infrastructure and when writing all the policies, files, etc, I essentially said "OK CFEngine controlled? Then it is Debian". (This has been slowly changing over time, but not very quickly.)

But in short this means that the machines running *BSD, Solaris, and non-Debian distributions haven't been managed as well via CFEngine as the rest, even though technically they could have been.

A while back I decided that it was time to deal with this situation. Looking around the various options it seemed Puppet was the way of the future and using that we could rewrite/port our policies and make sure they were both cleanly organised and made no assumptions.

So I setup a puppetmaster machine, then I installed the client on a range of client machines (openbsd, debian lenny, ubuntu, solaris) so that I could convince myself my approach was valid, and that the tool itself could do everything I wanted it to do.

Unfortunately using puppet soon became painful. It has primitives for doing various things such as maintaining local users, working with cronjobs, and similar. Unfortunately not all primitives work upon all platforms, which kinda makes me think "what's the point?". For example the puppet client running upon FreeBSD will let you add a local user, setup a ~/.ssh/authorized_keys file but will not let you setup a password. (Which means you can add users who can login, but then cannot use sudo. Subpar)

At this point I've taken a step back. As I think I've mentioned before I don't actually do too much with CFEngine. Just a few jobs:

  • Fetch a file from the master machine and copy into the local filesystem. (Making no changes.)
  • Fetch a file from the master machine, move it to the local system after applying a simple edit. (e.g "s/##HOSTNAME##/`hostname`/g")
  • Install a package.
  • Purge a package.
  • Setup local user accounts, with ~/.ssh handled properly.
  • Apply one-line sed-style edits to files. (e.g. "s/ENABLED=no/ENABLED=yes/" /etc/default/foo)

(i.e. I don't use cron facilities, I add files to cron directories. Similarly I don't use process monitoring, instead I install the monit package and drop /etc/monit/monitrc into place.)

There is a pretty big decision to make in the future with the alternatives being:

  • Look at Chef.
  • Stick with CFEngine but start again with a better layout, with more care and attention to portability things.
  • Replace the whole mess with in-house-fu.

If we ignore the handling of local users, and sudo setup, then the tasks that remain are almost trivial. Creating a simple parser for a "toy-language" which can let you define copies, edits, and package operations would be an afternoons work. Then add some openssl key authentication and you've got a cfengine-lite.

For the moment I'm punting the decision but I'm 90% certain that the choice is CFEngine vs. Chef vs. In-House-Fu - and that puppet is no longer under consideration.

Anyway despite having taken months to arrive at this point I'm going to continue to punt. Instead my plan is to move toward using LDAP for all user management, login stuff, and sudo management. That will be useful in its own right, and it will coincidentally mean that whatever management system we do end up using will have on less task to deal with. (Which can only be a good thing.)

ObFilm: Terminator II



A new and glorious moment

Friday, 18 May 2007

Puppet the system-administration tool, similar to CFengine is very nice.

What is less-nice is the lack of decent examples. The wiki has lots of "recipes", however these give no real explaination of how to install them.

I know that I need manifests/site.pp which will control which nodes get which actions applied to them, but beyond that I'm a little lost!

I'm hoping to replicate my cfengine setup. Most of it is pretty simple "append line to file if missing", and copying files from the central server. (The latter I've got working nicely.)

Hopefully somebody will now point me at a good tutorial that doesn't stop once it is installed (like this one.)

| No comments


Recent Posts

Recent Tags