Puppet masterless – my choice of operations

by stiron

It is arguable that using Puppet master(s) with the Open Source Puppet is good or bad, comfortable or uncomfortable, effective or ineffective. There are people around who tried to convince me about either. My opinion is simply that if your infrastructure and Ops team can afford it, then use the Puppet master. If you try to be as flexible as you can, then use Puppet in a nodeless / masterless configuration. I do it so for my systems too. Let’s see what am I talking about.

  • Puppet is a configuration management tool: https://puppetlabs.com/
  • Puppet can be used in Master-Node design or masterless: puppet apply
  • Masterless design is as cool as DevOps
  • Git is the central configuration repository
  • Git hooks will trigger Puppet runs
  • We are cool, aren’t we?

How does it look like in practice?

  • Install the common Puppet package, e.g. on Ubuntu it looks like this
wget -O /tmp/puppetlabs.deb http://apt.puppetlabs.com/puppetlabs-release-`lsb_release -cs`.deb
dpkg -i /tmp/puppetlabs.deb
apt-get -q -y update
apt-get -q -y install puppet
  • The puppet.conf file must be set up
  • Initiate a Git repository, push the contents to a remote
  • Start adding modules to the configuration

Git hooks and Puppet runs

We are at the interesting part now. How will our Puppet “agent” run periodically and how will it download changes from Git? The answer is “Cron, Git hooks and puppet apply”.

  • Create a post-merge hook for git in the .git/hooks directory
/usr/bin/puppet apply /etc/puppet/manifests/site.pp --logdest /var/log/puppet/puppet.log
  • Then I use three Cron jobs to run this hook (code on my github)
    • Git pull in every 20 min will pull the changes if there is any
    • Hourly “puppet apply” runs to repair config drifts
    • On every reboot “puppet apply” runs to make sure the config state
  • From this point I can have as many copies of the Git repo as I need, my nodes are pointing there to download configuration, then Puppet simply runs and configures everything
  • If I work on a Puppet module, then I do it on my personal notebook, I can test my modifications, then push them to Git
  • The “git pull” Cron job will pull the changes, it causes a “post-merge” hook that runs a “puppet apply”, logs to the /var/log/puppet/puppet.log file (it can be used for monitoring the node, but I am sure every SysAdmin has a good monitoring system intact).

What did I achieve?

Is it worth it? For me the answer is definitely yes. I have machines at home, Linuxes at my work space, I want to fully automate their deployment and configuration without using too many resources. I achieved it. Deploying a new machine is about 2-10 mins (obviously it depends on the software stack, and without the installation). The longest part of the work is thinking about what do I want to do.

The “most scary” issue I have heard about this configuration management is every node gets every Puppet module and Hiera data from Git. Yes, ok. Now what? I use Hiera to store node specific data, and there is eyaml to encrypt all of the secrets, and that’s it.

But honestly, I have worked with many companies on automation, and I have seen such, or even much worse designs and security problems in production. This masterless configuration definitely has its rough edges, but with branch-based-environments and with profiles and roles it can be more convenient. This is not the most important part of any automation and configuration management.

What is more important then?

In my opinion the most important thing in a fully automated infrastructure is not the technical stack, but the people who operate it. As it can be very very dangerous (really!!!), it must be operated by exceptional senior system administrators with excellent coding abilities and advanced communication skills (a.k.a. DevOps). This is the most important piece of the whole puzzle. The staff behind all of this.