Example42 blog

Blog

Posted: 2011-07-27

Deploying applications and bringing Puppet information to the cli with Puppi

With Puppet we build infrastructures, piece by piece, manifest after manifest. We control how nodes are configured, what services they provide, how they are checked.
We manage where web applications stay and sometimes how they are built, tested and deployed.
Puppet has a lot of knowledge about our systems, every catalog we receive is an unique source of data that is exactly what brings our servers to the desired state.
More information we provide in our manifests about the system, more we can do things with it.
Puppi tries to bring this knowledge to the command line.
 
PRESENT
Puppi was initially developed to standardize different procedures of web applications deployments and it has evolved into a shell command that provides handy and quick actions useful for the system administrator.
Now is stable enough to have reached version 1.0 (currently in RC state) the features initially planned are present, it's used in production and no big changes are planned for this version.
See the presentation held at The PuppetCamp Europe 2011 (http://example42.com/?q=Puppi_presentation_PuppetCamp_Europe_2011) for some videos and further details.
Puppi most useful actions are:
deploy, to manage the whole deploy workflow with a single keystroke
check, to verify the general system's health and specific checks on the application deployed
log, to quickly tail some or all the known logs 
info, to show the output of a custom set of preconfigured commands
rollback, to quickly rollback a deployed application
 
Puppi is currently entirely provided as a Puppet module, (http://github.com/example42/puppi), you include it and you have the whole puppi thing:
- the bash command /usr/sbin/puppi
- its configuration directory /etc/puppi, with plenty of files and dirs configured by Puppet defines like puppi::check, puppi::info etc
- a set of (customizable) defines that build deploy procedures, like puppi::project::maven that retrieves Java artifacts generated by Maven
- some general use native scripts that are used to accomplish the different steps of a deployment
- a set of defines to populate the output of puppi actions
- some default content for puppi info, log and check actions to make puppi useful out of the box.
 
There are not other modules prerequisites, but for full functionality you need on the systems where you place puppi these commands: wget, mail, rsync and the most common Nagios Plugins. 
Note also that Puppi is entirely self contained, it doesn't need external services to run, once deployed, all its actions are based on local files and data: checks are local, info scripts are local, deploy procedures need only the availability of the defined source to work.
 
HOW TO USE
The puppi module provides some deploy procedures that cover many typical scenarios.
For example, to retrieve a war from $source and deploy it in $deploy_root, keeping a copy for rollback and notifying $report_email, you need this:
puppi::project::war { "myapp":
    source           => "http://repo.example42.com/deploy/prod/myapp.war",
    deploy_root      => "/store/tomcat/myapp/webapps",
    report_email     => "[email protected]",    
}
All the existing deploy procedure have a set of optional arguments that make them more flexible. Here a more complex case:
puppi::project::maven { "supersite":
    source           => "http://nexus.example42.com/nexus/content/repositories/releases/it/example42/supersite/",
    deploy_root      => "/usr/local/tomcat/supersite/webapps",
    config_suffix    => "cfg",
    config_root      => "/srv/htdocs/supersite",
    document_suffix  => "css",
    document_root    => "/srv/htdocs/supersite",
    firewall_src_ip  => $site ? {
        dr      => "192.168.101.1/30",
        main    => "192.168.1.1/30",
    },
    backup_retention => "3",
    init_script      => "tomcat",
    report_email     => "[email protected]",
    enable           => "true",
}
this does the following actions in sequence: 
- Retrieves the maven-metadata.xml from $source,
- Blocks access from a loadbalancer IP,
- Backups the existing data for rollback operations
- Deletes older backups (3 archives are kept, instead of the default 5)
- Deploys the release war in $deploy_root, 
- Unpacks a configurations tarball tagged with the Maven qualifier $config_suffix in $config_root, 
- Unpacks a static files tarball tagged with the Maven qualifier $document_suffix in $document_root
- Restarts tomcat and notifies via mail.
All this can be triggered with the command "puppi deploy supersite" and if something fails you can "puppi rollback supersite".
 
The above puppi::project::maven (or tar|war|list|dir|mysql..) defines build up the logic and the sequence of commands run in deployment and rollback operations using basic puppi defines like puppi::project, puppi::deploy, puppi::rollback, puppi::init.
You can use the existing puppi::project::* procedures or build up your own ones, to manage special cases. The same bash scripts they use ("native scripts", stored in puppi/files/scripts/) can be replaced by custom scripts, in whatever language.
 
The other puppi actions require simpler constructs, for example you can manage a single check (we use Nagios plugins, as they are so common) with:
puppi::check { "Port_Apache":
    command  => "check_tcp -H ${fqdn} -p 80" ,
}
or insert more elaborated checks in your defines (for example when you create virtualhosts, using data you may already provide):
puppi::check { "Url_$name":
    enable   => $enable,
    command  => "check_http -I '${target}' -p '${port}' -u '${url}' -s '${pattern}'" ,
}
 
The logs to tail with puppi log are defined by the puppi::log define (note that with a simple selector you can adapt the commands to run according the underlining OS):
puppi::log { "auth":
    description => "Users and authentication" ,
    log => $operatingsystem ? { 
        redhat => "/var/log/secure",
        darwin => "/var/log/secure.log",
        ubuntu => ["/var/log/user.log","/var/log/auth.log"],
}
Also in this case you can insert a puppi::log inside an existing define, using the data it inherently has:
puppi::log { "tomcat-${instance_name}":
    log => "${tomcat::params::storedir}/${instance_name}/logs/catalina.out"
}
 
You can manage the output of "puppi info network" with something like:
puppi::info { "network":
    description => "Network settings and stats" ,
    run         => [ "ifconfig" , "route -n" , "cat /etc/resolv.conf" , "netstat -natup | grep LISTEN" ],
}
or build more elaborated info subclasses using custom templates and specific data:
puppi::info::instance { "tomcat-${instance_name}":
    servicename => "tomcat-${instance_name}",
    processname => "${instance_name}",
    configdir   => "${tomcat::params::storedir}/${instance_name}/conf/",
    bindir      => "${tomcat::params::storedir}/${instance_name}/bin/",
    pidfile     => "${instance_rundir}/tomcat-${instance_name}.pid",
    datadir     => "${instance_path}/webapps",
    logdir      => "${instance_logdir}",
    httpport    => "${instance_httpport}",
    controlport => "${instance_controlport}",
    ajpport     => "${instance_ajpport}",
    description => "Info for ${instance_name} Tomcat instance" ,
}
Examples are endless, you can extend and customize easily the existing defines and you can integrated them in your modules according to the needs you have.
 
THE JOYS OF COLLECTIVISM
Puppi's purpose is not only to provide a command tool based on Puppet data, that helps the sysadmin to gather info, deploy applications, troubleshoot and check them.
It can be run manually from the local system, automatically via a cron job or triggered by a web interface, it can be used to summarize a set of common actions to be sudoed by non privileged users and it can be called by an agent of an orchestration tool.
Puppi enters into a new scale with the MCollective agent puppi and the command mc-puppi: whatever can be done locally with Puppi, can be repeated on the whole infrastructure, with the same syntax, using the power of MCollective.
This becomes particularly interesting when your deploy procedures involve actions on different nodes, or when you need to check quickly the systems health on a moltitude on nodes.
A command like "mc-puppi check" runs and shows the equivalent of ALL your Nagios checks on your WHOLE MCollective domain.
 
They can be thousands: you have them in few seconds.
I like to consider this a real-time distributed instant infrastructure test.
Something like "mc-puppi info network", instead, provides immediate overview of the network configuration and status of all your nodes, and if verbosity bothers you, just grep what you need.
 
A missing piece in the puppi world is a web frontend that gathers the reports of the deployments, collects information about nodes, shows the results of local checks and possibly lets users trigger deploy procedures via a central web console.
The development of a web interface to puppi, altough planned since the beginning, has not yet started.
The main reason is that it was considered a priority to have a stable command and a mcollective agent, another reason it's time to make decisions about puppi, and possibly these have to be shared.
 
FUTURE
Puppet development is growing quickly, at the Europe PuppetCamp 2011 Luke presented Puppet Faces and it's clear that version 2.7 introduces us to a new era in Puppet evolution.
There are some common points in faces and puppi: they both bring parts of Puppet to the cli and they are expandable with actions.
Actually it just seems natural that puppi's future is to become a Puppet face.
Now it's a bash script that executes bundles of bash scripts, based on data more or less elegantly provided by Puppet modules, it works also on older Puppet versions (at least 0.25) so that it can be widely adopted and integrated in current layouts.
Next version is probably going to be in ruby, use directly Puppet APIs and possibly be based on a more standardized modules data model. And of course is going to work only with Puppet 2.7 and later.
It could become something different, maybe even with a different name, but as far as I'm concerned it should keep the principles it's based upon:
- Be based on Puppet data: Puppet knows the infrastructure, we want this knowledge and intelligence integrated in commands we run on the shell. The way this data is currently provided is not optimized (every piece of puppi information is basically a Puppet resource (generally a file) and this is an overhead we should avoid), we could rely directly on the catalog, which seems the most natural source, but in order to do this I suspect some kind of standardization is needed at the module level.
- Provide a simple single line standard command to run an application deployment (one keystroke to deploy them all)
- Provide useful actions that can be used from the cli, an orchestrator agent or a web interface to show info, status and working details on systems and applications (I would keep the check/info/log actions, as I'm finding them quite useful)
- It can be run manually or automatically, locally or from a central orchestrator and, possibly, also via a web interface.
A puppi webapp should let different users request, trigger and view the results of a deploy, gather info from the systems (via rest?), in order to become an inventory frontend on steroids with as much detail on the system as users want, receive and show checks and possibly be able to search and correlate the large amount of data it could receive.
Truth is that development on the next puppi and its web frontend is something that I would like to do in collaboration with PuppetLabs and who is interested in the community.
Puppi 1.0 was done by me for a customer's needs with the knowledge and the requirents I had at disposal.
Puppi 2.0 (whatever the name and the shape) has not immediate operative requirements, its design, how data is feeded from modules, how it's integrated in Puppet should be discussed and shared.
Anyone interested?