I'm sure you've heard that all the cool kids are playing with orchestration automation these days. But do you know why? Well first, the resources consumed by modern microservices workloads are becoming much more complex and deploy to far more instances than ever before. And second, more and more of those resources are virtual rather than physical - so many of them will only exist for minutes or even seconds.
All of which means that even if you wanted to go around logging into each of your many servers, it just wouldn't make sense. In most cases in fact, it wouldn't even be possible. Instead, you're going to be running a lot of clever scripts. And the tools you use to run those kinds of scripts are generally called orchestrators.
I'm sure you've encountered at least one or two members of the orchestration club. Besides Ansible, there's Terraform, Chef, Puppet and others. But there are also lower-level tools that work as add-ons to core Linux tools like SSH. Although, seeing how it'll run natively on Windows and, of course, macOS, I'm not sure it's quite correct to call SSH a "Linux" tool any more.
One of those SSH add-ons is a tool set called pssh - which stands for Parallel SSH. That's what we're going to be learning about in this article - which is excerpted from my new Pluralsight course, Linux System Optimization.
For now, though, I'm going to tell you a bit about the lab I'm using so that you can more easily reproduce it and follow along at home. I've got three Ubuntu LXD containers running. The base for all of our operations will be the one with an IP address of 10.0.3.140, while the two host nodes we'll be remotely provisioning will use 10.0.3.93 and 10.0.3.43.
Everything we'll do assumes that we've got passwordless SSH access from my base container to each of the two nodes. If you're not sure how to do that, you can view the SSH module of my Protocol Deep Dive: SSH and Telnet course on Pluralsight. If you're in a hurry, this Red Hat tutorial will get you to the same place.
Installing pssh on Ubuntu is simple and quick:
sudo apt install pssh. It doesn't get any harder on CentOS.
I created a simple host inventory file called sshhosts.txt that contains nothing more than the IP addresses of my two nodes:
$ less sshhosts.txt 10.0.3.93 10.0.3.43
Now I'm going to run the pssh parallel-ssh command to execute a single command on my hosts.
$ parallel-ssh -i -h sshhosts.txt df -ht ext4
-i tells the program to run as interactive - otherwise we wouldn't be shown any command output. -h points to the hosts file that I called sshhosts.txt. And the command itself will be the old Unix utility df. That'll return a list of drives attached to the system along with their mount points and usage information. The -h here will display disk space in human readable units and the t will restrict access to only drives formatted as ext4.
Why do I care about that ext4 business? Because Ubuntu uses the snap package manager and each snap creates its own virtual device. So what? Well, I don't want to have to comb through a dozen or so virtual devices reporting 0 free space just to get to the real drives reporting actual usage.
$ parallel-ssh -i -h sshhosts.txt df -ht ext4  22:02:00 [SUCCESS] 10.0.3.43 Filesystem Size Used Avail Use% Mounted on /dev/sda2 457G 131G 304G 30% /  22:02:00 [SUCCESS] 10.0.3.93 Filesystem Size Used Avail Use% Mounted on /dev/sda2 457G 131G 304G 30% /
And there you go! Full disk space information about both of my nodes. I'm sure you noticed that the information is identical. That's because these are both containers running on my workstation, so as far as they know, they both have full access to my own drive.
For my next trick, I'll collect the /etc/group files from each of my nodes. This is the kind of operation that could be useful to quickly monitor the security status of your nodes. You could add a script that parses the incoming data and alerts you if there are any anomalies.
Before I begin, I'll create a directory locally called host-files. Then I'll use the
parallel-slurp command - whose name wonderfully describes its function. Again, -h points to the hosts file. The
-L sets the host-files directory as the target location for writing the data we're going to generate,
/etc/group is the remote file we want to slurp up, and
group is the name we'd like to assign the data locally.
mkdir host-files parallel-slurp -h sshhosts.txt -L host-files/ /etc/group group
When it's done, your host-files directory will contain sub-directories named after the IP address of each of your nodes. As you can see, there's a file called "group" that contains the /etc/group data from each node.
$ tree host-files/ host-files/ ├── 10.0.3.43 │ └── group └── 10.0.3.93 └── group
Does pssh come with any other treats? Yup. And running
apropos gives you the whole list.
$ apropos parallel parallel-nuke (1) - parallel process kill program parallel-rsync (1) - parallel process kill program parallel-scp (1) - parallel process kill program parallel-slurp (1) - parallel process kill program parallel-ssh (1) - parallel ssh program
This article is based on content in my Pluralsight course, "Linux System Optimization." There's much more administration goodness in the form of books, courses, and articles available at bootstrap-it.com.