Shell command pipelines are a feature of *nix shells. This feature allows you to redirect the output of one command directly to the input of another command. Command pipelines are built by typing a syntactically correct command, then entering the “pipe character, “|” then another syntactically correct command. This can be done as many times as required.
This is fabulously useful. It allows you to perform many tasks with a single command. Each individual command performs a relatively simple, straightforward function. When multiple commands are piped together, minutes worth of typing, copying, and pasting are compressed into a single command.
If you’re already familiar with command pipelines, please let me know if I made any mistakes. If you aren’t familiar with them, adding them to your toolkit is a big upgrade to your capabilities.
A Real-World Example
Let’s consider a real-world example. My web server runs WordPress. WordPress has a feature called “xmlrpc” which allows statistics and other administrative functions to be performed. But using this feature is somewhat compute-intensive, and if run too often, can cause server performance issues.
Even in a properly configured website, black-hat actors can submit requests to xmlrpc, possibly degrading your server performance.
So I routinely filter my Apache server’s access log to look for IP addresses requesting xmlrpc. I do this with a command pipeline, to not only flag the requesting IP addresses, but to return the results in a useful, actionable format.
To run this check, here is the pipeline I submit:
grep xmlrpc /path/to/httpd/access_log | cut -d " " -f1 | grep -v "^192.0" | sort | uniq -c | grep -v "^\s*1 " | sort -n
Whoa! What does all that mean? Let me explain each step in the pipeline.
grep xmlrpc /path/to/httpd/access_log
return every line from the access log which contains the string “xmlrpc”. These lines represent requests to the xmlrpc function.
cut -d " " -f1
using a space character as a field delimiter, return the 1st field. This field is the IP Address requesting the xmlrpc function.
grep -v "^192.0"
ignore lines beginning with IP addresses from the class B subnet 192.0. These are IP addresses owned by the administrators of WordPress, so are valid requests.
sort
sort the results alphabetically. The next command in the pipeline, the uniq command, requires sorted input to work properly.
uniq -c
find all unique IP addresses, and count how many times each occurs.
grep -v "^\s*1 "
ignore IP addresses which only tried accessing xmlrpc a single time.
sort -n
sort the final list of IP address by the number of times that address requested xmlrpc.
When I run this command, it instantly filters an access log file having hundreds of thousands of rows and returns a short list of IP addresses which have requested xmlrpc more than once. It’s then up to me to take action or not.
This pipeline does for me in less than a second something which would take several minutes of focused effort to do manually.
To create a pipeline, simply type the command you need. When that command produces the result you want, append a pipe character then another command, and run it again. Repeat this until you achieve the final output you desire.