Perl has a reputation as a write-only language and one that is difficult to use. While you can write unreadable code in any language, aspects of Perl make it difficult for novices to read and encourage poor readability. There is one area where Perl remains the unquestioned champion: one-liners.
Perl was originally designed as an amalgamation of awk, sed, and sh. For those unaware, sh is the Bourse shell which is a common command line interface to UNIX computers. The Bourne shell has variables, looping, and branching. awk processes input lines and generates output lines. (awk is named from the first letter of the last names of its authors: Aho, Weinberger, and Kernighan.) It has selection criteria (regex matching and string search) as well as arithmetic. sed stands for ‘stream editor’. As its name implies, it is not line oriented. It performs substitutions, mainly based on regular expressions.
With this heritage, plus some immensely useful command line options, Perl is my go to language when I want to filter, format, or sort the output of one program as the input for another. I’m going to introduce the flags with examples, things that I’m actually doing at the time I write them.
Working with Git
My currently preferred version control software is git, which has a full command line interface. However, the output is not usually in a format suitable for further processing. For example, something I commonly do is list what files need to be checked in and what files need to be added. git info provides this, although not in the format needed to add new or modified files to a change set. Here is the output of git info:
$ git status On branch pre_2015_changes Changes not staged for commit: (use "git add <file>..." to update what will be committed) (use "git checkout -- <file>..." to discard changes in working directory) modified: .gitignore modified: views/welcome/companies_2012.php modified: views/welcome/companies_2014.php modified: views/welcome/intern_sponsors.php modified: views/welcome/sponsor.php modified: views/welcome/student_new.php Untracked files: (use "git add <file>..." to include in what will be committed) views/welcome/intern_sponsors.php.bak_2014-10-23 www/images/program_sponsors_2015.png no changes added to commit (use "git add" and/or "git commit -a") $
This output has only been changed to remove path leaders that could identify my client.
With git, when you have files that you would like to add to the current changeset, which is the list of files to be committed), you supply that as an argument to git add. How can Perl help us do this?
Essentials Flags for Perl One Liners-e Specifies that whatever follows is executed as a program. This allows you to pass code on the command line, quoted with single quotes on UNIX/Linux/OS X and double quotes on Windows. This flag is essential for a one liner. It almost defines what a one liner is. -l That is a dash el, which tells Perl to hand line endings for you. Unlike most programming languages, Perl doesn’t assume anything about line endings. You can handle them explicity, but most one liners use this flag. -n This flag is interesting. It tells Perl to run a loop on the input and execute the code for every line in the input. It is for no print by default. There is a corresponding and often used flag, -p , which mean execute the code on every input line and print it. -a This flag says to split the input line on whitespace and put the results into an array @F . The a is meant to look like the @ sigil used to identify arrays.
Using the Flags
Using these flags, we will extract only the lines we want from the output of git status and format them as input to git add.
Let’s concentrate on the modified files. We can see that these files are all of the form:
Each line has leading space (or tab, we cannot know for sure without using something like od -h ), is tagged with the string modified: , has a another space and then the path to the file that has been modified. We are going to look for lines that contain modified: and then print the last space-delimited ‘thing’ on that line. Like so:
$ git status | perl -alne 'print $F[-1] if $F eq q(modified:)' .gitignore views/welcome/companies_2012.php views/welcome/companies_2014.php views/welcome/intern_sponsors.php views/welcome/sponsor.php views/welcome/student_new.php $
How does this work?
We loop over all the input lines. If modified: is the first thing on the current line, then we print the last thing on the current line.
The code after the -e flag is run on every input line because of the -n flag:
print $F[-1] if $F eq q(modified:)
Because of the -l flag, line endings are stripped from the input and added back to the output. The -a flag causes the input line to be broken up based on spaces with leading spaces ignored, and the results put in to the array @F . Perl uses zero-based indexing, thus $F is the first group of non-space characters on the line, which we check to see if it is equal to the modified: string that is on lines of interest. We use eq to test for string equality, which differs from the test for numeric equality. This strangeness was inherited from the Bourne shell. We use a peculiar quoting mechanism to effectively single-quote the string modified, by putting it withing q() .
This single quoting construct, and its brother qq() , were added to Perl specifically to make one liners easier. Without it we would have to use the single quote ‘ but have to escape it since we are already inside single quotes. This is a real headache and very confusing. In fact, q() and qq() and two of the best reasons to use Perl one liners.
What’s with the $F[-1] ? With a negative index into an array, Perl counts backwards from the end of the array towards the beginning, so $F[-1] is the last non-white space group on the line.
Using the Output as Input
Now that we have the line of files to process, what do we do with them? We use a feature of the Bourne shell, using $() to place the output of the enclosed command in its place.
$ git add $(git status | perl -alne 'print $F[-1] if $F eq q(modified:)') $
If we run git status again, notice that the modified files are now marked as Changes to be committed:
[eif@mayorsinterns mayorsinterns.org]$ git status On branch pre_2015_changes Changes to be committed: (use "git reset HEAD <file>..." to unstage) modified: .gitignore modified: views/welcome/companies_2012.php modified: views/welcome/companies_2014.php modified: views/welcome/intern_sponsors.php modified: views/welcome/sponsor.php modified: views/welcome/student_new.php Untracked files: (use "git add <file>..." to include in what will be committed) views/welcome/intern_sponsors.php.bak_2014-10-23 www/images/program_sponsors_2015.png [eif@mayorsinterns mayorsinterns.org]$