Learn to Write Command Line Utilities in R - part 2

In yesterday’s post we took a look at command line utilities in general, some of the reasons why they’re useful, and also made our first bare-bones utility of our own.

Today, we’re going to extend our sortinghat.R example, by allowing it to accept ‘arguments’.

Arguments are what we call the stuff that goes after a command when you run it on the command line.

If we take the humble ls command as an example, we can run it on its own, and it will just list the contents of the current working directory. however, ls also supports various arguments. For instance, running ls /tmp will list the contents of the ‘/tmp’ directory instead of the current working directory. Running ls -l lists the contents of a directory using a ’long’ output format that has more detail than the normal output.

Clearly arguments provide a great way to change the behaviour of our command line utility, so how do we go about doing that in an R script?

Fortunately, R has a built in function called commandArgs() that can do just that and we’re going to use it to let us use a name with our Sorting Hat utility.

Open up the sortinghat.R script from yesterday in RStudio and edit it, to match the following:

#!/usr/bin/env Rscript
args <- commandArgs(trailingOnly = TRUE)
houses <- c("Hufflepuff", "Gryffindor", "Ravenclaw", "Slytherin")
house <- sample(houses, 1)
cat(paste0("Hello ", args[1], ", you can join ", house, "\n"))

If you’ve followed along to this point, you already know what most of this script does, it just selects a house at random. We’ve a couple of small changes though:

The first is the new line…

args <- commandArgs(trailingOnly = TRUE)

This takes any arguments we specify and assigns them to ‘args’. The ’trailingOnly = TRUE’ option means that the first element of ‘args’ is the first argument, instead of the name of the command itself.

The second change is that we’ve modified the final line to include the first argument, args[1], in a short message and we’re now outputting that instead of just the house name. The means that our final output, will be along the lines of, ‘Hello NAME, you can join HOUSE’

Now let’s go ahead and run our new and improved sorting hat command.

Linux/Mac/git-bash

Run the following command, including a name argument like this:

$ ./sortinghat.R sellorm
Hello sellorm, you can join Hufflepuff

Remember to type everything after the ‘$’ symbol and feel free to replace ‘sellorm’ with a name of your choosing. You should see output similar to that displayed above.

Windows

Don’t forget, if you’re using git-bash (see the first article for more info), you need to follow the instructions for Linux/MacOS.

In the first article, we wrote a batch file wrapper for our R script that ended with ‘%*’. That percent-asterisk combo takes any arguments we pass to our batch file and passes them straight through to our R script.

Run the following command, including a name argument like this:

sortinghat.bat sellorm
Hello sellorm, you can join Hufflepuff

Feel free to replace ‘sellorm’ with a name of your choosing. You should see output similar to that displayed above.

Wrapping up

This time around, we’ve extended our Sorting Hat to use an argument, in this case a name, and improved the output a little. Next time we’ll look at ways of improving how we handle arguments. In the meantime, have a think about other improvements you’d like to see, like perhaps getting rid of the random assignment of a house (we’ll fix that soon, I promise!), or something else.


It lives!