TL;DR logging is an extremely useful tool for understanding a running (and potentially failing!) application and an essential element of running any code in production. Adding logging to your long running script, shiny app or plumber API is simple and can pay off enormously when things go wrong or when you want someone else to look after your code for you.
Log….
Rolls down stairs,
Alone or in pairs,
Rolls over your neighbours dog.
What’s great for a snack,
and fits on your back?
Its log, log, Log.
It’s log, log, it’s big it’s heavy it’s wood.
It’s log, log, it’s better than bad it’s good.
Everyone loves a log,
You’re gonna love it log,
Come on and get your log.
Everyone needs a log.
Log, log, log.
Log from Blammo!
“Log” commercial from The Ren & Stimpy Show (Nickleodeon, 1991) YouTube
What is logging and why do we need it?
I once spent two and a half hours debugging the wrong part of someone else’s code. I’d assumed a failure I was seeing was in a different part of the codebase to where it actually was. Simple logging of what part of the code was actually running would have prevented that wasted effort.
Logging is a way for us to record what’s happening with the programs we write. This can be useful for a couple of reasons:
First, if it’s the sort of code that you run interactively, it can tell you what your program is doing, while it’s doing it, so you can see what your program is up to at any given moment.
This can help you understand how long different parts of your program take to run and also inform you where your program is in its flow. Is it, near the beginning, middle or end? is it talking a long time to do that data load today? That sort of thing.
The second is for when your programs run unattended, as in the case of scheduled scripts and shiny apps and so on, where you’re not generally watching over the program while it runs. This is the “production” mode of operation. In this case, writing out a log file can be a really useful record of what your program did and when it did it.
For instance, if you’ve written a shiny app and users report that it’s feeling sluggish today, you could check the log for any anomalous behaviour. Maybe that log file shows that API requests that you’re making are taking longer than you’d normally expect, or perhaps timing out altogether. It’s essentially a shortcut to resolving issues that may arise in the future.
Logging sounds hard!
Choosing how to implement logging in your application certainly can be a challenge.
- Where should you put messages?
- What should the messages contain?
- Are there simple patterns you can use?
- Rolling your own vs using a logging package
But once you get into the habit of adding logging to your programs, it becomes a much easier task.
How do I get started?
The simplest method of providing any sort of logging in R is the humble cat()
function. You’ve probably added a bunch of cat()
statements to your programs in the past while you’ve been debugging.
if(isTRUE(thing_status)){
do_something()
} else {
cat("Does this _ever_ happen??\n")
do_something_else()
}
This is a common pattern for most developers as it provides an extra level of observability while your program is running. Once that round of debugging is complete, you probably commented out or removed the cat()
statements and moved on to other things.
When we implement logging in our apps we’re really just formalising this informal debugging practice and extending it into something more generally useful. We add the potential for other people who don’t even understand how your program works to have an idea about what’s going on inside it.
Say you’ve written a long running ETL program. This program will be run on a daily basis by your CI tool of choice. You wrote the code, but it’s essential that anyone on your team and the CI team can debug issues with it. It’s important to remember here that our code rarely exists in isolation. Code is just one part of a much broader ecosystem of tooling, of databases, file systems, operating systems, networks and so on. In larger organisations many of these things will be outside of your direct control, run by other teams with different priorities to your own.
It should be clear even to someone who doesn’t understand your code what is happening when the program runs. For instance, if your program fails to connect to the database, seeing a log message about that failure may prompt someone to check the database and discover that it’s down for some reason. Good logging can empower even non-programmers to solve problems that might otherwise require delving into your code.
Where should I start?
Those cat()
s you’ve been using are a great first step and R has a number of great logging packages to take your logging to the next level. One of my favourites is logger as it’s extremely flexible and powerful. If you’re looking for simplicity though, I’d encourage you to check out rlog.
Note: The rest of this post uses rlog to demonstrate the concepts being presented. As much as I like rlog, I don’t mind if you want to use a different package. The concepts below should be easily transferable to other packages and even other languages other than R, since most have similar logging tools.
I wrote rlog — and it’s shell companion, shlog - The shell script log handler — to be highly opinionated and as a consequence, extremely simple to use. I wanted cat()
-like simplicity and no bells or whistles. As such, I think it’s also a great tool for those who are new to this sort of explicit logging.
rlog::log_info(“Connecting to the database...”)
rlog:
- Has exactly one (1) log message format
- timestamp [LOG LEVEL] “your message here”
- 2021-06-13 17:19:06 [INFO] Connecting to the database…
- Only writes messages to standard out and standard error
- These are Linux standards that make it easier to get your code into production
- This also helps external tools like RStudio Connect or CI/CD tools to surface your logs.
- Has a LOG_LEVEL controlled by an environment variable
- This makes it easy for other people (eg people who didn’t write the code) to change how much logging information your program spits out without changing your code. More on this later!
- This also comes with a sensible default value in case you don’t want to specify one just yet.
When first getting to grips with logging with rlog
the log_info()
function should be your initial go-to. It’s pretty much the same as cat()
except it nicely formats the output for you. I generally start with adding an info message to each section of my code as it begins, to act as markers in the output.
rlog::log_info(“Some info from my program”)
Prints this:
2021-06-16 21:19:06 [INFO] Some info from my program
Think of it as a way for your program to tell you what it’s doing…
- I’m connecting to the database
- I’m extracting the data
- I’m running the first stage of the transformation
- Now I’m running the next stage
- Now I’m adding in data from this other place
- Now I’m fitting a model to that data
- Now I’m writing all my outputs to disk
- I’ve finished
So you could add the following lines to the appropriate places:
rlog::log_info(“Connecting to the database”)
rlog::log_info(“Extracting data”)
rlog::log_info(“Running first stage of transform”)
rlog::log_info(“Running second stage of transform”)
rlog::log_info(“Combine with external data”)
rlog::log_info(“Fitting model")
rlog::log_info(“Writing outputs to disk”)
rlog::log_info(“Finished”)
Now if you see something like this in your logs:
2021-06-12 21:20:10 [INFO] Running second stage of transform
Error in df$wingspan : object of type 'closure' is not subsettable
You’ll know that the error occurred in the fourth section, before any attempt was made to combine the external data in step five.
Now you have the basic idea of logging down, we can start to think about when to emit log messages and what they should be about.
When and what to log?
Some people might argue that you should log everything. That’s excessive in all but a few cases, but it’s certainly true that adding logging to all branches of your code can help guide you more swiftly to specific problems.
My preference though, is to add logging to things which I already know will have the potential to be tricky or fragile. Any time I rely on a connection to a database or API or anything else outside of my direct control, I’ll log what’s happening in case it fails for some reason. Naturally, this is no excuse to not add great error handling to your programs, but remember, our aim is to help our future selves as well as others to diagnose problems from the outside.
Wouldn’t it be great though, if there was a way to show even more log messages when you’re debugging your code than you might see under ordinary operation? Maybe you want to add logging to every little nook and cranny of your code, but you don’t want that to show up when it’s running in production as it would add too much noise and not enough signal. What you need are called “log levels”.
What are log levels?
Now that you’re comfortable getting some logging info out of your programs with log_info()
we can turn our attention to controlling our logging output.
Often, when your program is running smoothly, you don’t need a lot of output. When you’re testing, rolling out a new program or debugging issues with an existing one it’s nice to be able to get more log output. It’s even better, especially in production settings, if this can be done by someone who didn’t write the program and who has no knowledge of how the program was written. To do this, we use “log levels”.
Log levels give your super-powered logging output even more power. They allow you to choose the severity of the log messages that are emitted.
rlog has support for the following log levels, in decreasing order of severity:
- FATAL
- ERROR
- WARNING
- INFO (the default log level)
- DEBUG
- TRACE
The default log level in rlog
is “INFO”. This means that only messages that use the info log level and above will be emitted when the program runs.
Log levels are often specified in upper case and you’ll see them written like that a lot too. The rlog package isn’t fussy though and you can either use all upper, or all lower case names for the levels.
rlog relies on an environment variable called LOG_LEVEL to control which log messages are emitted.
If the environment variable is not set, it will default to “INFO”.
Another example:
- FATAL
- ERROR <= If we set the log level here only messages of this level and above will be emitted.
- WARN
- INFO
- DEBUG
- TRACE
This is a really powerful way of only including the messages you want to see in the log files at any given time. For example, you may choose to set the LOG_LEVEL
to “TRACE” while developing your application or if it runs into a problem in production, but you might choose to run the application with a LOG_LEVEL
of “ERROR” under normal circumstances.
Using rlog
The best way to use rlog is to call the functions directly from within your applications.
There is one function per log message level:
rlog::log_fatal(“my fatal message”)
rlog::log_error(“my error message”)
rlog::log_warn(“my warn message”)
rlog::log_info(“my info message”)
rlog::log_debug(“my debug message”)
rlog::log_trace(“my trace message”)
If you run this code as-is without setting a custom LOG_LEVEL with the environment variable, you’ll get output like this:
2021-06-16 21:25:06 [FATAL] fatal
2021-06-16 21:25:06 [ERROR] error
2021-06-16 21:25:06 [WARN] warn
2021-06-16 21:25:06 [INFO] info
The “DEBUG” and “TRACE” message won’t be printed, since the default LOG_LEVEL
is “INFO” and those messages have lower priority.
Play around with setting the LOG_LEVEL directly in R and seeing what messages are emitted:
Sys.setenv(“LOG_LEVEL” = “ERROR”)
rlog::log_fatal(“my fatal message”)
rlog::log_error(“my error message”)
rlog::log_warn(“my warn message”)
rlog::log_info(“my info message”)
rlog::log_debug(“my debug message”)
rlog::log_trace(“my trace message”)
Gives us this output:
2021-06-16 21:26:14 [FATAL] fatal
2021-06-16 21:26:14 [ERROR] error
Remember though, the best way to set the environment variable is outside of the app. That way it’s easier to change at run-time and can be changed by whoever ends up looking after the running of your program, without having to understand how the program was written.
In your terminal you can do something like this before you run your script:
$ export LOG_LEVEL=TRACE
$ ./my_script.R
Or you can call your script with the appropriate value at run-time:
$ LOG_LEVEL=ERROR ./my_script.R
RStudio Connect users can set an environment variable called LOG_LEVEL
in the “Vars” tab of their app’s control panel. Other tools have similar ways to set environment variables. See your tool of choice’s documentation for details.
There’s a small shiny app to demonstrate log levels interactively over on GitHub.
Error masking
All programming languages and extension packages have errors of their own that they spit out from time to time. These can be more or less helpful depending on the quality of the message, especially to someone who’s responsibility is the ongoing support of your app, rather than its development.
It’s possible to “mask” these errors by intercepting them and returning one of your own in its place. It can be tempting to mask an R or package error with a log message, but you should be very cautious about doing this as it can end up making things worse for you in the long run.
Say you’re analysing hat shop data, you want to know if Trilbys are in stock and you have some code like this:
hats <- list(trilby = TRUE, fez = FALSE, bowler = TRUE)
trilby_stock <- hats$trilby,
trilby_stock
But you know that, for reasons outside of your control, your data sometimes comes in broken, like this:
hats <- "list of hats"
Perhaps this isn’t a show-stopping error for your app, and it’s OK to proceed with the rest of your code, but it results in this error:
Error in hats$trilby: $ operator is invalid for atomic vectors
This error message is hopefully useful to you as a developer, but probably not that useful to an operations person who may end up supporting your work in production. So you might decide to mask the error with one of your own:
hats <- "list of hats"
trilby_stock <- tryCatch(
expr = hats$trilby,
error = function(e){
rlog::log_warn("Trilby hat stock status not defined - check input data")
FALSE
}
)
trilby_stock
Now when that snippet of code runs it will output something like this:
2021-06-16 22:38:39 [WARN] Trilby hat stock status not defined - check input data
In one sense this is an improvement. It’s better for the person who ends up supporting your work in production, since they no longer need to understand what R might be telling them and can instead rely on what you’re telling them. However, it’s much worse for you, as it’s now not possible to see the underlying error without changing the code.
It’s worth remembering that this code indiscriminately catches any and all errors. What if something else goes wrong and produces a different error from R or a package and it was being masked in this way? You wouldn’t have a way to see what the underlying error was without going in and changing your code, which is not something an ops team should be expected to do.
A better pattern would be something like this:
# hats <- list(trilby = TRUE, fez = FALSE, bowler = TRUE)
hats <- "list of hats"
trilby_stock <- tryCatch(
expr = hats$trilby,
error = function(e){
rlog::log_warn("Trilby hat not defined")
rlog::log_debug(e)
FALSE
}
)
trilby_stock
This version keeps the original message, but hides it behind a lower priority log message; “DEBUG” in this case.
This means that in normal operation, with the LOG_LEVEL
set to INFO
the log output looks like this:
2021-06-16 22:39:28 [WARN] Trilby hat stock status not defined - check input data
But when you need to, you can set the LOG_LEVEL
to DEBUG
and you’ll get this instead:
2021-06-16 22:42:12 [WARN] Trilby hat not defined
2021-06-16 22:42:12 [DEBUG] Error in hats$trilby: $ operator is invalid for atomic vectors
This way, you’ve preserved the original error message in case you need it at some point, but set it to only be emitted if the appropriate LOG_LEVEL
is set. That way, your ops team won’t see it unless they explicitly set the DEBUG
level, which they’re only likely to do if there’s a serious problem and they’re working with you to resolve it.
Of course, this is just a toy example, but it hopeful illustrates this very real problem.
Here’s the full code in case you want to play with it:
# Remember that the LOG_LEVEL should never be set inside your
# code under normal circumstances.
Sys.setenv("LOG_LEVEL" = "DEBUG")
# Switch which `hats` is set by un/commenting these lines
hats <- list(trilby = TRUE, fez = FALSE, bowler = TRUE)
# hats <- "list of hats"
# Here's our stock checking and error masking code.
trilby_stock <- tryCatch(
expr = hats$trilby,
error = function(e){
rlog::log_error("Trilby hat not defined")
rlog::log_debug(e)
FALSE
}
)
# Do we have stock of Trilbys?
trilby_stock
The big take-away here is that while it’s possible to mask R and package errors with your own, doing so without carefully considering the ramifications, can make your code less maintainable in the future. Be kind to your future self and always provide a fall-back option that exposes the underlying error.
Logging to files
rlog outputs it log messages into the console in the same way that most Unix/Linux tools do, using standard streams. Ordinary messages go to “standard output” (stdout) and errors go to “standard error” (stderr).
When you’re using Linux interactively, both stdout and stderr are directed in to the console by default., though depending on your dev setup you may get the stderr messages in a more attention grabbing colour! This is so that you can see the output of your program as it’s running.
When you want to save that output somewhere else however, it must be “redirected”. There is some fairly common Linux syntax for this, so if run in a terminal, output can be redirected like this:
./my_script.R > /path/to/file.log 2>&1
This command (when read from left to right) runs the script called my_script.R
, uses the >
symbol to redirect stdout to the file /path/to/file.log
and then redirects stderr to stdout with 2>&1
. Linux uses the shorthand 1
for stdout and 2
for stderr, so this last bit is really saying, send 2 (stderr) to the same place a 1 (stdout).
You can also keep the “normal” and error outputs separate:
./my_script.R > /path/to/file.log 2> /path/to/errors.log
This example would redirect all stdout to file.log
and all stderr to errors.log
.
In rlog
WARN, ERROR and FATAL are all sent to stderr and the rest to stdout.
Don’t worry if that all seems a bit too much right now. Many enterprise software products, such as Jenkins and RStudio Connect will capture this output automatically and save it for you.
Where do we go from here?
Remember, even though much of this post focused on code using rlog
pretty much all the concepts discussed are transferable to other logging packages and even other languages. The specific language or logging package you’re using is effectively an implementation detail.
Hopefully you now have some good ideas for how to start adding some logging to your own programs. The next step is to go forth and give your own programs production superpowers. Happy logging!