Running a shiny app in a docker container

Shiny! Photo by Tim Mossholder on Unsplash

I recently went looking for a tutorial on hosting a shiny app inside a docker container for a friend. There are a loads of tutorials available, but this one from Juan Orduz is my favourite. It’s short, to the point, and covers exactly what you need to get started at the perfect level of detail.

It’s a couple of years old now though and there are a few ways we can tweak it to make it a little more robust.

There’s also a Github repo for the demo project below if you’d like to play around with it.

Project structure

First, let’s revisit the example project structure. If you’ve been a long time reader of this blog, you’ll know what I think about keeping your data prep with your shiny app, but keeping it in a separate folder is a step in the right direction, so we end up with something that looks like this:

+-- project.Rproj
+-- Dockerfile
+-- shiny-app
|   +-- app.R
|   +-- data-df.rds
+-- data-prep
|   +-- data-prep-script.R
|   +-- raw-data.csv

Our new Dockerfile (more on that later) is at the top level of our project and we have directories for our data prep as well as our shiny app. Remember, friends don’t let friends use spaces in file names!

Automating some of this with a Makefile would be a great addition for the enthusiastic, but let’s keep this post on-topic.

Dockerfile with improvements

There are four main areas where Juan’s original Dockerfile can be modified, mostly to improve reproducibility.

Let’s use the rocker project’s base shiny image so we don’t need to make our own. This image contains Shiny Server, an open source shiny runtime tool from RStudio. We’re not going to use the “latest” version of the image though, since we can never guarantee the versions of R and any other dependencies if we use latest, so we’ll pin it to a specific version of R instead. In our case R 4.0.5.

FROM rocker/shiny:4.0.5

Install all your system dependencies and R packages in a single command for each. This reduces the number of intermediate ’layers’ in your image and helps keep the final image size as small as possible. The syntax for doing this can seem weird initially, but the main thing to remember is that if it would work as a single line command on the system, you can break it up with \ characters to split it over multiple lines in the Dockerfile. The main thing we’re aiming for here is readability, and avoiding having lines that are crazy long.

RUN apt-get update && apt-get install -y \
    libcurl4-gnutls-dev \
    libssl-dev

It’s also worth remembering, for both system dependencies and R packages, to only install the bare minimum that your app requires to run. Don’t install “tidyverse” for instance, unless you’re using every single one of the packages that it installs, just install the packages you actually use. Again, this helps to keep your final image size as small as it can be.

Installing from a standard CRAN mirror means that packages will be installed from source code and compiled inside the image as it’s being built. This can be a lengthy process and is unnecessary now that RStudio’s public Package Manager is available. Package Manager allows linux users to install pre-built binaries of the packages which will make install quicker.

RUN R -e 'install.packages(c(\
              "shiny", \
              "shinydashboard", \
              "ggplot2" \
            ), \
            repos="https://packagemanager.rstudio.com/cran/__linux__/focal/2021-04-23"\
          )'

Each time you build the image from the original article, you’ll get the latest versions of the packages and from a reproducibility perspective, this is not ideal. It would mean that you’re open to a package update potentially introducing a change that negatively impacts your shiny app. Package Manager has an option to pin a CRAN repo mirror to a certain date, meaning you’ll only ever be able to install packages as they were on that date. This is great for reproducibility as it prevents accidental upgrades. Notice the date in the repo URL above.

The final benefit of package manager is that the listing for each package tells you what system dependencies are required for a given package on each operating system it supports (the rocker/shiny docker image uses Ubuntu 20.04). For instance, if we look at the Package Manager entry for the ‘httr’ package and check the “Install system prerequisites…” section, you’ll see that we need to install the libcurl4-openssl-dev and libssl-dev packages for it to work. This means to use httr, we’d need to add those dependencies to the system dependencies section.

We’ve talked a fair bit above about reproducibility and pinning things to specific versions or dates. None of this is to say that you should never upgrade your app, the packages it uses or even the base docker image. Of course you should! But you should be in complete control of how and more importantly when that happens, so you can test any upgrades thoroughly.

Your new docker file

so our new docker file now looks something like this…

# Example shiny app docker file
# https://blog.sellorm.com/2021/04/25/shiny-app-in-docker/

# get shiny server and R from the rocker project
FROM rocker/shiny:4.0.5

# system libraries
# Try to only install system libraries you actually need
# Package Manager is a good resource to help discover system deps
RUN apt-get update && apt-get install -y \
    libcurl4-gnutls-dev \
    libssl-dev
  

# install R packages required 
# Change the packages list to suit your needs
RUN R -e 'install.packages(c(\
              "shiny", \
              "shinydashboard", \
              "ggplot2" \
            ), \
            repos="https://packagemanager.rstudio.com/cran/__linux__/focal/2021-04-23"\
          )'


# copy the app directory into the image
COPY ./shiny-app/* /srv/shiny-server/

# run app
CMD ["/usr/bin/shiny-server"]

That just leaves the last two commands: COPY and CMD.

The COPY copies the contents of the shiny-app directory into the correct location inside our container image.

Finally the CMD defines what command should be run by default when a container is started from this image.

Secret bonus: The RStudio IDE has syntax highlighting for Dockerfiles!

Building and running the image

Make sure you have Docker installed before you begin.

To build the docker image, open your terminal application in the directory where your project is located and run:

docker build -t my-shiny-app .

Notice the . on the end of this command, it’s important! It just means “in this directory”.

The “my-shiny-app” part is the name of our docker image. You can call it whatever you like.

To create a local container from your image run:

docker run --rm -p 3838:3838 my-shiny-app

The --rm removes the container after it’s stopped and the -p 3838:3838 maps your local port 3838, to the same port inside the container where shiny server is listening.

To access the locally running app open a web browser to http://localhost:3838.

To stop the container, head back to the terminal where docker is running and press ctrl+c.

Hosting the app

Hosting your application can be a little trickier than building it and is beyond the scope of this post. At a high level, it would usually involve pushing the image to a container registry like Docker Hub or one run by your cloud provider. Then the image could be pulled into a service like AWS Fargate or GCP’s Cloud Run, or alternatively you could self host on a server your maintain yourself.

Wrapping up

And that’s it. Hopefully there’s enough info here for you to start to play around with containerising your own shiny apps. It’s a great low budget solution to publishing an app, even if it’s not as straightforward as shinyapps.io or as enterprise ready as RStudio Connect.

Sunday, April 25, 2021