5 Tips for Using pins with R

pins in a pincushion

It’s no secret that I’m a big fan of the pins package for R (and now there’s Python pins too!).

In this post, we’ll take a look at my top 5 best practice tips for using pins effectively. Finally, we have a bonus tip on dubugging problems using pins with RStudio Connect.

1. Use a good title

A good title is an essential part of discovery for your pins. It should be short, but informative. It should help someone who finds it understand what the pin is quickly. In this sense it’s a lot like naming a variable.

For instance, the following are all terrible names:

  • pin 76
  • tuesdays data
  • untitled pin
  • water data

A good name is more descriptive, so try to use things like:

  • daily sales extract
  • iot sensor log
  • water flow last 7 days
  • pipeline maintenance model
board %>% pin_write(mtcars,
                    name  = "motor_trend_cars",
                    title = "Motor Trend Car Road Tests")

Note that it’s a good idea to apply a similar philosophy to the pin name as well, though remember to not include any spaces or special characters. Keep these simple.

2. Always use a description

This is one of my pet peeves. I see so many pins with no description and it always baffles me.

How are people supposed to know what the pin is? Once you have more than about 3 pins on a given board, not providing descriptions is basically a crime. What is this data? Where did it come from? What’s it being used for? Who’s using it?

With no description, you’re essentially condemning your potential users to guessing what your pin contains from the title, most of which are far from helpful (see item 1!). Throw your users a bone and let them know what they’re looking at!

description <- "The data was extracted from the 1974 Motor Trend US magazine, and comprises fuel consumption and 10 aspects of automobile design and performance for 32 automobiles (1973–74 models)."

board %>% pin_write(mtcars,
                    name        = "motor_trend_cars",
                    title       = "Motor Trend Car Road Tests",
                    description = description)

3. Use metadata

For many use cases, you should also consider adding metadata. Supplying user-defined metadata feels like a power-user move, but it really helps to level up your pins experience, especially in enterprise or other multi-user settings.

You can supply pretty much anything you’d like as metadata as long as you supply it as a list.

Some examples include:

  • Source of the data
  • URL to the documentation for the project you’re working on
  • URL to a data dictionary
  • Project contact details
  • Business department
  • Business owner for the pin

It can also be useful to define a schema for metadata in your organisation, so that there’s consistency across all your pinned data and other consumers know what to expect from pinned data in your organisation.

metadata <- list(owner       = "sellorm",
                 deptartment = "R&D",
                 URL         = "https://blog.sellorm.com")

description <- "The data was extracted from the 1974 Motor Trend US magazine, and comprises fuel consumption and 10 aspects of automobile design and performance for 32 automobiles (1973–74 models)."

board %>% pin_write(mtcars,
                    name        = "motor_trend_cars",
                    title.      = "Motor Trend Car Road Tests",
                    description = description,
                    metadata    = metadata)

Users can read a pins' metadata using the pin_meta() function.

4. Use programmatic metadata

Metadata is just data, so we can improve our metadata game with some automation. Why type your metadata when you can generate it?

Using this technique, you could write a metadata generating function that including things like:

  • R version
  • package versions
  • the hostname of the computer used to generate the pin
  • the status of the data, eg. validated, not-validated
  • info about the collected data, eg. location or demographic
  • project related info

Bonus points for maintaining a project metadata pin for all your teams projects. You can use this with your auto-metadata generating function to read all the project related metadata it needs from there. A metadata pin for auto-populating your pin’s metadata? Very meta.

5. Don’t use pins for everything

I love pins, but if what you really need is a database, you should use a database. If you have data that changes very frequently, is very large, or only require subsets you should probably consider using a database instead.

Of course, “very large” is ambiguous and what that actually means will depend a lot on your environment. A 4 GB pin in local storage can be very different to the same sized pin being pushed over the internet to cloud storage. You should be prepared to do a little performance testing as your pin sizes grow to see how your particular board performs with your pinned data.

Frequent or small updates can also be inefficient. For instance, if you’re writing an app that records transactional data in a pin it’s likely that you’ll suffer from performance issues as the number of transactions grows. As with file sizes above, exactly where the break point occurs is going to depend entirely on things outside of the pins package’s control. Things like the board you use, how your app is updating data and so on. This is essentially a solved problem and that solution is a database. Databases were built for this sort of use case and, as with everything, you should endeavour to use the right tool for the job.

Remeber though, that using pins with a database, can be a great way to provide easier access to extracts, subsets, and augmented datasets, that would otherwise be difficult or impossible to share.

Bonus debugging tip for pins with R

One of the most powerful ways to use pins is with RStudio Connect.

The errors that pins emits are very helpful, but you sometimes need more information that you can use, perhaps together with your IT support team, to resolve the issue. While it is rare, if you ever need to debug a pin publishing problem you can use this technique to get more information on the API calls pins is making.

To do this, we can use httr’s with_verbose() to emit additional connection details:

library(pins)
library(httr)
board <- board_rsconnect(server = "SERVER_URL", key = "API_KEY")
with_verbose(pin_write(board = board, x = mtcars, name = "verbose_test"))

That’s all of them. What are your favourite pins tips? Let me know over on Twitter.

Photo by Sarah Dao on Unsplash