The plots thicken

Every story needs a good plot

One could think of data science as “art, grounded in facts”. It tells a story through visualisation. Both story and visualisation rely on a good plot. And an abundance of those has evolved over time. Many have their own dedicated Wikipedia page!

Which generate the most interest? How is the interest in each trending over time? Try this app to find out.

The app may take a moment to load:

View full-width here.

Note the utility of selecting the right scaling. The combination of “fixed” and “normal” reveals what must have been “world histogram day” on July 27th 2015, but little else.

The need for speed

Turning non-interactive code into an app sharpens the mind’s focus on performance. And profvis, integrated into RStudio via the profile menu option, is a wonderful “tool for helping you understand how R spends its time”.

My first version of the app was finger-tappingly slow.

Profvis revealed the main culprit to be the pre-loading of a dataframe with the page-view data for all chart types (there are more than 100). Profiling prompted the more efficient “reactive” approach of loading the data only for the user’s selection (maximum of 8).

Profiling also showed that rounding the corners of the plot.background with additional grid-package code was expensive. App efficiency felt more important than minor cosmetic detailing (to the main panel to match the theme’s side panel). And most users would probably barely notice (had I not drawn attention to it here).

R toolkit

 PackagesFunctions
purrrmap_df
profvisprofvis
pageviewsarticle_pageviews
rvestread_html; html_nodes; html_text
dplyrmutate; select;
stringrstr_replace_all
lubridateymd
tibbledata_frame
ggplot2geom_line; geom_smooth; facet_wrap
ggthemestheme_economist; economist_pal
shinyfluidPage; reactive; renderPlot; shinyApp; selectInput; wellPanel; helpText; selectizeInput; titlePanel; mainPanel; plotOutput
shinythemesshinytheme

View the code here.

Citations / Attributions

R Development Core Team (2008). R: A language and environment for
statistical computing. R Foundation for Statistical Computing,
Vienna, Austria. ISBN 3-900051-07-0, URL http://www.R-project.org.

 

2 Replies to “The plots thicken”

  1. Hi Carl

    You must go to quite a lot of trouble to craft your illustrations. One that is beautiful and useful immediately caught my eye: the table showing the tools used from the R toolkit.

    Do you build the table by hand, or do you have a tool to help?
    I would like to suggest to my colleagues that they include similar tables in their documentation.

    Thanks

    1. Thanks for the comment Michael.

      I create the table container using the WordPress plugin TablePress. The icons are adapted from Font Awesome; I stylised them in The Adobe Creative Cloud. So I have the table with stylised icons as a template I can reuse each time.

      Identifying the packages / functions though is manual. I find it worthwhile as it helps me to check consistency, e.g data_frame versus data.frame. I can also see if i’ve brought something new into any given article, and it’s a nice way to acknowledge the work of some great package authors out there. Automating it all would make a great candidate package.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.