Modelling and visualizing data

You will learn how to:

1 Interactive visualization: The core of Shiny

  • Shiny offers the perfect basis for visualization
    • Plots can be modified using UI inputs
    • Seamless integration of interactivity elements (e.g. pan, zoom)
    • Dashboards facilitate the idea of story-telling by providing context to plots

1.1 Good practice examples

  • Examples of these concepts can be seen in many Shiny apps, one example is Edward Parker’s COVID-19 tracker
Question

Explore the COVID-19 tracker. Do you think this is a good Shiny app? If so, why? If not, why not?

COVID-19 Tracker

1.2 Plain plotting vs. Shiny

Feature Plain R Shiny Examples
Reactivity Changes in the visualization have to be changed in the code Visualizations can be modified on the fly using widgets like drop-down menus ExPanD
Interactivity Plots are static raster or vector images Plots can be dynamic and can be interacted with COVID-19 tracker
Narrativity Sense-making happens through manual annotation, e.g. in an article or a presentation Plots are embedded in a compilation of narrative elements that can tell a coherent story

Freedom of Press Shiny app

GRETA Analytics

Medium Reactivity Interactivity Narrativity
Plain image
Paper / report
Dashboard (e.g. Tableau) ☑️
Quarto / RMarkdown ☑️
Traditional website ☑️
Shiny

1.3 Current app state

  • In the last sections, we added a table and a plot and linked them to a number of inputs
  • The code chunk below contains the current app state
  • In this section, we will:
    • Augment the violin plot
    • Add an interactive map
Full code for the current app state
library(dplyr)
library(tidyr)
library(shiny)
library(plotly)
library(leaflet)
library(haven)

ess <- readRDS("ess_trust.rds")
ess_geo <- readRDS("ess_trust_geo.rds")

# UI ----
ui <- fluidPage(
  titlePanel("European Social Survey - round 10"),
  
  ## Sidebar ----
  sidebarLayout(
    sidebarPanel(
      ### select dependent variable
      selectInput(
        "xvar",
        label = "Select a dependent variable",
        choices = c(
          "Trust in country's parliament" = "trust_parliament",
          "Trust in the legal system" = "trust_legal",
          "Trust in the police" = "trust_police",
          "Trust in politicians" = "trust_politicians",
          "Trust in political parties" = "trust_parties",
          "Trust in the European Parliament" = "trust_eu",
          "Trust in the United Nations" = "trust_un"
        )
      ),
      
      ### select a variable ----
      selectInput(
        "yvar",
        label = "Select an independent variable",
        choices = c(
          "Placement on the left-right scale" = "left_right",
          "Age" = "age",
          "Feeling about household's income" = "income_feeling",
          "How often do you use the internet?" = "internet_use",
          "How happy are you?" = "happiness"
        )
      ),
      
      ### select a country ----
      selectizeInput(
        "countries",
        label = "Filter by country",
        choices = unique(ess$country),
        selected = "FR",
        multiple = TRUE
      ),
      
      ### filter values ----
      sliderInput(
        "range",
        label = "Set a value range",
        min = min(ess$trust_parliament, na.rm = TRUE),
        max = max(ess$trust_parliament, na.rm = TRUE),
        value = range(ess$trust_parliament, na.rm = TRUE),
        step = 1
      )
    ),
    
    ## Main panel ----
    mainPanel(
      tabsetPanel(
        type = "tabs",
        
        ### Table tab ----
        tabPanel(
          title = "Table",
          div(
            style = "height: 600px; overflow-y: auto;",
            tableOutput("table")
          )
        ),
        
        ### Plot tab ----
        tabPanel(
          title = "Plot",
          plotOutput("plot", height = 600)
        ),
        
        ### Map tab ----
        tabPanel(
          title = "Map",
          leafletOutput("map", height = 600)
        )
      )
    )
  )
)


# Server ----
server <- function(input, output, session) {
  # update slider ----
  observe({
    var <- na.omit(ess[[input$xvar]])
    is_ordered <- is.ordered(var)
    var <- as.numeric(var)
    updateSliderInput(
      inputId = "range",
      min = min(var),
      max = max(var),
      value = range(var),
      step = if (is_ordered) 1
    )
  }) %>%
    bindEvent(input$xvar)
  
  # filter data ----
  filtered <- reactive({
    req(input$countries, cancelOutput = TRUE)
    
    xvar <- input$xvar
    yvar <- input$yvar
    range <- input$range
    
    # select country
    ess <- ess[ess$country %in% input$countries, ]
    
    # select variable
    ess <- ess[c("idno", "country", xvar, yvar)]
    
    # apply range
    ess <- ess[ess[[xvar]] > range[1] & ess[[xvar]] < range[2], ]
    ess
  })
  
  # render table ----
  output$table <- renderTable({
    filtered()
  }, height = 400)
  
  # render plot ----
  output$plot <- renderPlot({
    xvar <- input$xvar
    yvar <- input$yvar
    plot_data <- filtered() %>%
      drop_na() %>%
      mutate(across(where(is.numeric), .fns = as.ordered))
    
    ggplot(plot_data) +
      aes(x = .data[[xvar]], y = .data[[yvar]], group = .data[[xvar]]) +
      geom_violin(fill = "lightblue", show.legend = FALSE) +
      theme_classic()
  })
}

shinyApp(ui = ui, server = server)

1.4 Recap: Plotting in Shiny

  • Inserting plots in Shiny apps works just like any other UI component
  • You need two things: plotOutput() (or similar) in the UI and renderPlot() (or similar) in the server function
    • plotOutput() creates the empty element in the UI where the plot will go
    • renderPlot() renders the plot and updates the UI element every time a reactive dependency is invalidated

2 Data masking

  • Data masking means that function arguments are not evaluated traditionally, but captured or “defused” for later use
  • This strategy is employed by many functions for plotting or creating tables including the tidyverse (also called “tidy evaluation”)
  • In a practical sense, this means you can specify string values such as column names as you would variables
  • To learn more about data masking in Shiny, see chapter 20 of Advanced R and chapter 12 of Mastering Shiny
# NSE as "tidy evaluation"
ess %>%
  summarize(mean = mean(trust_eu))

# NSE in base R
subset(ess, select = trust_eu)
with(ess, sum(trust_eu))

2.1 Why is data masking a problem?

  • Data masks are a little tricky to handle in higher levels of abstraction, i.e. functions or reactive expressions
  • In such cases, we do not need one specific variable, but a dynamically changing variable
plot_df <- function(df, var) {
  ggplot(df) +
    aes(x = var) +
    geom_histogram()
}

plot_df(ess, "trust_eu")
Error in `geom_histogram()`:
! Problem while computing stat.
ℹ Error occurred in the 1st layer.
Caused by error in `setup_params()`:
! `stat_bin()` requires a continuous x aesthetic.
✖ the x aesthetic is discrete.
ℹ Perhaps you want `stat="count"`?

2.2 Strategy 1: Use tidy pronouns

  • Tidyverse functions that feature tidy evaluation support the .data and .env pronouns
  • The .data pronoun is a representation of the original data which can be used in a masked environment
  • See also the reference of rlang
plot_df <- function(df, var) {
  ggplot(df) +
    aes(x = .data[[var]]) +
    geom_histogram()
}

plot_df(ess, "trust_eu")

2.3 Strategy 2: Convert strings to expressions

  • Sometimes, masked expressions can simply be constructed as strings
  • One example are formulas (e.g. in lm(y ~ x1 + x2))
  • The as.formula function can create formula objects manually
linreg <- function(df, y, x) {
  fm <- paste(y, "~", paste(x, collapse = " + "))
  fm <- as.formula(fm)
  lm(fm, data = df)
}

linreg(ess, y = "trust_eu", x = c("age", "left_right"))

2.4 Strategy 3: Change names

  • In case of poorly implemented data masking, no tools are available to inject variables
  • One strategy to overcome such situations could be to simply change the object names
plot_df <- function(df, var) {
  df <- df[, var]
  names(df) <- "x"

  ggplot(df) +
    aes(x = x) +
    geom_histogram()
}

plot_df(ess, "trust_eu")

3 Interactivity

  • R itself is very bad at interactivity
  • Shiny supports some very essential interactivity through plotOutput
    • Not covered in this workshop! For a primer, check out chapter 7.1 of Mastering Shiny
  • All of today’s cool kids use interactivity through Javascript interfaces
  • Shiny can generally process all kinds of Javascript-based widgets because Shiny apps are HTML documents

4 Plotly

4.1 Plotly’s grammar of graphics

  • Similar to ggplot2, R plotly defines its own grammar of graphics
  • A plotly canvas is created with plot_ly()
  • Additional plot elements can be added through pipes %>% or |>
ess_geo <- readRDS("data/ess_trust_geo.rds")
ess_geo <- mutate(
  ess_geo,
  region = case_match(
    country,
    c("AT", "BE", "CH", "DE", "NL", "PL", "CZ") ~ "Central",
    c("BG", "EE", "HR", "HU", "LT", "LV", "PL", "SI", "SK") ~ "Eastern",
    c("ES", "IT", "PT", "RS", "ME") ~ "Southern",
    c("IS", "SE", "FI", "GB", "IE", "DK") ~ "Northern"
  )
)

plot_ly(
  sf::st_drop_geometry(ess_geo),
  x = ~trust_eu,
  y = ~left_right,
  z = ~age,
  color = ~region,
  text = ~country
) %>%
  add_markers() %>%
  layout(scene = list(
    xaxis = list(title = 'Trust in the EU'),
    yaxis = list(title = 'Left-right placement'),
    zaxis = list(title = 'Age')
  ))
1
Variables such as x, y, z and color are defined as formulas in a call to plot_ly. This is comparable to calling ggplot(aes(x, y, z, color)).
2
The plot type is added through a pipe. This is comparable to ggplot2 functions such as geom_point or geom_bar.
3
Visual sugar is then added by calling layout and manually editing the axis titles.

4.2 Quick and dirty interactivity

  • One important advantage of plotly is that you do not need to learn its grammar
  • ggplot2 plots can very easily be converted to an interactive plotly plot:
p <- ggplot(iris) +
  geom_point(aes(Sepal.Width, Sepal.Length))
p

ggplotly(p)

4.3 Extending plotly

4.3.1 Customization

  • We can extend Plotly objects using three functions:
    • layout() changes the plot organisation (think ggplot2::theme()), e.g.:
      • colors, sizes, fonts, positions, titles, ratios and alignment of all kinds of plot elements
      • updatemenus adds buttons or drop down menus that can change the plot style or layout (see here for examples)
      • sliders adds sliders that can be useful for time series (see here for examples)
    • config() changes interactivity configurations, e.g.:
      • The modeBarButtons options and displaylogo control the buttons in the mode bar
      • toImageButtonOptions controls the format of plot downloads
      • scrollZoom enables or disables zooming by scrolling
    • style() changes data-level attributes (think ggplot2::scale_), e.g.:
      • hoverinfo controls whether tooltips are shown on hover
      • mode controls whether to show points, lines and/or text in a scatter plot
      • hovertext modifies the tooltips texts shown on hover

4.3.2 Schema

  • The actual number of options is immense!
  • You can explore all options by calling plotly::schema()
schema()