Create a Shiny app to search Twitter with rtweet and R

See how to create an interactive Shiny web application to search, sort, and track tweets with a specific hashtag

Create a Shiny app to search Twitter with rtweet and R
Thinkstock

Do you want to track a conference hashtag — or any hashtag — on Twitter? This bonus Do More With R tutorial shows you step by step how to create an interactive Shiny web application to search, sort, and filter tweets with a specific hashtag (or any keyword).

If you don’t want to go through the whole Twitter R tutorial, you can download the Shiny and R app code plus this article in R Markdown and PDF format as a free Insider extra. 

Original R code to search tweets

Do More With R episode #41 demonstrated how to search for tweets with the rtweet package and create a sortable, filterable table with reactable. This first code block from that article searches for tweets and formats results to display in a table.

# Configure variables: number of tweets to download and hashtag search query
num_tweets_to_download <- 200
hashtag_to_search <- "#rstudioconf"
# Make sure to install any packages listed below that you don't already have on your system:
library("rtweet")
library("reactable")
library("glue")
library("stringr")
library("httpuv")
library("dplyr")
library("purrr")
# Code to actually search for tweets
tweet_df <- search_tweets(hashtag_to_search, n = num_tweets_to_download, include_rts = FALSE)
# select a few desired columns and add a clickable link to tweet text for table data
tweet_table_data <- tweet_df %>%
  select(user_id, status_id, created_at, screen_name, text, favorite_count, retweet_count, urls_expanded_url) %>%
  mutate(
    Tweet = glue::glue("{text} <a href='https://twitter.com/{screen_name}/status/{status_id}'>>> </a>")
  )%>%
  select(DateTime = created_at, User = screen_name, Tweet, Likes = favorite_count, RTs = retweet_count, URLs = urls_expanded_url)

The next code block is a function to make URLs in the tweet URL column clickable. As I mentioned in the main episode, this is a bit more complicated than creating a link to a tweet because there can be more than one URL per tweet. There’s probably a more elegant way to generate clickable URLs, but the function below works.

make_url_html <- function(url) {
  if(length(url) < 2) {
    if(!is.na(url)) {
      as.character(glue("<a title = {url} target = '_new' href = '{url}'>{url}</a>") )
    } else {
      ""
    }
  } else {
    paste0(purrr::map_chr(url, ~ paste0("<a title = '", .x, "' target = '_new' href = '", .x, "'>", .x, "</a>", collapse = ", ")), collapse = ", ")
  }
}
tweet_table_data$URLs <- purrr::map_chr(tweet_table_data$URLs, make_url_html)

The third and final code block creates an interactive reactable table. Most of the code in this block tweaks the default styling and table behavior.

reactable::reactable(tweet_table_data, 
          filterable = TRUE, searchable = TRUE, bordered = TRUE, striped = TRUE, highlight = TRUE,
          showSortable = TRUE, defaultSortOrder = "desc", defaultPageSize = 25, showPageSizeOptions = TRUE, pageSizeOptions = c(25, 50, 75, 100, 200),
          columns = list(
            DateTime = colDef(defaultSortOrder = "asc"),
            User = colDef(defaultSortOrder = "asc"),
            Tweet = colDef(html = TRUE, minWidth = 190, resizable = TRUE),
            Likes = colDef(filterable = FALSE, format = colFormat(separators = TRUE)),
            RTs = colDef(filterable =  FALSE, format = colFormat(separators = TRUE)),
            URLs = colDef(html = TRUE)
          )
)

Create a basic Shiny app to search tweets

Now let’s turn this data table into a local interactive Shiny app with even more filtering possibilities.

41 shiny01 IDG

Figure 1

First, go to RStudio and create a new Shiny app in a new subdirectory within your project by going to File > New File > New Shiny Web Application. I’ll call my app HashtagSearch but you can call yours anything.

The default app.R file created by RStudio looks like this:

library(shiny)
# Define UI for application that draws a histogram
ui <- fluidPage(

    # Application title
    titlePanel("Old Faithful Geyser Data"),

    # Sidebar with a slider input for number of bins
    sidebarLayout(
        sidebarPanel(
            sliderInput("bins",
                        "Number of bins:",
                        min = 1,
                        max = 50,
                        value = 30)
        ),

        # Show a plot of the generated distribution
        mainPanel(
           plotOutput("distPlot")
        )
    )
)

# Define server logic required to draw a histogram
server <- function(input, output) {

    output$distPlot <- renderPlot({
        # generate bins based on input$bins from ui.R
        x    <- faithful[, 2]
        bins <- seq(min(x), max(x), length.out = input$bins + 1)

        # draw the histogram with the specified number of bins
        hist(x, breaks = bins, col = 'darkgray', border = 'white')
    })
}

# Run the application
shinyApp(ui = ui, server = server)

Create the UI for your Shiny app

The UI portion of a shiny app is the user interface, or front-end HTML page that the user — in this case you — interacts with. The server portion is the back-end R code that does all of the computing to make a user’s request actually work.

I typically first delete most of the comments in the default app.R file, change the title, and load all of the packages I need. The top of my app.R now looks like this:

library(shiny)
library(rtweet)
library(dplyr)
library(glue)
library(reactable)
library(purrr)

Next, I’ll add the code that searches for tweets, imports them to R, and wrangles a version to display in a table — but I’m going to Shiny-fy it a bit.

I know I’m going to want my URL HTML-formatting function. I’ll put that at the top of the app, before all the interactive code. (Alternatively, I could put it in a separate helper file and source that file.) The top of my app now looks like this:

library(shiny)
library(rtweet)
library(dplyr)
library(glue)
library(reactable)
library(purrr)

make_url_html <- function(url) {
    if(length(url) < 2) {
        if(!is.na(url)) {
            as.character(glue("<a title = {url} target = '_new' href = '{url}'>{url}</a>") )
        } else {
            ""
        }
    } else {
        paste0(purrr::map_chr(url, ~ paste0("<a title = '", .x, "' target = '_new' href = '", .x, "'>", .x, "</a>", collapse = ", ")), collapse = ", ")
    }
}

ui <- fluidPage(

Next, I’ll add code to create the initial Twitter data frame from my search query. That goes in the app.R’s server section. Since that data frame will change and update depending on what the user requests, I need to make it a reactive value.

So far, there are two items that the user can change: the hashtag search query and the number of tweets to return. Instead of hard coding the values at the top of the file, I want to add those as inputs users can change.

That means adding two user input fields in the app.R UI section: hashtag_to_search and num_tweets_to_download. I’ll create a text box for entering a hashtag and numerical input box or slider for entering the number of tweets to request.

For number of tweets, I’ll start off by keeping the slider that came in the default app.R. However, I’ll change its ID to num_tweets_to_download, the label that the user sees to “Number of tweets to download”, the min value to 100, the max value to 18000, the default value to 200, and the step (how much is added or subtracted with each slight move of the slider) to 100.

I’m also going to delete everything in the server section between the brackets of the line server <- function(input, output) {### DELETE EVERYTHING IN HERE### } since the default app won’t work anymore now that I’m changing inputs.

The UI and server sections of the app now look like this:

ui <- fluidPage(

    # Application title
    titlePanel("Search tweets"),

    # Sidebar
    sidebarLayout(
        sidebarPanel(
            sliderInput("num_tweets_to_download",
                        "Number of tweets to download:",
                        min = 100,
                        max = 18000,
                        value = 30,
                        step = 100)
        ),

        # Show results
        mainPanel(
           plotOutput("distPlot")
        )
    )
)

# Define server logic
server <- function(input, output) {
    output$distPlot <- renderPlot({

    })
}
41 shiny04 med IDG

Figure 2

If I run the app by clicking the Run App button at the top right of my RStudio pane (Figure 2), I can see my slider (although it doesn’t do anything yet).

If you find it a little difficult to get the exact value you want when moving the slider, you can change that slider to a numericInput() box:

numericInput("num_tweets_to_download",
                        "Number of tweets to download:",
                        min = 100,
                        max = 18000,
                        value = 200,
                        step = 100)

That will give you a box you can type into instead of a slider (Figure 3).

41 shiny05 IDG

Figure 3

Next I’m going to add a text input box for the hashtag query with Shiny’s textInput() function. I’ll set the box’s inputId to be hashtag_to_search, the label as “Hashtag to search”, and the default value as “#rstudioconf”. My sidebar panel code now looks like this:

# Sidebar
sidebarLayout(
    sidebarPanel(
        numericInput("num_tweets_to_download",
            "Number of tweets to download:",
            min = 100,
            max = 18000,
            value = 200,
            step = 100), # <- Don’t forget comma here
        textInput("hashtag_to_search",
            "Hashtag to search:",
            value = "#rstudioconf")
    ),

Note that since I’m adding a second input argument to the sidebar panel, I need a comma after the first one as shown by the commented arrow above. Getting parentheses and commas correct can be a challenge when coding Shiny apps. Creating inputs and outputs one by one and running the code after each addition can help make it easier to spot problems.

41 shiny07 IDG

Figure 4

Click to run the newest code and you should see something like Figure 4. 

The basic sidebar is finished, but I need one more input in my UI to tell shiny where I want my tweet table to display. I’ll put that placeholder in the main panel. The default app’s main panel has plotOutput("distPlot"), which tells shiny to save space for a plot in the main part of the page. I want a reactable table there, so I’ll change plotOutput() to reactableOutput(). More specifically, I’ll set the table inputId to “tweet_table” and swap in reactableOutput("tweet_table"). If I click to run my app now, nothing has changed. But we’re almost there.

Code the Shiny server logic

It’s finally time to actually get and display the tweets. Just as with the original code, I’ll use rtweet’s search_tweet() function to search for tweets and store results in tweets_df (which will contain 90 columns). I’ll then create a tweets_table_data data frame from my results, formatted to display in a table, and code the table. All this server logic must go inside the server <- function(input, output) {  } portion of the app.R file.

I can re-use my original code with a few modifications. The most important points:

  1. Any variable that changes based on user input needs to be a reactive value, not a conventional R object. I have to create and otherwise handle them in slightly different ways than a conventional variable, but happily, the tweaks are minor.
  2. Any time I need the value of a user input, I have to refer to it as input$my_variable_name and not just my_variable_name.

Let’s see how that works.

Here’s my non-Shiny code to search for tweets and create a version of the results for a table:

tweet_df <- search_tweets(hashtag_to_search, n = num_tweets_to_download, include_rts = FALSE)

# select a few desired columns, add clickable links for tweets and URLs, rename some columns
tweet_table_data <- tweet_df %>%
  select(user_id, status_id, created_at, screen_name, text, favorite_count, retweet_count, urls_expanded_url) %>%
  mutate(
    Tweet = glue::glue("{text} <a href='https://twitter.com/{screen_name}/status/{status_id}'>>> </a>"),
    URLs = purrr::map_chr(urls_expanded_url, make_url_html)
  )%>%
  select(DateTime = created_at, User = screen_name, Tweet, Likes = favorite_count, RTs = retweet_count, URLs)

To turn a “regular” value into a reactive one, use Shiny’s reactive({}) function. And remember: To refer to the variables hashtag_to_search and num_tweets_to_download that are set by a user’s interaction with the app, use input$var_name and not just var_name. That turns this non-Shiny code:

tweet_df <- search_tweets(hashtag_to_search, n = num_tweets_to_download, include_rts = FALSE)

Into this Shiny code:

tweet_df <- reactive({
  search_tweets(input$hashtag_to_search, n = input$num_tweets_to_download, include_rts = FALSE)
})

and the tweet_table_data code into this:

tweet_table_data <- reactive({
  req(tweet_df())
  tweet_df() %>%
  select(user_id, status_id, created_at, screen_name, text, favorite_count, retweet_count, urls_expanded_url) %>%
  mutate(
    Tweet = glue::glue("{text} <a href='https://twitter.com/{screen_name}/status/{status_id}'>>> </a>"),
    URLs = purrr::map_chr(urls_expanded_url, make_url_html)
  )%>%
  select(DateTime = created_at, User = screen_name, Tweet, Likes = favorite_count, RTs = retweet_count, URLs)
})

Only the first three lines, which create the tweet_table_data data frame, have changed. The first line, tweet_table_data <- reactive({ , creates an object that can change based on user input. The second line, req(tweet_df()), tells shiny not to start any calculations for tweet_table_data until a tweet_df() data frame exists. Without that line, my app might throw an error if it tries to run calculations on a data frame that doesn’t exist. Note, too, it’s tweet_df() and not tweet_df since that data frame is reactive.

The rest of the code is the same as before, except for the closing }) at the end.

Finally, the table code needs three minor changes:

  1. Since the table is a visualization, Shiny needs to know where to put it and what it is. I’ll store the table in a variable called output$tweet_table. That connects it with the reactableOutput("tweet_table") placeholder in the user interface. I’ll also use the renderReactable() function so Shiny knows what kind of visualization it’s creating. That makes the code output$tweet_table <- renderReactable({ ### R code to create the reactable table here ### }). Note: Most of the time when you want to use a visualization in Shiny, you add something like myfunctionOutput("my_dataviz_id") in the UI and output$my_dataviz_id <- renderMyfunction({ ### R code to create viz here ### }) in the server. 
  1. The code should require the presence of the tweet_table_data data frame, or the app might throw an error trying to generate a table from data that doesn’t exist.
  2. Code also needs to refer to tweet_table_data as tweet_table_data() since it’s reactive.

The Shiny-fied table code:

output$tweet_table <- renderReactable({
  reactable::reactable(tweet_table_data(),
          filterable = TRUE, searchable = TRUE, bordered = TRUE, striped = TRUE, highlight = TRUE,
          showSortable = TRUE, defaultSortOrder = "desc", defaultPageSize = 25, showPageSizeOptions = TRUE, pageSizeOptions = c(25, 50, 75, 100, 200),
          columns = list(
            DateTime = colDef(defaultSortOrder = "asc"),
            User = colDef(defaultSortOrder = "asc"),
            Tweet = colDef(html = TRUE, minWidth = 190, resizable = TRUE),
            Likes = colDef(filterable = FALSE, format = colFormat(separators = TRUE)),
            RTs = colDef(filterable =  FALSE, format = colFormat(separators = TRUE)),
            URLs = colDef(html = TRUE)
          )
)
})

Here is the code for the entire app.R Shiny app:

library(shiny)
ui <- fluidPage(
 
  # Application title
  titlePanel("Search tweets"),
 
  # Sidebar
  sidebarLayout(
    sidebarPanel(
      numericInput("num_tweets_to_download",
                   "Number of tweets to download:",
                   min = 100,
                   max = 18000,
                   value = 200,
                   step = 100),
      textInput("hashtag_to_search",
                "Hashtag to search:",
                value = "#rstudioconf")
    ),
   
    # Show results
    mainPanel(
      reactableOutput("tweet_table")
    )
  )
)

# Define server logic
server <- function(input, output) {
 
  tweet_df <- reactive({
    search_tweets(input$hashtag_to_search, n = input$num_tweets_to_download, include_rts = FALSE)
  })
 
  tweet_table_data <- reactive({
    req(tweet_df())
    tweet_df() %>%
      select(user_id, status_id, created_at, screen_name, text, favorite_count, retweet_count, urls_expanded_url) %>%
      mutate(
        Tweet = glue::glue("{text} <a href='https://twitter.com/{screen_name}/status/{status_id}'>>> </a>"),
        URLs = purrr::map_chr(urls_expanded_url, make_url_html)
      )%>%
      select(DateTime = created_at, User = screen_name, Tweet, Likes = favorite_count, RTs = retweet_count, URLs)
  })
 
  output$tweet_table <- renderReactable({
    reactable::reactable(tweet_table_data(),
                         filterable = TRUE, searchable = TRUE, bordered = TRUE, striped = TRUE, highlight = TRUE,
                         showSortable = TRUE, defaultSortOrder = "desc", defaultPageSize = 25, showPageSizeOptions = TRUE, pageSizeOptions = c(25, 50, 75, 100, 200),
                         columns = list(
                           DateTime = colDef(defaultSortOrder = "asc"),
                           User = colDef(defaultSortOrder = "asc"),
                           Tweet = colDef(html = TRUE, minWidth = 190, resizable = TRUE),
                           Likes = colDef(filterable = FALSE, format = colFormat(separators = TRUE)),
                           RTs = colDef(filterable =  FALSE, format = colFormat(separators = TRUE)),
                           URLs = colDef(html = TRUE)
                         )
    )
  })
 
}

# Run the application
shinyApp(ui = ui, server = server)

Run the app and voila! It should look something like this:

41 shiny08 IDG

Figure 5

A few practical refinements

There are some refinements I want to make to this app to make it more practical. First is how the app reacts when a user changes the default hashtag or number of tweets to request. By default, Shiny will try to update the data as you’re typing. That’s often cool and useful behavior. But in this case, I don’t want the app to make a call to the Twitter API every time I type a letter or number! Otherwise, it may try to return a search for a partial request after every keystroke — and I risk bumping into my 18,000-tweets-in-15-minutes limit.

The other feature I’d like to add is saving some of the data my app returns.

For the first issue, I don’t want my app making a request to the Twitter API until I click a “Get data” button. RStudio recommends an “Action button” for this functionality. As with most things in Shiny, that involves two parts: the UI and the server.

For the UI, I can add an action button with the actionButton() function. The first argument is the button’s inputId; second is the button label a user sees. There are optional arguments, such as built-in classes to change the button’s appearance. I like the btn-primary CSS class, so I’ll use this for my button: actionButton("get_data", "Get data", class = "btn-primary").

To make that button do something — or, more accurately, stop the app from doing something until all the input information is ready — I need to change the tweet_df() object from a reactive value to a reactive event. That means it doesn’t simply change its value based on a value a user enters but by an action a user takes (such as a button click). The syntax for that in the server is as follows: 

my_reactive_object <- eventReactive(input$my_button_id, {
  ### R code here ###
  })

Note the comma after input$my_button_id! I’ve spent a fair amount of time trying to track down Shiny errors because I forget a comma after the first argument of a reactive function.

Next add the following line to the UI sidebarPanel: 

actionButton("get_data", "Get data", class = "btn-primary")

The whole panel should now look like this:

sidebarPanel(
            numericInput("num_tweets_to_download",
                         "Number of tweets to download:",
                         min = 100,
                         max = 18000,
                         value = 200,
                         step = 100),
            textInput("hashtag_to_search",
                      "Hashtag to search:",
                      value = "#rstudioconf"),
            actionButton("get_data", "Get data", class = "btn-primary")
        ),

And change the tweet_df definition in your server code from this:

    tweet_df <- reactive({
        search_tweets(input$hashtag_to_search, n = input$num_tweets_to_download, include_rts = FALSE)
    })

To this: 

    tweet_df <- eventReactive(input$get_data, {
        search_tweets(input$hashtag_to_search, n = input$num_tweets_to_download, include_rts = FALSE)
    })

If you run the app now, nothing should happen until you click the “Get data” button.

41 shiny09 lg IDG

Figure 6

To save data to your local machine, you can add a download data button. Once again, we need code in the UI and in the server.

For a download button to appear in the app, there’s a special downloadButton() function with the syntax downloadButton("id", "Label"). My button code, including a couple of br() line breaks before the button to separate it from the item above, is:

br(),br(),
downloadButton("download_data", "Download data")

And the full sidebar panel code is now:

        sidebarPanel(
            numericInput("num_tweets_to_download",
                         "Number of tweets to download:",
                         min = 100,
                         max = 18000,
                         value = 200,
                         step = 100),
            textInput("hashtag_to_search",
                      "Hashtag to search:",
                      value = "#rstudioconf"),
            actionButton("get_data", "Get data", class = "btn-primary"),
            br(),br(),
            downloadButton("download_data", "Download data")
        ),

If you run the app.R code now, it will download the HTML page, which of course isn’t the behavior you want. For server logic to make the button actually download data, you need to use Shiny’s downloadHandler() function. This function needs two pieces of information: the name you want for the downloaded file and the data you want to download. It uses the format below. Note the need to have functions for both the file name and the content that’s downloaded:

output$download_data <- downloadHandler(
    filename = function() {
      paste(input$hashtag_to_search, "_", Sys.Date(), ".csv", sep = "")
    },
    content = function(file) {
      write.csv(tweet_table_data(), file, row.names = FALSE)
    }
  )

Add that someplace in the app.R server code and give it a try.

Note that any filtering you do within the table won’t be reflected in the data you download. Also, you may want to download the full data set — in this case tweet_df() with its 90 columns of data instead of tweet_table_data() — since you never know what sorts of analysis you might want to do in the future with conference tweets. In that case, just change the line write.csv(tweet_table_data(), file, row.names = FALSE) to write.csv(tweet_df(), file, row.names = FALSE).

Additional filtering

There’s a lot more filtering you can do on the Twitter data once it’s loaded. The Shiny input widget gallery shows many of the user input options you can add to a Shiny app.

41 shiny10 IDG

Figure 7

For example, you could use the date range filter to only look at tweets during conference dates, screening out pre-conference and post-conference chatter.

To do that, the UI code uses a format such as dateRangeInput("my_date_picker_id", label = "Select dates:", start = "2020-01-27", end = "2020-01-30") where start and end are the default values.

Try adding that code somewhere in your UI sidebar panel code (picking the dates you want and always remembering proper commas), along with br() line breaks as needed. Note that your default dates can also be something like Sys.Date() - 7 to start “seven days ago.”

My side panel code now looks like this:

sidebarPanel(
            numericInput("num_tweets_to_download",
                         "Number of tweets to download:",
                         min = 100,
                         max = 18000,
                         value = 200,
                         step = 100),
            textInput("hashtag_to_search",
                      "Hashtag to search:",
                      value = "#rstudioconf"),
            dateRangeInput("date_picker", label = "Select dates:", start = "2020-01-27", end = "2020-01-30"),
            actionButton("get_data", "Get data", class = "btn-primary"),
            br(),br(),
            downloadButton("download_data", "Download data")
        ),

Which produces the sidebar panel shown in Figure 8. 

41 shiny11 IDG

Figure 8

To make that date-range picker actually filter, go into the tweet_table_data() definition and add code to filter rows based on the DateTime column. In this case, since the date selector has an inputId of “date_picker”, the start value is input$date_picker[1] and the end value is input$date_picker[2].

I’ll add a row in the tweet_table_data() definition to filter for tweets that were created between the user-requested start and end dates:

filter(between(as.Date(created_at), input$date_picker[1], input$date_picker[2]))

The code that creates the reactive tweet_table_data() data frame now looks like this:

tweet_table_data <- reactive({
        req(tweet_df())
        tweet_df() %>%
            select(user_id, status_id, created_at, screen_name, text, favorite_count, retweet_count, urls_expanded_url) %>%
            filter(between(as.Date(created_at), input$date_picker[1], input$date_picker[2]) ) %>%
            mutate(
                Tweet = glue::glue("{text} <a href='https://twitter.com/{screen_name}/status/{status_id}'>>> </a>"),
                URLs = purrr::map_chr(urls_expanded_url, make_url_html)
            )%>%
            select(DateTime = created_at, User = screen_name, Tweet, Likes = favorite_count, RTs = retweet_count, URLs)
    })

More enhancements

There is so much more you could do to add capabilities to this app. You might add sliders to set a minimum number of retweets or likes, and then do additional table filtering on that subset of tweets. Or, you could add a check box to filter for tweets that contain URLs — potentially helpful if you’re looking for conference tweets that might include links to presentations and other resources. You could keep other columns from the initial data set to filter for variables such as language and geography.

To learn more about Shiny, RStudio has some written and video tutorials at shiny.rstudio.com. Other useful resources: Zev Ross created an easy-to-follow Shiny tutorial with examples and a cheat sheet, and Hadley Wickham has a Mastering Shiny book in progress that’s available free online.

For more on rtweet, go to rtweet.info.

And for more tips on using R, head to the Do More With R page at InfoWorld!

This story, "Create a Shiny app to search Twitter with rtweet and R" was originally published by InfoWorld.

ITWorld DealPost: The best in tech deals and discounts.
  
Shop Tech Products at Amazon