Skip to content

Regression discontinuity experiments with Google Trends data

Notifications You must be signed in to change notification settings

Jonathan-Pearce/RDD_trends

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 

Repository files navigation

RDD_trends

Regression discontinuity experiments with Google Trends data

image

Example - 'GST tax' search in Canada

Between November 2024 and February 2025 the federal government of Canada proposed, passed and executed legislation (Bill C-78, the Tax Break for All Canadians Act) making essentially all food and many holiday essentials tax-free (GST/HST) for two months from December 14, 2024 to February 15, 2025. In this example, we focus around the date this legislation was originally proposed (November 21, 2024) to examine how this news impacted the search volume related to tax terms in Canada, specifically the search term 'GST tax'.

Code:

devtools::install_github("PMassicotte/gtrendsR")
library(gtrendsR)
library(ggplot2)
#read helper functions
source(file.path(getwd(),"get_gtrends_data.R"))
source(file.path(getwd(),"get_rdd_graph.R"))
#get Google Trends data
gtrends_data <- get_gtrends_data("GST tax", search_geo = "CA", search_time = "2024-10-15 2024-12-08")
#write data for future reference
write.csv(gtrends_data, "gst_tax_example.csv", row.names = FALSE)
#get regression discontinuity graph
rdd_graph <- get_rdd_graph(gtrends_data, "Tax break proposed", geo_label = "Canada",
                           disc_datetime = as.POSIXct("2024-11-21 GMT"))
plot(rdd_graph)
#save plot
ggsave("gst_tax_example.png")

Output: image

Implementation Notes

  • Google trends reports search volume on a relative scale (interest over time) defined as follows; Numbers represent search interest relative to the highest point on the chart for the given region and time. A value of 100 is the peak popularity for the term. A value of 50 means that the term is half as popular. A score of 0 means there was not enough data for this term.
  • Depending on the start and end points of your Google Trends query the time scale of data returned can vary (hour, day, week, etc.)
  • Some queries return values of "<1" for certain times, currently we impute a value of 0.5 for these times.

References:

Google Trends

Papers

RDD

Papers

Chapters

Code

Upcoming Features:

  • RDD, allow user to specify discontinuity date
  • RDD, add in percent change option
  • RDD, automatically calculate optimal polynomial order
  • RDD, calculate CIs for discontinuity estimate
  • RDD, add fuzzy RDD
  • RDD, add local polynomial regression (bandwidths, kernels, etc.)
  • RDD, regression kink design (Tools of the Trade: The Regression Kink Design)
  • Implement difference in differences (will require >1 time series)
  • Basic time series methods (e.g. seasonality)
    • example search term: 'flu shot'

About

Regression discontinuity experiments with Google Trends data

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages