|
| 1 | +--- |
| 2 | +title: "Using vimpact for estimating vaccine impact - internal" |
| 3 | +author: "Rob Ashton" |
| 4 | +date: "2021-10-11" |
| 5 | +output: rmarkdown::html_vignette |
| 6 | +vignette: > |
| 7 | + %\VignetteIndexEntry{Using vimpact for estimating vaccine impact - internal} |
| 8 | + %\VignetteEngine{knitr::rmarkdown} |
| 9 | + %\VignetteEncoding{UTF-8} |
| 10 | +--- |
| 11 | + |
| 12 | +<!-- DO NOT EDIT THIS FILE - see vignettes_src and make changes there --> |
| 13 | + |
| 14 | + |
| 15 | + |
| 16 | +This vignette describes how to use vimpact to calculate impact as a member of VIMC. This requires a connection to the montagu database so can only be used internally. Note that this is all in development and the interface is likely to change. |
| 17 | + |
| 18 | +## Impact by calendar year & impact by birth year |
| 19 | + |
| 20 | +### Function interface |
| 21 | + |
| 22 | + |
| 23 | +```r |
| 24 | +impact <- vimpact::calculate_impact( |
| 25 | + con, method = "calendar_year", touchstone = "201710gavi-5", |
| 26 | + modelling_group = "CDA-Razavi", disease = "HepB", |
| 27 | + focal_scenario_type = "default", focal_vaccine_delivery = list( |
| 28 | + list(vaccine = "HepB_BD", activity_type = "routine"), |
| 29 | + list(vaccine = "HepB", activity_type = "routine") |
| 30 | + ), |
| 31 | + baseline_scenario_type = "novac", |
| 32 | + burden_outcomes = c("hepb_deaths_acute", "hepb_deaths_dec_cirrh", |
| 33 | + "hepb_deaths_hcc")) |
| 34 | +str(impact) |
| 35 | +#> tibble [9,292 × 4] (S3: tbl_df/tbl/data.frame) |
| 36 | +#> $ country : chr [1:9292] "AFG" "AFG" "AFG" "AFG" ... |
| 37 | +#> $ burden_outcome: chr [1:9292] "deaths" "deaths" "deaths" "deaths" ... |
| 38 | +#> $ year : int [1:9292] 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 ... |
| 39 | +#> $ impact : num [1:9292] 0 0 0.01 0 0 ... |
| 40 | +``` |
| 41 | + |
| 42 | +Or for impact by birth year, use `method = "birth_year"`. This will get `burden_estimate_set` ids for the baseline and focal scenarios for this particular touchstone, modelling group, disease. Then uses those to pull the `burden_estimate` data for specified `burden_outcomes`. Optionally filtering on country via `countries` arg, year via `vaccination_years` and under 5 age groups if `is_under5 = TRUE`. We then use raw impact from `burden_estimate` table and call relevant public facing impact method. For `method = "calendar_year"` `impact_by_calendar_year`. For `method = "birth_year"` `impact_by_birth_year`. |
| 43 | + |
| 44 | + |
| 45 | + |
| 46 | +### Recipe interface |
| 47 | + |
| 48 | +Define a recipe either as a csv or using `recipe_template` |
| 49 | + |
| 50 | + |
| 51 | +```r |
| 52 | +recipe <- data.frame( |
| 53 | + touchstone = "201710gavi-5", |
| 54 | + modelling_group = "CDA-Razavi", disease = "HepB", |
| 55 | + focal = "default:HepB_BD-routine;HepB-routine", |
| 56 | + baseline = "novac", |
| 57 | + burden_outcome = "hepb_deaths_acute,hepb_deaths_dec_cirrh,hepb_deaths_hcc;hepb_cases_acute_severe,hepb_cases_dec_cirrh,hepb_cases_hcc") |
| 58 | +t <- tempfile(fileext = ".csv") |
| 59 | +write.csv(recipe, t, row.names = FALSE) |
| 60 | +``` |
| 61 | + |
| 62 | +This is a set of properties defining what data we want to extract from the db e.g. `touchstone`, `modelling_group`, `disease`, focal and baseline scenarios and `burden_outcome`. It captures the same info as the args to `calculate_impact` above. Then use the recipe to define meta data frame |
| 63 | + |
| 64 | + |
| 65 | +```r |
| 66 | +meta <- vimpact:::get_meta_from_recipe(default_recipe = FALSE, recipe = t, con = con) |
| 67 | +``` |
| 68 | + |
| 69 | +And use this to calculate impact |
| 70 | + |
| 71 | + |
| 72 | +```r |
| 73 | +old_impact <- vimpact:::get_raw_impact_details(con, meta, "deaths") |
| 74 | +str(old_impact) |
| 75 | +#> 'data.frame': 9292 obs. of 7 variables: |
| 76 | +#> $ country : int 104 104 104 104 104 104 104 104 104 104 ... |
| 77 | +#> $ time : int 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 ... |
| 78 | +#> $ baseline_value: num 12592 12763 12922 13107 13297 ... |
| 79 | +#> $ focal_value : num 12592 12763 12922 13106 13293 ... |
| 80 | +#> $ value : num 0 0 0 0.7 3.8 ... |
| 81 | +#> $ index : int 1 1 1 1 1 1 1 1 1 1 ... |
| 82 | +#> $ burden_outcome: chr "deaths" "deaths" "deaths" "deaths" ... |
| 83 | +``` |
| 84 | + |
| 85 | +## Impact by year of vaccination |
| 86 | + |
| 87 | +### Function interface |
| 88 | + |
| 89 | +Very similar to examples for calendar year and birth year, to calculate impact by year of vaccination stratified by activity type run |
| 90 | + |
| 91 | + |
| 92 | +```r |
| 93 | +impact <- vimpact::calculate_impact( |
| 94 | + con, method = "yov_activity_type", touchstone = "201710gavi-5", |
| 95 | + modelling_group = "CDA-Razavi", disease = "HepB", |
| 96 | + focal_scenario_type = "default", focal_vaccine_delivery = list( |
| 97 | + list(vaccine = "HepB_BD", activity_type = "routine"), |
| 98 | + list(vaccine = "HepB", activity_type = "routine") |
| 99 | + ), |
| 100 | + baseline_scenario_type = "novac", |
| 101 | + burden_outcomes = "dalys") |
| 102 | +str(impact) |
| 103 | +#> tibble [4,077 × 6] (S3: tbl_df/tbl/data.frame) |
| 104 | +#> $ country : chr [1:4077] "AFG" "AFG" "AFG" "AFG" ... |
| 105 | +#> $ vaccine : chr [1:4077] "HepB" "HepB" "HepB" "HepB" ... |
| 106 | +#> $ activity_type : chr [1:4077] "routine" "routine" "routine" "routine" ... |
| 107 | +#> $ year : int [1:4077] 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 ... |
| 108 | +#> $ burden_outcome: chr [1:4077] "dalys" "dalys" "dalys" "dalys" ... |
| 109 | +#> $ impact : num [1:4077] 116637 119565 118335 124335 126887 ... |
| 110 | +``` |
| 111 | + |
| 112 | +or use `method = "yov_birth_cohort` for impact by year of vaccination stratified by birth cohort. This uses the same logic as other impact methods but before calling the public facing impact function it will also extract FVP data from the data base for this `touchstone` and vaccine delivery. It then calls either `impact_by_year_of_vaccination_activity_type` or `impact_by_year_of_vaccination_birth_cohort` to get impact. |
| 113 | + |
| 114 | +### Recipe interface |
| 115 | + |
| 116 | +We can re-use the recipe from above and get meta table for this method |
| 117 | + |
| 118 | + |
| 119 | +```r |
| 120 | +meta <- vimpact:::get_meta_from_recipe(default_recipe = FALSE, recipe = t, |
| 121 | + method = "method2a", con = con) |
| 122 | +``` |
| 123 | + |
| 124 | +Then use this to get raw impact |
| 125 | + |
| 126 | + |
| 127 | +```r |
| 128 | +old_raw_impact <- vimpact:::get_raw_impact_details(con, meta, "dalys") |
| 129 | +``` |
| 130 | + |
| 131 | +Extract the fvp data and map output to required interface |
| 132 | + |
| 133 | + |
| 134 | +```r |
| 135 | +fvps <- vimpact::extract_vaccination_history(con, "201710gavi-5", year_min = 2000, |
| 136 | + year_max = 2030, |
| 137 | + disease_to_extract = "HepB") |
| 138 | +#> User defined touchstone version is used. |
| 139 | +#> Converting input coverage data...... |
| 140 | +#> Extracted interpolated population. |
| 141 | +#> Extracted raw coverage data... |
| 142 | +#> Transformed coverage data. |
| 143 | +fvps$fvps <- fvps$fvps_adjusted |
| 144 | +fvps$country <- fvps$country_nid |
| 145 | +``` |
| 146 | + |
| 147 | +Then can use `meta`, `raw_impact` and `fvps` to calculate impact by year of vaccination |
| 148 | + |
| 149 | + |
| 150 | +```r |
| 151 | +old_impact <- vimpact:::impact_by_year_of_vaccination( |
| 152 | + meta, old_raw_impact, fvps, vaccination_years = 2000:2030) |
| 153 | +str(old_impact) |
| 154 | +#> 'data.frame': 4256 obs. of 27 variables: |
| 155 | +#> $ country : int 4 4 4 4 4 4 4 4 4 4 ... |
| 156 | +#> $ vaccine : chr "HepB_BD" "HepB" "HepB" "HepB_BD" ... |
| 157 | +#> $ activity_type : chr "routine" "routine" "routine" "routine" ... |
| 158 | +#> $ scenario_description: chr "best-estimates" "best-estimates" "best-estimates" "best-estimates" ... |
| 159 | +#> $ coverage_set : int 620 619 619 620 619 620 620 620 620 620 ... |
| 160 | +#> $ delivery_id : int 3666 1138 1910 3665 1909 3652 3663 3664 3653 3661 ... |
| 161 | +#> $ disease : chr "HepB" "HepB" "HepB" "HepB" ... |
| 162 | +#> $ scenario_type : chr "default" "default" "default" "default" ... |
| 163 | +#> $ gavi_support_level : chr "with" "with" "with" "with" ... |
| 164 | +#> $ year : int 2028 2009 2030 2027 2029 2014 2025 2026 2015 2023 ... |
| 165 | +#> $ gavi_support : logi TRUE TRUE TRUE TRUE TRUE FALSE ... |
| 166 | +#> $ gender : chr "Both" "Both" "Both" "Both" ... |
| 167 | +#> $ age : num 0 0 0 0 0 0 0 0 0 0 ... |
| 168 | +#> $ target_source : num 1123235 1065815 1137944 1113858 1132169 ... |
| 169 | +#> $ coverage_source : num 0.644 0.63 0.779 0.614 0.769 ... |
| 170 | +#> $ cohort_size : num 1123235 1065815 1137944 1113858 1132169 ... |
| 171 | +#> $ delivery_population : num 1123235 1065815 1137944 1113858 1132169 ... |
| 172 | +#> $ fvps_source : num 723476 671463 886576 684020 870755 ... |
| 173 | +#> $ fvps_adjusted : num 723476 671463 886576 684020 870755 ... |
| 174 | +#> $ coverage_adjusted : num 0.644 0.63 0.779 0.614 0.769 ... |
| 175 | +#> $ country_nid : int 4 4 4 4 4 4 4 4 4 4 ... |
| 176 | +#> $ fvps : num 723476 671463 886576 684020 870755 ... |
| 177 | +#> $ time : num 2028 2009 2030 2027 2029 ... |
| 178 | +#> $ burden_outcome : chr "dalys" "dalys" "dalys" "dalys" ... |
| 179 | +#> $ impact_ratio : num 0.176 0.176 0.176 0.176 0.176 ... |
| 180 | +#> $ impact : num 127502 118335 156245 120548 153457 ... |
| 181 | +#> $ index : int 1 1 1 1 1 1 1 1 1 1 ... |
| 182 | +``` |
| 183 | + |
| 184 | +## Comparison |
| 185 | + |
| 186 | +`calculate_impact` has a very similar interface as a single row in an impact recipe. The output is slightly different format but output from previous method can be transformed into same format. |
| 187 | + |
| 188 | + |
| 189 | +```r |
| 190 | +country <- dplyr::tbl(con, "country") |
| 191 | +old_impact <- old_impact %>% |
| 192 | + dplyr::left_join(country, by = c("country" = "nid"), copy = TRUE) %>% |
| 193 | + dplyr::select(country = id, vaccine, activity_type, year = time, |
| 194 | + burden_outcome, impact) %>% |
| 195 | + dplyr::filter(!is.na(impact)) %>% |
| 196 | + dplyr::arrange(activity_type, country, year, vaccine) |
| 197 | +str(old_impact) |
| 198 | +#> 'data.frame': 4077 obs. of 6 variables: |
| 199 | +#> $ country : chr "AFG" "AFG" "AFG" "AFG" ... |
| 200 | +#> $ vaccine : chr "HepB" "HepB" "HepB" "HepB" ... |
| 201 | +#> $ activity_type : chr "routine" "routine" "routine" "routine" ... |
| 202 | +#> $ year : num 2007 2008 2009 2010 2011 ... |
| 203 | +#> $ burden_outcome: chr "dalys" "dalys" "dalys" "dalys" ... |
| 204 | +#> $ impact : num 116637 119565 118335 124336 126887 ... |
| 205 | +``` |
| 206 | + |
| 207 | +and we can see from a plot that the output is very similar |
| 208 | + |
| 209 | + |
| 210 | +```r |
| 211 | +impact_plot <- merge(old_impact, impact, all.x = TRUE, all.y = TRUE, |
| 212 | + by = c("country", "vaccine", "activity_type", "year", |
| 213 | + "burden_outcome")) |
| 214 | +countries <- unique(impact_plot$country) |
| 215 | +impact_plot <- impact_plot %>% |
| 216 | + dplyr::rename(recipe_impact = impact.x, function_impact = impact.y) %>% |
| 217 | + tidyr::pivot_wider(names_from = country, |
| 218 | + values_from = c(recipe_impact, function_impact)) |
| 219 | +impact_plot <- as.data.frame(impact_plot) |
| 220 | +countries_filter <- lapply(unique(countries), function(country) { |
| 221 | + button <- list( |
| 222 | + method = "restyle", |
| 223 | + args = list("y", c( |
| 224 | + list(impact_plot[, paste0("recipe_impact_", country)]), |
| 225 | + list(impact_plot[, paste0("function_impact_", country)]))), |
| 226 | + label = country |
| 227 | + ) |
| 228 | +}) |
| 229 | +plot <- plotly::plot_ly(data = impact_plot) %>% |
| 230 | + plotly::add_trace(x = ~year, y = ~recipe_impact_AFG, name = "recipe", |
| 231 | + type = "scatter", mode = "markers", text = ~vaccine, |
| 232 | + marker = list( |
| 233 | + color = "rgb(235, 204, 42)", |
| 234 | + size = 10 |
| 235 | + )) %>% |
| 236 | + plotly::add_trace(x = ~year, y = ~function_impact_AFG, name = "function", |
| 237 | + type = "scatter", mode = "markers", text = ~vaccine, |
| 238 | + marker = list( |
| 239 | + color = "rgb(60, 154, 178)", |
| 240 | + symbol = "circle-open", |
| 241 | + size = 10, |
| 242 | + line = list( |
| 243 | + color = "rgb(60, 154, 178)", |
| 244 | + width = 2 |
| 245 | + ) |
| 246 | + )) %>% |
| 247 | + plotly::layout( |
| 248 | + yaxis = list(title = "impact"), |
| 249 | + updatemenus = list( |
| 250 | + list( |
| 251 | + y = 0.7, |
| 252 | + buttons = countries_filter |
| 253 | + ) |
| 254 | + )) |
| 255 | +``` |
| 256 | + |
| 257 | + |
| 258 | + |
| 259 | + |
| 260 | +```{r} |
| 261 | +htmltools::tags$iframe( |
| 262 | + src = "impact_comparisons.html", |
| 263 | + width = "100%", |
| 264 | + height = "400", |
| 265 | + scrolling = "no", |
| 266 | + seamless = "seamless", |
| 267 | + frameBorder = "0", |
| 268 | + `data-external` = "1" |
| 269 | +) |
| 270 | +``` |
| 271 | + |
| 272 | +There are small differences in impact because of differences in precision. `calculate_impact` tries to do large parts of the aggregation on the database where `burden_estimate` values are stored as Postgres `real` type which has 4 bytes of storage size and 6 decimal digits of precision. Whereas if we pull this before aggregation the values are stored in R as doubles which are much higher precision leading to small differences when aggregated. |
| 273 | + |
| 274 | +`calculate_impact` at the moment won't be able to generate impact for you for more than 1 set of scenario comparisons. Next steps will be to add a wrapper which can take some data like the impact recipe and call `calculate_imapct` for multiple scenarios. |
0 commit comments