Skip to content

Commit d56c8be

Browse files
committed
Add vignette to compare impact functions
1 parent 8f12dac commit d56c8be

File tree

6 files changed

+332
-4
lines changed

6 files changed

+332
-4
lines changed

.gitignore

+2
Original file line numberDiff line numberDiff line change
@@ -3,3 +3,5 @@ docs
33
.Renviron
44
.Rhistory
55
.DS_Store
6+
vignettes_src/internal-impact.md
7+
vignettes_src/impact_comparisons.html

Makefile

+8
Original file line numberDiff line numberDiff line change
@@ -36,4 +36,12 @@ pkgdown:
3636
website: pkgdown
3737
./scripts/update_web.sh
3838

39+
vignettes/internal-impact.Rmd: vignettes_src/internal-impact.Rmd
40+
./scripts/build_impact_vignette
41+
42+
vignettes: vignettes/using-vimpact.Rmd vignettes/vignette.Rmd
43+
${RSCRIPT} -e 'tools::buildVignettes(dir = ".")'
44+
mkdir -p inst/doc
45+
cp vignettes/*.html vignettes/*.Rmd inst/doc
46+
3947
.PHONY: all test document install vignettes

scripts/build_impact_vignette

+24
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
#!/usr/bin/env bash
2+
# We can't run the internal-impact vignette on a computer that doesn't
3+
# have access to montagu DB (including CI) so we pre-compile it
4+
# ocassinally here. The version in vignettes_src should be edited,
5+
# then this script run to create the version in vignettes
6+
(cd vignettes_src && Rscript -e 'knitr::knit("internal-impact.Rmd")')
7+
header="DO NOT EDIT THIS FILE - see vignettes_src and make changes there"
8+
widget=' \
9+
```{r} \
10+
htmltools::tags$iframe( \
11+
src = "impact_comparisons.html", \
12+
width = "100%", \
13+
height = "400", \
14+
scrolling = "no", \
15+
seamless = "seamless", \
16+
frameBorder = "0", \
17+
`data-external` = "1" \
18+
)\
19+
```'
20+
sed -s 's/[[:space:]]*$//' vignettes_src/internal-impact.md |
21+
sed 's/\r//g' |
22+
sed "s/HEADER/$header/" |
23+
sed "s/<!-- WIDGET -->/$widget/" > vignettes/internal-impact.Rmd
24+
cp vignettes_src/impact_comparisons.html vignettes/impact_comparisons.html

vignettes/internal-impact.Rmd

+274
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,274 @@
1+
---
2+
title: "Using vimpact for estimating vaccine impact - internal"
3+
author: "Rob Ashton"
4+
date: "2021-10-11"
5+
output: rmarkdown::html_vignette
6+
vignette: >
7+
%\VignetteIndexEntry{Using vimpact for estimating vaccine impact - internal}
8+
%\VignetteEngine{knitr::rmarkdown}
9+
%\VignetteEncoding{UTF-8}
10+
---
11+
12+
<!-- DO NOT EDIT THIS FILE - see vignettes_src and make changes there -->
13+
14+
15+
16+
This vignette describes how to use vimpact to calculate impact as a member of VIMC. This requires a connection to the montagu database so can only be used internally. Note that this is all in development and the interface is likely to change.
17+
18+
## Impact by calendar year & impact by birth year
19+
20+
### Function interface
21+
22+
23+
```r
24+
impact <- vimpact::calculate_impact(
25+
con, method = "calendar_year", touchstone = "201710gavi-5",
26+
modelling_group = "CDA-Razavi", disease = "HepB",
27+
focal_scenario_type = "default", focal_vaccine_delivery = list(
28+
list(vaccine = "HepB_BD", activity_type = "routine"),
29+
list(vaccine = "HepB", activity_type = "routine")
30+
),
31+
baseline_scenario_type = "novac",
32+
burden_outcomes = c("hepb_deaths_acute", "hepb_deaths_dec_cirrh",
33+
"hepb_deaths_hcc"))
34+
str(impact)
35+
#> tibble [9,292 × 4] (S3: tbl_df/tbl/data.frame)
36+
#> $ country : chr [1:9292] "AFG" "AFG" "AFG" "AFG" ...
37+
#> $ burden_outcome: chr [1:9292] "deaths" "deaths" "deaths" "deaths" ...
38+
#> $ year : int [1:9292] 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 ...
39+
#> $ impact : num [1:9292] 0 0 0.01 0 0 ...
40+
```
41+
42+
Or for impact by birth year, use `method = "birth_year"`. This will get `burden_estimate_set` ids for the baseline and focal scenarios for this particular touchstone, modelling group, disease. Then uses those to pull the `burden_estimate` data for specified `burden_outcomes`. Optionally filtering on country via `countries` arg, year via `vaccination_years` and under 5 age groups if `is_under5 = TRUE`. We then use raw impact from `burden_estimate` table and call relevant public facing impact method. For `method = "calendar_year"` `impact_by_calendar_year`. For `method = "birth_year"` `impact_by_birth_year`.
43+
44+
45+
46+
### Recipe interface
47+
48+
Define a recipe either as a csv or using `recipe_template`
49+
50+
51+
```r
52+
recipe <- data.frame(
53+
touchstone = "201710gavi-5",
54+
modelling_group = "CDA-Razavi", disease = "HepB",
55+
focal = "default:HepB_BD-routine;HepB-routine",
56+
baseline = "novac",
57+
burden_outcome = "hepb_deaths_acute,hepb_deaths_dec_cirrh,hepb_deaths_hcc;hepb_cases_acute_severe,hepb_cases_dec_cirrh,hepb_cases_hcc")
58+
t <- tempfile(fileext = ".csv")
59+
write.csv(recipe, t, row.names = FALSE)
60+
```
61+
62+
This is a set of properties defining what data we want to extract from the db e.g. `touchstone`, `modelling_group`, `disease`, focal and baseline scenarios and `burden_outcome`. It captures the same info as the args to `calculate_impact` above. Then use the recipe to define meta data frame
63+
64+
65+
```r
66+
meta <- vimpact:::get_meta_from_recipe(default_recipe = FALSE, recipe = t, con = con)
67+
```
68+
69+
And use this to calculate impact
70+
71+
72+
```r
73+
old_impact <- vimpact:::get_raw_impact_details(con, meta, "deaths")
74+
str(old_impact)
75+
#> 'data.frame': 9292 obs. of 7 variables:
76+
#> $ country : int 104 104 104 104 104 104 104 104 104 104 ...
77+
#> $ time : int 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 ...
78+
#> $ baseline_value: num 12592 12763 12922 13107 13297 ...
79+
#> $ focal_value : num 12592 12763 12922 13106 13293 ...
80+
#> $ value : num 0 0 0 0.7 3.8 ...
81+
#> $ index : int 1 1 1 1 1 1 1 1 1 1 ...
82+
#> $ burden_outcome: chr "deaths" "deaths" "deaths" "deaths" ...
83+
```
84+
85+
## Impact by year of vaccination
86+
87+
### Function interface
88+
89+
Very similar to examples for calendar year and birth year, to calculate impact by year of vaccination stratified by activity type run
90+
91+
92+
```r
93+
impact <- vimpact::calculate_impact(
94+
con, method = "yov_activity_type", touchstone = "201710gavi-5",
95+
modelling_group = "CDA-Razavi", disease = "HepB",
96+
focal_scenario_type = "default", focal_vaccine_delivery = list(
97+
list(vaccine = "HepB_BD", activity_type = "routine"),
98+
list(vaccine = "HepB", activity_type = "routine")
99+
),
100+
baseline_scenario_type = "novac",
101+
burden_outcomes = "dalys")
102+
str(impact)
103+
#> tibble [4,077 × 6] (S3: tbl_df/tbl/data.frame)
104+
#> $ country : chr [1:4077] "AFG" "AFG" "AFG" "AFG" ...
105+
#> $ vaccine : chr [1:4077] "HepB" "HepB" "HepB" "HepB" ...
106+
#> $ activity_type : chr [1:4077] "routine" "routine" "routine" "routine" ...
107+
#> $ year : int [1:4077] 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 ...
108+
#> $ burden_outcome: chr [1:4077] "dalys" "dalys" "dalys" "dalys" ...
109+
#> $ impact : num [1:4077] 116637 119565 118335 124335 126887 ...
110+
```
111+
112+
or use `method = "yov_birth_cohort` for impact by year of vaccination stratified by birth cohort. This uses the same logic as other impact methods but before calling the public facing impact function it will also extract FVP data from the data base for this `touchstone` and vaccine delivery. It then calls either `impact_by_year_of_vaccination_activity_type` or `impact_by_year_of_vaccination_birth_cohort` to get impact.
113+
114+
### Recipe interface
115+
116+
We can re-use the recipe from above and get meta table for this method
117+
118+
119+
```r
120+
meta <- vimpact:::get_meta_from_recipe(default_recipe = FALSE, recipe = t,
121+
method = "method2a", con = con)
122+
```
123+
124+
Then use this to get raw impact
125+
126+
127+
```r
128+
old_raw_impact <- vimpact:::get_raw_impact_details(con, meta, "dalys")
129+
```
130+
131+
Extract the fvp data and map output to required interface
132+
133+
134+
```r
135+
fvps <- vimpact::extract_vaccination_history(con, "201710gavi-5", year_min = 2000,
136+
year_max = 2030,
137+
disease_to_extract = "HepB")
138+
#> User defined touchstone version is used.
139+
#> Converting input coverage data......
140+
#> Extracted interpolated population.
141+
#> Extracted raw coverage data...
142+
#> Transformed coverage data.
143+
fvps$fvps <- fvps$fvps_adjusted
144+
fvps$country <- fvps$country_nid
145+
```
146+
147+
Then can use `meta`, `raw_impact` and `fvps` to calculate impact by year of vaccination
148+
149+
150+
```r
151+
old_impact <- vimpact:::impact_by_year_of_vaccination(
152+
meta, old_raw_impact, fvps, vaccination_years = 2000:2030)
153+
str(old_impact)
154+
#> 'data.frame': 4256 obs. of 27 variables:
155+
#> $ country : int 4 4 4 4 4 4 4 4 4 4 ...
156+
#> $ vaccine : chr "HepB_BD" "HepB" "HepB" "HepB_BD" ...
157+
#> $ activity_type : chr "routine" "routine" "routine" "routine" ...
158+
#> $ scenario_description: chr "best-estimates" "best-estimates" "best-estimates" "best-estimates" ...
159+
#> $ coverage_set : int 620 619 619 620 619 620 620 620 620 620 ...
160+
#> $ delivery_id : int 3666 1138 1910 3665 1909 3652 3663 3664 3653 3661 ...
161+
#> $ disease : chr "HepB" "HepB" "HepB" "HepB" ...
162+
#> $ scenario_type : chr "default" "default" "default" "default" ...
163+
#> $ gavi_support_level : chr "with" "with" "with" "with" ...
164+
#> $ year : int 2028 2009 2030 2027 2029 2014 2025 2026 2015 2023 ...
165+
#> $ gavi_support : logi TRUE TRUE TRUE TRUE TRUE FALSE ...
166+
#> $ gender : chr "Both" "Both" "Both" "Both" ...
167+
#> $ age : num 0 0 0 0 0 0 0 0 0 0 ...
168+
#> $ target_source : num 1123235 1065815 1137944 1113858 1132169 ...
169+
#> $ coverage_source : num 0.644 0.63 0.779 0.614 0.769 ...
170+
#> $ cohort_size : num 1123235 1065815 1137944 1113858 1132169 ...
171+
#> $ delivery_population : num 1123235 1065815 1137944 1113858 1132169 ...
172+
#> $ fvps_source : num 723476 671463 886576 684020 870755 ...
173+
#> $ fvps_adjusted : num 723476 671463 886576 684020 870755 ...
174+
#> $ coverage_adjusted : num 0.644 0.63 0.779 0.614 0.769 ...
175+
#> $ country_nid : int 4 4 4 4 4 4 4 4 4 4 ...
176+
#> $ fvps : num 723476 671463 886576 684020 870755 ...
177+
#> $ time : num 2028 2009 2030 2027 2029 ...
178+
#> $ burden_outcome : chr "dalys" "dalys" "dalys" "dalys" ...
179+
#> $ impact_ratio : num 0.176 0.176 0.176 0.176 0.176 ...
180+
#> $ impact : num 127502 118335 156245 120548 153457 ...
181+
#> $ index : int 1 1 1 1 1 1 1 1 1 1 ...
182+
```
183+
184+
## Comparison
185+
186+
`calculate_impact` has a very similar interface as a single row in an impact recipe. The output is slightly different format but output from previous method can be transformed into same format.
187+
188+
189+
```r
190+
country <- dplyr::tbl(con, "country")
191+
old_impact <- old_impact %>%
192+
dplyr::left_join(country, by = c("country" = "nid"), copy = TRUE) %>%
193+
dplyr::select(country = id, vaccine, activity_type, year = time,
194+
burden_outcome, impact) %>%
195+
dplyr::filter(!is.na(impact)) %>%
196+
dplyr::arrange(activity_type, country, year, vaccine)
197+
str(old_impact)
198+
#> 'data.frame': 4077 obs. of 6 variables:
199+
#> $ country : chr "AFG" "AFG" "AFG" "AFG" ...
200+
#> $ vaccine : chr "HepB" "HepB" "HepB" "HepB" ...
201+
#> $ activity_type : chr "routine" "routine" "routine" "routine" ...
202+
#> $ year : num 2007 2008 2009 2010 2011 ...
203+
#> $ burden_outcome: chr "dalys" "dalys" "dalys" "dalys" ...
204+
#> $ impact : num 116637 119565 118335 124336 126887 ...
205+
```
206+
207+
and we can see from a plot that the output is very similar
208+
209+
210+
```r
211+
impact_plot <- merge(old_impact, impact, all.x = TRUE, all.y = TRUE,
212+
by = c("country", "vaccine", "activity_type", "year",
213+
"burden_outcome"))
214+
countries <- unique(impact_plot$country)
215+
impact_plot <- impact_plot %>%
216+
dplyr::rename(recipe_impact = impact.x, function_impact = impact.y) %>%
217+
tidyr::pivot_wider(names_from = country,
218+
values_from = c(recipe_impact, function_impact))
219+
impact_plot <- as.data.frame(impact_plot)
220+
countries_filter <- lapply(unique(countries), function(country) {
221+
button <- list(
222+
method = "restyle",
223+
args = list("y", c(
224+
list(impact_plot[, paste0("recipe_impact_", country)]),
225+
list(impact_plot[, paste0("function_impact_", country)]))),
226+
label = country
227+
)
228+
})
229+
plot <- plotly::plot_ly(data = impact_plot) %>%
230+
plotly::add_trace(x = ~year, y = ~recipe_impact_AFG, name = "recipe",
231+
type = "scatter", mode = "markers", text = ~vaccine,
232+
marker = list(
233+
color = "rgb(235, 204, 42)",
234+
size = 10
235+
)) %>%
236+
plotly::add_trace(x = ~year, y = ~function_impact_AFG, name = "function",
237+
type = "scatter", mode = "markers", text = ~vaccine,
238+
marker = list(
239+
color = "rgb(60, 154, 178)",
240+
symbol = "circle-open",
241+
size = 10,
242+
line = list(
243+
color = "rgb(60, 154, 178)",
244+
width = 2
245+
)
246+
)) %>%
247+
plotly::layout(
248+
yaxis = list(title = "impact"),
249+
updatemenus = list(
250+
list(
251+
y = 0.7,
252+
buttons = countries_filter
253+
)
254+
))
255+
```
256+
257+
258+
259+
260+
```{r}
261+
htmltools::tags$iframe(
262+
src = "impact_comparisons.html",
263+
width = "100%",
264+
height = "400",
265+
scrolling = "no",
266+
seamless = "seamless",
267+
frameBorder = "0",
268+
`data-external` = "1"
269+
)
270+
```
271+
272+
There are small differences in impact because of differences in precision. `calculate_impact` tries to do large parts of the aggregation on the database where `burden_estimate` values are stored as Postgres `real` type which has 4 bytes of storage size and 6 decimal digits of precision. Whereas if we pull this before aggregation the values are stored in R as doubles which are much higher precision leading to small differences when aggregated.
273+
274+
`calculate_impact` at the moment won't be able to generate impact for you for more than 1 set of scenario comparisons. Next steps will be to add a wrapper which can take some data like the impact recipe and call `calculate_imapct` for multiple scenarios.

vignettes/vignette.Rmd

+1-1
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ author: "Xiang Li"
44
date: "`r Sys.Date()`"
55
output: rmarkdown::html_vignette
66
vignette: >
7-
%\VignetteIndexEntry{vignette}
7+
%\VignetteIndexEntry{Using vimpact for estimating vaccine impact - VIMC members}
88
%\VignetteEngine{knitr::rmarkdown}
99
%\VignetteEncoding{UTF-8}
1010
---

vignettes_src/internal-impact.Rmd

+23-3
Original file line numberDiff line numberDiff line change
@@ -168,11 +168,24 @@ countries_filter <- lapply(unique(countries), function(country) {
168168
label = country
169169
)
170170
})
171-
plotly::plot_ly(data = impact_plot) %>%
171+
plot <- plotly::plot_ly(data = impact_plot) %>%
172172
plotly::add_trace(x = ~year, y = ~recipe_impact_AFG, name = "recipe",
173-
type = "scatter", mode = "markers", text = ~vaccine) %>%
173+
type = "scatter", mode = "markers", text = ~vaccine,
174+
marker = list(
175+
color = "rgb(235, 204, 42)",
176+
size = 10
177+
)) %>%
174178
plotly::add_trace(x = ~year, y = ~function_impact_AFG, name = "function",
175-
type = "scatter", mode = "markers", text = ~vaccine) %>%
179+
type = "scatter", mode = "markers", text = ~vaccine,
180+
marker = list(
181+
color = "rgb(60, 154, 178)",
182+
symbol = "circle-open",
183+
size = 10,
184+
line = list(
185+
color = "rgb(60, 154, 178)",
186+
width = 2
187+
)
188+
)) %>%
176189
plotly::layout(
177190
yaxis = list(title = "impact"),
178191
updatemenus = list(
@@ -183,6 +196,13 @@ plotly::plot_ly(data = impact_plot) %>%
183196
))
184197
```
185198

199+
```{r include = FALSE}
200+
htmlwidgets::saveWidget(plotly::partial_bundle(plot),
201+
"impact_comparisons.html")
202+
```
203+
204+
<!-- WIDGET -->
205+
186206
There are small differences in impact because of differences in precision. `calculate_impact` tries to do large parts of the aggregation on the database where `burden_estimate` values are stored as Postgres `real` type which has 4 bytes of storage size and 6 decimal digits of precision. Whereas if we pull this before aggregation the values are stored in R as doubles which are much higher precision leading to small differences when aggregated.
187207

188208
`calculate_impact` at the moment won't be able to generate impact for you for more than 1 set of scenario comparisons. Next steps will be to add a wrapper which can take some data like the impact recipe and call `calculate_imapct` for multiple scenarios.

0 commit comments

Comments
 (0)