Skip to content

Commit 900a301

Browse files
committed
Standardize the last step: disconnect & close Docker
1 parent 4650a3b commit 900a301

20 files changed

+196
-52
lines changed

050-docker-setup-postgres-with-adventureworks.Rmd

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -141,7 +141,7 @@ dbDisconnect(con)
141141
142142
```
143143

144-
## Cleaning up
144+
## Cleaning up: diconnect from the database and stop Docker
145145

146146
Always have R disconnect from the database when you're done.
147147
```{r}

060-secure-and-use-dbms-credentials-in-r.Rmd

Lines changed: 1 addition & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -73,15 +73,9 @@ Once the connection object has been created, you can list all of the tables in o
7373
dbExecute(con, "set search_path to humanresources, public;") # watch for duplicates!
7474
dbListTables(con)
7575
```
76+
## Disconnect from the database and stop Docker
7677

77-
## Clean up
78-
79-
Afterwards, always disconnect from the dbms:
8078
```{r}
8179
dbDisconnect(con)
82-
```
83-
Tell Docker to stop the `adventureworks` container:
84-
```{r}
8580
sp_docker_stop("adventureworks")
8681
```
87-

065-connect-to-adventureworks-with-dplyr.Rmd

Lines changed: 1 addition & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -134,21 +134,9 @@ salesorderheader_table$ops$x$vars
134134
```
135135
`salesorderheader_table` holds information needed to get the data from the 'salesorderheader' table, but `salesorderheader_table` does not hold the data itself. In the following sections, we will examine more closely this relationship between the `salesorderheader_table` object and the data in the database's 'salesorderheader' table.
136136

137-
Disconnect from the database:
138-
```{r }
139-
dbDisconnect(con)
137+
## Disconnect from the database and stop Docker
140138

141-
```
142-
## Cleaning up
143-
144-
Always have R disconnect from the database when you're done.
145139
```{r}
146-
147140
dbDisconnect(con)
148-
149-
```
150-
151-
Stop the `adventureworks` container:
152-
```{r}
153141
sp_docker_stop("adventureworks")
154142
```

080-elementary-queries.Rmd

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -283,12 +283,11 @@ skimr::skim_to_wide(salesorderheader_tibble) #skimr doesn't like certain kinds o
283283
284284
```
285285

286-
### Close the connection and shut down adventureworks
286+
## Disconnect from the database and stop Docker
287287

288-
Where you place the `collect` function matters.
289288
```{r}
290-
DBI::dbDisconnect(con)
291-
sqlpetr::sp_docker_stop("adventureworks")
289+
dbDisconnect(con)
290+
sp_docker_stop("adventureworks")
292291
```
293292

294293
## Additional reading

083-exploring-a-single-table.Rmd

Lines changed: 88 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -324,15 +324,17 @@ ggplot(data = annual_sales, aes(x = orderdate, y = avg_total_soh_dollars)) +
324324

325325
Look at number of orders by the the average sales per order for the four years:
326326
```{r number of orders by the the average sales per order 2}
327-
annual_sales %>% arrange(orderdate) %>%
328-
ggplot(aes(x = avg_total_soh_dollars, y = as.numeric(soh_count))) +
327+
sales <- annual_sales %>% arrange(orderdate) %>%
328+
ungroup() %>%
329+
mutate(year = year(orderdate), year = as.factor(year))
330+
331+
sales %>% ggplot(aes(x = avg_total_soh_dollars, y = as.numeric(soh_count), color = year)) +
329332
geom_point() +
330-
geom_text(aes(label = orderdate, hjust = .5, vjust = 0)) +
333+
geom_text(aes(label = year, hjust = .5, vjust = 0)) +
331334
geom_path() +
332335
xlab("Average dollars per order") +
333336
ylab("Total number of orders") +
334-
facet_wrap("onlineorderflag", scales = "free") +
335-
scale_y_continuous(labels = scales::dollar_format()) +
337+
facet_wrap("onlineorderflag", scales = "free") +
336338
scale_x_continuous(labels = scales::dollar_format()) +
337339
ggtitle(paste("Number of Orders by Average Order Amount\n",
338340
min_soh_dt, " - ", max_soh_dt))
@@ -399,6 +401,86 @@ A couple of things jump out from the graph.
399401
2. 2011 has more variation than 2012 and 2013 and peaks every two months.
400402
3. 2014 has the most variation and also peaks every two months. Both the number of sales, 939, and the average sales order size, $52.19 plummet in June 2014.
401403

404+
## define the one-off function
405+
406+
```{r}
407+
monthly_sales <- tbl(con, in_schema("sales", "salesorderheader")) %>%
408+
select(orderdate, subtotal, onlineorderflag) %>%
409+
# mutate(day = day(as.Date(orderdate))) %>%
410+
mutate(
411+
orderdate = as.Date(orderdate),
412+
day = day(orderdate)) %>%
413+
show_query() %>%
414+
collect() %>% # From here on we're in R
415+
mutate(
416+
correct_orderdate = case_when(
417+
# onlineorderflag == FALSE & day == 1 ~ NA,
418+
onlineorderflag == FALSE & day == 1L ~ orderdate - 1 ,
419+
TRUE ~ orderdate
420+
))
421+
422+
```
423+
424+
Here's the magic:
425+
426+
```{r}
427+
dbExecute(con,
428+
"CREATE OR REPLACE FUNCTION so_adj_date(so_date timestamp, ONLINE_ORDER boolean) RETURNS timestamp AS $$
429+
BEGIN
430+
IF (ONLINE_ORDER) THEN
431+
RETURN (SELECT so_date);
432+
ELSE
433+
RETURN(SELECT CASE WHEN EXTRACT(DAY FROM so_date) = 1
434+
THEN so_date - '1 day'::interval
435+
ELSE so_date
436+
END
437+
);
438+
END IF;
439+
END; $$
440+
LANGUAGE PLPGSQL;
441+
")
442+
```
443+
444+
this is a big lesson! Using the magic.
445+
446+
```{r}
447+
monthly_sales <- tbl(con, in_schema("sales", "salesorderheader")) %>%
448+
select(orderdate, subtotal, onlineorderflag) %>%
449+
# mutate(day = day(as.Date(orderdate))) %>%
450+
mutate(
451+
# orderdate = as.Date(orderdate),
452+
adjusted_date = so_adj_date(orderdate, onlineorderflag),
453+
day = day(orderdate)) %>%
454+
show_query() %>%
455+
# collect() %>% # From here on we're in R
456+
collect()
457+
458+
```
459+
460+
```{r}
461+
monthly_sales %>% filter(day==1 & onlineorderflag == FALSE) %>%
462+
count(orderdate) %>% as.data.frame()
463+
```
464+
465+
```{r}
466+
mutate(orderdate = date(orderdate),
467+
orderdate = round_date(orderdate, "month"),
468+
onlineorderflag = if_else(onlineorderflag == FALSE,
469+
"Sales Rep", "Online"),) %>% #
470+
group_by(orderdate, onlineorderflag) %>%
471+
summarize(
472+
min_soh_orderdate = min(orderdate, na.rm = TRUE),
473+
max_soh_orderdate = max(orderdate, na.rm = TRUE),
474+
total_soh_dollars = round(sum(subtotal, na.rm = TRUE), 2),
475+
avg_total_soh_dollars = round(mean(subtotal, na.rm = TRUE), 2),
476+
soh_count = n()
477+
)
478+
479+
```
480+
481+
482+
483+
402484
### Looking at lagged data
403485

404486
```{r}
@@ -759,11 +841,9 @@ The gather command throws the following warning:
759841
they will be dropped
760842

761843

762-
## Close and clean up
763-
844+
## Disconnect from the database and stop Docker
764845

765846
```{r}
766847
dbDisconnect(con)
767848
sp_docker_stop("adventureworks")
768849
```
769-

085-sales-forecasting.Rmd

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -82,6 +82,20 @@ First, let's look at orders for online and sales representative sales:
8282

8383
```{r}
8484
monthly_tsibble %>% autoplot(orders)
85+
Disconnect from the database:
86+
```{r }
87+
dbDisconnect(con)
88+
89+
```
90+
91+
## Cleaning up
92+
93+
Always have R disconnect from the database when you're done.
94+
```{r}
95+
96+
dbDisconnect(con)
97+
98+
```
8599

86100
```
87101
@@ -103,3 +117,10 @@ monthly_tsibble %>% autoplot(total_revenue / orders)
103117
```
104118

105119
For the sales representatives, there's still a month-to-month variation but the revenue per order appears to be bounded both below and above. However, the online revenue per order is decreasing. Note that this decline appears to be in steps between May and June each year; that could mean it's an artifact of the database creation process and not a "natural" phenomenon.
120+
121+
## Disconnect from the database and stop Docker
122+
123+
```{r}
124+
dbDisconnect(con)
125+
sp_docker_stop("adventureworks")
126+
```

090-lazy-evalutation-intro.Rmd

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -235,10 +235,11 @@ Q %>% dplyr::show_query()
235235
236236
```
237237
Hand-written SQL code to do the same job will probably look a lot nicer and could be more efficient, but functionally `dplyr` does the job.
238+
## Disconnect from the database and stop Docker
238239

239-
```{r disconnect and close down}
240-
DBI::dbDisconnect(con)
241-
sqlpetr::sp_docker_stop("adventureworks")
240+
```{r}
241+
dbDisconnect(con)
242+
sp_docker_stop("adventureworks")
242243
```
243244

244245

095-lazy-evalutation-and-timing.Rmd

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -145,12 +145,13 @@ try(sales_person_table %>% collect() %>%
145145

146146
See more examples of lazy execution [here](https://datacarpentry.org/R-ecology-lesson/05-r-and-databases.html).
147147

148+
## Disconnect from the database and stop Docker
149+
148150
```{r}
149-
DBI::dbDisconnect(con)
150-
sqlpetr::sp_docker_stop("adventureworks")
151+
dbDisconnect(con)
152+
sp_docker_stop("adventureworks")
151153
```
152154

153-
154155
## Other resources
155156

156157
* Benjamin S. Baumer. 2017. A Grammar for Reproducible and Painless Extract-Transform-Load Operations on Medium Data. [https://arxiv.org/abs/1708.07073](https://arxiv.org/abs/1708.07073)

100-dbi-and-sql.Rmd

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -261,7 +261,10 @@ GQ <- dbGetQuery(
261261
But because `Q` hasn't been executed, we can add to it. This behavior is the basis for a useful debugging and development process where queries are built up incrementally.
262262

263263
Where you place the `collect` function matters.
264+
265+
## Disconnect from the database and stop Docker
266+
264267
```{r}
265-
DBI::dbDisconnect(con)
266-
sqlpetr::sp_docker_stop("sql-pet")
268+
dbDisconnect(con)
269+
sp_docker_stop("sql-pet")
267270
```

130-sql-dplyr-joins-data.Rmd

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -488,8 +488,9 @@ smy_cust_dupes <- dbGetQuery(con,'select *
488488
sp_print_df(smy_cust_dupes)
489489
```
490490
-->
491+
## Disconnect from the database and stop Docker
492+
491493

492-
Diconnect from the db:
493494
```{r}
494495
495496
dbDisconnect(con)

140-sql-dplyr-joins.Rmd

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -928,11 +928,11 @@ The explanation below is from [here](https://stackoverflow.com/questions/4748577
928928

929929
In `by = c("col1" = "col2")`, `=` is _not_ the equality operator, but an assignment operator (the equality operator in R is `==`). The expression inside `c(...)` creates a _named character vector_ (name: col1 value: col2) that `dplyr` uses for the join. Nowhere do you define the kind of comparison that is made during the join. The comparison is hard-coded in `dplyr`. I don't think `dplyr` supports non-equi joins (yet).
930930

931-
```{r}
932-
# disconnect from the db
933-
dbDisconnect(con)
931+
## Disconnect from the database and stop Docker
934932

935-
sp_docker_stop("sql-pet")
933+
```{r}
934+
dbDisconnect(con)
935+
sp_docker_stop("adventureworks")
936936
```
937937

938938
```{r}

150-sql-joins-exercises.Rmd

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -318,3 +318,9 @@ In the next code block, change the function parameter to different table names t
318318
#dbListTables(con)
319319
sp_tbl_pk_fk_sql('customer')
320320
```
321+
## Disconnect from the database and stop Docker
322+
323+
```{r}
324+
dbDisconnect(con)
325+
sp_docker_stop("adventureworks")
326+
```

153-sql-joins-exercises.Rmd

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1683,7 +1683,7 @@ cust_id <- 601
16831683
sp_print_df(customer_details_fn_dplyr(cust_id))
16841684
```
16851685

1686-
## Diconnect from the db and clean up
1686+
## Diconnect from the database and stop Docker
16871687
```{r}
16881688
dbDisconnect(con)
16891689

155-sql-joins-exercises-with-answers.Rmd

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2191,11 +2191,11 @@ compute(smy_customer_details_dplyr,name='smy_compute_exercise',temporary = FALSE
21912191
21922192
```
21932193

2194+
## Disconnect from the database and stop Docker
2195+
21942196
```{r}
2195-
# diconnect from the db
21962197
dbDisconnect(con)
2197-
2198-
sp_docker_stop("sql-pet")
2198+
sp_docker_stop("adventureworks")
21992199
```
22002200

22012201
```{r}

160-anti-join-cost-comparisons.Rmd

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -173,3 +173,12 @@ select c.customer_id,count(*) lojs
173173
print(glue("sql_aj1 loj-null costs=", sql_aj1[1, 1]))
174174
print(glue("sql_aj3 not exist costs=", sql_aj3[1, 1]))
175175
```
176+
## Cleaning up
177+
178+
Always have R disconnect from the database when you're done and stop the Adventureworks Container
179+
```{r}
180+
181+
dbDisconnect(con)
182+
sp_docker_stop("adventureworks")
183+
184+
```

170-sql-quick-start-simple-retrieval.Rmd

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -509,3 +509,14 @@ The SEQ column shows the standard order of SQL execution. One take away for thi
509509
510510
6. SQL for View table dependencies.
511511
7. Add cartesian join exercise.
512+
513+
```
514+
## Cleaning up
515+
516+
Always have R disconnect from the database when you're done and stop the Adventureworks Container
517+
```{r}
518+
519+
dbDisconnect(con)
520+
sp_docker_stop("adventureworks")
521+
522+
```

180-r-query-postgres-metadata.Rmd

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -343,3 +343,13 @@ ls()
343343
```
344344

345345

346+
```
347+
## Cleaning up
348+
349+
Always have R disconnect from the database when you're done and stop the Adventureworks Container
350+
```{r}
351+
352+
dbDisconnect(con)
353+
sp_docker_stop("adventureworks")
354+
355+
```

190-drilling-into-db-environment.Rmd

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -393,8 +393,13 @@ create_graph(
393393
) %>%
394394
render_graph()
395395
```
396+
```
397+
## Cleaning up
396398
399+
Always have R disconnect from the database when you're done and stop the Adventureworks Container
397400
```{r}
401+
398402
dbDisconnect(con)
399-
# system2('docker','stop sql-pet')
403+
sp_docker_stop("adventureworks")
404+
400405
```

210-sql-query-steps.Rmd

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -75,8 +75,12 @@ dbClearResult(rs)
7575
# dbRemoveTable(con, "cars")
7676
dbRemoveTable(con, "mtcars")
7777
# dbRemoveTable(con, "cust_movies")
78+
```
79+
80+
81+
```{r}
82+
## Disconnect from the database and stop Docker
7883
79-
# diconnect from the db
8084
dbDisconnect(con)
8185
8286
sp_docker_stop("sql-pet")

0 commit comments

Comments
 (0)