Skip to content

Commit 6038af4

Browse files
committed
First commit
0 parents  commit 6038af4

14 files changed

+2763
-0
lines changed

README.md

+174
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,174 @@
1+
---
2+
editor_options:
3+
markdown:
4+
wrap: 72
5+
---
6+
7+
# D-Lab R SQL Fundamentals Workshop
8+
9+
[![DataHub](https://img.shields.io/badge/launch-datahub-blue)](DATAHUB_LINK_HERE)
10+
[![Binder](https://mybinder.org/badge_logo.svg)](BINDER_LINK_HERE)
11+
[![License: CC BY
12+
4.0](https://img.shields.io/badge/License-CC_BY_4.0-lightgrey.svg)](https://creativecommons.org/licenses/by/4.0/)
13+
14+
This repository contains the materials for D-Lab R SQL Fundamentals
15+
workshop.
16+
17+
### Prerequisites
18+
19+
Prior experience with [R
20+
Fundamentals](https://github.com/dlab-berkeley/R-Fundamentals), [R Data
21+
Wrangling](https://github.com/dlab-berkeley/R-Data-Wrangling), and [R
22+
Data
23+
Visualization](https://github.com/dlab-berkeley/R-Data-Visualization) is
24+
assumed.
25+
26+
Check D-Lab's [Learning
27+
Pathways](https://dlab-berkeley.github.io/dlab-workshops/python_path.html)
28+
to figure out which of our workshops to take!
29+
30+
## Workshop Goals
31+
32+
In this workshop, we provide an introduction to using SQL to query and
33+
retrieve data from relational databases in R. First, we’ll cover what
34+
relational databases and SQL are. Then, we’ll use different packages in
35+
R to navigate relational databases using SQL.
36+
37+
If you are not familiar with material in [R
38+
Fundamentals](https://github.com/dlab-berkeley/R-Fundamentals), [R Data
39+
Wrangling](https://github.com/dlab-berkeley/R-Data-Wrangling), and [R
40+
Data
41+
Visualization](https://github.com/dlab-berkeley/R-Data-Visualization),
42+
we recommend attending those workshops first.
43+
44+
## Learning Objectives
45+
46+
After this workshop, you will be able to:
47+
48+
1. Explain what a relational database is and why we would want to use
49+
it.
50+
51+
2. Access and query a database using SQL.
52+
53+
3. Use `DBI` and `dbplyr` to query data from a relational database.
54+
55+
This workshop does not cover the following:
56+
57+
- Read and conduct basic data operations in R. These are covered in [R
58+
Fundamentals](#0).
59+
- Wrangle data for basic data analysis using R. These are covered in
60+
[R Data Wrangling](#0).
61+
62+
## Installation Instructions
63+
64+
We will use RStudio to go through the workshop materials, which requires
65+
installation of both the R language and the RStudio software. Complete
66+
the following steps:
67+
68+
1. [Download R](https://cloud.r-project.org/): Follow the links
69+
according to the operating system that you are running. Download the
70+
package, and install R onto your compute. You should install the
71+
most recent version (at least version 4.0).
72+
2. [Download
73+
RStudio](https://rstudio.com/products/rstudio/download/#download):
74+
Install RStudio Desktop. This should be free. Do this after you have
75+
already installed R. The D-Lab strongly recommends an RStudio
76+
edition of 2022.02.0+443 "Prairie Trillium" or higher.
77+
3. [Download these workshop
78+
materials](https://github.com/dlab-berkeley/R-Data-Visualization):
79+
1. Click the green "Code" button in the top right of the repository
80+
information.
81+
2. Click "Download Zip".
82+
3. Extract this file to a folder on your computer where you can
83+
easily access it (we recommend Desktop).
84+
4. Optional: if you’re familiar with git, you can instead clone this
85+
repository by opening a terminal and entering [GitCloneCommand].
86+
87+
## Is R not working on your laptop?
88+
89+
If you do not have R installed and the materials loaded on your workshop
90+
by the time it starts, we *strongly* recommend using the UC Berkeley
91+
Datahub to run the materials for these lessons. You can access the
92+
DataHub by clicking [this
93+
link](https://datahub.berkeley.edu/hub/user-redirect/git-pull?repo=https%3A%2F%2Fgithub.com%2Fdlab-berkeley%2FR-Data-Visualization&urlpath=rstudio%2F&branch=main).
94+
95+
The DataHub downloads this repository, along with any necessary
96+
packages, and allows you to run the materials in an RStudio instance on
97+
UC Berkeley's servers. No installation is necessary from your end - you
98+
only need an internet browser and a CalNet ID to log in. By using the
99+
DataHub, you can save your work and come back to it at any time. When
100+
you want to return to your saved work, just go straight to
101+
[DataHub](https://datahub.berkeley.edu), sign in, and you click on the
102+
`R-SQL-Fundamentals` folder.
103+
104+
## Run the Code
105+
106+
Now that you have all the required software and materials, you need to
107+
run the code:
108+
109+
Provide instructions on running the code, including how to load relevant
110+
software (RStudio, Jupyter Notebooks, etc.) and which file to open up.
111+
See other repositories for examples.
112+
113+
Additionally, provide instructions on how to run code once it’s open
114+
(running Jupyter cells, RMarkdown cells, etc.).
115+
116+
# Additional Resources
117+
118+
Check out the following resources to learn more about using SQL in R:
119+
120+
- Errickson, J. 2024. STATS 506. [Computational Methods and Tools in
121+
Statistics](https://dept.stat.lsa.umich.edu/~jerrick/courses/stat506_f24/07-sql.html)
122+
123+
- [Data Analysis and Visualization in R for
124+
Ecologists](https://datacarpentry.github.io/R-ecology-lesson/instructor/05-r-and-databases.html)
125+
126+
# About the UC Berkeley D-Lab
127+
128+
D-Lab works with Berkeley faculty, research staff, and students to
129+
advance data-intensive social science and humanities research. Our goal
130+
at D-Lab is to provide practical training, staff support, resources, and
131+
space to enable you to use R for your own research applications. Our
132+
services cater to all skill levels and no programming, statistical, or
133+
computer science backgrounds are necessary. We offer these services in
134+
the form of workshops, one-to-one consulting, and working groups that
135+
cover a variety of research topics, digital tools, and programming
136+
languages.
137+
138+
Visit the [D-Lab homepage](https://dlab.berkeley.edu/) to learn more
139+
about us. You can view our
140+
[calendar](https://dlab.berkeley.edu/events/calendar) for upcoming
141+
events, learn about how to utilize our
142+
[consulting](https://dlab.berkeley.edu/consulting) and
143+
[data](https://dlab.berkeley.edu/data) services, and check out upcoming
144+
[workshops](https://dlab.berkeley.edu/events/workshops).
145+
146+
# Here are other R workshops offered by the D-Lab:
147+
148+
## Basic Competency
149+
150+
- [Fast-R](https://github.com/dlab-berkeley/Fast-R)
151+
- [R Data Wrangling](https://github.com/dlab-berkeley/R-wrang)
152+
- [R Functional
153+
Programming](https://github.com/dlab-berkeley/R-functional-programming)
154+
- [Geospatial Fundamentals in R with
155+
sf](https://github.com/dlab-berkeley/Geospatial-Fundamentals-in-R-with-sf)
156+
- [Census Data in
157+
R](https://github.com/dlab-berkeley/Census-Data-in-R)
158+
159+
## Intermediate/Advanced Competency
160+
161+
- [Advanced Data Wrangling in
162+
R](https://github.com/dlab-berkeley/advanced-data-wrangling-in-R)
163+
- [Unsupervised Learning in
164+
R](https://github.com/dlab-berkeley/Unsupervised-Learning-in-R)
165+
- [R Machine Learning with
166+
tidymodels](https://github.com/dlab-berkeley/Machine-Learning-with-tidymodels)
167+
- [Introduction to Deep Learning in
168+
R](https://github.com/dlab-berkeley/Deep-Learning-in-R)
169+
- [R Package
170+
Development](https://github.com/dlab-berkeley/R-package-development)
171+
172+
# Contributors
173+
174+
- [Taesoo Song](https://taesoosong.github.io/)

data/.DS_Store

6 KB
Binary file not shown.

data/ACS_age_income.csv.gz

852 KB
Binary file not shown.

data/ACS_state.csv

+52
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,52 @@
1+
"state","region","population","median_income","median_rent","poverty_rate","immigrant_share"
2+
"Alabama","South",5028092,59609,925,0.210627531252712,0.0352433885457943
3+
"Alaska","West",734821,86370,1345,0.121749464895995,0.0790314920232274
4+
"Arizona","West",7172282,72581,1308,0.169381648494447,0.130496820956008
5+
"Arkansas","South",3018669,56335,868,0.214889701348985,0.0502807694384512
6+
"California","West",39356104,91905,1856,0.15080918300121,0.265343337846653
7+
"Colorado","West",5770790,87598,1594,0.106675752977669,0.0948920338463191
8+
"Connecticut","Northeast",3611317,90213,1374,0.12712434714844,0.150350689236088
9+
"Delaware","South",993635,79325,1286,0.164121606658186,0.0978457884434425
10+
"District of Columbia","South",670587,101722,1817,0.211741351427667,0.134353931704611
11+
"Florida","South",21634529,67917,1444,0.169355489947195,0.211441256705889
12+
"Georgia","South",10722325,71355,1221,0.180140261288291,0.104395828330143
13+
"Hawaii","West",1450589,94814,1868,0.118265338449083,0.180181981250375
14+
"Idaho","West",1854109,70214,1061,0.120639554964057,0.0573919872024784
15+
"Illinois","Midwest",12757634,78433,1179,0.150571111485779,0.141458910014192
16+
"Indiana","Midwest",6784403,67173,967,0.153426478315972,0.0557063016451116
17+
"Iowa","Midwest",3188836,70571,914,0.123868114527156,0.0559881411273581
18+
"Kansas","Midwest",2935922,69747,986,0.133124087853195,0.0706047367743421
19+
"Kentucky","South",4502935,60183,902,0.1993576337823,0.0408864440637051
20+
"Louisiana","South",4640546,57852,996,0.250035987040722,0.0423810043042349
21+
"Maine","Northeast",1366949,68251,1009,0.12908900682591,0.0379728870645503
22+
"Maryland","South",6161707,98461,1598,0.113367712696242,0.156715987955935
23+
"Massachusetts","Northeast",6984205,96505,1588,0.112456236811943,0.17580325892496
24+
"Michigan","Midwest",10057921,68505,1037,0.172065274370222,0.0690858478606066
25+
"Minnesota","Midwest",5695292,84313,1178,0.102634483779497,0.0846176104754594
26+
"Mississippi","South",2958846,52985,896,0.261200580153675,0.0230687910083864
27+
"Missouri","Midwest",6154422,65920,957,0.158639485074933,0.0419538991638857
28+
"Montana","West",1091840,66341,974,0.140426691043736,0.0226342687573271
29+
"Nebraska","Midwest",1958939,71722,987,0.113126537281521,0.0750074402520956
30+
"Nevada","West",3104817,71646,1382,0.162277393077234,0.190759390972157
31+
"New Hampshire","Northeast",1379610,90845,1336,0.077681647465579,0.060428671871036
32+
"New Jersey","Northeast",9249063,97126,1577,0.127899673933075,0.231942954653893
33+
"New Mexico","West",2112463,58722,966,0.234865601958349,0.0919874099570028
34+
"New York","Northeast",19994379,81386,1507,0.175133051444859,0.225510179635987
35+
"North Carolina","South",10470214,66186,1093,0.178155791877651,0.0828966819589361
36+
"North Dakota","Midwest",776874,73959,912,0.105383412835211,0.0463691666859748
37+
"Ohio","Midwest",11774683,66990,945,0.17527926908441,0.0486097162870542
38+
"Oklahoma","South",3970497,61364,934,0.196242871142204,0.0614613233557411
39+
"Oregon","West",4229374,76632,1373,0.127857808268195,0.0976414003585401
40+
"Pennsylvania","Northeast",12989208,73170,1110,0.154475236377078,0.0725067302024881
41+
"Rhode Island","Northeast",1094250,81370,1195,0.136164114940795,0.144172721041809
42+
"South Carolina","South",5142750,63623,1065,0.195185238715585,0.0525341500170142
43+
"South Dakota","Midwest",890342,69457,878,0.139200853792906,0.0382920271086841
44+
"Tennessee","South",6923772,64035,1047,0.183407941636799,0.0542626186997492
45+
"Texas","South",29243342,73035,1251,0.187910968897132,0.170563781663532
46+
"Utah","West",3283809,86833,1302,0.0843418639790693,0.0840520870732737
47+
"Vermont","Northeast",643816,74014,1149,0.104958811468038,0.0438494849460094
48+
"Virginia","South",8624511,87249,1440,0.120914355942008,0.126338177318111
49+
"Washington","West",7688549,90325,1592,0.113841281868454,0.148718308226949
50+
"West Virginia","South",1792967,55217,831,0.215511414496319,0.0161408436407363
51+
"Wisconsin","Midwest",5882128,72458,992,0.126570951058706,0.0502143101952219
52+
"Wyoming","West",577929,72495,933,0.117718814280931,0.0341841298844668

images/R_session_aborted.png

8.07 KB
Loading

0 commit comments

Comments
 (0)