Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion R/calculate_custom_geographies.R
Original file line number Diff line number Diff line change
Expand Up @@ -433,7 +433,8 @@ calculate_custom_geographies = function(
paste0(definition, " [Percentage recalculated from summed components.]"),
aggregation_strategy == "weighted_average" ~
paste0(definition, " [Aggregated via population-weighted average using ", weight_variable, ".]"),
TRUE ~ definition))
TRUE ~ definition)) %>%
dplyr::select(calculated_variable, variable_type, definition, dplyr::everything())

attr(result, "codebook") = updated_codebook

Expand Down
3 changes: 2 additions & 1 deletion R/compile_acs_data.R
Original file line number Diff line number Diff line change
Expand Up @@ -332,7 +332,8 @@ this function returns.")}
})})

## attach the codebook and resolved tables as attributes to the returned dataset
attr(df_cvs, "codebook") = codebook
attr(df_cvs, "codebook") = codebook %>%
dplyr::select(calculated_variable, variable_type, definition, dplyr::everything())
attr(df_cvs, "resolved_tables") = resolved_tables

if (isTRUE(spatial)) { df_cvs = sf::st_as_sf(df_cvs) }
Expand Down
44 changes: 26 additions & 18 deletions README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -37,8 +37,11 @@ the American Community Survey (ACS).

With a single function call, you get:

- Access to hundreds of standardized variables, such as percentages, in addition
to the raw count variables used to produce them.
- Access to hundreds of standardized variables, such as percentages and
the raw count variables used to produce them.

- Margins of error and coefficients of variation for all
variables--those direct from the API as well as derived variables.

- Meaningful, consistent variable names.

Expand All @@ -50,23 +53,23 @@ With a single function call, you get:
- Supplemental measures, such as population density, that aren't
available from the ACS.

- Built-in quality checks to help ensure that calculated variables are
accurate. Plus some good, old-fashioned manual QC.
- Built-in quality checks to help ensure that calculated variables
and measures of error are accurate. Plus some good, old-fashioned manual QC.
That said--use at your own risk. We cannot and do not guarantee there aren't bugs.


- Margins of error and coefficients of variation for all
variables--those direct from the API as well as derived variables.

# Installation

Install the development version of `urbnindicators` from
[GitHub](https://github.com/) with:

```r
install.packages("renv")
# install.packages("renv")
renv::install("UI-Research/urbnindicators")
```

You'll need a Census API key
You'll want a Census API key
([request one here](https://api.census.gov/data/key_signup.html)).
Set it once with:

Expand Down Expand Up @@ -95,8 +98,11 @@ library(urbnindicators)
## Discover Available Data

```{r, warning = FALSE, message = FALSE}
list_tables()
list_variables()
list_tables() |> head(10)
```

```{r, warning = FALSE, message = FALSE}
list_variables() |> head(10)
```

## Obtain Data
Expand All @@ -112,7 +118,7 @@ df = compile_acs_data(
geography = "county",
states = "NJ")

glimpse(df)
glimpse(df) |> head(10)
```

## Visualize Data
Expand All @@ -136,8 +142,9 @@ plot_data = df %>%
data_source_year = factor(data_source_year))

state_averages = plot_data %>%
group_by(data_source_year) %>%
summarize(mean_pct = mean(race_personofcolor_percent)) %>%
summarize(
.by = data_source_year,
mean_pct = mean(race_personofcolor_percent)) %>%
arrange(data_source_year) %>%
pull(mean_pct)

Expand All @@ -156,9 +163,7 @@ dumbbell_data = plot_data %>%
names_from = data_source_year,
values_from = race_personofcolor_percent,
names_prefix = "year_")
```

```{r, warning = FALSE, message = FALSE}
ggplot() +
geom_segment(
data = dumbbell_data,
Expand Down Expand Up @@ -210,7 +215,7 @@ ggplot() +
title = "All NJ Counties Experienced Racial Diversification from 2019 to 2024",
subtitle = paste0("Share of population who are people of color, by county, 2019-2024
Confidence intervals are presented around each point but are extremely small"),
x = "County",
x = "",
y = "Share of population who are people of color") +
scale_x_discrete(expand = expansion(mult = c(.03, .04))) +
scale_y_continuous(
Expand All @@ -223,8 +228,7 @@ Confidence intervals are presented around each point but are extremely small"),

# Learn More

A growing number of vignettes aim to support users in effectively using
this package. These vignettes include:
Check out the vignettes for additional details:

- A package overview to help users [**Get
Started**](articles/urbnindicators.html).
Expand All @@ -241,6 +245,10 @@ this package. These vignettes include:
Error**](articles/quantified-survey-error.html) can improve
inference making.

- You can re-create your indicators and their measures of error
for [**Custom Geographies**](articles/custom-geographies.html).
Neighborhoods? Unincorporated counties? Start here.

# Credits

This package is built on top of and enormously indebted to
Expand Down
91 changes: 52 additions & 39 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,11 @@ the American Community Survey (ACS).

With a single function call, you get:

- Access to hundreds of standardized variables, such as percentages, in
addition to the raw count variables used to produce them.
- Access to hundreds of standardized variables, such as percentages and
the raw count variables used to produce them.

- Margins of error and coefficients of variation for all variables–those
direct from the API as well as derived variables.

- Meaningful, consistent variable names.

Expand All @@ -34,23 +37,22 @@ With a single function call, you get:
- Supplemental measures, such as population density, that aren’t
available from the ACS.

- Built-in quality checks to help ensure that calculated variables are
accurate. Plus some good, old-fashioned manual QC.

- Margins of error and coefficients of variation for all variables–those
direct from the API as well as derived variables.
- Built-in quality checks to help ensure that calculated variables and
measures of error are accurate. Plus some good, old-fashioned manual
QC. That said–use at your own risk. We cannot and do not guarantee
there aren’t bugs.

# Installation

Install the development version of `urbnindicators` from
[GitHub](https://github.com/) with:

``` r
install.packages("renv")
# install.packages("renv")
renv::install("UI-Research/urbnindicators")
```

You’ll need a Census API key ([request one
You’ll want a Census API key ([request one
here](https://api.census.gov/data/key_signup.html)). Set it once with:

``` r
Expand All @@ -65,26 +67,16 @@ updates–check to ensure you have the most recent version installed!
## Discover Available Data

``` r
list_tables()
#> [1] "age" "computing_devices"
#> [3] "cost_burden" "disability"
#> [5] "educational_attainment" "employment"
#> [7] "gini" "health_insurance"
#> [9] "household_size" "income_quintiles"
#> [11] "internet" "language"
#> [13] "median_household_income" "median_housing_cost"
#> [15] "median_income_by_tenure" "mortgage_status"
#> [17] "nativity" "occupants_per_room"
#> [19] "population_density" "poverty"
#> [21] "public_assistance" "race"
#> [23] "school_enrollment" "sex"
#> [25] "snap" "tenure"
#> [27] "tenure_by_housing_costs" "tenure_by_units_in_structure"
#> [29] "total_population" "transportation_to_work"
#> [31] "travel_time_to_work" "units_in_structure"
#> [33] "vehicles_available" "year_structure_built"
list_variables()
#> # A tibble: 861 × 2
list_tables() |> head(10)
#> [1] "age" "computing_devices" "cost_burden"
#> [4] "disability" "educational_attainment" "employment"
#> [7] "gini" "health_insurance" "household_size"
#> [10] "income_quintiles"
```

``` r
list_variables() |> head(10)
#> # A tibble: 10 × 2
#> variable table
#> <chr> <chr>
#> 1 total_population_universe total_population
Expand All @@ -96,8 +88,7 @@ list_variables()
#> 7 snap_received_percent snap
#> 8 household_income_quintile_upper_limit_1 income_quintiles
#> 9 household_income_quintile_upper_limit_2 income_quintiles
#> 10 household_income_quintile_upper_limit_3 income_quintiles
#> # ℹ 851 more rows
#> 10 household_income_quintile_upper_limit_3 income_quintiles
```

## Obtain Data
Expand All @@ -113,7 +104,7 @@ df = compile_acs_data(
geography = "county",
states = "NJ")

glimpse(df)
glimpse(df) |> head(10)
#> Rows: 21
#> Columns: 175
#> $ data_source_year <dbl> 2024, 2024, 2…
Expand Down Expand Up @@ -291,6 +282,26 @@ glimpse(df)
#> $ race_hispanic_twoormore_includingotherrace_percent_M <dbl> 0.0073, 0.003…
#> $ race_hispanic_twoormore_excludingotherrace_percent_M <dbl> 0.0017, 0.001…
#> $ race_personofcolor_percent_M <dbl> 0.0023, 0.001…
#> # A tibble: 10 × 175
#> data_source_year GEOID NAME total_population_uni…¹ race_universe
#> <dbl> <chr> <chr> <dbl> <dbl>
#> 1 2024 34001 Atlantic County,… 276270 276270
#> 2 2024 34003 Bergen County, N… 962316 962316
#> 3 2024 34005 Burlington Count… 467805 467805
#> 4 2024 34007 Camden County, N… 527257 527257
#> 5 2024 34009 Cape May County,… 94941 94941
#> 6 2024 34011 Cumberland Count… 153305 153305
#> 7 2024 34013 Essex County, Ne… 863002 863002
#> 8 2024 34015 Gloucester Count… 306954 306954
#> 9 2024 34017 Hudson County, N… 718323 718323
#> 10 2024 34019 Hunterdon County… 130160 130160
#> # ℹ abbreviated name: ¹​total_population_universe
#> # ℹ 170 more variables: race_nonhispanic_allraces <dbl>,
#> # race_nonhispanic_white_alone <dbl>, race_nonhispanic_black_alone <dbl>,
#> # race_nonhispanic_aian_alone <dbl>, race_nonhispanic_asian_alone <dbl>,
#> # race_nonhispanic_nhpi_alone <dbl>, race_nonhispanic_otherrace_alone <dbl>,
#> # race_nonhispanic_twoormore <dbl>,
#> # race_nonhispanic_twoormore_includingotherrace <dbl>, …
```

## Visualize Data
Expand All @@ -314,8 +325,9 @@ plot_data = df %>%
data_source_year = factor(data_source_year))

state_averages = plot_data %>%
group_by(data_source_year) %>%
summarize(mean_pct = mean(race_personofcolor_percent)) %>%
summarize(
.by = data_source_year,
mean_pct = mean(race_personofcolor_percent)) %>%
arrange(data_source_year) %>%
pull(mean_pct)

Expand All @@ -334,9 +346,7 @@ dumbbell_data = plot_data %>%
names_from = data_source_year,
values_from = race_personofcolor_percent,
names_prefix = "year_")
```

``` r
ggplot() +
geom_segment(
data = dumbbell_data,
Expand Down Expand Up @@ -388,7 +398,7 @@ ggplot() +
title = "All NJ Counties Experienced Racial Diversification from 2019 to 2024",
subtitle = paste0("Share of population who are people of color, by county, 2019-2024
Confidence intervals are presented around each point but are extremely small"),
x = "County",
x = "",
y = "Share of population who are people of color") +
scale_x_discrete(expand = expansion(mult = c(.03, .04))) +
scale_y_continuous(
Expand All @@ -403,8 +413,7 @@ Confidence intervals are presented around each point but are extremely small"),

# Learn More

A growing number of vignettes aim to support users in effectively using
this package. These vignettes include:
Check out the vignettes for additional details:

- A package overview to help users [**Get
Started**](articles/urbnindicators.html).
Expand All @@ -421,6 +430,10 @@ this package. These vignettes include:
Error**](articles/quantified-survey-error.html) can improve inference
making.

- You can re-create your indicators and their measures of error for
[**Custom Geographies**](articles/custom-geographies.html).
Neighborhoods? Unincorporated counties? Start here.

# Credits

This package is built on top of and enormously indebted to
Expand Down
Binary file modified man/figures/README-unnamed-chunk-6-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
3 changes: 3 additions & 0 deletions vignettes/.gitignore
Original file line number Diff line number Diff line change
@@ -1,2 +1,5 @@
*.html
*.R

/.quarto/
**/*.quarto_ipynb
14 changes: 8 additions & 6 deletions vignettes/codebook.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -17,19 +17,21 @@ knitr::opts_chunk$set(

```{r setup, echo = FALSE}
library(urbnindicators)
library(dplyr)
library(reactable)

df = compile_acs_data(year = 2023, geography = "us")
df = compile_acs_data(year = 2024, geography = "us")
codebook = attr(df, "codebook")

reactable::reactable(
reactable(
codebook %>%
dplyr::rename(
transmute(
Variable = calculated_variable,
Type = variable_type,
Definition = definition),
columns = list(
Variable = reactable::colDef(minWidth = 200),
Type = reactable::colDef(minWidth = 50),
Type = reactable::colDef(minWidth = 100)),
Variable = colDef(minWidth = 200),
Type = colDef(minWidth = 50),
Definition = colDef(minWidth = 100)),
filterable = TRUE)
```
Loading