psrc
diff --git a/‎docs/prework.html
Lines changed: 3 additions & 13 deletions b/‎docs/prework.html
Lines changed: 3 additions & 13 deletions
diff --git a/‎docs/search.json
Lines changed: 7 additions & 7 deletions b/‎docs/search.json
Lines changed: 7 additions & 7 deletions
diff --git a/‎docs/slides/images/chas-download.png
-250 Bytes b/‎docs/slides/images/chas-download.png
-250 Bytes
diff --git a/‎docs/slides/images/chas-end-output.png
33.9 KB b/‎docs/slides/images/chas-end-output.png
33.9 KB
diff --git a/‎docs/slides/slides.Rmd
Lines changed: 21 additions & 5 deletions b/‎docs/slides/slides.Rmd
Lines changed: 21 additions & 5 deletions
@@ -2149,7 +2149,7 @@ <h3>Contents</h3>
 <h1 id="session-files-on-local-drive">Session Files on Local Drive</h1>
 <ol type="1">
 <li><p>Clone the repo <a href="https://github.com/psrc/intro-code-org">https://github.com/psrc/intro-code-org</a> onto your local drive.</p></li>
-<li><p>Download the CHAS zip file of the <code>2015-2019 ACS 5-year average data</code> for <code>Census places</code> from <a href="https://www.huduser.gov/portal/datasets/cp.html#data_2006-2019">here</a>.</p></li>
+<li><p>Download the CHAS zip file of the <code>2015-2019 ACS 5-year average data</code> for <code>Census counties</code> from <a href="https://www.huduser.gov/portal/datasets/cp.html#data_2006-2019">here</a>.</p></li>
 </ol>
 <ul>
 <li><p>Extract the zipfile in the <code>data</code> sub-directory of the cloned repo.</p></li>
@@ -2158,19 +2158,9 @@ <h1 id="session-files-on-local-drive">Session Files on Local Drive</h1>
 <p><img src="slides/images/chas-download.png" /></p>
 <p>Make sure that the repo and other files for the session are <strong>on your local drive</strong>. If they are on PSRC’s network, you may experience extreme sluggishness when using a <code>.Rproj</code> file.</p>
 <h2 id="open-.rproj">Open .Rproj</h2>
-<p>In the RStudio IDE, open the <code>.Rproj</code> file. Project files will automatically set our working directory–in this case, the root of the directory. No need for <code>setwd()</code> and dealing with file paths!</p>
-<p><img src="images/rproj.png" /></p>
+<p>In the RStudio IDE, open the <code>.Rproj</code> file in the repo. Project files will automatically set our working directory–in this case, the root of the directory. No need for <code>setwd()</code> and dealing with file paths!</p>
 <p>After opening the <code>.Rproj</code> file, you’ll see some changes to your IDE. Your console and <code>Files</code> pane will reflect the new working directory, and the project name is listed in the top right corner.</p>
-<p><img src="images/rproj-changes.png" />
-To close out of the project, click the project name at the top right corner and select <code>Close Project</code>. On the day of the session, you can access the dropdown in that area of the IDE and select ``.</p>
-<div class="adjacent-image">
-<div class="image-left">
-<p><img src="images/rproj-close.png" /></p>
-</div>
-<div>
-<p><img src="images/rproj-open.png" /></p>
-</div>
-</div>
+<p>To close out of the project, click the project name at the top right corner and select <code>Close Project</code>. On the day of the session, you can access the dropdown in that area of the IDE and select <code>intro-code-org</code>.</p>
 <h1 id="install">Install</h1>
 <p>If the following packages have not been installed, install the following by running the following code snippet in the console of your RStudio IDE. Ignore any warnings regarding <code>Rtools</code> and if you are asked to install from sources which needs compilation, click ‘No’.</p>
 <aside>
 
@@ -6,46 +6,46 @@
       "description": "Some additional details about the website",
       "author": [],
       "contents": "\r\n\r\n\r\n\r\n",
-      "last_modified": "2023-06-02T16:14:51-07:00"
+      "last_modified": "2023-06-20T15:15:59-07:00"
     },
     {
       "path": "functions.html",
       "title": "Functions",
       "description": "Repeating things? Let's functionalize it!\n",
       "author": [],
       "contents": "\r\n\r\nContents\r\nWhat’s in a function?\r\nLet’s create a custom function!\r\nCall the function\r\n\r\nBenefits of creating functions\r\n\r\nWhen we see code being repeated more than once, functions are a great way to reduce duplication. Even if we call a function only once, they can be a nice way to break up large complicated processes.\r\nWhat’s in a function?\r\nThe Formals\r\nThe Body\r\nThe Environment\r\nTo define a function here’s the basic skeleton\r\n\r\n\r\nmy_function_name <- function() {\r\n  \r\n}\r\n\r\n\r\nLet’s create a custom function!\r\nHere’s a CHAS table. Each csv will look similar to this:\r\n\r\n\r\nfile_01 <- read_csv(here('data', '050', 'Table9.csv'))\r\nhead(file_01, 10)\r\n\r\n# A tibble: 10 × 152\r\n   source     sumlevel geoid name  st    cnty  T9_est1 T9_est2 T9_est3\r\n   <chr>      <chr>    <chr> <chr> <chr> <chr>   <dbl>   <dbl>   <dbl>\r\n 1 2015thru2… 050      0500… Auta… 01    001     21395   15680   12835\r\n 2 2015thru2… 050      0500… Bald… 01    003     80930   60895   54425\r\n 3 2015thru2… 050      0500… Barb… 01    005      9345    5690    3460\r\n 4 2015thru2… 050      0500… Bibb… 01    007      6890    5130    4330\r\n 5 2015thru2… 050      0500… Blou… 01    009     20845   16425   15090\r\n 6 2015thru2… 050      0500… Bull… 01    011      3520    2505     750\r\n 7 2015thru2… 050      0500… Butl… 01    013      6505    4550    2925\r\n 8 2015thru2… 050      0500… Calh… 01    015     44605   31255   25110\r\n 9 2015thru2… 050      0500… Cham… 01    017     13450    9070    5745\r\n10 2015thru2… 050      0500… Cher… 01    019     10735    8305    7730\r\n# ℹ 143 more variables: T9_est4 <dbl>, T9_est5 <dbl>, T9_est6 <dbl>,\r\n#   T9_est7 <dbl>, T9_est8 <dbl>, T9_est9 <dbl>, T9_est10 <dbl>,\r\n#   T9_est11 <dbl>, T9_est12 <dbl>, T9_est13 <dbl>, T9_est14 <dbl>,\r\n#   T9_est15 <dbl>, T9_est16 <dbl>, T9_est17 <dbl>, T9_est18 <dbl>,\r\n#   T9_est19 <dbl>, T9_est20 <dbl>, T9_est21 <dbl>, T9_est22 <dbl>,\r\n#   T9_est23 <dbl>, T9_est24 <dbl>, T9_est25 <dbl>, T9_est26 <dbl>,\r\n#   T9_est27 <dbl>, T9_est28 <dbl>, T9_est29 <dbl>, T9_est30 <dbl>, …\r\n\r\nSuppose we’d like to do some cleaning to each CHAS table in the same manner. Let’s create one that does the following:\r\nfilter for WA state and PSRC counties\r\npivot longer (so columns that start with ‘T’ are not across the table)\r\ncreate 3 more columns that dissect the column containing the former ‘T…’ headers:\r\ncreate ‘table’ field extracting T and the numbers before the underscore\r\ncreate a ‘type’ field to identify whether values are ‘est’ or ‘moe’\r\ncreate a ‘sort’ field extracting the numeric digits at the end\r\n\r\n\r\n\r\n# define the skeleton of our function\r\n# add table as a parameter\r\nclean_table <- function(table) {\r\n  \r\n  # fill it in!\r\n  \r\n}\r\n\r\n\r\nFill in the body with the argument to clean\r\n\r\n\r\nclean_table <- function(table) {\r\n  table %>% \r\n    filter(st == 53 & cnty %in% c('033', '035', '053', '061')) %>% \r\n    pivot_longer(cols = str_subset(colnames(table), \"^T.*\"), \r\n                 names_to = 'header', \r\n                 values_to = 'value') %>% \r\n    mutate(table = str_extract(header, \"^T\\\\d*(?=_)\"), \r\n           type = str_extract(header, \"(?<=_)\\\\w{3}\"), \r\n           sort = str_extract(header, \"\\\\d+$\")) \r\n}\r\n\r\n# Regex used:\r\n# table: \"^T\\\\d*(?=_)\" string starting with T and numeric digits followed by _\r\n# type: \"(?<=_)\\\\w{3}\" 3 letters preceded by _\r\n# sort: \"\\\\d+$\" last numeric digits at the end of the string\r\n\r\n\r\n\r\nFunctions will generally return the last evaluated expression. With the piping (%>%) in dplyr, our example is essentially a one liner expression. You can always add return(<name of object>) to explicitly return a specific object whenever your function is called.\r\nCall the function\r\n\r\n\r\nt9 <- clean_table(file_01)\r\n\r\n\r\nTry with other files\r\n\r\n\r\nfile_02 <- read_csv(here('data', '050', 'Table10.csv'))\r\nfile_03 <- read_csv(here('data', '050', 'Table11.csv'))\r\n\r\nt10 <- clean_table(file_02)\r\nt11 <- clean_table(file_03)\r\n\r\n\r\nIf we forgot a step in the cleaning process, we can always edit the function and re-run our script\r\n\r\n\r\n# Let's make this edit to our function that will convert the sort column from string to numeric\r\nsort = as.numeric(str_extract(header, \"\\\\d+$\"))\r\n\r\n\r\nBenefits of creating functions\r\nEasier editing of code\r\nReduce redundancy\r\nBreak long processes into chunks\r\n\r\n\r\n\r\n",
-      "last_modified": "2023-06-02T16:14:55-07:00"
+      "last_modified": "2023-06-20T15:16:02-07:00"
     },
     {
       "path": "index.html",
       "title": "Code Organization",
       "description": "Functions, Lists, and Loops\n",
       "author": [],
       "contents": "\r\nIt’s never too late to start organizing code! It’s a step towards cleaner scripts which not only benefit the machines that execute it but the humans who are reading and understanding the logic. We’ll go over three essentials of programming (in R) that can help us remove redundant code and efficiently process and store multiple objects. With these tools: functions, lists, and loops, your scripts can become more concise and easier to navigate.\r\n\r\n\r\n\r\n\r\n\r\nShortcut Keys\r\nf: full-screen\r\nEsc: exit full-screen\r\no: tile view\r\nleft or right arrow: advance slide\r\n\r\n\r\n\r\n",
-      "last_modified": "2023-06-02T16:14:59-07:00"
+      "last_modified": "2023-06-20T15:16:03-07:00"
     },
     {
       "path": "lists.html",
       "title": "Lists",
       "description": "Lists are our friend!\n",
       "author": [],
       "contents": "\r\nLists can not only store a mix of data types, but also more complex data (e.g. data frames, even lists themselves!). It’s an ideal option for a group of similar complex data, and decently sets us up for iteration! So in preparation for for loops, let’s examine lists and how to extract data from them.\r\nCreate a list\r\n\r\n\r\n# an empty list\r\nl <- list() \r\n\r\n# populated with objects\r\nl <- list(file_01, file_02, file_03) \r\n\r\n\r\nAnatomy\r\nThere’s several layers to a list, like a container within a container. To extract a specific element’s data, use double brackets instead one.\r\n\r\n\r\nl # the whole list and all its elements\r\n\r\nl[1] # the first element in its container; contains the name/index of element and the data\r\n\r\nl[[1]] # the data of the first element\r\n\r\n\r\nHadley Wickham’s pepper analogy\r\n\r\nNames\r\nLike vectors, lists can also be named\r\n\r\n\r\nnames(l) <- c('table9', 'table10', 'table11')\r\n\r\n\r\nNow you can also access the data of a specific element by name using $ or [[]]\r\n\r\n\r\nl$table9\r\n\r\nl[['table9']]\r\n\r\n# saving processes back into the list\r\nl$table9 <- l$table9 %>% filter(cnty == '033')\r\n\r\n\r\nNow if you extract an element with only one [] like l[1], you can see that it’s a container holding the name of the element and the data itself.\r\n\r\n\r\n\r\n\r\n",
-      "last_modified": "2023-06-02T16:15:00-07:00"
+      "last_modified": "2023-06-20T15:16:04-07:00"
     },
     {
       "path": "loops.html",
       "title": "Loops",
       "description": "Over and Over again...\n",
       "author": [],
       "contents": "\r\n\r\nContents\r\nAnatomy\r\nSequence\r\nOutput\r\n\r\nPut it all together\r\nSequencing alternatives\r\n\r\nAlter the Flow\r\nBreak\r\nNext\r\n\r\n\r\nWe could explicitly call the function as many times need be to clean the tables we’re interested in.\r\n\r\n\r\ntable9 <- clean_table(file_01)\r\n\r\ntable10 <- clean_table(file_02)\r\n\r\ntable11 <- clean_table(file_03)\r\n\r\n\r\nBut that would be redundant! Imagine if we were reading in 15 of those csvs! There would be extra lines of code and as many extra variable names in your global environment to keep track of. Let’s see how loops paired with lists can help us!\r\nAnatomy\r\nA for loop has three parts:\r\nThe Output: where the stuff will be stored\r\nThe Sequence: code within () that shows what to loop over\r\nThe Body: code within {} that does the work\r\nSequence\r\nThe sequence lies within the (). It will follow this structure:\r\n([variable name of your choice] in [list or vector])\r\nThe sequence tells the machine what to loop over. The variable name of your choice will represent a single element within the list or vector in a loop.\r\nIn the example below, with every iteration, df will be a counter and represent a different data frame in l\r\n\r\n\r\nl <- list(file_01, file_02, file_03)\r\n\r\n# for every data frame (df) in list (l)...\r\nfor(df in l) {\r\n  \r\n}\r\n\r\n\r\nTry printing the head() of each data frame in our list\r\n\r\n\r\nl <- list(file_01, file_02, file_03) \r\n\r\nfor(df in l) {\r\n  print(head(df))\r\n}\r\n\r\n\r\nTry printing a version of each data frame with clean_table()\r\n\r\n\r\nfor(df in l) {\r\n  print(clean_table(df))\r\n}\r\n\r\n\r\nOutput\r\nNow instead of printing stuff, let’s store stuff in a list!\r\nA way to use both lists and loops is to read data. Let’s create a loop to read-in csvs 1 through 11 and store them into a list.\r\nInitiate the Output (dfs) to store the end result (data frames from csvs)\r\nConstruct the sequence you’ll be looping over (csv) (file names of csvs)\r\nRead csv (t)\r\n\r\n\r\ndfs <- list()\r\n\r\ncsv <- paste0('Table', 1:11, '.csv')\r\n\r\nfor(c in csv) {\r\n  t <- read_csv(here('data', '050', c))\r\n  dfs[[c]] <- t\r\n}\r\n\r\n\r\nRename the elements in the list\r\n\r\n\r\nnames(dfs) <- paste0('Table', 1:11)\r\n\r\n\r\nPut it all together\r\nEdit the loop so that we clean the tables as we’re reading in the csvs.\r\n\r\n\r\nfor(c in csv) {\r\n  t <- read_csv(here('data', '050', c))\r\n  ct <- clean_table(t)\r\n  dfs[[c]] <- ct\r\n}\r\n\r\n\r\nWith loops and a list to store the output, we’ve removed code redundancy. Instead of calling clean_table() for every table in our list, it just required editing a couple lines within the loop to make that adjustment.\r\nSequencing alternatives\r\n\r\n\r\nfor(df in 1:length(l)) {\r\n  \r\n}\r\n\r\n\r\nAlter the Flow\r\nBreak\r\nNext\r\n\r\n\r\n\r\n",
-      "last_modified": "2023-06-02T16:15:01-07:00"
+      "last_modified": "2023-06-20T15:16:05-07:00"
     },
     {
       "path": "prework.html",
       "title": "Pre-work",
       "author": [],
-      "contents": "\r\n\r\nContents\r\nSession Files on Local Drive\r\nOpen .Rproj\r\n\r\nInstall\r\nTest\r\n\r\nWe’ll be going over Functions, Lists, and Loops in this session. Some familiarity with using R and opening RStudio is helpful if you are actively participating, otherwise all are welcome to watch and learn.\r\nR and the RStudio IDE are required. See the first module on R Basics for guidance.\r\nSession Files on Local Drive\r\nClone the repo https://github.com/psrc/intro-code-org onto your local drive.\r\nDownload the CHAS zip file of the 2015-2019 ACS 5-year average data for Census places from here.\r\nExtract the zipfile in the data sub-directory of the cloned repo.\r\nAll data files will be found in data/050.\r\n\r\nMake sure that the repo and other files for the session are on your local drive. If they are on PSRC’s network, you may experience extreme sluggishness when using a .Rproj file.\r\nOpen .Rproj\r\nIn the RStudio IDE, open the .Rproj file. Project files will automatically set our working directory–in this case, the root of the directory. No need for setwd() and dealing with file paths!\r\n\r\nAfter opening the .Rproj file, you’ll see some changes to your IDE. Your console and Files pane will reflect the new working directory, and the project name is listed in the top right corner.\r\n\r\nTo close out of the project, click the project name at the top right corner and select Close Project. On the day of the session, you can access the dropdown in that area of the IDE and select ``.\r\n\r\n\r\n\r\n\r\n\r\n\r\n\r\n\r\nInstall\r\nIf the following packages have not been installed, install the following by running the following code snippet in the console of your RStudio IDE. Ignore any warnings regarding Rtools and if you are asked to install from sources which needs compilation, click ‘No’.\r\n\r\nSome may have already been installed from previous modules. You can adapt the code snippet accordingly.\r\n\r\n\r\ninstall.packages(c('tidyverse', 'here'))\r\n\r\n\r\nTest\r\nTest to make sure you can read csvs from the CHAS dataset that was downloaded. Adjust the line below according to how you’ve stored your files in the data sub-directory.\r\n\r\n\r\nfile_01 <- read_csv(here('data', '050', 'Table9.csv'))\r\n\r\n\r\n\r\n\r\n\r\n",
-      "last_modified": "2023-06-02T16:15:02-07:00"
+      "contents": "\r\n\r\nContents\r\nSession Files on Local Drive\r\nOpen .Rproj\r\n\r\nInstall\r\nTest\r\n\r\nWe’ll be going over Functions, Lists, and Loops in this session. Some familiarity with using R and opening RStudio is helpful if you are actively participating, otherwise all are welcome to watch and learn.\r\nR and the RStudio IDE are required. See the first module on R Basics for guidance.\r\nSession Files on Local Drive\r\nClone the repo https://github.com/psrc/intro-code-org onto your local drive.\r\nDownload the CHAS zip file of the 2015-2019 ACS 5-year average data for Census counties from here.\r\nExtract the zipfile in the data sub-directory of the cloned repo.\r\nAll data files will be found in data/050.\r\n\r\nMake sure that the repo and other files for the session are on your local drive. If they are on PSRC’s network, you may experience extreme sluggishness when using a .Rproj file.\r\nOpen .Rproj\r\nIn the RStudio IDE, open the .Rproj file in the repo. Project files will automatically set our working directory–in this case, the root of the directory. No need for setwd() and dealing with file paths!\r\nAfter opening the .Rproj file, you’ll see some changes to your IDE. Your console and Files pane will reflect the new working directory, and the project name is listed in the top right corner.\r\nTo close out of the project, click the project name at the top right corner and select Close Project. On the day of the session, you can access the dropdown in that area of the IDE and select intro-code-org.\r\nInstall\r\nIf the following packages have not been installed, install the following by running the following code snippet in the console of your RStudio IDE. Ignore any warnings regarding Rtools and if you are asked to install from sources which needs compilation, click ‘No’.\r\n\r\nSome may have already been installed from previous modules. You can adapt the code snippet accordingly.\r\n\r\n\r\ninstall.packages(c('tidyverse', 'here'))\r\n\r\n\r\nTest\r\nTest to make sure you can read csvs from the CHAS dataset that was downloaded. Adjust the line below according to how you’ve stored your files in the data sub-directory.\r\n\r\n\r\nfile_01 <- read_csv(here('data', '050', 'Table9.csv'))\r\n\r\n\r\n\r\n\r\n\r\n",
+      "last_modified": "2023-06-20T15:16:05-07:00"
     }
   ],
   "collections": []
 
@@ -128,15 +128,16 @@ class: inverse, center, middle
 
 ### Exercise (5 minutes)
 
-.pull-left[
+.panelset[
+.panel[.panel-name[Exercise]
 Let's create a function that when called, does the following to any CHAS table: 
+.pull-left[
+- Filter for WA state and PSRC counties
 
-- filter for WA state and PSRC counties
-
-- pivot longer (so columns that start with **T** are not across the table, but contained in a single column)
+- Pivot longer (so columns that start with **T** are not across the table, but contained in a single column)
 
 Extra credit:
-- create 3 more columns that dissect the column containing the former **T...** headers: 
+- Create 3 more columns that dissect the column containing the former **T...** headers: 
     - create **table** field extracting `T` and the numbers before the underscore
 
     - create **type** field extracting 'est' or 'moe'
@@ -158,6 +159,21 @@ clean_table <- function(table) {
 ]
 ]
 
+]
+
+.panel[.panel-name[End Output]
+![](images/chas-end-output.png)
+]
+
+.panel[.panel-name[Resources]
+
+- [Stringr](https://stringr.tidyverse.org/)
+- [Stringr Cheat Sheet](https://github.com/rstudio/cheatsheets/blob/main/strings.pdf)
+
+]
+
+]
+
 ---
 
 class: inverse, center, middle