Preface |
|
xiii | |
|
1 The US Census and the R programming language |
|
|
1 | (16) |
|
1.1 Census data: an overview |
|
|
1 | (1) |
|
|
2 | (1) |
|
1.3 How to find US Census data |
|
|
3 | (5) |
|
1.3.1 Data downloads from the Census Bureau |
|
|
5 | (1) |
|
|
5 | (2) |
|
1.3.3 Third-party data distributors |
|
|
7 | (1) |
|
|
8 | (3) |
|
1.4.1 Getting started with R |
|
|
8 | (1) |
|
1.4.2 Basic data structures in R |
|
|
8 | (1) |
|
1.4.3 Functions and packages |
|
|
9 | (1) |
|
1.4.4 Package ecosystems in R |
|
|
10 | (1) |
|
1.5 Analyses using R and US Census data |
|
|
11 | (6) |
|
1.5.1 Census data packages in R: a brief summary |
|
|
11 | (1) |
|
1.5.2 Health resource access |
|
|
12 | (1) |
|
1.5.3 COVID-19 and pandemic response |
|
|
12 | (1) |
|
1.5.4 Politics and gerrymandering |
|
|
12 | (3) |
|
1.5.5 Social equity research |
|
|
15 | (1) |
|
1.5.6 Census data visualization |
|
|
15 | (2) |
|
2 An introduction to tidycensus |
|
|
17 | (22) |
|
2.1 Getting started with tidycensus |
|
|
17 | (5) |
|
|
18 | (2) |
|
2.1.2 American Community Survey |
|
|
20 | (2) |
|
2.2 Geography and variables in tidycensus |
|
|
22 | (5) |
|
|
24 | (3) |
|
2.3 Searching for variables in tidycensus |
|
|
27 | (2) |
|
2.4 Data structure in tidycensus |
|
|
29 | (4) |
|
2.4.1 Understanding GEOIDs |
|
|
30 | (2) |
|
2.4.2 Renaming variable IDs |
|
|
32 | (1) |
|
2.5 Other Census Bureau datasets in tidycensus |
|
|
33 | (3) |
|
2.5.1 Using get_estimates() |
|
|
33 | (2) |
|
|
35 | (1) |
|
2.6 Debugging tidycensus errors |
|
|
36 | (1) |
|
|
37 | (2) |
|
3 Wrangling Census data with tidyverse tools |
|
|
39 | (22) |
|
|
39 | (1) |
|
3.2 Exploring Census data with tidyverse tools |
|
|
40 | (5) |
|
3.2.1 Sorting and filtering data |
|
|
40 | (3) |
|
3.2.2 Using summary variables and calculating new columns |
|
|
43 | (2) |
|
3.3 Group-wise Census data analysis |
|
|
45 | (4) |
|
3.3.1 Making group-wise comparisons |
|
|
46 | (1) |
|
3.3.2 Tabulating new groups |
|
|
47 | (2) |
|
3.4 Comparing ACS estimates over time |
|
|
49 | (7) |
|
3.4.1 Time-series analysis: some cautions |
|
|
50 | (2) |
|
3.4.2 Preparing time-series ACS estimates |
|
|
52 | (4) |
|
3.5 Handling margins of error in the American Community Survey with tidycensus |
|
|
56 | (4) |
|
3.5.1 Calculating derived margins of error in tidycensus |
|
|
57 | (2) |
|
3.5.2 Calculating group-wise margins of error |
|
|
59 | (1) |
|
|
60 | (1) |
|
4 Exploring US Census data with visualization |
|
|
61 | (32) |
|
4.1 Basic Census visualization with ggplot2 |
|
|
61 | (5) |
|
4.1.1 Getting started with ggplot2 |
|
|
62 | (2) |
|
4.1.2 Visualizing multivariate relationships with scatter plots |
|
|
64 | (2) |
|
4.2 Customizing ggplot2 visualizations |
|
|
66 | (6) |
|
4.2.1 Improving plot legibility |
|
|
67 | (2) |
|
4.2.2 Custom styling of ggplot2 charts |
|
|
69 | (2) |
|
4.2.3 Exporting data visualizations from R |
|
|
71 | (1) |
|
4.3 Visualizing margins of error |
|
|
72 | (4) |
|
|
72 | (2) |
|
4.3.2 Using error bars for margins of error |
|
|
74 | (2) |
|
4.4 Visualizing ACS estimates over time |
|
|
76 | (2) |
|
4.5 Exploring age and sex structure with population pyramids |
|
|
78 | (4) |
|
4.5.1 Preparing data from the Population Estimates API |
|
|
78 | (2) |
|
4.5.2 Designing and styling the population pyramid |
|
|
80 | (2) |
|
4.6 Visualizing group-wise comparisons |
|
|
82 | (4) |
|
4.7 Advanced visualization with ggplot2 extensions |
|
|
86 | (6) |
|
|
86 | (1) |
|
|
87 | (1) |
|
|
88 | (3) |
|
4.7.4 Interactive visualization with plotly |
|
|
91 | (1) |
|
4.8 Learning more about visualization |
|
|
92 | (1) |
|
|
92 | (1) |
|
5 Census geographic data and applications in R |
|
|
93 | (30) |
|
5.1 Basic usage of tigris |
|
|
93 | (8) |
|
5.1.1 Understanding tigris and simple features |
|
|
97 | (3) |
|
5.1.2 Data availability in tigris |
|
|
100 | (1) |
|
5.2 Plotting geographic data |
|
|
101 | (4) |
|
5.2.1 Ggplot2 and geom_sf () |
|
|
101 | (2) |
|
5.2.2 Interactive viewing with mapview |
|
|
103 | (2) |
|
|
105 | (4) |
|
5.3.1 Tiger/Line and cartographic boundary shapefiles |
|
|
105 | (1) |
|
5.3.2 Caching tigris data |
|
|
106 | (1) |
|
5.3.3 Understanding yearly differences in TIGER/Line files |
|
|
107 | (1) |
|
5.3.4 Combining tigris datasets |
|
|
108 | (1) |
|
5.4 Coordinate reference systems |
|
|
109 | (6) |
|
5.4.1 Using the crsuggest package |
|
|
110 | (3) |
|
5.4.2 Plotting with coord_sf() |
|
|
113 | (2) |
|
5.5 Working with geometries |
|
|
115 | (7) |
|
5.5.1 Shifting and rescaling geometry for national US mapping |
|
|
115 | (2) |
|
5.5.2 Converting polygons to points |
|
|
117 | (2) |
|
5.5.3 Exploding multipolygon geometries to single parts |
|
|
119 | (3) |
|
|
122 | (1) |
|
6 Mapping Census data with R |
|
|
123 | (44) |
|
6.1 Using geometry in tidycensus |
|
|
123 | (3) |
|
6.1.1 Basic mapping of sf objects with plot () |
|
|
125 | (1) |
|
6.2 Map-making with ggplot2 and geom_sf |
|
|
126 | (2) |
|
|
126 | (1) |
|
6.2.2 Customizing ggplot2 maps |
|
|
127 | (1) |
|
|
128 | (13) |
|
6.3.1 Choropleth maps with tmap |
|
|
129 | (4) |
|
6.3.2 Adding reference elements to a map |
|
|
133 | (3) |
|
6.3.3 Choosing a color palette |
|
|
136 | (1) |
|
6.3.4 Alternative map types with tmap |
|
|
137 | (4) |
|
6.4 Cartographic workflows with non-Census data |
|
|
141 | (6) |
|
6.4.1 National election mapping with tigris shapes |
|
|
142 | (1) |
|
6.4.2 Understanding and working with ZCTAs |
|
|
143 | (4) |
|
|
147 | (7) |
|
6.5.1 Interactive mapping with Leaflet |
|
|
147 | (4) |
|
6.5.2 Alternative approaches to interactive mapping |
|
|
151 | (3) |
|
|
154 | (7) |
|
6.6.1 Mapping migration flows |
|
|
155 | (1) |
|
6.6.2 Linking maps and charts |
|
|
156 | (2) |
|
6.6.3 Reactive mapping with Shiny |
|
|
158 | (3) |
|
6.7 Working with software outside of R for cartographic projects |
|
|
161 | (4) |
|
6.7.1 Exporting maps from R |
|
|
162 | (1) |
|
6.7.2 Interoperability with other visualization software |
|
|
163 | (2) |
|
|
165 | (2) |
|
7 Spatial analysis with US Census data |
|
|
167 | (46) |
|
|
167 | (4) |
|
7.1.1 Note; aligning coordinate reference systems |
|
|
168 | (1) |
|
7.1.2 Identifying geometries within a metropolitan area |
|
|
169 | (1) |
|
7.1.3 Spatial subsets and spatial predicates |
|
|
170 | (1) |
|
|
171 | (9) |
|
7.2.1 Point-in-polygon spatial joins |
|
|
172 | (4) |
|
7.2.2 Spatial joins and group-wise spatial analysis |
|
|
176 | (4) |
|
7.3 Small area time-series analysis |
|
|
180 | (7) |
|
7.3.1 Area-weighted areal interpolation |
|
|
182 | (1) |
|
7.3.2 Population-weighted areal interpolation |
|
|
183 | (2) |
|
7.3.3 Making small-area comparisons |
|
|
185 | (2) |
|
7.4 Distance and proximity analysis |
|
|
187 | (9) |
|
7.4.1 Calculating distances |
|
|
188 | (2) |
|
7.4.2 Calculating travel times |
|
|
190 | (2) |
|
7.4.3 Catchment areas with buffers and isochrones |
|
|
192 | (2) |
|
7.4.4 Computing demographic estimates for zones with areal interpolation |
|
|
194 | (2) |
|
7.5 Better cartography with spatial overlay |
|
|
196 | (2) |
|
7.5.1 "Erasing" areas from Census polygons |
|
|
196 | (2) |
|
7.6 Spatial neighborhoods and spatial weights matrices |
|
|
198 | (4) |
|
7.6.1 Understanding spatial neighborhoods |
|
|
199 | (2) |
|
7.6.2 Generating the spatial weights matrix |
|
|
201 | (1) |
|
7.7 Global and local spatial autocorrelation |
|
|
202 | (9) |
|
7.7.1 Spatial lags and Moran's |
|
|
203 | (1) |
|
7.7.2 Local spatial autocorrelation |
|
|
204 | (2) |
|
7.7.3 Identifying clusters and spatial outliers with local indicators of spatial association (LISA) |
|
|
206 | (5) |
|
|
211 | (2) |
|
8 Modeling US Census data |
|
|
213 | (42) |
|
8.1 Indices of segregation and diversity |
|
|
213 | (7) |
|
8.1.1 Data setup with spatial analysis |
|
|
213 | (2) |
|
8.1.2 The Dissimilarity Index |
|
|
215 | (2) |
|
8.1.3 Multi-group segregation indices |
|
|
217 | (1) |
|
8.1.4 Visualizing the diversity gradient |
|
|
218 | (2) |
|
8.2 Regression modeling with US Census data |
|
|
220 | (14) |
|
8.2.1 Data setup and exploratory data analysis |
|
|
222 | (1) |
|
8.2.2 Inspecting the outcome variable with visualization |
|
|
223 | (2) |
|
8.2.3 "Feature engineering" |
|
|
225 | (1) |
|
8.2.4 A first regression model |
|
|
225 | (5) |
|
8.2.5 Dimension reduction with principal components analysis |
|
|
230 | (4) |
|
|
234 | (7) |
|
8.3.1 Methods for spatial regression |
|
|
236 | (3) |
|
8.3.2 Choosing between spatial lag and spatial error models |
|
|
239 | (2) |
|
8.4 Geographically weighted regression |
|
|
241 | (6) |
|
8.4.1 Choosing a bandwidth for GWR |
|
|
242 | (1) |
|
8.4.2 Fitting and evaluating the GWR model |
|
|
243 | (3) |
|
|
246 | (1) |
|
8.5 Classification and clustering of ACS data |
|
|
247 | (6) |
|
8.5.1 Geodemographic classification |
|
|
248 | (2) |
|
8.5.2 Spatial clustering & regionalization |
|
|
250 | (3) |
|
|
253 | (2) |
|
9 Introduction to Census microdata |
|
|
255 | (14) |
|
|
255 | (2) |
|
9.1.1 Microdata resources: IPUMS |
|
|
256 | (1) |
|
9.1.2 Microdata and the Census API |
|
|
256 | (1) |
|
9.2 Using microdata in tidycensus |
|
|
257 | (3) |
|
9.2.1 Basic usage of get_pums () |
|
|
257 | (1) |
|
9.2.2 Understanding default data from get_pums () |
|
|
258 | (2) |
|
9.3 Working with PUMS variables |
|
|
260 | (3) |
|
9.3.1 Variables available in the ACS PUMS |
|
|
261 | (1) |
|
9.3.2 Recoding PUMS variables |
|
|
261 | (1) |
|
9.3.3 Using variables filters |
|
|
262 | (1) |
|
9.4 Public Use Microdata Areas (PUMAs) |
|
|
263 | (4) |
|
|
263 | (2) |
|
9.4.2 Working with PUMAs in PUMS data |
|
|
265 | (2) |
|
|
267 | (2) |
|
10 Analyzing Census microdata |
|
|
269 | (16) |
|
10.1 PUMS data and the tidyverse |
|
|
269 | (5) |
|
10.1.1 Basic tabulation of weights with tidyverse tools |
|
|
269 | (3) |
|
10.1.2 Group-wise data tabulation |
|
|
272 | (2) |
|
|
274 | (1) |
|
10.3 Survey design and the ACS PUMS |
|
|
275 | (5) |
|
10.3.1 Getting replicate weights |
|
|
275 | (2) |
|
10.3.2 Creating a survey object |
|
|
277 | (1) |
|
10.3.3 Calculating estimates and errors with srvyr |
|
|
278 | (1) |
|
10.3.4 Converting standard errors to margins of error |
|
|
279 | (1) |
|
10.4 Modeling with PUMS data |
|
|
280 | (4) |
|
|
281 | (1) |
|
10.4.2 Fitting and evaluating the model |
|
|
282 | (2) |
|
|
284 | (1) |
|
11 Other Census and government data resources |
|
|
285 | (32) |
|
11.1 Mapping historical geographies of New York City with NHGIS |
|
|
285 | (7) |
|
11.1.1 Getting started with NHGIS |
|
|
286 | (1) |
|
11.1.2 Working with NHGIS data in R |
|
|
287 | (2) |
|
11.1.3 Mapping NHGIS data in R |
|
|
289 | (3) |
|
11.2 Analyzing complete-count historical microdata with IPUMS and R |
|
|
292 | (10) |
|
11.2.1 Getting microdata from IPUMS |
|
|
294 | (2) |
|
11.2.2 Loading microdata into a database |
|
|
296 | (1) |
|
11.2.3 Accessing your microdata database with R |
|
|
297 | (3) |
|
11.2.4 Analyzing big Census microdata in R |
|
|
300 | (2) |
|
11.3 Other US government datasets |
|
|
302 | (10) |
|
11.3.1 Accessing Census data resources with censusapi |
|
|
302 | (4) |
|
11.3.2 Analyzing labor markets with lehdr |
|
|
306 | (2) |
|
11.3.3 Bureau of Labor Statistics data with blscrapeR |
|
|
308 | (2) |
|
11.3.4 Working with agricultural data with tidyUSDA |
|
|
310 | (2) |
|
11.4 Getting government data without R packages |
|
|
312 | (3) |
|
11.4.1 Making requests to APIs with httr |
|
|
312 | (1) |
|
11.4.2 Writing your own data access functions |
|
|
313 | (2) |
|
|
315 | (2) |
|
12 Working with Census data outside the United States |
|
|
317 | (26) |
|
12.1 The International Data Base and the idbr R package |
|
|
317 | (8) |
|
12.1.1 Visualizing IDB data |
|
|
319 | (3) |
|
12.1.2 Interactive and animated visualization of global demographic data |
|
|
322 | (3) |
|
12.2 Country-specific Census data packages |
|
|
325 | (16) |
|
|
325 | (2) |
|
12.2.2 Kenya: rKenyaCensus |
|
|
327 | (4) |
|
12.2.3 Mexico: combining mxmaps and inegiR |
|
|
331 | (2) |
|
12.2.4 Brazil: aligning the geobr R package with raw Census data files for spatial analysis |
|
|
333 | (8) |
|
12.3 Other international data resources |
|
|
341 | (1) |
|
|
341 | (2) |
Conclusion |
|
343 | (2) |
Bibliography |
|
345 | (8) |
Index |
|
353 | |