Frequently asked questions
Source:vignettes/articles/frequently-asked-questions.Rmd
frequently-asked-questions.Rmd
How do I learn more about a particular dataset?
Read the online documentation for your survey. Some information is not included in the developer metadata or documentation pages and is only available in PDFs on the Census Bureau website.
How can I see the underlying API call sent to the Census Bureau?
You can see the underlying API call sent to the Census Bureau servers
by setting getCensus(show_call = TRUE)
when running your
code. If your getCensus()
call results in an error, it will
automatically print this underlying API call in your R console. You can
copy and paste this URL in your web browser to view it directly.
What does “There was an error while running your query” mean?
Occasionally you might get the general error message
“There was an error while running your query. We've logged the error and we'll correct it ASAP. Sorry for the inconvenience.”
This comes from the Census Bureau and could be caused by any number of
problems, including server issues. Try rerunning your API call. If that
doesn’t work and you are requesting a large amount of data, try reducing
the amount that you’re requesting. If you’re still having trouble, see
below for ways to get help.
My getCensus()
call worked last year but now it gives
an error. Why?
The Census Bureau makes frequent changes to the APIs. For annual
datasets, like the American Community Survey, variable names and
available geographies can change year to year. Options for timeseries
datasets sometimes change with new releases. Check the Census Bureau’s
online documentation for your dataset and use
listCensusMetadata()
to make sure you’re using the current
syntax.
Are the Census APIs case sensitive?
Yes. Run
listCensusMetadata(type = "variables")
on your dataset to
see what variables are available. If the variable name you want is
uppercase you’ll need to write it uppercase in your
getCensus()
request. Most of the APIs use uppercase, but
some use lowercase and some even use sentence case variable names.
How do I know what geographies are available for my dataset? What is a FIPS code?
Run listCensusMetadata(type = "geographies")
on your
dataset to check which geographies you can use. Each API has its own
list of valid geographies and they occasionally change as the Census
Bureau makes updates.
Most geographies in the Census APIs are specified using FIPS (Federal
Information Processing Standards) codes. For example, Autauga County,
Alabama is assigned state code 01
and county code
001
. Its combined GEOID, which uniquely identifies counties
nationally, is 01001
.
See the Census Bureau FIPS reference for valid codes and the geographic glossary for more information. You can also download more geographic identifying information from the Census gazetteer files, including full GEOID, name, and centroid coordinates.
FIPS codes are characters, not numbers. For example, state-level FIPS
codes are two characters long. region = state:01
will work
but region =``state:1
will usually not.
How do I get data for every state, county, or metro area?
Most Census datasets allow you to use a wildcard
— the
*
symbol — to get data for all of a geography class. For
example, you can get data for every state in the American Community
Survey by using region = state:*
.
state_data <- getCensus(
name = "acs/acs1",
vintage = 2022,
vars = c("NAME", "B19013_001E"),
region = "state:*")
head(state_data)
state | NAME | B19013_001E |
---|---|---|
01 | Alabama | 59674 |
02 | Alaska | 88121 |
04 | Arizona | 74568 |
05 | Arkansas | 55432 |
06 | California | 91551 |
08 | Colorado | 89302 |
Data for some small geographies, like Census tract or block group,
need to be nested using the regionin
argument. Run
listCensusMetadata(type = "geographies")
to see those
options.
Here’s an example getting block group population data within a specific Census tract using the 2020 Decennial Census.
block_group <- getCensus(
name = "dec/dhc",
vintage = 2020,
vars = c("NAME", "P1_001N"),
region = "block group:*",
regionin = "state:36+county:027+tract:220300")
block_group
state | county | tract | block_group | NAME | P1_001N |
---|---|---|---|---|---|
36 | 027 | 220300 | 1 | Block Group 1; Census Tract 2203; Dutchess County; New York | 1467 |
36 | 027 | 220300 | 2 | Block Group 2; Census Tract 2203; Dutchess County; New York | 1394 |
36 | 027 | 220300 | 3 | Block Group 3; Census Tract 2203; Dutchess County; New York | 1192 |
36 | 027 | 220300 | 4 | Block Group 4; Census Tract 2203; Dutchess County; New York | 885 |
I’m still stuck or got an unexpected result. How can I get help?
Use
listCensusMetadata()
to make sure you’re using the right variable names and/or geography names.Join the Census Bureau’s public Slack channel and ask your question in the R or API rooms. Census Bureau staff and other
censusapi
users (and thecensusapi
package developer!) check this Slack regularly. This is the fastest way to get help.Open a Github issue for bugs or issues that you suspect are caused by this R package.
For questions about a specific survey, you can email the contact listed in the dataset metadata found in
listCensusApis()
.
Why is my data -666666666 or another weird value? What is an annotation?
Some Census datasets, including the American Community Survey, use annotated values. These values use numbers or symbols to indicate that the data is unavailable, has been top coded, has an insufficient sample size, or other noteworthy characteristics. Read more from the Census Bureau on ACS annotation meanings and ACS variable types.
The censusapi
package is intended to return the data
as-is so that you can receive those unaltered annotations. If you are
using data for a small geography like Census tract or block group make
sure to check for values like -666666666
or check the
annotation columns for non-empty values to exclude as needed.
As an example, we’ll get the total number of households (B11012_001E) and median household income with associated annotations and margins of error (B19013 group) for three census tracts in Washington, DC.
The value for one tract is available, one is top coded, and one is unavailable. Notice that income is top coded at $250,000 — meaning any tract’s income that is above that threshold is listed as $250,001. You can see a value has a special meaning in the “EA” (estimate annotation) and “MA” (margin of error annotation) columns.
annotations_example <- getCensus(
name = "acs/acs5",
vintage = 2022,
vars = c("B11012_001E", "group(B19013)"),
region = "tract:006804,007703,000903",
regionin = "county:001&state:11")
annotations_example
state | county | tract | B11012_001E | B19013_001E | B19013_001EA | B19013_001M | B19013_001MA | GEO_ID | NAME |
---|---|---|---|---|---|---|---|---|---|
11 | 001 | 000903 | 358 | 250001 | 250,000+ | -333333333 | *** | 1400000US11001000903 | Census Tract 9.03; District of Columbia; District of Columbia |
11 | 001 | 006804 | 18 | -666666666 | - | -222222222 | ** | 1400000US11001006804 | Census Tract 68.04; District of Columbia; District of Columbia |
11 | 001 | 007703 | 2340 | 59369 | NA | 24077 | NA | 1400000US11001007703 | Census Tract 77.03; District of Columbia; District of Columbia |