nhanesdata

The National Health and Nutrition Examination Survey (NHANES) is one of the most comprehensive public health datasets available, spanning over two decades of U.S. health data. But working with it has been frustrating. If you've tried using NHANES before, you've likely hit two major problems: (1) CDC server reliability issues that break reproducible research, and (2) cycle suffix confusion, where finding DEMO, DEMO_B, DEMO_C, all the way through DEMO_L makes data discovery a scavenger hunt.

nhanesdata solves both problems. All datasets are hosted on reliable cloud storage with fast access, and all survey cycles are already merged. Just use read_nhanes("demo") and you get demographics data from 1999-2023 with a year column tracking which cycle each observation belongs to. No CDC server timeouts, no suffix confusion.

All processed datasets are publicly available at https://nhanes.kylegrealis.com/ with no authentication required.

Acknowledgments

This package builds on the nhanesA package, which provides the foundation for accessing NHANES data through R.

Installation

# From CRAN (submitted for approval Feb. 18, 2026)
install.packages("nhanesdata")

# Development version from GitHub
pak::pak("kyleGrealis/nhanesdata")

Quick Start

library(nhanesdata)

# Load any dataset (case-insensitive)
demo   <- read_nhanes("demo")    # Demographics
bpx    <- read_nhanes("BPX")     # Blood pressure
trigly <- read_nhanes("TRIGLY")   # Triglycerides

# Search for variables
term_search("diabetes") # By keyword
var_search("RIDAGEYR")  # By variable name

# Get CDC documentation
get_url("DEMO_J")

All datasets include a year column (survey cycle start year) and seqn (participant ID). Join datasets on both columns:

library(dplyr)

analysis <- read_nhanes("demo") |>
  inner_join(read_nhanes("bpx"), by = c("seqn", "year"))

Functions

Function	Purpose
`read_nhanes()`	Load a pre-merged NHANES dataset from cloud storage
`create_design()`	Create survey design objects with proper weighting for multiple cycles
`term_search()`	Search variables by keyword or phrase
`var_search()`	Search variables by exact name
`get_url()`	Get CDC codebook URL for a specific table

All functions are case-insensitive.

Available Datasets

All standard NHANES datasets are included, except:

Surplus samples (requires special access)
Pooled samples (different analysis requirements)
Special samples (limited availability)
2019-2020 cycle data (COVID-19 disruption)

Categories include:

Questionnaire/Interview: Demographics, health conditions, lifestyle factors, dietary data
Examination: Physical measurements, body composition, cardiovascular fitness
Laboratory: Biomarkers, environmental chemicals, infectious disease serology, nutritional status
Dietary: Dietary recall, supplement use, food frequency questionnaires

See the dataset catalog for the complete list, or browse inst/extdata/datasets.yml in the source.

Important Notes

The 2019-2020 survey cycle (suffix K) is excluded due to COVID-19 data collection disruptions. See vignette("covid-data-exclusion") for details.
Variable names match CDC documentation. Always verify definitions with get_url() since variable usage may differ across cycles.
Data types are automatically harmonized across cycles (integer vs. double, factor vs. character).

Direct Access (Without the Package)

library(arrow)
demo <- arrow::read_parquet("https://nhanes.kylegrealis.com/demo.parquet")

This works from any language with Arrow support. Dataset names in URLs are lowercase.

Getting Help

Documentation: ?read_nhanes, browseVignettes("nhanesdata")
Bug reports: GitHub Issues
CDC NHANES: nhanes.cdc.gov

Related Packages

nhanesA: Direct interface to the NHANES API
survey: Complex survey analysis with proper weighting
srvyr: Tidy survey analysis using dplyr syntax
gtsummary: Publication-ready summary tables
sumExtras: Extended summary statistics and helpers

License

NHANES data is public domain (U.S. government). This processing code is MIT licensed.

Name		Name	Last commit message	Last commit date
Latest commit History 52 Commits
.github/workflows		.github/workflows
.quarto/project-cache		.quarto/project-cache
R		R
inst		inst
logos		logos
man		man
pkgdown		pkgdown
tests/testthat		tests/testthat
vignettes		vignettes
.Rbuildignore		.Rbuildignore
.checksums.json		.checksums.json
.gitignore		.gitignore
.lintr		.lintr
.pre-commit-config.yaml		.pre-commit-config.yaml
DESCRIPTION		DESCRIPTION
LICENSE		LICENSE
LICENSE.md		LICENSE.md
MAINTAINER.md		MAINTAINER.md
NAMESPACE		NAMESPACE
NEWS.md		NEWS.md
README.md		README.md
cran-comments.md		cran-comments.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Licenses found

Repository files navigation

nhanesdata

Acknowledgments

Installation

Quick Start

Functions

Available Datasets

Important Notes

Direct Access (Without the Package)

Getting Help

Related Packages

License

About

Licenses found

Uh oh!

Releases 2

Packages

Uh oh!

Contributors

Uh oh!

Languages

License

Licenses found

kyleGrealis/nhanesdata

Folders and files

Latest commit

History

Repository files navigation

nhanesdata

Acknowledgments

Installation

Quick Start

Functions

Available Datasets

Important Notes

Direct Access (Without the Package)

Getting Help

Related Packages

License

About

Topics

Resources

License

Licenses found

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages