Michigan PFAS Analysis

An R-based data analysis and interactive Shiny dashboard for exploring PFAS hazard patterns across Michigan.

This project combines PFAS contamination site data with drinking water sample data to analyze spatial and temporal trends in hazard index values, identify potential hotspots, and present the results through an interactive dashboard.

Live dashboard

Open the dashboard

Dashboard preview

Project overview

PFAS contamination has become an important environmental and public health issue in Michigan. This project analyzes Michigan PFAS data to better understand:

how hazard index values are distributed across sites and counties
where contamination hotspots may exist
how sampling effort varies across space and time
whether selected sites show statistically meaningful differences in hazard index values

The repository includes both the analytical report and a Shiny dashboard built on top of the cleaned and merged data.

What this project does

Data preparation

The analysis starts from multiple CSV files in the data/ directory, including site-level data, hazard index sample data, and accompanying data dictionaries. During preprocessing, the project:

inspects variable types and missingness patterns
removes columns with heavy missingness that are not essential to the analysis
assigns missing GEOID values using the nearest county by spatial proximity
links samples to nearby facilities using a derived nearest_facility field
creates a merged dataset used for downstream analysis and the dashboard

Exploratory analysis

The report explores:

the overall distribution of hazard index values
counties with high maximum hazard index values
sampling effort by county and by month
site-level comparisons between sample count and maximum hazard index

Statistical testing

The project also includes a permutation test comparing hazard index values between Pellston Regional Airport and Manistee Blacker Airport.

Key findings

Most hazard index values are low and concentrated below the EPA threshold of concern (HI < 1).
A small number of counties and sites show much higher maximum hazard index values, suggesting localized hotspots.
Sampling effort is uneven across counties, which means some areas are monitored much more heavily than others.
Sampling activity shows seasonality, with higher activity during summer months.
Although Pellston Regional Airport had a higher mean hazard index than Manistee Blacker Airport, the permutation test did not find strong enough evidence to conclude that the difference was statistically significant.

Dashboard features

The Shiny dashboard provides an interactive way to explore the processed data.

Main features

interactive Michigan map with county boundaries
site markers colored by hazard severity
click-to-zoom behavior for viewing samples around a selected site
connecting lines from a selected site to its related samples
site summary panel with sample count and descriptive statistics
default global plots for hazard index distribution and sampling effort over time
site-specific plots after selection, including:
- hazard index distribution for the selected site
- stacked view of hazard index composition by sample for non-zero cases
embedded analysis report viewer inside the app

Repository structure

pfas-data-analytics/
├── app.R
├── project_report.qmd
├── _quarto.yml
├── README.md
├── data/
│   ├── data_dict_hazard.csv
│   ├── data_dict_sites.csv
│   ├── pfas_hazard_index.csv
│   ├── pfas_public_water_long.csv
│   ├── pfas_sites.csv
│   ├── pfas_surface_water_long.csv
│   └── samples_site.csv
└── www/

Tools and packages

This project is built in R and uses packages from the tidyverse ecosystem together with geospatial, reporting, and interactive visualization tools.

Core packages used across the analysis and dashboard include:

tidyverse
dplyr
knitr
skimr
flextable
naniar
purrr
sf
plotly
tigris
leaflet
shiny

Running the project locally

1. Clone the repository

git clone https://github.com/nishanKhanal/pfas-data-analytics.git
cd pfas-data-analytics

2. Install required packages

Open R or RStudio and install the required packages:

install.packages(c(
  "tidyverse", "shiny", "leaflet", "plotly", "tigris",
  "knitr", "skimr", "flextable", "naniar", "purrr", "sf", "dplyr"
))

3. Run the dashboard

shiny::runApp("app.R")

4. Render the report

Because the Quarto configuration writes output to www/, you can regenerate the report with:

quarto render project_report.qmd

Data notes

The project relies on Michigan PFAS data files stored locally in the repository under data/. The dashboard reads from the processed file data/samples_site.csv, while the analytical report starts from the raw site and hazard datasets and performs cleaning, transformation, and merging steps.

Because some spatial joins are based on nearest-county and nearest-facility assumptions, results should be interpreted carefully, especially for samples with missing location identifiers in the original data.

Authors

Nishan
Kabin
Udita

Acknowledgment

This repository was created as part of an R-based PFAS data analysis project focused on understanding hazard index patterns in Michigan and communicating the results through both a reproducible report and an interactive dashboard.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Michigan PFAS Analysis

Live dashboard

Dashboard preview

Project overview

What this project does

Data preparation

Exploratory analysis

Statistical testing

Key findings

Dashboard features

Main features

Repository structure

Tools and packages

Running the project locally

1. Clone the repository

2. Install required packages

3. Run the dashboard

4. Render the report

Data notes

Authors

Acknowledgment

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
assets		assets
data		data
www		www
.gitignore		.gitignore
README.md		README.md
_quarto.yml		_quarto.yml
app.R		app.R
project_report.qmd		project_report.qmd

Folders and files

Latest commit

History

Repository files navigation

Michigan PFAS Analysis

Live dashboard

Dashboard preview

Project overview

What this project does

Data preparation

Exploratory analysis

Statistical testing

Key findings

Dashboard features

Main features

Repository structure

Tools and packages

Running the project locally

1. Clone the repository

2. Install required packages

3. Run the dashboard

4. Render the report

Data notes

Authors

Acknowledgment

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages