-
Notifications
You must be signed in to change notification settings - Fork 28
Expand file tree
/
Copy pathWQXValidationService.Rmd
More file actions
169 lines (126 loc) · 7.59 KB
/
WQXValidationService.Rmd
File metadata and controls
169 lines (126 loc) · 7.59 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
---
title: "WQX QAQC Service User Guide"
format: html
editor: visual
author: "TADA Team"
date: "`r Sys.Date()`"
output: rmarkdown::html_vignette
vignette: >
%\usepackage[utf8]{inputenc}
%\VignetteIndexEntry{WQX QAQC Service User Guide}
%\VignetteEngine{knitr::rmarkdown}
editor_options:
chunk_output_type: console
markdown:
wrap: 72
---
## TADA Leverages the Water Quality eXchange (WQX) QAQC Service
This is a overview of the the WQX Quality Assurance and Quality Control
(QAQC) data submission service, and how TADA leverages that service to
flag potentially suspect data in the Water Quality Portal (WQP). It will
cover: 1) an overview of all available WQX QAQC tests for data
submissions, 2) which of these QAQC tests are also available in TADA for
flagging potentially suspect WQP data, and 3) how to interpret and
provide feedback on the validation reference tables referenced by WQX
and TADA for this QAQC service.
## Background
The WQX expectation for submissions is that users submit only QAQC'd
data and utilize WQX elements to ensure the data is of "documented
quality". The WQX team has historically hosted data quality working
groups aimed at creating best practices and required data elements for
WQX 3.0 for specific parameter groups such as nutrients, metals and
biological data. These resources have supported users to submit data of
documented quality. This approach has been broadly successful, but
because it is not an enforceable approach it has allowed for data of
varying quality to be shared. In response, the WQX team recently
released a QAQC service in 2022 to test and flag data submissions for
potential errors, and provide the data submitter with a chance to make
edits to improve the quality of their data, before it goes into the WQP.
However the new WQX services will not flag or fix data historical that
is already in the WQP. Therefore, the TADA R package leverages the WQX
QAQC Validation Reference tables in the R package to make these same
tests available on WQP outbound data using R.
## Available Tests
Tests available in both WQX Web for data submissions & the TADA R
Package for use with data retrieved from the WQP:
- Tests that Leverage the WQX Validation Table
- Flag suspect characteristic and unit combinations (e.g., flag
data if mg/L or °C was used for pH; flag if units were left
blank, NA, None, etc. but are required for the characteristic)
- [TADA_FlagResultUnit()](https://usepa.github.io/EPATADA/reference/TADA_FlagResultUnit.html)
- Flag data without methods, with uncommon methods, or with
suspect methods for the characteristic. This is contingent on
the activity type since some activity types don't require
methods.
- [TADA_FlagMethod()](https://usepa.github.io/EPATADA/reference/TADA_FlagMethod.html)
- Flag data with suspect characteristic and fraction combinations
- [TADA_FlagFraction()](https://usepa.github.io/EPATADA/reference/TADA_FlagFraction.html)
- Flag data with suspect characteristic and speciation combination
- [TADA_FlagSpeciation()](https://usepa.github.io/EPATADA/reference/TADA_FlagSpeciation.html)
- Flag monitoring locations with imprecise coordinates (accuracy
of lat/long is less than three three decimal degrees (e.g.,
38.88°, -77.00°)] and/or coordinates outside of the US.
- [TADA_FlagCoordinates()](https://usepa.github.io/EPATADA/reference/TADA_FlagCoordinates.html)
- Flag results that do not have an approved QAPP and/or do not
include a QAPP attachment
- [TADA_FindQAPPDoc()](https://usepa.github.io/EPATADA/reference/TADA_FindQAPPDoc.html)
- [TADA_FindQAPPApproval()](https://usepa.github.io/EPATADA/reference/TADA_FindQAPPApproval.html)
- Flag results above or below national thresholds using the
Interquartile Range (IQR) method. The IQR method is defined as
the difference between the 75% percentile and the 25% percentile
of your dataset.
- Upper Threshold= 75th Percentile + 1.5 \* (75th percentile -
25th percentile)
- [TADA_FlagAboveThreshold()](https://usepa.github.io/EPATADA/reference/TADA_FlagAboveThreshold.html)
- Lower Threshold= 25th Percentile - 1.5 \* (75th percentile -
25th percentile)
- [TADA_FlagBelowThreshold()](https://usepa.github.io/EPATADA/reference/TADA_FlagBelowThreshold.html)
[ with an
interquartile range and normal
[probability](https://en.wikipedia.org/wiki/Probability_density_function).
Attribution: [Jhguch](https://creativecommons.org/licenses/by-sa/2.5/)
via Wikimedia
Commons.](images/IQR.png)](https://commons.wikimedia.org/wiki/File:Boxplot_vs_PDF.svg)
Additional tests only available in WQX Web and Node Submissions:
- Location
- Flag data where the lat/long assigned to monitoring sites is NOT
in the state and country included in metadata.
## Providing Feedback on Validation Tables
All WQX Domain Tables are available
[HERE](https://www.epa.gov/waterdata/storage-and-retrieval-and-water-quality-exchange-domain-services-and-downloads).
TADA leverages many of the WQX domain tables.
- [QAQCCharacteristicValidation
(ZIP)](https://cdx.epa.gov/wqx/download/DomainValues/QAQCCharacteristicValidation_CSV.zip) \|
([XML](https://cdx.epa.gov/wqx/download/DomainValues/QAQCCharacteristicValidation.zip))
\|
([CSV)](https://cdx.epa.gov/wqx/download/DomainValues/QAQCCharacteristicValidation.CSV)
- Both WQX and TADA leverage the table above to flag suspect and
uncommon results. This reference table is used in the following
TADA functions:
- [TADA_FlagAboveThreshold()](https://usepa.github.io/EPATADA/reference/TADA_FlagAboveThreshold.html)
- [TADA_FlagBelowThreshold()](https://usepa.github.io/EPATADA/reference/TADA_FlagBelowThreshold.html)
- [TADA_FlagResultUnit()](https://usepa.github.io/EPATADA/reference/TADA_FlagResultUnit.html)
- [TADA_FlagMethod()](https://usepa.github.io/EPATADA/reference/TADA_FlagMethod.html)
- [TADA_FlagFraction()](https://usepa.github.io/EPATADA/reference/TADA_FlagFraction.html)
- [TADA_FlagSpeciation()](https://usepa.github.io/EPATADA/reference/TADA_FlagSpeciation.html)
- [MeasureUnit
(ZIP)](https://cdx.epa.gov/wqx/download/DomainValues/MeasureUnit_CSV.zip) \|
([XML](https://cdx.epa.gov/wqx/download/DomainValues/MeasureUnit.zip))\| [(CSV)](https://cdx.epa.gov/wqx/download/DomainValues/MeasureUnit.CSV)
- Both WQX and TADA leverage the table above to convert all data
for each unique characteristic to a consistent unit, so that
results can then we assessed against the WQX upper and lower
thresholds in the validation table. Target units for each
characteristic are included in the MeasureUnit domain table.
TADA leverages this table to convert all data for each unique
characteristic to a consistent target unit. See relevant TADA
functions:
- See TADA Function:
[TADA_ConvertResultUnits()](https://usepa.github.io/EPATADA/reference/TADA_ConvertResultUnits.html)
All TADA Reference and Validation Tables are also available in the R
Package
[HERE](https://github.com/USEPA/EPATADA/tree/develop/inst/extdata). TADA
pulls the WQX Validation Table and other domain tables into TADA and
updates them automatically whenever changes are made. WQX and TADA users
can review and provide feedback on the validation table and/or target
units assigned, and provide feedback by emailing the WQX helpdesk
(WQX\@epa.gov).