SDTM Data Inconsistencies and Cleaning Guide
Domain Common Inconsistencies How to Fix (SAS Logic)
AGE = intck('year', input(BRTHDTC,yymmdd10.), input(RFSTDTC,yy
DM (Demographics)
Incorrect AGE, missing RFSTDTC/RFENDTC Validate RFSTDTC ≤ RFENDTC
if input(AEENDTC,yymmdd10.) < input(AESTDTC,yymmdd10.) then
AE (Adverse Events)
AEENDTC < AESTDTC, missing AEOUT, AESEV Check AEOUT, derive AETRTEM
LBSTRESN = LBORRES * conversion_factor;
LB (Lab Results) LBSTRESN missing, unit inconsistency, abnormal Use
flagsformats
incorrect
to normalize units
VS (Vital Signs) Mixed units (°F/°C), implausible values, missing VSSTAT
Convert all to °C, validate values: if SYSBP > 250 then flag='Check'
CM (ConcomitantCMENDTC
Meds) < CMSTDTC, duplicates, missing indication
Standardize CMTRT; if CMENDTC < CMSTDTC then CMENDTC = C
SU (Substance Use)
Duplicate SUTRT, inconsistent frequency Remove duplicates, standardize SUFREQ using upcase/compress
MH (Medical History)
Uncoded MHTERM, date outside study period Flag records outside RFSTDTC–RFENDTC, apply MedDRA dictiona
CE (Clinical Events)
Missing CEDECOD, incorrect date order Apply dictionary coding, ensure CESTDTC ≤ CEENDTC
DS (Disposition) Invalid DSTERM or DSDECOD, DSSTDTC > RFENDTC
Map DSDECOD to standard terms, validate dates
Use this guide to identify and fix real-time data issues in your SDTM mapping. Combine it with P21
validation and protocol-level cross-checks for accurate submission.