0% found this document useful (0 votes)
422 views1 page

Lab 1 Data Preprocessing

The document discusses pre-processing practices for exploring and cleaning a dataset. It involves creating a project, library, and data source in SAS before performing data exploration to identify issues like missing values, outliers, and noisy data using Statexplore and Graphexplore. Appropriate pre-processing techniques from Impute, Filter, Transform, and Feature Selection are then used to address issues found, such as imputing missing values, transforming skewed variables, and removing noisy data. A process flow is created and findings are documented in a table.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
422 views1 page

Lab 1 Data Preprocessing

The document discusses pre-processing practices for exploring and cleaning a dataset. It involves creating a project, library, and data source in SAS before performing data exploration to identify issues like missing values, outliers, and noisy data using Statexplore and Graphexplore. Appropriate pre-processing techniques from Impute, Filter, Transform, and Feature Selection are then used to address issues found, such as imputing missing values, transforming skewed variables, and removing noisy data. A process flow is created and findings are documented in a table.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 1

CT119-3-2-DMPM Data Pre-Processing

Data Pre Processing


1. Create a New Project ‘Exploratory Data Analysis’
2. Create a New Library ‘AAEM’
3. Create a New Data Source ‘PVA97NK’ from the SAS Table
4. Create a New Diagram ‘Pre-processing practice’

 Perform Data exploration and identify the data quality issues such as Missing Values,
Outliers, Noisy data etc. [USE STATEXPLORE and GRAPHEXPLORE]

 Use suitable Pre-processing methods from the Sample, Explore and Modify Tabs.
[USE IMPUTE, FILTER, TRANSFORM, FEATURE SELECTION]

Fill in the table below with your findings:

Data exploration - Findings Pre-processing Techniques Used

Missing value Impute


1) DemAge
2) GiftAvgcard36
High skewness Transform
1) GiftAvgAll
2) GiftAvgcard36
Noisy data Graph explore
1) TargetGiftAmount- has nosiy data
Data reduction Correlation plot : Pearson
GiftTimeFirst Stat explore

 Share the process flow for this Lab activity.

Level X Asia Pacific University of Technology & Innovation Page # of #

You might also like