ST402 – Statistical Data Mining
Assignment 01
S/15/809
1. Define what is Data Mining.
Data mining is a process for turn raw data into useful information. It can be done using by some special
software. Data mining depends on effective data collection, warehousing, and computer processing.
2. What are the different names used for Data Mining?
Data mining is also known as Knowledge Discovery in Data (KDD).
data retrieval
data analytics
data extracting
data analysis
3. What kind of job/research opportunities available in the field of Data Mining? and what is the salary
range for these opportunities.
Job Salary
Data Manager $(111,250 – 186,000)
Data Architect $(119,750 – 193,550)
Big Data Engineer $(130,000 – 222,000)
Data Scientist $(105,750 – 180,250)
Data Analyst $(83,750 – 142,250)
4. What are the skills expected from a competent person for a Data Mining job/research opportunity?
Separately list down knowledge expected in theoretical areas and practical knowledge in tools.
1. Computer Science Skills
Programming/statistics language: R, Python, C++, Java, Matlab, SQL, SAS, shell etc.
Big data processing frameworks: Hadoop, Storm, Samza, Spark, Flink
Operating System: Linux
Database knowledge: Relational Databases & Non-Relational Databases
1. Statistics & Algorithm Skills
Basic Statistics Knowledge: Probability, Probability Distribution, Correlation, Regression, Linear
Algebra, Stochastic Process…
Data Structure & Algorithms
Machine Learning/Deep Learning Algorithm
Natural Language Processing
2. Others
Project Experience
Communication & Presentation Skills
6. Give 3 examples about data mining applications.
1. Data Mining Examples in Finance
Loan Payment Prediction
Targeted Marketing
Detect Financial Crimes
2. Applications of Data Mining In Marketing
Forecasting Market
Anomaly Detection
System Security
3. Examples Of Data Mining Applications In Healthcare
Healthcare Management
Effective Treatments
Fraudulent and Abusive Data
7. Briefly explain two data mining processes and why they are important in data mining application
development?
Data Mining Process in Oracle DBMS
RDBMS represents information within the kind of tables with rows and columns. Information is
accessed by writing database queries.
Relational Database management systems like Oracle support data processing using CRISP-DM.
The facilities of the Oracle information square measure helpful in information preparation and
understanding. Oracle supports data processing through java interface, PL/SQL interface,
machine-controlled data processing, SQL functions, and graphical user interfaces.