0% found this document useful (0 votes)
8 views3 pages

Data Analytics & Visualization (Theory and Lab Syllabus)

The document outlines the course structure for 'Foundation of Data Science' and 'Data Analytics & Visualization Lab', detailing course codes, titles, credits, and outcomes. It covers topics such as data analytics lifecycle, advanced data analysis techniques, mining data streams, and visualization methods, along with relevant textbooks and online resources. Additionally, it specifies the examination format and permissible materials during assessments.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views3 pages

Data Analytics & Visualization (Theory and Lab Syllabus)

The document outlines the course structure for 'Foundation of Data Science' and 'Data Analytics & Visualization Lab', detailing course codes, titles, credits, and outcomes. It covers topics such as data analytics lifecycle, advanced data analysis techniques, mining data streams, and visualization methods, along with relevant textbooks and online resources. Additionally, it specifies the examination format and permissible materials during assessments.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

Course Code Course Title L T P Credits

3 - - 2
BT-AIDS-301A Foundation of Data Science CIE SEE Total
40 60 100
Course Outcomes
Understand the comprehensive lifecycle phases of Data Analytics, from initial discovery and
CO1
planning to model building and effective communication of results.
Comprehend and apply various advanced Data Analysis Techniques for different analytical problems
CO2
and data types.
Implement methods for mining and processing data from diverse streams, including real-time
CO3
analytics applications.
Analyze frequent item sets, apply various clustering techniques, and effectively visualize data to
CO4
extract meaningful patterns and communicate insights.
Course Outcomes (CO) to Programme Outcomes (PO) mapping (scale 1: low, 2: Medium, 3: High)
PO1 PO2 PO3 PO4 PO5 PO6 PO7 PO8 PO9 PO10 PO11 PO12 PSO PSO2
1
CO1 2 3 2 2 3 - - - 1 2 2 3 3 3
CO2 2 2 3 2 3 - - - 1 2 2 2 2 3
CO3 2 3 3 2 3 - - - 2 3 2 3 3 3
CO4 3 3 3 3 3 - - - 2 2 2 3 3 2

COURSE CURRICULUM

Course Outline
Unit: I Introduction to Data Analytics Fundamentals Contact Hours:10
Introduction to Data Analytics: Sources and nature of data, classification of data (structured, semi-
structured, unstructured), characteristics of data. Introduction to Big Data platform, need of data analytics,
evolution of analytic scalability, analytic process and tools, analysis vs reporting, modern data analytic tools,
applications of data analytics.
Data Analytics Lifecycle: Need, key roles for successful analytic projects, various phases of data analytics
lifecycle – discovery, data preparation, model planning, model building, communicating results,
operationalization.
Unit: II Advanced Data Analysis Techniques Contact Hours:08
Data Analysis: Regression modeling, multivariate analysis, Bayesian modeling, inference and Bayesian
networks, support vector and kernel methods, analysis of time series: linear systems analysis & nonlinear
dynamics. Rule induction, neural networks: learning and generalization, competitive learning, principal
component analysis and neural networks, fuzzy logic: extracting fuzzy models from data, fuzzy decision
trees, stochastic search methods.
Unit: III Mining Data Streams Contact Hours:08
Mining Data Streams: Introduction to streams concepts, stream data model and architecture, stream
computing, sampling data in a stream, filtering streams, counting distinct elements in a stream, estimating
moments, counting oneness in a window, decaying window.
Real-Time Analytics Platform (RTAP) applications, Case studies – real time sentiment analysis, stock
market predictions.
Unit: IV Frequent Item sets, Clustering & Visualization Contact Hours:12
Frequent Item sets and Clustering: Mining frequent item sets, market-based modelling. Apriori algorithm,
handling large data sets in main memory, limited pass algorithm, counting frequent item sets in a stream,
clustering techniques: hierarchical, K-means, clustering high dimensional data, CLIQUE and ProClus,
frequent pattern-based clustering methods, clustering in non-Euclidean space, clustering for streams and
parallelism.
Introduction to Visualization and Stages – Computational Support - Issues - Different Types of Tasks -
Data representation – Limitation: Display Space- Rendering Time – Navigation Links.
Human Vision – Space Limitation - Time Limitations - Design - Exploration of Complex Information Space
- Figure Caption in Visual Interface - Visual Objects and Data Objects - Space Perception and Data in Space
- Images, Narrative and Gestures for Explanation.

Text books:
1. Michael Berthold, David J. Hand, Intelligent Data Analysis, Springer
2. Anand Rajaraman and Jeffrey David Ullman, Mining of Massive Datasets, Cambridge University Press.
3. Bill Franks, Taming the Big Data Tidal wave: Finding Opportunities in Huge Data Streams with Advanced
Analytics, John Wiley & Sons.
4. Michael Minelli, Michelle Chambers, and Ambiga Dhiraj, "Big Data, Big Analytics: Emerging Business
Intelligence and Analytic Trends for Today's Businesses", Wiley
5. David Dietrich, Barry Heller, Beibei Yang, “Data Science and Big Data Analytics”, EMC Education Series,
John Wiley
6. Frank J Ohlhorst, “Big Data Analytics: Turning Big Data into Big Money”, Wiley and SAS Business Series
7. Colleen Mccue, “Data Mining and Predictive Analysis: Intelligence Gathering and Crime Analysis”, Elsevier
8. Anil Maheshwari, “Data Analytics”, McGraw Hill Education
9. Paul Zikopoulos, Chris Eaton, Paul Zikopoulos, “Understanding Big Data: Analytics for Enterprise Class
Hadoop and Streaming Data”, McGraw Hill
10. Trevor Hastie, Robert Tibshirani, Jerome Friedman, "The Elements of Statistical Learning", Springer
11. Mark Gardner, “Beginning R: The Statistical Programming Language”, Wrox Publication
12. Pete Warden, Big Data Glossary, O’Reilly

Reference Books:
1. Claus O. Wilke, “Fundamentals of Data Visualization”, O’Reilly Media, Sebastopol, 2024

2. Alexander Loth, “Visual Analytics with Tableau”, Wiley, Hoboken, 2024

3. Joshua N. Milligan, “Learning Tableau 2022”, Packt Publishing, Birmingham, 2025

Online Learning Resources/URLs:


1.Coursera – Courses like “Data Visualization with Tableau” by University of California, Davis and “Data
Analysis with Python” by IBM
2.Website: https://www.coursera.org
3.edX – Courses such as “Data Science: Visualization” by Harvard
University Website: https://www.edx.org
4.Udemy – Courses like “Tableau 2023 A-Z: Hands-On Tableau Training for Data Science” and “Microsoft Power
BI Desktop for Business Intelligence”
5.Website: https://www.udemy.com

6.Kaggle – Hands-on learning through real datasets, notebooks, and competitions Website:
https://www.kaggle.com/learn

NOTE: 1. For the semester examination, nine questions are to be set by the examiner. Question no. 1, containing 5-7 short
answer type questions, will be compulsory & based on the entire syllabus. Rest of the eight questions is to be set by setting
two questions from each of the four units of the syllabus. The candidates will be required to attempt five questions in all,
selecting one from each unit AND Question no. 1. All questions will carry equal marks.

2. The students will be allowed to use non-programmable scientific calculator. However, sharing /ex-change of calculator or
any other items are prohibited in the examinations. No programmable calculators, mobile phones or other electrical/ electronic
items are allowed in the examination
Course Code Course Title L T P Credits
0 0 2 2
CIE SEE Total
BT-AIDS-373 A Data Analytics & Visualization Lab
50 50 100
Course Outcomes
Perform fundamental data handling and numerical operations in R, including data import/export and
CO1
matrix manipulations.
Apply statistical analysis and data preprocessing techniques, such as handling missing data,
CO2
normalization, and dimensionality reduction, using R.
Implement common machine learning algorithms like linear regression and clustering, and conduct
CO3
specialized analyses such as association rules and time-series modeling in R.
Collect data from diverse sources using web-scraping/APIs, perform text mining, and create various
CO4
advanced visualizations, including cartographic representations, in R.
Course Outcomes (CO) to Programme Outcomes (PO) mapping (scale 1: low, 2: Medium, 3: High)
PO1 PO2 PO3 PO4 PO5 PO6 PO7 PO8 PO9 PO10 PO11 PO12 PSO1 PSO2
CO1 2 3 2 2 3 - - - 1 3 2 3 3 3
CO2 2 2 3 2 3 - - - 2 2 3 2 2 3
CO3 2 3 3 2 3 - - - 2 3 2 3 3 3
Sr. Name of Experiments
CO4 3 3 3 3 3 - - - 2 2 2 3 3 2
No.
To get the input from user and perform numerical operations (MAX, MIN, AVG, SUM, SQRT,
1
ROUND) using in R
2 To perform data import/export (.CSV, .XLS, .TXT) operations using data frames in R.
To get the input matrix from user and perform Matrix addition, subtraction, multiplication, inverse
3
transpose and division operations using vector concept in R.
4 To perform statistical operations (Mean, Median, Mode and Standard deviation) using R.
5 To perform data pre-processing operations i) Handling Missing data ii) Min-Max normalization
6 To perform dimensionality reduction operation using PCA for Houses Data Set
7 To perform Simple Linear Regression with R.
8 To perform K-Means clustering operation and visualize for iris data set
Learn how to collect data via web-scraping, APIs and data connectors from suitable sources as
9
specified by the instructor.
10 Perform association analysis on a given dataset and evaluate its accuracy.
11 Build a recommendation system on a given dataset and evaluate its accuracy.
12 Build a time-series model on a given dataset and evaluate its accuracy.
Build cartographic visualization for multiple datasets involving various countries of the world; states
13
and districts in India etc.
Perform text mining on a set of documents and visualize the most important words in a visualization
14
such as word cloud.
Perform text classification (e.g., sentiment analysis) on a given text dataset using a basic machine
15
learning model in R.

You might also like