Introduction to
Big Data Analytics
B.Bhuvaneswaran
Assistant Professor (SS)
Department of Computer Science & Engineering
Rajalakshmi Engineering College
Thandalam
Chennai 602 105
[email protected]Big Data Defined
There are multiple characteristics of big
data, but 3 stand out as defining
Characteristics:
Huge volume of data (for instance, tools that
can manage billions of rows and billions of
columns)
Complexity of data types and structures,
with an increasing volume of unstructured data
(80-90% of the data in existence is
unstructured).part of the Digital Shadow or
Data Exhaust
Speed or velocity of new data creation
Question? (Ref. Module-1,
Page-5)
What would be considered "Big Data"?
A. An OLAP Cube containing customer
demographic information about 100,000,000
customers
B. Daily Log files from a web server that
receives 100,000 hits per minute
C. Aggregated statistical data stored in a
relational database table
D. Spreadsheets containing monthly sales
data for a Global 100 corporation
Question? (Ref. Module-1,
Page-6)
What are the characteristics of Big Data?
A. Data volume, processing complexity, and
data structure variety.
B. Data volume, business importance, and
data structure variety.
C. Data type, processing complexity, and data
structure variety.
D. Data volume, processing complexity, and
business importance.
Question? (Ref. Module-1,
Page-7)
Which data asset is an example of quasistructured data?
A. Webserver log
B. XML data file
C. Database table
D. News article
Question? (Ref. Module-1,
Page-7)
Which word or phrase completes the
statement?
Structured data is to OLAP data as quasistructured data is to __________
A. Clickstream data
B. XML data
C. Text documents
D. Image files
Question? (Ref. Module-1,
Page-7)
Which data asset is an example of semistructured data?
A. XML data file
B. Database table
C. Webserver log
D. News article
Question? (Ref. Module-1,
Page-9)
Which word or phrase completes the
statement?
A spreadsheet is to a data island as a
centralized database for reporting is to a
__________?
A. Data Warehouse
B. Data Repository
C. Analytic Sandbox
D. Data Mart
Question? (Ref. Module-1,
Page-9)
Which word or phrase completes the
statement?
A data warehouse is to a centralized
database for reporting as an analytic
sandbox is to a __________?
A. Collection of data assets for modeling
B. Collection of low-volume databases
C. Centralized database of KPIs
D. Collection of data assets for ETL
Question? (Ref. Module-1,
Page-15)
Which word or phrase completes the
statement?
Business Intelligence is to monitoring
trends as Data Science is to __________
trends.
A. Predicting
B. Discarding
C. Driving
D. Optimizing
Question? (Ref. Module-1,
Page-30)
Which word or phrase completes the
statement?
Theater actor is to "Artistic and
Expressive" as Data Scientist is to
__________
A. "Communicative and Collaborative"
B. "Introverted and Technical"
C. "Logical and Steadfast"
D. "Independent and Intelligent"
References
Data Science and Big Data Analytics
(DSBDA), EMC.