0% found this document useful (0 votes)

25 views53 pages

Chapter 1

Uploaded by

Amanuel Desalegn

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

25 views53 pages

Chapter 1

Uploaded by

Amanuel Desalegn

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 53

Big Data Analytics(BDA)

GTU #3170722

Unit-1
Introductio
n to Big
Data
Prof. Maulik D. Trivedi
Computer Engineering Department
Darshan Institute of Engineering & Technology, Rajkot
[email protected]
9998 265 805
 Looping
Outline
• Introduction to Big Data
• Big Data Characteristics
• Challenges of Conventional System
• Types of Big Data
• Intelligent Data Analysis
• Traditional vs. Big Data business Approach
• Case Study of Big Data Solutions
Introduction to Big
Data
Introduction
 Firstly, We need to know “what is data?”
 The quantities, characters, or symbols on which operations are performed
by a computer, which may be stored and transmitted in the form of
electrical signals and recorded on magnetic, optical, or mechanical
recording media.
Data Comes From Types of Data

#3170722 (BDA)  Unit:1 – Introduction to Big

Prof. Maulik D Trivedi 4
Computer Data as Information
 Computer data is information processed or stored by a computer.
 This information may be in the form of text documents, images, audio
clips, software programs, or other types of data.
 Computer data may be processed by the computer's CPU and is stored
in files and folders on the computer's hard disk.

#3170722 (BDA)  Unit:1 – Introduction to Big

Prof. Maulik D Trivedi 5
Definition – Big Data
 Big Data is a massive collection of data that
continues to grow dramatically over time.
 It is a data set that is so huge and
complicated that no typical data management
technologies can effectively store or process
it.
 Big Data is like regular data, but it is much
larger.
 A data which are very large in size.
 Normally we work on data of size MB(WordDoc
,Excel) or maximum GB(Movies, Codes) but
data in Peta bytes i.e. 1015 byte size is called
Big Data.
 It is stated that almost 90% of today's data
has been generated in the past 3 years.
#3170722 (BDA)  Unit:1 – Introduction to Big
Prof. Maulik D Trivedi 6
Sources of Big Data
Posts, Photos Videos, Likes
and Comments on Social
Media
Traffic data & GPS
Signals

Emails, Blogs and e-

news Software logs, camera and
microphone

Huge data from Weather station

and satellite that stored and
manipulated to forecasting
Digital Pictures &
Videos

#3170722 (BDA)  Unit:1 – Introduction to Big

Prof. Maulik D Trivedi 7
Big Data
Characteristics
Big Data Characteristics

 Volume represents the volume i.e. amount of data that is growing at a

high rate i.e. data volume in Petabytes.

#3170722 (BDA)  Unit:1 – Introduction to Big

Prof. Maulik D Trivedi 9
Big Data Characteristics

 Value refers to turning data into value. By turning accessed big data into
values, businesses may generate revenue.

#3170722 (BDA)  Unit:1 – Introduction to Big

Prof. Maulik D Trivedi 10
Big Data Characteristics

 Veracity refers to the uncertainty of available data. Veracity arises due to

the high volume of data that brings incompleteness and inconsistency.

#3170722 (BDA)  Unit:1 – Introduction to Big

Prof. Maulik D Trivedi 11
Big Data Characteristics

 Visualization is the process of displaying data in charts, graphs, maps,

and other visual forms.

#3170722 (BDA)  Unit:1 – Introduction to Big

Prof. Maulik D Trivedi 12
Big Data Characteristics

 Variety refers to the different data types i.e. various data formats like
text, audios, videos, etc.

#3170722 (BDA)  Unit:1 – Introduction to Big

Prof. Maulik D Trivedi 13
Big Data Characteristics

 Velocity is the rate at which data grows. Social media contributes a

major role in the velocity of growing data.

#3170722 (BDA)  Unit:1 – Introduction to Big

Prof. Maulik D Trivedi 14
Big Data Characteristics

 Virality describes how quickly information gets spread across people to

people (P2P) networks.

#3170722 (BDA)  Unit:1 – Introduction to Big

Prof. Maulik D Trivedi 15
Volume
 As it follows from the name, big data is used to
refer to enormous amounts of information. Volume
[ Data at Rest ]
 We are talking about not gigabytes but terabytes
and petabytes of data.
 The IoT (Internet of Things) is creating exponential
growth in data.
 The volume of data is projected to change
significantly in the coming years.
 Hence, 'Volume' is one characteristic which needs • Terabytes,
Petabytes
to be considered while dealing with Big Data.
• Records/Arch
• Table/Files
• Distributed

#3170722 (BDA)  Unit:1 – Introduction to Big

Prof. Maulik D Trivedi 16
Variety
 Variety refers to heterogeneous sources and the Variety
nature of data, both structured and unstructured. [ Data in many
 Data comes in different formats – from structured, Forms ]
numeric data in traditional databases to
unstructured text documents, emails, videos,
audios, stock ticker data and financial transactions.
 This variety of unstructured data poses certain
issues for storage, mining and analysing data.
 Organizing the data in a meaningful way is no • Structured
simple task, especially when the data itself changes • Unstructured
• Text
rapidly. • Multimedia
 Another challenge of Big Data processing goes
beyond the massive volumes and increasing
velocities of data but also in manipulating the
enormous variety of these data.
#3170722 (BDA)  Unit:1 – Introduction to Big
Prof. Maulik D Trivedi 17
Veracity
 Veracity describes whether the data can be trusted.
Veracity
 Veracity refers to the uncertainty of available data. [ Data in Doubt ]
 Veracity arises due to the high volume of data that
brings incompleteness and inconsistency.
 Hygiene of data in analytics is important because
otherwise, you cannot guarantee the accuracy of
your results.
 Because data comes from so many different
sources, it’s difficult to link, match, cleanse and • Trustworthiness
• Authenticity
transform data across systems.
• Accurate
 However, it is useless if the data being analysed are • Availability
inaccurate or incomplete.
 Veracity is all about making sure the data is
accurate, which requires processes to keep the bad
data from accumulating in your systems.
#3170722 (BDA)  Unit:1 – Introduction to Big
Prof. Maulik D Trivedi 18
Velocity
 Velocity is the speed in which data is grows, process
and becomes accessible. Velocity
[ Data in Motion ]
 A data flows in from sources like business
processes, application logs, networks, and social
media sites, sensors, Mobile devices, etc.
 The flow of data is massive and continuous.
 Most data are warehoused before analysis, there is
an increasing need for real-time processing of these
enormous volumes. • Streaming
 Real-time processing reduces storage requirements • Batch
• Real / Near Time
while providing more responsive, accurate and • Processes
profitable responses.
 It should be processed fast by batch, in a stream-
like manner because it just keeps growing every
years.
#3170722 (BDA)  Unit:1 – Introduction to Big
Prof. Maulik D Trivedi 19
Value
 It refers to turning data into value. By turning Value
accessed big data into values, businesses may [ Data into
generate revenue. Money ]
 Value is the end game. After addressing volume,
velocity, variety, variability, veracity, and
visualization – which takes a lot of time, effort and
resources – you want to be sure your organization is
getting value from the data.
 For example, data that can be used to analyze
• Statistical
consumer behavior is valuable for your company • Events
because you can use the research results to make • Correlations
individualized offers.

#3170722 (BDA)  Unit:1 – Introduction to Big

Prof. Maulik D Trivedi 20
Visualization
 Big data visualization is the process of displaying Visualizatio
data in charts, graphs, maps, and other visual n
forms. [ Data Readable ]
 It is used to help people easily understand and
interpret their data at a glance, and to clearly show
trends and patterns that arise from this data.
 Raw data comes in a different formats, so creating
data visualizations is process of gathering,
managing, and transforming data into a format • Readable
that’s most usable and meaningful. • Accessible
 Big Data Visualization makes your data as • Presentation
accessible as possible to everyone within your • Visual Forms
organization, whether they have technical data
skills or not.

#3170722 (BDA)  Unit:1 – Introduction to Big

Prof. Maulik D Trivedi 21
Virality
 Virality describes how quickly information gets
spread across people to people (P2P) networks. Virality
[ Data Spread ]
 It is measures how quickly data is spread and
shared to each unique node.
 Time is a determinant factor along with rate of
spread.

• P2P
• Shared
• Rate of Spread

#3170722 (BDA)  Unit:1 – Introduction to Big

Prof. Maulik D Trivedi 22
Challenges of
Conventional System
Challenges of Conventional System
 There are main three challenges of conventional system, which are as
follows:
1. Volume of Data
2. Processing and Analyzing
3. Management of Data

#3170722 (BDA)  Unit:1 – Introduction to Big

Prof. Maulik D Trivedi 24
Volume of Data
 The volume of data increasing day by day, especially the data generated
from machine, telecommunication service, airline services, data from
sensors, etc…
 The rapid growth in data every year is coming with new source of data
which are emerging.
 As per survey, the growth in volume of data is so rapid that it is expected
by IBM that by 2020 around 35 zettabyte of data will get stored in the
world.

#3170722 (BDA)  Unit:1 – Introduction to Big

Prof. Maulik D Trivedi 25
Processing & Analyzing
 Processing of such large volume of data is major challenge and is very
difficult.
 Organization make use of such large volume of data by analyzing in order
to achieve their business goals.
 Taking out insights from such large amount of data is time consuming and
it also takes lot of effort to do.
 Processing and analyzing of data is also costly since the data is in different
format and is complex.

#3170722 (BDA)  Unit:1 – Introduction to Big

Prof. Maulik D Trivedi 26
Management of Data
 As the data gathered have different formats like structured, semi-
structured and unstructured, it is very challenging to manage such
different variety of data.

#3170722 (BDA)  Unit:1 – Introduction to Big

Prof. Maulik D Trivedi 27
Types of Big Data
Types of Big Data
1. Unstructured
2. Semi-structured
3. Structured

#3170722 (BDA)  Unit:1 – Introduction to Big

Prof. Maulik D Trivedi 29
Unstructured
 Any data with unknown form or the structure is classified as unstructured
data.
 In addition to the size being huge, un-structured data poses multiple
challenges in terms of its processing for deriving value out of it.
 Typical example of unstructured data is, a heterogeneous data source
containing a combination of simple text files, images, videos like search in
Google Engine.
 Now a day organizations have wealth of data available with them but
Machine Generated
Human Generated Data
unfortunately they don't know how to derive value outDataof it since this data
is in its raw form or unstructured format.

#3170722 (BDA)  Unit:1 – Introduction to Big

Prof. Maulik D Trivedi 30
Unstructured - Example
 The output returned by 'Google Search'

#3170722 (BDA)  Unit:1 – Introduction to Big

Prof. Maulik D Trivedi 31
Structured
 Any data that can be stored, accessed and processed in the form of fixed
format is termed as a "Structured" data.
 Over the period of time, talent in computer science have achieved greater
success in developing techniques for working with such kind of data
(where the format is well known in advance) and also determining value
out of it.
 When size of such data grows to a huge extent, typical sizes are being in
the range of multiple zettabyte.
 Data stored in a relational database management system in one example
of a structured data.

#3170722 (BDA)  Unit:1 – Introduction to Big

Prof. Maulik D Trivedi 32
Structured - Example
 Employee_Table
Employee_ID Employee_Na Gender Department Salary_In_lacs
me
1 XYX MALE FINANCE 850000
2 ABC MALE ADMIN 250000
3 PQR FEMALE SALES 350000
4 MNR FEMALE FINANCE 600000

#3170722 (BDA)  Unit:1 – Introduction to Big

Prof. Maulik D Trivedi 33
Semi-structured
 Semi structured is the third type of big data.
 Semi-structured data can contain both the forms of data.
 Semi-structured data pertains to the data containing both the formats
mentioned above, that is, structured and unstructured data.
 To be precise, it refers to the data that although has not been classified
under a particular repository (database), yet contains vital information or
tags that segregate individual elements within the data.
 Web application data, which is unstructured, consists of log files,
transaction history files etc.
 Online transaction processing systems are built to work with structured
data wherein data is stored in relations (tables).

#3170722 (BDA)  Unit:1 – Introduction to Big

Prof. Maulik D Trivedi 34
Semi-structured - Example
 User can see semi-structured data as a structured in form but it is actually
not defined with e.g. a table definition in relational DBMS.
 Personal data stored in a XML file:
<rec><name>Prashant
Rao</name><sex>Male</sex><age>35</age></rec>
<rec><name>Seema
R.</name><sex>Female</sex><age>41</age></rec>
<rec><name>Satish
Mane</name><sex>Male</sex><age>29</age></rec>

#3170722 (BDA)  Unit:1 – Introduction to Big

Prof. Maulik D Trivedi 35
Difference
Semi-structured
Factors Structured data Unstructured data
data
 It is more flexible than
 It is flexible in nature
 It is dependent and structured data but
 Flexibility and there is an
less flexible less than flexible than
absence of a schema
unstructured data
 Matured transaction  The transaction is  No transaction
 Transaction
and various adapted from DBMS management and no
Management
concurrency technique not matured concurrency

 Structured query allow Queries over  An only textual query
 Query performance anonymous nodes are
complex joining is possible
 It is based on the possible  This is based on
 It is based on RDF and
 Technology relational database character and library
XML
table data

#3170722 (BDA)  Unit:1 – Introduction to Big

Prof. Maulik D Trivedi 36
Intelligent Data
Analysis
Intelligent Data Analysis
 Intelligent Data Analysis (IDA) is one of the major issues in the field of
artificial intelligence and information.
 Intelligent data analysis reveals implicit, previously unknown and
potentially valuable information or knowledge from large amounts of data.
 It also helps in making a decision.
 All zones of data visualization, data pre-preparing(combination, altering,
change, separating, examining), data engineering, database mining
procedure, devices and applications, use of domain knowledge in in data
analysis, big data applications, developmental algorithms, etc…
 It includes three major steps:
1. Data Preparation
2. Rules finding or data mining
3. Result validation and explanation

#3170722 (BDA)  Unit:1 – Introduction to Big

Prof. Maulik D Trivedi 38
Intelligent Data Analysis – Cont.
 Data Preparation:
 It includes extracting or collecting relevant data from source and then creating an
data set.
 Rules finding or Data mining:
 It is working out rules contained in the dataset by means of certain methods or
algorithms.
 Result Validation and Explanation:
 This result validation means examining these rules.
 And Result explanation is giving intuitive, reasonable, and understandable
description using logical reasoning.
 IDA is to extract useful knowledge, the process demands a combination of
extraction, analysis, conversion, classification, organization, reasoning,
and so on.
 We can imply machine learning and deep learning concept for IDA.
 It will helps in many area:
 Banking & Securities, Communications, Media,
#3170722 (BDA)  Unit:1 & Entertainment
– Introduction to Big
Prof. Maulik D Trivedi 39
Traditional vs. Big Data
Business Approach
Importance of Big Data
 Complex or massive data sets which are quite impractical to be managed
using the traditional database system and software tools are referred to as
big data.
 Big data is utilized by organizations in one or another way. It is the
technology which possibly realizes big data’s value.
 It is the voluminous amount of both multi-structured as well unstructured
data.

#3170722 (BDA)  Unit:1 – Introduction to Big

Prof. Maulik D Trivedi 41
Traditional vs. Big Data
 Confidentiality & Data Accuracy
 Data Relationship
 Data Storage Size
 Different types of data
 Flexibility
 Real-time Analytics
 Distributed Architecture

#3170722 (BDA)  Unit:1 – Introduction to Big

Prof. Maulik D Trivedi 42
Majors between Traditional Data & Big Data

TRADITIONAL DATA BIG DATA

Traditional Data
 Traditional data is generated in  Big data is generated in outside and
enterprise level. enterprise level.

 Its volume ranges from Gigabytes to  Its volume ranges from Petabytes to
Terabytes. Zettabytes or Exabytes.
 Big data system deals with
 Traditional database system deals
structured, semi structured and
with structured data.
unstructured data.
Big Data
 Traditional data is generated per  But big data is generated more
hour or per day or more. frequently mainly per seconds.
 Traditional data source is 
Big data source is distributed and it
centralized and it is managed in
is managed in distributed form.
centralized form.

#3170722 (BDA)  Unit:1 – Introduction to Big

Prof. Maulik D Trivedi 43
Majors between Traditional Data & Big Data

TRADITIONAL DATA BIG DATA

Traditional Data

 Data integration is very easy.  Data integration is very difficult.

 Normal system configuration is  High system configuration is

capable to process traditional data. required to process big data.

 The size is more than the traditional

 The size of the data is very small.
data size.
Big Data  Traditional data base tools are  Special kind of data base tools are
required to perform any data base required to perform any data base
operation. operation.
 Normal functions can manipulate  Special kind of functions can
data. manipulate data.

#3170722 (BDA)  Unit:1 – Introduction to Big

Prof. Maulik D Trivedi 44
Majors between Traditional Data & Big Data

TRADITIONAL DATA BIG DATA

Traditional Data
 Its data model is strict schema  Its data model is flat schema based
based and it is static. and it is dynamic.

 Traditional data is stable and inter  Big data is not stable and unknown
relationship. relationship.

 Traditional data is in manageable  Big data is in huge volume which

volume. becomes unmanageable.
Big Data
 It is easy to manage and  It is difficult to manage and
manipulate the data. manipulate the data.
 Its data sources includes ERP
 Its data sources includes social
transaction data, CRM transaction
media, device data, sensor data,
data, financial data, organizational
video, images, audio etc.
data, web transaction data etc.

#3170722 (BDA)  Unit:1 – Introduction to Big

Prof. Maulik D Trivedi 45
Case Study of Big Data
Solutions
Case Study of Big Data Solution
 Undoubtedly Big Data has become a major game change in most part of
the cutting edge industries over the last few years.
 As Big Data keeps on going day by day, the number of various
organizations that are adopting Big Data keeps on expanding.
 Let’s discuss example:
 An e-commerce site XYZ (having 100 million users) wants to offer a gift voucher of
100$ to its top 10 customers who have spent the most in the previous year.
 Moreover, they want to find the buying trend of these customers so that company
can suggest more items related to them.
 Issues: Huge amount of unstructured data which needs to be stored, processed and
analyzed.
 Solution:
 Storage: This huge amount of data, Hadoop uses HDFS (Hadoop Distributed File System)
which uses commodity hardware to form clusters and store data in a distributed fashion. It
works on Write once, read many times principle.
 Processing: Map Reduce paradigm is applied to data distributed over network to find the
required output.
 Analyze: Pig, Hive can be used to analyze the data.
#3170722 (BDA)  Unit:1 – Introduction to Big
Prof. Maulik D Trivedi 51
Where are businesses finding uses for Big
Data ?

#3170722 (BDA)  Unit:1 – Introduction to Big

Prof. Maulik D Trivedi 52
Walmart
 Biggest retiler in the world and world’s biggest organization by revenue.
 Approx. 2 million workers and 20000 stores in 28+ nations.
 It started to use Big Data concept in earlier stage.
 It used data mining to find designs pattern that can be used to give
product suggestions to client, depending on which products were brought
together.
 Based on data mining result, it has expanding its conversion rate of
customers.
 Main taget of walmart is to holding customers and enhance their
experience.
 Hadoop and NoSQL technologies are used to furnished these customers
real time data to gathered from various sources and their effective
valuable use.

#3170722 (BDA)  Unit:1 – Introduction to Big

Prof. Maulik D Trivedi 53
Uber
 It is the best option for individuals around the globe when moving people
and making conveyances.
 It utilizes individuals information of the user to intently monitor which
features of services are used.
 To analyze usage pattern and to figure out where the services should be
more engaged.
 It focuses around the oraganic market of the services because of which
the costs of services gave changes.
 The use of data is surge pricing and its influences the rate of demand.

#3170722 (BDA)  Unit:1 – Introduction to Big

Prof. Maulik D Trivedi 54
Netflix
 It is very popular entertainment company work in online on-request web
based video streaming for its customers.
 It has been determined to be able to predict what precisely its customers
will appreciate viewing with Big Data.
 Recently, Netflix begun positioning itself as a content creator, not simply a
distribution medium which is solidly said based on data analytics.
 Data likes are recommandation engines take care of customers watch,
regularly playback halted, ratings and so on.
 It has incorporates with Hadoop, Hive and Pig and other traditional
business intelligence.

#3170722 (BDA)  Unit:1 – Introduction to Big

Prof. Maulik D Trivedi 55
More Case Studies of Big Data
 https://www.scnsoft.com/blog/big-data-use-cases-stats-and-examples
 https://www.tableau.com/learn/articles/big-data-examples-use-cases

#3170722 (BDA)  Unit:1 – Introduction to Big

Prof. Maulik D Trivedi 56
Big Data Analytics(BDA)
GTU #3170722

Thank
You
Prof. Maulik D Trivedi
Computer Engineering Department
Darshan Institute of Engineering & Technology, Rajkot
[email protected]
9998 265 805

Introduction to Big Data Concepts
100% (1)
Introduction to Big Data Concepts
53 pages
Introduction to Big Data Concepts
No ratings yet
Introduction to Big Data Concepts
57 pages
Big Data Basics for Beginners
No ratings yet
Big Data Basics for Beginners
53 pages
Big Data Analytics
No ratings yet
Big Data Analytics
96 pages
Big Data Management Essentials
No ratings yet
Big Data Management Essentials
32 pages
Introduction To Big Data Management
No ratings yet
Introduction To Big Data Management
53 pages
BDA - CHP 1
No ratings yet
BDA - CHP 1
141 pages
Bda Chapter 1 Techneo
No ratings yet
Bda Chapter 1 Techneo
27 pages
CS8091 LN
No ratings yet
CS8091 LN
68 pages
BDA Unit 1
No ratings yet
BDA Unit 1
10 pages
Mca Big Data PDF Sem 3
No ratings yet
Mca Big Data PDF Sem 3
193 pages
Big Data
No ratings yet
Big Data
52 pages
Unit 1
No ratings yet
Unit 1
62 pages
BD - Unit - I - Introduction To Big Data
No ratings yet
BD - Unit - I - Introduction To Big Data
18 pages
Understanding Big Data Computing
No ratings yet
Understanding Big Data Computing
25 pages
INTRODUCTION: SLIDE 1 (Smriti)
No ratings yet
INTRODUCTION: SLIDE 1 (Smriti)
17 pages
Big Data Analytics Course Overview
No ratings yet
Big Data Analytics Course Overview
13 pages
BDA GTU Study Material E-Notes All-Units 03122021014217PM
No ratings yet
BDA GTU Study Material E-Notes All-Units 03122021014217PM
42 pages
Chapter 1 Introduction To Big Data 250525 071016
No ratings yet
Chapter 1 Introduction To Big Data 250525 071016
40 pages
Bsd1313 Chapter 2
No ratings yet
Bsd1313 Chapter 2
40 pages
Unit-1 Introduction To Big Data Analytics
No ratings yet
Unit-1 Introduction To Big Data Analytics
57 pages
Microsoft Word - Lecture 1
No ratings yet
Microsoft Word - Lecture 1
55 pages
BDA Lec 1
No ratings yet
BDA Lec 1
23 pages
Module 6 - Big Data and NOSQL
No ratings yet
Module 6 - Big Data and NOSQL
63 pages
Unit 1 - BDS - DS307
No ratings yet
Unit 1 - BDS - DS307
47 pages
Unit1 BDA
No ratings yet
Unit1 BDA
86 pages
Pertemuan 1. Understanding of Big Data
No ratings yet
Pertemuan 1. Understanding of Big Data
41 pages
Introduction To Big Data
No ratings yet
Introduction To Big Data
39 pages
Unit 1 Big Data Notes
No ratings yet
Unit 1 Big Data Notes
40 pages
Unit 1 Big Data Notes
No ratings yet
Unit 1 Big Data Notes
40 pages
Bda Mod 1
No ratings yet
Bda Mod 1
83 pages
Digital Notes IDBA Final Original
No ratings yet
Digital Notes IDBA Final Original
156 pages
BDA-1st Unit
No ratings yet
BDA-1st Unit
39 pages
Introduction To Big Data
No ratings yet
Introduction To Big Data
83 pages
BDA Module1 Part#1
No ratings yet
BDA Module1 Part#1
69 pages
Unit 1 Big Data Notes
No ratings yet
Unit 1 Big Data Notes
40 pages
Big Data Lec1
No ratings yet
Big Data Lec1
37 pages
Unit I
No ratings yet
Unit I
25 pages
Introduction To Bda
No ratings yet
Introduction To Bda
67 pages
Lecture 1
No ratings yet
Lecture 1
22 pages
Unit I
No ratings yet
Unit I
66 pages
BDA Module1
No ratings yet
BDA Module1
141 pages
Introduction To Big Data
No ratings yet
Introduction To Big Data
11 pages
Big Data Analysis Introduction
No ratings yet
Big Data Analysis Introduction
42 pages
Unit - 1
No ratings yet
Unit - 1
104 pages
Big Data Analysis Fundamentals
No ratings yet
Big Data Analysis Fundamentals
43 pages
Introductions: What Are The 5 Vs of Big Data/ Characteristics of Big Data or Nature of Data
No ratings yet
Introductions: What Are The 5 Vs of Big Data/ Characteristics of Big Data or Nature of Data
75 pages
$R3N9XOZ
No ratings yet
$R3N9XOZ
56 pages
17 2017 Lecture1-2 INT312
0% (2)
17 2017 Lecture1-2 INT312
21 pages
Big Data Analytics for Engineers
No ratings yet
Big Data Analytics for Engineers
52 pages
BIG Data Analytics 21CSH-471: Computer Science & Engineering
No ratings yet
BIG Data Analytics 21CSH-471: Computer Science & Engineering
16 pages
Understanding Big Data Analytics
No ratings yet
Understanding Big Data Analytics
26 pages
Introduction To Big Data and Data Analysis
No ratings yet
Introduction To Big Data and Data Analysis
4 pages
Big Data (1) (Autosaved)
No ratings yet
Big Data (1) (Autosaved)
13 pages
Unit 1
No ratings yet
Unit 1
59 pages
Ehr Implementation Template - Original
No ratings yet
Ehr Implementation Template - Original
7 pages
Graphics Design Schedule
100% (1)
Graphics Design Schedule
5 pages
Chapter 1
No ratings yet
Chapter 1
8 pages
Stock Price Prediction
No ratings yet
Stock Price Prediction
1 page
Bullet Theory
No ratings yet
Bullet Theory
1 page
Revenue (Disaster Management) Department: Government of Telangana
No ratings yet
Revenue (Disaster Management) Department: Government of Telangana
46 pages
Wave-Particle Duality Explained
No ratings yet
Wave-Particle Duality Explained
24 pages
Lucknawi Culture Research Project
No ratings yet
Lucknawi Culture Research Project
5 pages
Kellog-Case-Study (GR.4)
No ratings yet
Kellog-Case-Study (GR.4)
18 pages
Effect of Quenching Temperature On The Mechanical Properties of Cast Ti 6al 4V Alloy
No ratings yet
Effect of Quenching Temperature On The Mechanical Properties of Cast Ti 6al 4V Alloy
7 pages
Barium Follow-Through TBL
100% (2)
Barium Follow-Through TBL
16 pages
Food and Beverage Assessment Test 3
100% (1)
Food and Beverage Assessment Test 3
10 pages
Projects R12 New Features
100% (1)
Projects R12 New Features
7 pages
Essence of Neville's Teaching - Lessons in Living in The End
100% (4)
Essence of Neville's Teaching - Lessons in Living in The End
4 pages
2023 ICCAD AFerdowsi Accurate Hybrid Delay Models For Dynamic Timing Analysis
No ratings yet
2023 ICCAD AFerdowsi Accurate Hybrid Delay Models For Dynamic Timing Analysis
9 pages
When A Non Bank Issues A Letter of Credit
100% (2)
When A Non Bank Issues A Letter of Credit
3 pages
Jembatan Kali Butu (Goin-Kedi) Halbar - Ok Fix
No ratings yet
Jembatan Kali Butu (Goin-Kedi) Halbar - Ok Fix
40 pages
Ammonia Export Early Work For Berth No. 17 Pars Petrochemical Port 2nd Phase - Assalouyeh
No ratings yet
Ammonia Export Early Work For Berth No. 17 Pars Petrochemical Port 2nd Phase - Assalouyeh
1 page
Grade 8 English Lesson Plan: Intensifiers
No ratings yet
Grade 8 English Lesson Plan: Intensifiers
13 pages
A Hydro-Mechanical Coupled Contact Method For Two-Phase Geotechnical Large Deformation Problems Within The SNS-PFEM Framework
No ratings yet
A Hydro-Mechanical Coupled Contact Method For Two-Phase Geotechnical Large Deformation Problems Within The SNS-PFEM Framework
25 pages
مواسير روكسي
No ratings yet
مواسير روكسي
20 pages
Non-Anthropocentric Agency Conference 2022
No ratings yet
Non-Anthropocentric Agency Conference 2022
21 pages
4D Seismic & Production Data for Norne Reservoir Management
No ratings yet
4D Seismic & Production Data for Norne Reservoir Management
4 pages
Air Purge Vanguard
No ratings yet
Air Purge Vanguard
5 pages
Advance Herbal Technology PPT 1
No ratings yet
Advance Herbal Technology PPT 1
15 pages
100 Complex Journal Entries With Explanations
No ratings yet
100 Complex Journal Entries With Explanations
12 pages
Digital Banking Motivation in Sri Lanka
No ratings yet
Digital Banking Motivation in Sri Lanka
112 pages
Ali Akbar Mohammadi
No ratings yet
Ali Akbar Mohammadi
10 pages
SPM Case Q & As
No ratings yet
SPM Case Q & As
108 pages
ProTaper Ultimate: Advancements in Endodontics
No ratings yet
ProTaper Ultimate: Advancements in Endodontics
6 pages
MXWD As of Jun 28 20221
No ratings yet
MXWD As of Jun 28 20221
124 pages
Agriculture Mcqs For ZTBL Og-III Test by Ots-1-1
77% (83)
Agriculture Mcqs For ZTBL Og-III Test by Ots-1-1
23 pages
Instruction Classification: Unit 2
No ratings yet
Instruction Classification: Unit 2
34 pages
SIM7600 Series Hardware Design
No ratings yet
SIM7600 Series Hardware Design
74 pages

Chapter 1

Uploaded by

Chapter 1

Uploaded by

Big Data Analytics(BDA)

#3170722 (BDA)  Unit:1 – Introduction to Big

#3170722 (BDA)  Unit:1 – Introduction to Big

Emails, Blogs and e-

Huge data from Weather station

#3170722 (BDA)  Unit:1 – Introduction to Big

 Volume represents the volume i.e. amount of data that is growing at a

#3170722 (BDA)  Unit:1 – Introduction to Big

#3170722 (BDA)  Unit:1 – Introduction to Big

 Veracity refers to the uncertainty of available data. Veracity arises due to

#3170722 (BDA)  Unit:1 – Introduction to Big

 Visualization is the process of displaying data in charts, graphs, maps,

#3170722 (BDA)  Unit:1 – Introduction to Big

#3170722 (BDA)  Unit:1 – Introduction to Big

 Velocity is the rate at which data grows. Social media contributes a

#3170722 (BDA)  Unit:1 – Introduction to Big

 Virality describes how quickly information gets spread across people to

#3170722 (BDA)  Unit:1 – Introduction to Big

#3170722 (BDA)  Unit:1 – Introduction to Big

#3170722 (BDA)  Unit:1 – Introduction to Big

#3170722 (BDA)  Unit:1 – Introduction to Big

#3170722 (BDA)  Unit:1 – Introduction to Big

#3170722 (BDA)  Unit:1 – Introduction to Big

#3170722 (BDA)  Unit:1 – Introduction to Big

#3170722 (BDA)  Unit:1 – Introduction to Big

#3170722 (BDA)  Unit:1 – Introduction to Big

#3170722 (BDA)  Unit:1 – Introduction to Big

#3170722 (BDA)  Unit:1 – Introduction to Big

#3170722 (BDA)  Unit:1 – Introduction to Big

#3170722 (BDA)  Unit:1 – Introduction to Big

#3170722 (BDA)  Unit:1 – Introduction to Big

#3170722 (BDA)  Unit:1 – Introduction to Big

#3170722 (BDA)  Unit:1 – Introduction to Big

#3170722 (BDA)  Unit:1 – Introduction to Big

#3170722 (BDA)  Unit:1 – Introduction to Big

#3170722 (BDA)  Unit:1 – Introduction to Big

#3170722 (BDA)  Unit:1 – Introduction to Big

TRADITIONAL DATA BIG DATA

#3170722 (BDA)  Unit:1 – Introduction to Big

TRADITIONAL DATA BIG DATA

 Data integration is very easy.  Data integration is very difficult.

 Normal system configuration is  High system configuration is

 The size is more than the traditional

#3170722 (BDA)  Unit:1 – Introduction to Big

TRADITIONAL DATA BIG DATA

 Traditional data is in manageable  Big data is in huge volume which

#3170722 (BDA)  Unit:1 – Introduction to Big

#3170722 (BDA)  Unit:1 – Introduction to Big

#3170722 (BDA)  Unit:1 – Introduction to Big

#3170722 (BDA)  Unit:1 – Introduction to Big

#3170722 (BDA)  Unit:1 – Introduction to Big

#3170722 (BDA)  Unit:1 – Introduction to Big

You might also like