0% found this document useful (0 votes)
27 views33 pages

Big Data

hang

Uploaded by

215059
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views33 pages

Big Data

hang

Uploaded by

215059
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 33

PARALLEL AND DISTRIBUTED COMPUTING

IN TODAY’S LECTURE WE WILL LEARN ABOUT

• Parallel and Distributed Computing


• BIG DATA
• SOURCES OF BIG DATA
• REAL WORLD EXAMPLE OF BIGDATA
• CHARACTERISTICS OF BIGDATA
• CHALLENGES OF BIGDATA
• WHY WE NEED BIGDATA

M.Imran | PDC 2
M.Imran | PDC 3
BIG DATA
As a matter of fact, if we compare the present situation to the past scenario, we can find that we are creating as much
information in just two days as we did up to 2003, which means we are creating five exabytes of data every two days.

Challenges
At the time of data analysis, we have challenges in storing and analyzing those data.

M.Imran | PDC 4
BIG DATA
Big Data refers to vast and complex collections of both structured and unstructured data that traditional data
management systems cannot handle efficiently.

It is characterized by:

•Volume: Massive amounts of data generated every second.

•Velocity: The speed at which new data is created and processed.

•Variety: Different types of data, such as text, images, videos, and more.

M.Imran | PDC 5
SOURCES OF BIG DATA

M.Imran | PDC 6
BIG DATA
Sources of Big Data
Big data comes from a wide array of sources, including:

Social Media:

Sensors:

Machine-Generated Data:

Traditional Databases:

Mobile Devices:

M.Imran | PDC 7
BIG DATA
Sources of Big Data
Big data comes from a wide array of sources, including:

Social Media: Platforms like Facebook and Twitter generate massive amounts of user-generated content.

M.Imran | PDC 8
BIG DATA
Sources of Big Data
Sensors: Devices that collect data from the environment, such as temperature or motion sensors.

M.Imran | PDC 9
BIG DATA
Sources of Big Data
Machine-Generated Data: Data produced by machines, such as logs from servers or industrial equipment.

M.Imran | PDC 10
BIG DATA
Sources of Big Data
Machine-Generated Data: Data produced by machines, such as logs from servers or industrial equipment.

M.Imran | PDC 11
BIG DATA
Sources of Big Data
Traditional Databases: Structured data stored in conventional database systems.

M.Imran | PDC 12
BIG DATA
Sources of Big Data
Mobile Devices: Smartphones and tablets continuously generate data through various applications.

M2M (Machine-to-Machine) Communication:


•Definition: M2M refers to the communication between
devices without human intervention. It enables connected
devices (e.g., IoT sensors, smart devices) to exchange data.
•Examples: Smart meters, industrial sensors, connected
vehicles, and wearable devices are examples of M2M
applications.
"Without M2M Data":
•This refers to scenarios where M2M communication is
excluded. Data traffic only comes from traditional internet
uses like human interactions (e.g., streaming, web
browsing).

Global mobile data traffic forecast by ITU (International Telecommunication Union). Overall mobile data
traffic is estimated to grow at an annual rate of around 55% in 2020-2030 to reach 607 Exabyte's (EB)
in 2025 and 5, 016 EB in 2030. (Source: Cisco)
M.Imran | PDC 13
REAL WORLD EXAMPLES OF BIG DATA

M.Imran | PDC 14
REAL WORLD EXAMPLES BIG DATA
Google
Google utilizes big data to analyze user behavior and sell data analytics to
companies needing insights.
For example:

• Search Data: Analyzing what users search for to improve search


algorithms.
• Advertising: Targeting ads based on user data to increase relevance and
effectiveness.

Mobile Data
Mobile devices are a significant source of data generation:

• Maps: Apps like Google Maps collect data on users' travel patterns.
• Daily Activities: Mobile applications track and record users' daily
activities, identifying the most engaged areas.

M.Imran | PDC 15
REAL WORLD EXAMPLES BIG DATA
E-commerce Sites
E-commerce platforms gather data on consumer preferences and behaviors to:

• Personalize Shopping Experience: Recommend products based on browsing and purchase history.

• Inventory Management: Forecast demand to manage stock levels efficiently.

M.Imran | PDC 16
EXPONENTIAL DATA PRODUCTION FACTS

M.Imran | PDC 17
EXPONENTIAL DATA PRODUCTION FACTS

• Connected Devices by 2030: Expected to reach 150 billion, making data production 44 times greater than in
2009.
• Global Internet Traffic: In 2020, estimated to reach 260 exabytes per month.
• Facebook Data: Stores, accesses, and analyzes over 30 petabytes (PB) of user-generated data.
• Google Data Processing: In 2018, processed over 20,000 terabytes (TB) daily.
• Walmart Data: Processes more than 1 million customer transactions daily, generating an estimated 2.5 PB of
data.
• Mobile Activity: Over 5 billion people worldwide use mobile devices for calling, texting, tweeting, and browsing.
Email Growth:
• 2012: 3.3 billion email accounts.
• 2016: Exceed 4.3 billion accounts.
• 2030: Projected to reach over 500 billion email accounts.
Daily Emails:
•2012: 89 billion emails sent and received daily.
•2016: Exceed 143 billion daily.
•2021: Approximately 1,507 billion emails sent daily.

M.Imran | PDC 18
DATA STORAGE UNITS

M.Imran | PDC 19
DATA STORAGE UNITS

Number Binary Equivalent Unit Symbol


1 1 bit Bit b
4 4 bits Nibble nibble
1 8 bits Byte B
1,024 1,024 Bytes Kilobyte KB
1,024 1,024 Kilobytes Megabyte MB
1,024 1,024 Megabytes Gigabyte GB
1,024 1,024 Gigabytes Terabyte TB
1,024 1,024 Terabytes Petabyte PB
1,024 1,024 Petabytes Exabyte EB
1,024 1,024 Exabytes Zettabyte ZB
1,024 1,024 Zettabytes Yottabyte YB

M.Imran | PDC 20
KEY CHARACTERISTICS OF BIGDATA

M.Imran | PDC 21
KEY CHARACTERISTICS OF BIGDATA
1. Volume
Definition: Refers to the enormous amount of data generated every second.

Example: Every day, the world creates 2.5 quintillion bytes of data from sources like sensors, videos, social media, and
digital interactions.

2. Velocity
Definition: The speed at which data is generated and processed.

Example: Real-time data processing is essential for applications like live traffic updates on Google Maps or instant
financial transactions on stock exchanges.

M.Imran | PDC 22
KEY CHARACTERISTICS OF BIGDATA

3. Variety
Definition: The different types of data being generated and collected.

Example: Data comes in various forms, including text (emails, social media posts), images (photos, videos), audio, GPS
data, sensor data, and traditional structured databases.

4. Value
Definition: The potential benefits and insights that can be derived from analyzing big data.

Example: Companies use big data to personalize marketing strategies, optimize supply chains, and improve customer
experiences.

M.Imran | PDC 23
KEY CHARACTERISTICS OF BIGDATA

5. Veracity
Definition: The trustworthiness and accuracy of the data.

Example: Ensuring that data collected from various sources, like social media or sensors, is reliable and free from
biases.

6. Variability
Definition: The inconsistency of data flows and formats.

Example: Handling sudden spikes in data volume during events like product launches or viral social media trends.

M.Imran | PDC 24
KEY CHARACTERISTICS OF BIGDATA

M.Imran | PDC 25
CHALLENGES IN HANDLING BIG DATA

M.Imran | PDC 26
CHALLENGES IN HANDLING BIGDATA
• Storage: Managing and storing the continuously growing volume of data.

• Processing: Analyzing data quickly to extract meaningful insights.

• Data Types: Handling the wide variety of data from different sources.

• Scalability: Ensuring systems can scale out (adding more nodes) to handle increasing data.

• Real-Time Processing: Moving from batch processing to real-time analytics to reduce delays.

• Privacy Concerns: Ensuring data privacy and addressing legal challenges, especially with practices like email
scanning.

M.Imran | PDC 27
CHALLENGES IN HANDLING BIGDATA
Data can have errors, missing values, or inconsistencies, making it difficult to
analyze or use effectively.
There are many tools and technologies for As the size of the data grows, managing
handling big data, making it confusing to and processing it becomes harder
choose the right one.

Combining data from different


Setting up systems for sources (like websites, mobile
collecting , storing, and apps, and sensors) can be tricky if
analyzing big data can be very formats or structures don’t match.
costly.

Ensuring that the data is Handling data that updates


accurate and reliable before constantly in real time is challenging.
analyzing it is a big task.

Employees or management might resist Analyzing and managing big data requires
adopting new data-driven methods due skilled people, like data scientists and
to fear of change or lack of engineers, but finding them is not easy.
understanding.
Storing and analyzing data safely without leaks or hacks is
critical, especially for sensitive information.

M.Imran | PDC 28
WHY BIG DATA IS IMPORTANT

M.Imran | PDC 29
WHY BIG DATA IS IMPORTANT
1. Understanding Customers
What It Means: Companies collect lots of information about their customers to understand what they like and need.
Example: A store like Wal-Mart uses data to predict which products will sell best in different locations.

2. Improving Business Operations


What It Means: Businesses use data to make their processes more efficient.
Example: Retailers use social media trends and weather forecasts to decide how much stock to keep in stores.

3. Personal Growth and Health


What It Means: Individuals can use data from devices to improve their lifestyles and health.
Example: Fitness trackers monitor your activity and sleep patterns to help you stay healthy.

4. Enhancing Health Care


What It Means: Data helps doctors predict and treat diseases more effectively.
Example: Hospitals use data from wearable devices to monitor patients’ heart rates and detect infections early.

5. Boosting Sports Performance


What It Means: Teams and athletes use data to improve their game.
Example: Tennis players use tools like IBM SlamTracker to analyze their matches and enhance their skills.

M.Imran | PDC 30
WHY BIG DATA IS IMPORTANT
6. Advancing Science and Research
What It Means: Scientists use big data to make new discoveries and solve complex problems.
Example: CERN uses massive amounts of data from the Large Hadron Collider to study particle physics.

7. Smarter Devices and Technology


What It Means: Devices become smarter and can make decisions on their own by using data.
Example: Self-driving cars use data from cameras and sensors to navigate roads safely without human help.

8. Improving Security
What It Means: Big data helps keep people and information safe by detecting threats.
Example: Police use data to identify and catch online fraudsters.

9. Optimizing Cities and Countries


What It Means: Governments use data to make cities run more smoothly.
Example: Smart cities use data to manage traffic flow and reduce pollution.

10. Financial Trading


•What It Means: Traders use data to make better investment decisions quickly.
•Example: Stock traders use social media data to decide when to buy or sell stocks.

M.Imran | PDC 31
WHY BIG DATA IS IMPORTANT

"Data is the new oil. It’s valuable, but if unrefined, it cannot really
be used. It has to be changed into gas, plastic, chemicals, etc., to
create a valuable entity that drives profitable activity; so must data
be broken down, analyzed for it to have value."
— Clive Humby

M.Imran | PDC 32
ANY QUESTION…?

M.Imran | PDC 33

You might also like