WHAT IS BIG DATA ?
A. Definition
The term "Big Data" refers to the evolution and use of
technologies that provide the right user at the right time with
the right information from a mass of data that has been
growing exponentially for a long time in our society. The
challenge is not only to deal with rapidly increasing volumes
of data but also the difficulty of managing increasingly
heterogeneous formats as well as increasingly complex and
interconnected data.
Being a complex polymorphic object, its definition varies
according to the communities that are interested in it as a
user or provider of services. Invented by the giants of the
web, the Big Data presents itself as a solution designed to
provide everyone a real-time access to giant databases.
Big Data is a very difficult concept to define precisely, since
the very notion of big in terms of volume of data varies from
one area to another. It is not defined by a set of technologies,
on the contrary, it defines a category of techniques and
technologies. This is an emerging field, and as we seek to
learn how to implement this new paradigm and harness the
value, the definition is changing. [2]
1) Characteristics of Big Data
The term Big Data refers to gigantic larger datasets (volume);
more diversified, including structured, semi-structured, and
unstructured (variety) data, and arriving faster (velocity) than
before. These are the 3V.
-Volume: represents the amount of data generated, stored and
operated within the system. The increase in volume is
explained by the increase in the amount of data generated
and stored, but also by the need to exploit it.
-Variety: represents the multiplication of the types of data
managed by an information system. This multiplication leads
to a complexity of links and link types between these data.
The variety also relates to the possible uses associated with a
raw data.
-Velocity: represents the frequency at which data is
generated, captured, and shared. The data arrive by stream
and must be analyzed in real time
To this classical characterization, two other "V"s are
important:
-Veracity: level of quality, accuracy and uncertainty of data
and data sources.
-Value: the value and potential derived from data.
WHAT IS BIG DATA ANALYTICS ?
Big Data generally refers to data that exceeds the typical
storage, processing, and computing capacity of conventional
databases and data analysis techniques. As a resource, Big
Data requires tools and methods that can be applied to
analyze and extract patterns from large-scale data. [3]
The analysis of structured data evolves due to the variety and
velocity of the data manipulated. Therefore, it is no longer
enough to analyze data and produce reports, the wide variety
of data means that the systems in place must be capable of
assisting in the analysis of data. The analysis consists of
automatically determining, within a variety of rapidly
changing data, the correlations between the data in order to
help in the exploitation of it.
Big Data Anlytics refers to the process of collecting,
organizing, analyzing large data sets to discover different
patterns and other useful information. Big data analytics is a
set of technologies and techniques that require new forms of
integration to disclose large hidden values from large
datasets that are different from the usual ones, more complex,
and of a large enormous scale. It mainly focuses on solving
new problems or old problems in better and effective ways.
A. Types of Big Data Analytics
a) Descriptive Analytics
It consists of asking the question: What is happening?
It is a preliminary stage of data processing that creates a set
of historical data. Data mining methods organize data and
help uncover patterns that offer insight. Descriptive analytics
provides future probabilities and trends and gives an idea
about what might happen in the future.
b) Diagnostic Analytics
It consists of asking the question: Why did it happen?
Diagnostic analytics looks for the root cause of a problem. It
is used to determine why something happened. This type
attempts to find and understand the causes of events and
behaviors.
c) Predictive Analytics
It consists of asking the question: What is likely to happen?
It uses past data in order to predict the future. It is all about
forecasting. Predictive analytics uses many techniques like
data mining and artificial intelligence to analyze current data
and make scenarios of what might happen.
d) Prescriptive Analytics
It consists of asking the question: What should be done?
It is dedicated to finding the right action to be taken.
Descriptive analytics provides a historical data, and
predictive analytics helps forecast what might happen.
Prescriptive analytics uses these parameters to find the best
solution.
CONCLUSION
Big data refers to the set of numerical data produced by the
use of new technologies for personal or professional
purposes. Big Data analytics is the process of examining
these data in order to uncover hidden patters, market trends,
customer preferences and other useful information in order
to make the right decisions. Big Data Analytics is a fast
growing technology. It has been adopted by the most
unexpected industries and became an industry on its own.
But analysis of these data in the framework of the Big Data
is a process that seems sometimes quite intrusive.
Analytics is a data science. BI takes care of the decision-
making part while Data Analytics is the process of asking
questions. Analytics tools are used when company needs to
do a forecasting and wants to know what will happen in the
future.