0% found this document useful (0 votes)
349 views39 pages

BDA Notes-1

The document discusses key aspects of big data and analytics. It defines traits of big data as capturing low-level data that is transformed and modeled into higher-level insights and kept for long periods of time. Big data is characterized as huge, unstructured datasets requiring new tools like Hadoop. Traditional data is smaller, structured, and easier to query. The lifecycle of big data analytics includes identifying data sources, filtering data, extracting compatible data, aggregating common fields, analyzing data, visualizing insights, and delivering final results. Reporting organizes performance data while analysis explores deeper insights by interpreting data to identify answers and recommendations. Modern analytic tools include Hadoop, Hive, and Spark for processing large datasets.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
349 views39 pages

BDA Notes-1

The document discusses key aspects of big data and analytics. It defines traits of big data as capturing low-level data that is transformed and modeled into higher-level insights and kept for long periods of time. Big data is characterized as huge, unstructured datasets requiring new tools like Hadoop. Traditional data is smaller, structured, and easier to query. The lifecycle of big data analytics includes identifying data sources, filtering data, extracting compatible data, aggregating common fields, analyzing data, visualizing insights, and delivering final results. Reporting organizes performance data while analysis explores deeper insights by interpreting data to identify answers and recommendations. Modern analytic tools include Hadoop, Hive, and Spark for processing large datasets.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 39

Scanned by CamScanner

Scanned by CamScanner
Scanned by CamScanner
Scanned by CamScanner
Scanned by CamScanner
TRAITS OF BIG DATA:

a) Has the lowest level of data that is possible to capture


b) Uses the data by transforming it or modelling it or analyzing it into higher level information
c) Keeps the original raw data for a long time, so that it may be requires as a new question
arises or new solutions are needed.

BIG DATA VS TRADITIONAL DATA:

Big Data:

-Huge Datasets
-Unstructured Data
-Hard to perform Queries
-Needs a new Methodology
-Needs tools such as Hadoop, Hive, Hbase, Sqoop
-Aggregated or sampled or filtered data
-used for reporting, basic analysis, text mining. starting stage of Advance Analysis
-Peta/Exa bytes of data
-Billions/trillions of transactions
-Thousands/Millions of Accounts

TRADITIONAL DATA:

-In Control
-Structured
-Relatively easy
-Conventional Methods
-SQL, SAS, R, Excel
-Raw Transaction Data
-used for reporting, advanced analysis and Predictive modelling
-Mega/Giga bytes of data
-Millions of transactions
-Millions/Billions of Accounts

Scanned by CamScanner
The Lifecycle of Big Data Analytics
 Stage 1 - Business case evaluation - The Big Data analytics lifecycle begins
with a business case, which defines the reason and goal behind the
analysis.
 Stage 2 - Identification of data - Here, a broad variety of data sources are
identified.
 Stage 3 - Data filtering - All of the identified data from the previous stage
is filtered here to remove corrupt data.
 Stage 4 - Data extraction - Data that is not compatible with the tool is
extracted and then transformed into a compatible form.
 Stage 5 - Data aggregation - In this stage, data with the same fields across
different datasets are integrated.
 Stage 6 - Data analysis - Data is evaluated using analytical and statistical
tools to discover useful information.
 Stage 7 - Visualization of data - With tools like Tableau, Power BI, and
QlikView, Big Data analysts can produce graphic visualizations of the
analysis.
 Stage 8 - Final analysis result - This is the last step of the Big Data
analytics lifecycle, where the final results of the analysis are made
available to business stakeholders who will take action.

Scanned by CamScanner
Scanned by CamScanner
Scanned by CamScanner
Scanned by CamScanner
Scanned by CamScanner
Scanned by CamScanner
Scanned by CamScanner
Scanned by CamScanner
Scanned by CamScanner
Scanned by CamScanner
Scanned by CamScanner
Scanned by CamScanner
Scanned by CamScanner
Scanned by CamScanner
Scanned by CamScanner
Sno Reporting Analysis
1 The process of organizing data into informational summaries in order to monitor
how different areas of a business are performing
The process of exploring data and reports in order to extract meaningful insights,
which can be used to better understand and improve business performance
2 Reporting translates raw data into information Analysis transforms data and
information into insights.
3 Reporting helps companies to monitor their online business and be alerted to when
data falls outside of expected ranges
The goal of analysis is to answer questions by interpreting the data at a deeper level
and providing actionable recommendations.
4 Good reporting should raise questions about the business from its end users
Through the process of performing analysis you may raise additional questions, but
the goal is to identify answers, or at least potential answers that can be tested.
5 In summary, reporting shows you “what is happening”
analysis focuses on “ explaining why it is happening and what you can do about it.
6. Reporting pushes information to the organization, and
Analysis pulls insights from the reports and data

Scanned by CamScanner
MODERN ANALYTIC TOOLS:

Scanned by CamScanner
Scanned by CamScanner
Scanned by CamScanner
Scanned by CamScanner
Scanned by CamScanner
Scanned by CamScanner
Scanned by CamScanner
Scanned by CamScanner
Scanned by CamScanner
Scanned by CamScanner
Scanned by CamScanner
Scanned by CamScanner
Scanned by CamScanner
Scanned by CamScanner
Scanned by CamScanner
Scanned by CamScanner
Scanned by CamScanner

You might also like