Lecture : Data Analytics in Business
Today’s Highlights
Data Analytics Overview
Databases and Different Types of Analytics
Data Analytics Methodology
Dealing with Different Type of Data
Data Visualization
Data Analytics Tools
Data Analytics and Prompt Engineering
Utilize AI for Data Analytics
What you will get..
Analyze the triggers that led to the evolution of analytics
Develop an analytical approach to a business problem
Compare data science, data analytics dan mechine learning and understand their business
application
Explain the significance of data visualization in analytical modeling to drive meaning ful business
decisions
Identify business use cases that can leverage data analytics
Knowing how prompt engineering in data analytics can play an important role
What is data?
Data is numbers, characters, images or other method of recording, in a form which can assessed
to make a determination or decision about a specific action.
Many believe that data on its own has no meaning, only when interpreted does it take on
meaning and become information.
By closely examining data we can find patterns to perceive information, and then information
can be used to enhance knowledge (Source: The Free Online Dictionary of Computing, 1993-
2005 Denis Howe)
Data vs Information
‘’The number 1.099’’ is one example of data.
‘’The number of children who were determined to have a disability prior to enrollment in Migrant and
Seasonal Head Start fot the 2004 enrollment year is 1.099’’ is information.
Data Analytics?
Data analytics is the science of extracting trend, patterns and relevant information from raw
data to draw conclusions.
Mengolah dan mengambil keputusan berdasarkan data bukan perasaan saja.
It has multiple approaches, multiple dimensions and diverse techniques.
In addition to making business decisions, it is used by data scientists and researchers to verify
scientific models and theories.
Why Data Analytics?
It helps in decision making and effective business operations.
Analyzing data, gaining profits, making better use of resources and improving managerial
operations.
Big Data
Big data sets to large and complex to be processed by traditional methods. Consider that in a single
minute there are: Instagram strories posted, Youtube videos wathed, Google searches, texts sent and
emails sent.
Big data adalah bagian dari keseharian kita
The 3 V’s of Big Data – Plus 2 : These are the defining properties or dimentions of big data
Volume : the amount of data (sangat banyak)
Velocity : the speed of data (pengambilan data sangat cepat terjadi)
Variety : the types of data
Variability : the variety of and changes in data flows
Veracity : the quality of data
Source: Dataversity
The 5V’s of Big Data
Volume : This refers to the sheer volume of data being generated every second.
Variety : Can use structured as well as unstructured data.
Veracity : Data reliability and trust. Verifying and validating the data.
Value : Having access to big data is all well and good but that’s only usefull if we can turn it into
a value.
Velocity : Speed at which data is emanating and changes are occuring between the diverse data
sets.
Source: BINUS Uni
Big Data Value Chain
The process from big data, to data you can consume into insights.
Source: Dataversity
Example: Tokopedia
Aplication -> Data Sources (transactions, file) -> Data Warehouse (business inteligents) -> Visualization
(business user, management, data analyst)
Types of Analytics
Descriptive : What happened?
- Descriptive analytics is designed to access information about the past.
- Its purpose is to summarize the findings, focuses on the summarized view of facts.
- Techniques: Data Mining, Data Aggregation
- Tools: Excel, SPSS, Matlab
Diagnostics : Why did this happen?
- Diagnostics analytics helps you identify why something happened in the past.
- It takes a deeper look at data to understand the root cause of events, undersatnding of
causal relationships and sequences.
- It has a limited ability to provide actionable insights.
- Techniques: Data Mining, Data Discovery, Correlation, Drilldown.
- Example: Identify why a sales representative has sold fewer items than usual.
Predictive : What will happen?
- Predicting future outcomes in terms of probability of an to occur.
- Tools: Machine Learning Algorithm (random forest, SVM), Python, R.
- Example: Analyzing sentiments where all opinions posted on social media, Identifying target
audience for a promotional campaign, Forcasting weather, plan-failure prediction and travel
products recommender system.
Prescriptive : How can we make it happen?
- Prescriptive analytics provides the solution for a prediction in the future.
- It creates and updates the relationship between action and outcome using a feedback
system.
- It is the final frontier of advanced analytics.
- Example: Allows marketers and sales staff to become more precise with their campaigns and
customer outreach.
What is Database
Database is often conceived of as a repository of information needed for runnung certain functions in a
corporation or organization.
Such a database would permit not only the retrieval of data but also the continous modification of data
needed for control of operations. It may be possible to search the database to obtain answers to queries
or information for planning purposes (Source: W3School)
Data Warehousing Consepts (Gudang Data)
Primary Key and Relational Database
PK: Something that uniquely identifies each record in a table.
Data Analytics Cycle
1. Dicovery : Learn about business domain and assess available resources.
2. Data Preparation : Execute ELT (extract, load and transform).
3. Model Planning : Identify techniques and data to understand variables relation ship.
4. Model Building : Develop data sets for testing, training and production.
5. Communicate Results : Identify key findings, business values and develop narratives for
stakeholders.
6. Operationalize : Deliver final reports, briefs, codes and technical documents.
Data Analytics Process
Diawali dengan mengidentifikasi masalah bisnisnya terlebih dahulu. Ini bukan proses yang linear, jadi
bisa diulangi secara terus menerus.
Business Question -> Get Data -> Explore Data -> Prepare Data -> Analyze Data -> Present Findings
1. Business Question
Business understanding is the first stage of the data science methodology and lays the
foundation for a successful end result.
It includes defining the problem, project objectives and solution requirements from a business
perspective.
Business Question -> Data Question -> Data Answer -> Business Answer
2. Get Data
Here, we need to determine the analytic approach, business requirements as well as data
requirements.
It identifies the analytic methods, hardware and software, data content, formats and
representations to be used.
This stage has multiple sub-stages including data acquisition, data wrangling, data analysis and
data modeling.
Qualitative and Quantitative Data
a. Quantitative : Data that can be measured with numbers, such as duration or speed.
i. Discrete : Whole numbers that can’t be broken down, such as a number of terms.
ii. Continuous : Numbers that can be broken down, such as height or weight.
- Interval : Numbers with known differences between variables, such as time.
- Ratio : Numbers that measurable intervals where difference can be determined,
such as height or weight.
b. Qualitative: Non-numerical data that is categorial, such as yes/no responses or eye colour.
i. Nominal : Data used for naming variables, such as hair colour.
ii. Ordinal : Data used to describe the order of values, such as 1 = happy, 2 = neutral, 3
= unhappy.
3. Prepare Data/Data Cleaning
Perform data mining
Work with structured and unstructured data
Use various tools and software to transform data
Integrate data from various sources
Basic Statistical Parameters
1) Measures of Frequency
(Use this when you want to show how often a response is given)
a. Count, Percent, Frequency
b. Shows how often something occurs
2) Measures of Central Tendency
(Use this when you want to show an average or most commonly indicated response)
a. Mean, Median, and Mode
b. Locates the distribution by various points
3) Measures of Dispersion or Variation
(Use this when you want to show how “spread out” the data are. It is helpful to know when
your data are so spread out that it affects the mean)
a. Identifies the spread of scores by stating intervals
b. Range = Max-Min points
c. Variance or Standard Deviation = difference between observed score and mean
4) Measures of Position
(Use this when you need to compare scores to a normalized score)
a. Percentile Ranks, Quartile Ranks (Q1, Q2, Q3)
b. Describes how scores fall in relation to one another. Relies on standardized scores
4. Analyze Data
Using exploratory data analytics, data analysts will analyze performance data and scientists
attempts multiple algorithms to find the best model for the available data set.
Correlation Analysis
Correlation is a statistical measure that indicates the extent to which two or more varialbes
fluctuate together.
A positive correlation indicates the extent to which those variables increase or decrease in
parallel; a negative corelation indicates the extent to which one variable increases as the
other decreases.
Clustering Analysis
Clustering is the task of dividing the population or data points into a number of groups such that
data points in the same groups are more similar to other data points in the same group than
those in other groups. In simple words, the aim is to segragate groups with similar traits and
assign the into clusters.
Exploratory Data Analysis (EDA)
Exploratory data analytics is an approach to analyze data sets to summarize their main
characteristics.
Data visualization in exploratory data analytics is the first step towards modeling.
EDA primarily helps analyze data beyond the formal modeling.
Be Careful With: Personal Bias
Bias, in general, is prejudice in favor of or against one thing, person, or group compared with
another, usually in a way considered to be unfair. Some of types of bias on data analytics are:
Confirmation Bias
Availability Bias
Selection Bias
Confounding Variables
5. Data Visualization
Data visualization is the graphical representation of data using charts, graphs and maps.
Our eyes are drawn to colours and patterns.
Data visualization is a form of visual art that grabs our interest and keeps our eyes on the
message.
Why We Need It? The human brain processes images 60.000 times faster than text and 80% of
information transmitted to the brain is visual.
Types of Data Visualization
https://codecrucks.com/what-is-data-visualization/
https://ischool.syracuse.edu/what-is-data-visualization/
4 Key Questions for Successful Data Visualization
What is the story your data is trying to tell?
What type of data do you want to explain?
What chart type will be most efficient?
Who is your audience that will hear your story?
Data Visualization Tools and Languages
Some common tools for data visualization:
Tableau
FusionCharts
QlikView
PowerBI
Google Data Studio
Plotly
Sisence
Looker
Few languages and libraries leveraged by data visualization:
Scala
R
Python
Javascript
Java
Dashboard Visualization
Dashboard is a tool, combination of graphs from multiple sources, that can help business
owners to monitors business health by visually tracking, analyzing and dispalying key data
points. It usually showa real-time representation of data.
How to make it?
1) Analyze your target audience
2) Identify key business parameters
3) Identify the end goal of the dashboard
4) Get hands-on in developing the dashboard
5) Continous process of improvement
Common Tools for Data #1: SQL
SQL or Structured Query Language is a standard language for relation database management systems.
(Exploring data, Preparing data, Analyzing data)
16
Common Tools for Data #2: Sheets/Excel
(Exploring data, Preparing data, Analyzing data, Visualizing data)
Common Tools for Data #3: Python
Python is a general purpose programming language. It is very easy to learn, easy syntax and readability.
We can use python in data science, machine learning, Artificial inteligence, web development, software
development (Exploring data, Preparing data, Analyzing data, Visualizing data)
Common Tools for Data #4: Tableau
(Visualisation tool, VisQL, Easy, Beautiful, Interactive, Accessible and Complete, Flexible, Quick
prototyping)
However, the job landscape is evolving with AI
Top AI Tools for Work
Chatbots : ChatGPT, Claude, Bing Chat
Content Generation : Jasper, Writer, Notion AI
Spreadsheets : Numerous
Meeting Recording : Vowel, Fireflies
Personal Productivity : Rewind, Mem
Audio Editing : Descript, Adobe Podcast
Image Generation : Midjourney, Adobe Firefly
Side Decks : Gamma, Tome
Chat with PDF : ChatPDF
Synthetic Voices : ElevenLabs, Play.ht
Prompt engineering involves using natural language instructions to guide AI models like ChatGPT to
generate desired outputs.
Prompt engineering offers several advantages for data analysts:
Accessibility and Efficiency
Rapid Prototyping and Exploration
Enhanced Collaboration
“Data analysts are the architects of insigthts, and prompt engineering is the cornerstone of their data
driven success. Just as a well-crafted key unlocks the secrets of a treasure chest, skillful prompt
engineering unlocks the power of data, revealing the hidden gems that drive informed decisions.”