0% found this document useful (0 votes)
36 views59 pages

BM 411 Notes Con 2024

The document provides comprehensive notes on data analysis, emphasizing its importance in solving business problems and facilitating informed decision-making. It outlines definitions, techniques, and the data analysis process, including stages such as identification, collection, cleaning, analysis, and interpretation. Additionally, it discusses the benefits of data analysis across various sectors and introduces basic statistical terminology relevant to the field.

Uploaded by

gowagowabalisa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views59 pages

BM 411 Notes Con 2024

The document provides comprehensive notes on data analysis, emphasizing its importance in solving business problems and facilitating informed decision-making. It outlines definitions, techniques, and the data analysis process, including stages such as identification, collection, cleaning, analysis, and interpretation. Additionally, it discusses the benefits of data analysis across various sectors and introduces basic statistical terminology relevant to the field.

Uploaded by

gowagowabalisa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 59

FACULTY OF BUSINESS SCIENCES

DEPARTMENT OF BUSINESS MANAGEMENT

MODULE TITLE: DATA ANALYSIS

(BM 441)

LECTURER: MR G MADZIVANYIKA

E-mail: [email protected]
DATA ANALYSIS NOTES

Introduction

The potential of data analysis is in its ability to solve business problems and
provide new opportunities.

Data Analysis

Definitions

 Data refers to raw, unprocessed numbers, measurements, or text.


 Information refers to data that are processed, organized, structured, or
presented in a specific context.
 Therefore, the process of transforming data into information is data analysis.
OR
 Is the process of inspecting, cleansing, transforming, and modeling data with
the goal of discovering useful information, informing conclusions, and
supporting decision-making using various statistical and logical methods and
techniques.
 Analysis does not mean using computer software package but is mainly
looking at the data in light of the questions you need to answer: eg;
How would you analyze data to determine: “Is my program meeting its
objectives?” or if it’s on track – you would look at your program targets and
compare them to the actual program performance. This is analysis.
 Later, we will take this one step further and talk about interpretation (e.g.,
through analysis, you find that your program achieved only 10% of its
target; now you have to figure out why).

Other definitions of DA (Data Analysis)


• Data analysis is the process of systematically applying statistical or logical
techniques to describe and illustrate, condense, recap and evaluate data.

• Analysis is the procedure to make broad generalizations by identifying


trends and situations (phenomena) among present information (RCDR,
2019).

• The process of scrutinizing raw data with the purpose of drawing


conclusion about that information (Bhatia, 2017).

• The main aim of Data Analysis is to convert the available cluttered data into
a format which is easy to understand, more legible, conclusive and which
supports the mechanism of decision-making.

• The whole process of data analysis begins with the question ―what is to be
measured?
• The answers to these questions gives a researcher a clear idea about the main
motive that the analysis should address
Why Is Data Analysis Important?
1. Informed decision-making:
From a management perspective, you can benefit from analyzing your data
as it helps you make decisions based on facts and not simple intuition. For
instance, you can understand where to invest your capital, detect growth
opportunities, predict your income, or tackle uncommon situations before
they become problems. Through this, you can extract relevant insights from
all areas in your organization, and with the help of dashboard software,
present the data in a professional and interactive way to different
stakeholders.
2. Reduce costs: Another great benefit is to reduce costs. In time, this will help
you save money and resources on implementing the wrong strategies. And
not just that, by predicting different scenarios such as sales and demand you
can also anticipate production and supply.
3. Target customers better: Customers are arguably the most crucial element
in any business. By using analytics to get a 360° vision of all aspects related
to your customers, you can understand which channels they use to
communicate with you, their demographics, interests, habits, purchasing
behaviors, and more. In the long run, it will drive success to your marketing
strategies, allow you to identify new potential customers, and avoid wasting
resources on targeting the wrong people or sending the wrong message. You
can also track customer satisfaction by analyzing your client’s reviews or
your customer service departments.
4. Performance Problems: It helps any business or organization identify
performance problems that require some sort of action.
5. Sophisticated analysis of data can substantially improve decision making,
minimize risks, and unearth valuable insights that would otherwise remain
hidden.
6. Data analysis unlocks significant values by making certain facts and
information transparent and recognizable.

BENEFITS OF DATA ANALYSIS IN VARIOUS SECTORS


 International performance improvement guru and businessman H. James
Harrington has rightly quoted that “Measurement is the first step that needs
to control and eventually to lead to improvement. If you can’t measure
something, you can’t understand it. If you can’t understand it, you can’t
control it. If you can’t control it, you can’t improve it.”
 Thus measurement of any flaws or strategies which impede a company from
reaching its full potential becomes a highly preferred task. So perhaps it
becomes inevitable on the part of the governing body of any organization or
business to pay good considerations to the data analyzed as it depicts the
path with least resistance to success.
 It allows the identification of important and often critical trends within the
organisation.
 Financial institutions can quickly find that data analysis is skilled at
identifying fraud before it becomes widespread, preventing further damage.
 Governments have turned to data analysis to increase their security and
combat outside cyber threats.
 The healthcare industry uses data analysis to improve patient care and
discover better ways to manage resources and personnel.
 The data analysis software available today, are of major benefits to the
healthcare sector. As information becomes increasingly available,
comparable and unambiguous, patients will also be empowered and more
involved in their own treatment through online health applications, which
can integrate patient information with their health records and make it
available to clinicians.
 Data analysis on students behaviour can provide the concerned authority
with important insights, such as if a student requires more attention, the class
understanding of a topic is not clear, or if the course has to be modified.
 Telecommunications companies and other organisations utilize data analysis
to prevent customer churn which also assist in planning the best ways to
optimize new and existing wireless networks.
 Markets have quite a few easy to utilize data. One involves sentiment
analysis, where marketers can collect data on how customers feel about
certain products and services by analyzing what consumers post on social
media.
 Data analytics allows you to personalize the content or look and feel of your
website in real time to suit each consumer entering your website, depending
on, for instance, their sex, nationality or from where they ended up on your
site.
 The world is becoming data driven. Each and every decision is now based
on the data available. The concept of data analysis- large pool of data can be
brought together and analyzed to discuss patterns and make better decisions
– will soon become the basis of competition and growth for individual
firms, enhancing productivity and increasing the quality of products and
services.

 All these various methods and techniques used in data analysis are largely
based on two core areas: quantitative and qualitative research.
 Apart from qualitative and quantitative categories, there are also other types
of data that you should be aware of before dividing into complex data
analysis processes. These types include:
I. Big data: Refers to massive data sets that need to be analyzed using
advanced software to reveal patterns and trends. It is considered to be one
of the best analytical assets as it provides larger volumes of data at a
faster rate.
II. Metadata: Putting it simply, metadata is data that provides insights about
other data. It summarizes key information about specific data that makes
it easier to find and reuse for later purposes.
III. Real time data: As its name suggests, real time data is presented as soon
as it is acquired. From an organizational perspective, this is the most
valuable data as it can help you make important decisions based on the
latest developments. Our guide on real time analytics will tell you more
about the topic.
IV. Machine data: This is more complex data that is generated solely by a
machine such as phones, computers, or even websites and embedded
systems, without previous human interaction.

Techniques Used In Data Analysis

 Data mining: is a particular data analysis technique that focuses on


statistical modeling and knowledge discovery for predictive (projecting)
rather than purely descriptive purposes,
 Data integration is a precursor to data analysis, and data analysis is closely
linked to data visualization and data dissemination.

Key Concepts Types/ Varieties Of Data Analysis

 In statistical applications, data analysis can be divided into


i) Discriptive Analysis- Describes the sample/target population
(demographic characteristics). Does not define causality – tells you
what, not why.
ii) Exploratory data analysis (EDA)- EDA focuses on discovering new
features in the data.
iii) Confirmatory data analysis (CDA)- CDA focuses on confirming or
falsifying existing hypotheses.
iv) Predictive analytics- Predictive analytics focuses on the application of
statistical models for predictive forecasting or classification,
while text analytics applies statistical, linguistic, and structural
techniques to extract and classify information from textual sources.

Data Analysis Process?

When we talk about analyzing data there is an order to follow in order to extract
the needed conclusions. The analysis process consists of 5 key stages. We will
cover each of them more in detail later in the post, but to start providing the needed
context to understand what is coming next, here is a rundown of the 5 essential
steps of data analysis.

 Identify: Before you get your hands dirty with data, you first need to
identify why you need it in the first place. The identification is the stage in
which you establish the questions you will need to answer. For example,
what is the customer's perception of our brand? Or what type of packaging is
more engaging to our potential customers? Once the questions are outlined
you are ready for the next step.

 Collect: As its name suggests, this is the stage where you start collecting the
needed data. Here, you define which sources of data you will use and how
you will use them. The collection of data can come in different forms such
as internal or external sources, surveys, interviews, questionnaires, and focus
groups, among others. An important note here is that the way you collect the
data will be different in a quantitative and qualitative scenario.

 Clean: Once you have the necessary data it is time to clean it and leave it
ready for analysis. Not all the data you collect will be useful, when
collecting big amounts of data in different formats it is very likely that you
will find yourself with duplicate or badly formatted data. To avoid this,
before you start working with your data you need to make sure to erase any
white spaces, duplicate records, or formatting errors. This way you avoid
hurting your analysis with bad-quality data.

 Analyze: With the help of various techniques such as statistical analysis,


regressions, neural networks, text analysis, and more, you can start
analyzing and manipulating your data to extract relevant conclusions. At this
stage, you find trends, correlations, variations, and patterns that can help you
answer the questions you first thought of in the identify stage. Various
technologies in the market assist researchers and average users with the
management of their data. Some of them include business intelligence and
visualization software, predictive analytics, and data mining, among others.

 Interpret
This stage is where the researcher comes up with courses of action based on
the findings. For example, here you would understand if your clients prefer
packaging that is red or green, plastic or paper, etc. Additionally, at this
stage, you can also find some limitations and work on them.

Basic Statistical terminology and concepts used in Data analysis

 Statistical terms

 Ratio

 Proportion

 Percentage

 Rate

 Mean

 Median

Ratio

 Comparison of two numbers expressed as:

 a to b, a per b, a:b

 Used to express such comparisons as clinicians to patients or beds to clients

 Calculation a/b
 Example – In district X, there are 600 nurses and 200 clinics. What is the
ratio of nurses to clinics?

600
=3 nurses per clinic, a ratio of 3:1
200

A ratio is a comparison of two numbers and is expressed as “a to b” or “a


per b.” In the health sector, we commonly use ratios to look at the number
of clinicians to patients, or beds to clients.

To calculate a ratio, divide the first item you are looking at by the second.
So, if you were to say that there are 3 staff per clinic, the ratio is expressed
numerically as 3:1. It is not the same as saying 1 to 3 or 1:3. The order of the
numbers matters.

Note the example here, where we see in district X that there are 600 nurses
and 200 clinics. To find the ratio of nurses to clinics we divide 600 by 200
and come up with 3, or 3 nurses per clinic.

Calculating ratios

Now let’s try one together. Let’s say that there are 160 nurses and 40
clinics in the Zvishavane district. Who wants to volunteer to calculate
the nurse-to-clinic ratio?

 In Zvishavane district, there are 160 nurses and 40 clinics

 What is the nurse-to-clinic ratio

160
 =4
40
4:1 or 4 nurses to 1 clinic

Proportion

A proportion is a ratio in which all individuals included in the numerator


must also be included in the denominator.

We frequently use a proportion to compare part of the whole, such as


proportion of all clients who stop taking their drugs.

For example: If 20 of 100 clients on treatment stop taking their drugs, what
is the proportion of treatment failures to all treated?

20/100 = 1/5

 A ratio in which all individuals in the numerator are also in the denominator.

 Used to compare part of the whole, such as proportion of all clients who are
less than 15 years old.

 Example: If 20 of 100 clients on treatment are less than 15 years of age,


what is the proportion of young clients in the clinic?

 20/100 = 1/5

Calculating proportions

Let’s try one together. Who wants to volunteer to answer this? If a clinic has
12 female clients and 8 male clients, what is the proportion of male clients?

Add males to females to get the total number of clients. That is, 12+8 = 20,
so you have eight-twentieths that are male. But then you reduce this
proportion (multiple of 4) to two-fifths. Two out of five clients are male.
 Example: If a clinic has 12 female clients and 8 male clients, then the
proportion of male clients is 8/20, or 2/5

 12+8 = 20

 8/20

 Reduce this, multiple of 4 = 2/5 of clients = male

Percentage

 A way to express a proportion (proportion multiplied by 100)

 Expresses a number in relation to the whole

 Example: Males comprise 2/5 of the clients, or 40% of the clients are male
(0.40 x 100)

 Allows us to express a quantity relative to another quantity. Can compare


different groups, facilities, countries that may have different denominators

Rate

 Measured with respect to another measured quantity during the same time
period

 Used to express the frequency of specific events in a certain time period


(fertility rate, mortality rate)

 Numerator and denominator must be from same time period

 Often expressed as a ratio (per 1,000)

Infant Mortality Rate


Let’s look specifically at infant mortality rate. The calculation for a mortality
rate is the number of deaths in the population at risk, divided by the
population at risk in the same time period, and then multiplied by 1,000.
Mortality rate is always expressed in units of death per 1,000 individuals
(except for maternal mortality, which is expressed per 100,000 live births).

Example: In 2010, 4,000 infants were born. Of these infants, 75 died during
that year.

So, to calculate this, divide 75 by 4,000 = .0187 x 1,000 = 18.7

The infant mortality rate is nearly 19.

• Calculation

• # of deaths ÷ population at risk in same time period x 1,000

• Example – 75 infants (less than one year) died out of 4,000 infants born that
year

• 75/4,000 = .0187 x 1,000 = 18.7

19 infants died per 1,000 live births

Calculating mortality rate

Let’s try one together. In 2009, Mabasa clinic had 31,155 patients on ART.
During that same time period 1,536 ART clients died. How many clients (per
1,000 clients on ART) died?

In 2009, Mabasa clinic had 31,155 patients on

ART. During that same time period, 1,536 ART


clients died

1536
=0.49 ×1000=49
31155

49 clients died (mortality rate) per 1,000 clients on ART

Rate of increase

Now let’s look at the rate of increase. Calculating the rate of increase in
health service delivery can be a helpful way to assess progress. You can look
at the rate of increase for many things, such as the increase in new clients to
your service or the increase in commodities distributed.

For example, Mabasa clinic distributed 200 condoms in January and by


June, they had distributed 1,100. The rate of increase of 1,100 – 200 = 900,
divided by 6 (the number of months) = 150 more condoms were distributed
per month.

 Calculation

 Total number of increase ÷ time of increase

 Used to calculate monthly, quarterly, yearly increases in health service


delivery. Example: increase in # of new clients, commodities distributed

 Example: Condom distribution in Jan. = 200; as of June = 1,100. What is the


rate of increase?

 1,100 - 200 = 900/6 = 150 (150 condoms per mo)

Calculating rate of increase


In Mondello family planning clinic, there were 50 new FP users in quarter 1
(January through March) and 75 in quarter 2 (April through June). What was
the rate of increase?

In Q1, there were 50 new FP users, and in Q2 there were 75. What was the
rate of increase from Q1 to Q2?

Example: 75 - 50 = 25 /3 = 8.33 new clients/mo

Central tendency

The most commonly investigated characteristic of a collection of data (or


dataset) is its center, or the point around which the observations tend to
cluster. Measures of central tendency measure the middle or center of a
distribution of data.

We will discuss the mean and the median.

Measures of the location of the middle or the center of a distribution of data

 Mean

 Median

Mean

The mean is the most frequently used measure to look at the central values
of a dataset.

The mean takes into consideration the magnitude of every value, which
makes it sensitive to extreme values. If there are data in the dataset with
extreme values – extremely low or high compared to most other values in
the dataset – the mean may not be the most accurate method to use in
assessing the point around which the observations tend to cluster.

Use the mean when the data are normally distributed (symmetric).

To calculate the mean, you add up all your figures and divide by the total
number of figures. Like in the example here.

 Example: (22+18+30+19+37+33) = 159 ÷ 6 = 26.5

 The mean is sensitive to extreme values

Calculating the mean

On this slide, you see the total number of clients counseled per month from
Jan to June.

You add them together and get 231; then divide by 6 (the number of months)
and you get 38.5 (231÷ 6). So, the average number of clients counseled per
month is 38.5.

 Average number of clients counseled per month

 January: 30

 February: 45

 March: 38

 April: 41

 May: 37

 June: 40
 (30+45+38+41+37+40) = 231÷ 6 = 38.5

 Mean or average = 38.5

Median

The median is another measurement of central tendency but it is not as


sensitive to extreme values as the mean because it takes into consideration
the ordering and relative magnitude of the values. We therefore use the
median when data are not symmetric or skewed.

If a list of values is ranked from smallest to largest, then half of the values
are greater than or equal to the median and the other half are less than or
equal to it.

When there is an odd number of values, the median is the middle value.

For example, for the first list on the slide (2, 4, 7), the median is 4.

When there is an even number of values, the median is the average of the
two mid-point values.

For example, for the 2nd list (2, 4, 7, 12), you add 4+7 to get 11, and then
divide that by 2 to get 5.5. The median for this list is 5.5.

Remember: with the median, you have to rank (or order) the figures before
you can calculate it.

 The middle of a distribution (when numbers are in order: half of the numbers
are above the median and half are below the median)

 The median is not as sensitive to extreme values as the mean

 Odd number of numbers, median = the middle number


 Median of 2, 4, 7 = 4

 Even number of numbers, median = mean of the two middle numbers

 Median of 2, 4, 7, 12 = (4+7) /2 = 5.5

Calculating the median

Here we have an odd number of clients, so we re-order the numbers


(smallest to largest) and select the middle number = 67.

How about if we have an even number? (Facilitator – click for the first client
to disappear). In this case, we re-order the numbers from smallest to largest,
add the 2 middle figures (67+134), and divide by 2 to get 100.5.

 Client 1 – 2

 Client 2 – 134

 Client 3 – 67

 Client 4 – 10

 Client 5 – 221

 = 67

 = 67+134 = 201/2 = 100.5

Use the mean or median?

We can see that there are a few outliers that may skew the data, so we want
to use the median.
If we rank the values in the table, we get: 9.0, 11.0, 92, 92, 95, 100, 100,
101, 104, 206

Since there is an even number of observations, the median is calculated as:


95+100 = 195/2 = 97.5

We are choosing the 2 middle numbers (95 and 100), adding them together
to get 195, and then dividing by 2.

CD4 count

Client 1 9

Client 2 11

Client 3 100

Client 4 95

Client 5 92

Client 6 206

Client 7 104

Client 8 100

Client 9 101

Client 10 92
Data Analysis Methods and Tools

QUANTITATIVE DATA ANALYSIS

 Quantitative data analysis can be understood as explaining situations by


means of numerical data. In quantitative analysis, we collect numerical data
and analyse it using mathematic methods (in particular statistics). In order to
be able to use mathematic methods, our data has to be in a numerical form.

 As quantitative analysis is about collecting numerical data, the following


four specific phenomena are best suited to analyse quantitatively:

 Questions that demand a quantitative answer, such as: ‘How many students
choose to study social science at higher education?’ Or: ‘How many maths
teachers do we need? Or: How many have we got in our school/district?’

 Comparisons of numerical values, for example change prior, during, or past


a time period, or numerical characteristics of individuals/social groups, such
as: ‘

a) Are the numbers of students in our university rising or falling?’


b) ‘Is learning achievement going up or down?’

 Understanding the state of something, or other or to identify factors for the


situation or the change, e.g., factors which predict the recruitment of maths
teachers. What factors are related to changes in student achievement over
time?

 The study which needs testing of hypotheses – e.g. whether there is a


relationship between a pupil’s achievement and their self-esteem and social
background. By looking at the theory, a possible hypothesis to test, would be
that a lower social class background leads to low self-esteem, which would
in turn be related to low achievement. Quantitative analysis can test this kind
of model.

 The essence of quantitative analysis is to confirm an assumption, or


hypothesis by identifying patterns among a larger sample from a population
and this approach is useful in policymaking and planning.

QUALITATIVE ANALYSIS

 There are many definitions of qualitative analysis as there are books on the
subject. The key word to differentiate qualitative analysis from quantitative
analysis is ‘exploration’. Qualitative analysers are interested in exploring an
observed phenomenon to understand the meaning that people have
constructed. That is, how people make sense of their world and the
experiences they have in the world.

 As Johnson and Christensen (2004) state, qualitative analysis involves


working with data that is non-numerical in nature and does not indicate an
order, hierarchy or rank. Symbology is an example. Social scientists apply a
form of observing and interpretive sociology; that means, to adopt a point of
view from the individual’s perspective to understand beliefs, values or
behaviours by means of participant observation, or case studies, which result
in a narrative, descriptive account of a setting, or practice.

QUANTITATIVE QUALITATIVE
Cluster Text Analysis
Cohort Content Analysis
Regression Thematic Analysis
Neural Narrative Analysis
Factor Analysis Discourse Analysis
Data Mining Grounded Theory Analysis
Time Series Analysis
Decision Trees
Conjoint Analysis
Correspondence Analysis
Multidimensional Scaling (MDS)
(add notes)

Data Analysis Techniques

This stage helps the researcher to dig deeper into how to perform your
analysis using the following techniques:

1. Collaborate your needs


 This involves sitting down collaboratively with all key stakeholders
within your organization, decide on your primary campaign or
strategic goals, and gain a fundamental understanding of the types of
insights that will best benefit your progress or provide you with
the level of vision you need to evolve your organization.

2. Establish your questions

 Once you’ve outlined your core objectives, you should consider which
questions will need answering to help you achieve your mission.
 This is one of the most important techniques as it will shape the very
foundations of your success.
 To help you ask the right things and ensure your data works for you,
you have to ask the right data analysis questions.(giving direction to
your data analysis methododlogy).

3. Data democratization

 After giving your data analytics methodology some real direction, and
knowing which questions need answering to extract optimum value
from the information available to your organization, you should
continue with democratization.
 Data democratization is an action that aims to connect data from
various sources efficiently and quickly so that anyone in your
organization can access it at any given moment.
 You can extract data in text, images, videos, numbers, or any other
format. And then perform cross-database analysis to achieve more
advanced insights to share with the rest of the company interactively.
 Once you have decided on your most valuable sources, you need to
take all of this into a structured format to start collecting your insights.
(How?)
 For example, for this purpose by using data pine offers an easy all-in-
one data connectors feature to integrate all your internal and external
sources and manage them at your will. Additionally, datapine’s end-
to-end solution automatically updates your data, allowing you to save
time and focus on performing the right analysis to grow your
company.

4. Think of governance

 When collecting data in a business or research context you always


need to think about security and privacy.
 With data breaches becoming a topic of concern for businesses, the
need to protect your client's or subject’s sensitive information
becomes critical.
 To ensure that all this is taken care of, you need to think of a data
governance strategy. According to Gartner, this concept refers to “the
specification of decision rights and an accountability framework to
ensure the appropriate behavior in the valuation, creation,
consumption, and control of data and analytics.”
 Thus, data governance is a collection of processes, roles, and policies,
that ensure the efficient use of data while still achieving the main
company goals.
 It ensures that clear roles are in place for who can access the
information and how they can access it. In time, this not only ensures
that sensitive information is protected but also allows for an efficient
analysis as a whole.

5. Clean your data

 After harvesting data from so many sources you will be left with a
vast amount of information that can be overwhelming to deal with.
 This process involves the cleaning of unimportant and incorrect data
that can be misleading to your analysis.
 It will ensure that the insights you extract from it are correct.
 There are many things that you need to look for in the cleaning
process. The most important one is to eliminate any duplicate
observations;
 This usually appears when using multiple internal and external
sources of information. You can also add any missing codes, fix empty
fields, and eliminate incorrectly formatted data.
 Another usual form of cleaning is done with text data. As we
mentioned earlier, most companies today analyze customer reviews,
social media comments, questionnaires, and several other text inputs.
In order for algorithms to detect patterns, text data needs to be revised
to avoid invalid characters or any syntax or spelling errors.
 Most importantly, the aim of cleaning is to prevent you from arriving
at false conclusions that can damage your company in the long run.
 By using clean data, you will also help Business Intelligence solutions
to interact better with your information and create better reports for
your organization.

6. Set your Key Performance Indicators (KPIs)

 Once you’ve set your sources, cleaned your data, and established
clear-cut questions you want your insights to answer, you need to set a
host of key performance indicators (KPIs) that will help you track,
measure, and shape your progress in a number of key areas.
 KPIs are critical to both qualitative and quantitative analysis research.
 An example of KPIs of a Logistics company will have to look at the
performance indicators of the transportation-related costs. Eg:
7. Omit useless data

 Help to explore the raw data you’ve collected from all sources and use
your KPIs as a reference for chopping out any information you deem
to be useless.
 This will allow you to focus your analytical efforts and squeeze every
drop of value from the remaining ‘lean’ information.
 Any stats, facts, figures, or metrics that don’t align with your business
goals or fit with your KPI management strategies should be eliminated
from the equation.

8. Build a data management roadmap (optional step)


 Is a stage of creating a data governance roadmap that will help your
data analysis methods and techniques become successful on a more
sustainable basis.
 These roadmaps, if developed properly, are also built so they can be
tweaked and scaled over time.

9. Integrate technology

 There are many ways to analyze data, but one of the most vital aspects
of analytical success in a business context is integrating the
right decision support software and technology.
 Robust analysis platforms will not only allow you to pull critical data
from your most valuable sources while working with dynamic KPIs
that will offer you actionable insights.
 It will also present them in a digestible, visual, interactive format
from one central, live dashboard. A data methodology you can count
on.
 Integrating the right technology within your data analysis
methodology will help you avoid fragmenting your insights, saving
you time and effort while allowing you to enjoy the maximum value
from your business’s most valuable insights.

10. Answer your questions

 By considering each of the above efforts, you will swiftly start to


answer your most burning business questions.
11. Visualize your data

 Online data visualization is a powerful tool as it lets you tell a story


with your metrics, allowing users across the organization to extract
meaningful insights that aid business evolution – and it covers all the
different ways to analyze data.
 The purpose of analyzing is to make your entire organization more
informed and intelligent.

12. Be careful with the interpretation

 Data interpretation is a fundamental part of the process of data


analysis.
 It gives meaning to the analytical information and aims to drive a
concise conclusion from the analysis results.
 Since most of the time companies are dealing with data from many
different sources, the interpretation stage needs to be done carefully
and properly in order to avoid misinterpretations.
 NB- there are three common practices that you need to avoid at all
costs when looking at your data:
i. Correlation vs. causation: The human brain is formatted to find
patterns. This behavior leads to one of the most common mistakes
when performing interpretation: confusing correlation with causation.
Although these two aspects can exist simultaneously, it is not
correct to assume that because two things happened together, one
provoked the other. A piece of advice to avoid falling into this
mistake is never to trust just intuition, trust the data. If there is no
objective evidence of causation, then always stick to correlation.
ii. Confirmation bias: This phenomenon describes the tendency to
select and interpret only the data necessary to prove one hypothesis,
often ignoring the elements that might disprove it. Even if it's not
done on purpose, confirmation bias can represent a real problem, as
excluding relevant information can lead to false conclusions and,
therefore, bad business decisions. To avoid it, always try to disprove
your hypothesis instead of proving it, share your analysis with other
team members, and avoid drawing any conclusions before the entire
analytical project is finalized.
iii. Statistical significance: To put it in short words, statistical
significance helps analysts understand if a result is actually accurate
or if it happened because of a sampling error or pure chance. The level
of statistical significance needed might depend on the sample size and
the industry being analyzed. In any case, ignoring the significance of a
result when it might influence decision-making can be a huge mistake.

13. Build a narrative

 Is a stage where you look at how you can bring all of the discussed
elements together in a way that will benefit your business - starting
with a little something called data storytelling.
 The human brain responds incredibly well to strong stories or
narratives. Once you’ve cleansed, shaped, and visualized your most
invaluable data using various Business Intelligence dashboard tools,
you should strive to tell a story - one with a clear-cut beginning,
middle, and end.
 By doing so, you will make your analytical efforts more accessible,
digestible, and universal, empowering more people within your
organization to use your discoveries to their actionable advantage.

14. Consider autonomous technology

 Autonomous technologies, such as artificial intelligence (AI) and


machine learning (ML), play a significant role in the advancement of
understanding how to analyze data more effectively.
 Gartner (2017) predicts that by the end of 2023 80% of emerging
technologies will be developed with AI foundations.
 These technologies are revolutionizing the analysis industry. Some
examples that we mentioned earlier are neural networks, intelligent
alarms, and sentiment analysis.

15. Share the load

 Is the stage where you need to present your metrics in a digestible,


value-driven format, allowing almost everyone in the organization to
connect with and use relevant data to their advantage.
 Modern dashboards consolidate data from various sources, providing
access to a wealth of insights in one centralized location, no matter if
you need to monitor recruitment metrics or generate reports that need
to be sent across numerous departments.
 These cutting-edge tools offer access to dashboards from a multitude
of devices, meaning that everyone within the business can connect
with practical insights remotely - and share the load.
 Once everyone is able to work with a data-driven mindset, you will
catalyze the success of your business in ways you never thought
possible. And when it comes to knowing how to analyze data, this
kind of collaborative approach is essential.

Data analysis tools

 In order to perform high-quality analysis of data, it is fundamental to


use tools and software that will ensure the best results. Eg:
i. Business Intelligence: BI tools allow you to process significant
amounts of data from several sources in any format. Through this, you
can not only analyze and monitor your data to extract relevant insights
but also create interactive reports and dashboards to visualize your
KPIs and use them for your company's good.
Datapine is an amazing online BI software that is focused on
delivering powerful online analysis features that are accessible to
beginner and advanced users. Like this, it offers a full-service solution
that includes cutting-edge analysis of data, KPIs visualization, live
dashboards, reporting, and artificial intelligence technologies to
predict trends and minimize risk.
ii. Statistical analysis: These tools are usually designed for scientists,
statisticians, market researchers, and mathematicians, as they allow
them to perform complex statistical analyses with methods like
regression analysis, predictive analysis, and statistical modeling.
A good tool to perform this type of analysis is R-Studio as it offers a
powerful data modeling and hypothesis testing feature that can cover
both academic and general data analysis. This tool is one of the
favorite ones in the industry, due to its capability for data cleaning,
data reduction, and performing advanced analysis with several
statistical methods.
iii) SPSS is another relevant tool to mention from IBM. The software
offers advanced statistical analysis for users of all skill levels. Thanks to a
vast library of machine learning algorithms, text analysis, and a hypothesis
testing approach it can help your company find relevant insights to
drive better decisions. SPSS also works as a cloud service that
enables you to run it anywhere.

iv. SQL Consoles: SQL is a programming language often used to handle


structured data in relational databases. Tools like these are popular
among data scientists as they are extremely effective in unlocking
these databases' value. Undoubtedly, one of the most used SQL
software in the market is MySQL Workbench. This tool offers several
features such as a visual tool for database modeling and monitoring,
complete SQL optimization, administration tools, and visual
performance dashboards to keep track of KPIs.
v. Data Visualization: These tools are used to represent your data
through charts, graphs, and maps that allow you to find patterns and
trends in the data.
Datapine's already mentioned BI platform also offers a wealth of
powerful online data visualization tools with several benefits. Some of
them include: delivering compelling data-driven presentations to share
with your entire company, the ability to see your data online with any
device wherever you are, an interactive dashboard design feature that
enables you to showcase your results in an interactive and
understandable way, and to perform online self-service reports that
can be used simultaneously with several other people to enhance team
productivity.
17. Refine your process constantly

 Last is a step that might seem obvious to some people, but it can be
easily ignored if you think you are done. Once you have extracted the
needed results, you should always take a retrospective look at your
project and think about what you can improve. As you saw throughout
this long list of techniques, data analysis is a complex process that
requires constant refinement. For this reason, you should always go
one step further and keep improving.

TOPIC: SAMPLING

SAMPLING

 A sample is “a smaller (but hopefully representative) collection of units from


a population used to determine truths about that population” (Field, 2005)

Why sample?

 Resources (time, money) and workload

 Gives results with known accuracy that can be calculated mathematically


 The sampling frame is the list from which the potential respondents are
drawn from

 Eg : Registrar’s office, Class rosters

 Must assess sampling frame errors

 Sampling frame errors: university versus personal email addresses; changing


class rosters; are all students in your population of interest represented?

 How do we determine our population of interest?

 Administrators can tell us

 We notice through qualitative research that a particular subgroup of students


is experiencing higher risk

3 factors that influence sample representativeness

 Sampling procedure

 Sample size

 Participation (response)

When might you sample the entire population?

 When your population is very small

 When you have extensive resources

 When you don’t expect a very high response

What is your population of interest?

 To whom do you want to generalize your results?eg


 All doctors

 School children

 Indians

 Women aged 15-45 years

 Other

Sampling simplified
Types of Samples

 Two general approaches to sampling are used in social science research.ie

1) Probability sampling

2) Non probability sampling

 With probability sampling, all elements (e.g., persons, households) in the


population have some opportunity of being included in the sample, and the
mathematical probability that any one of them will be selected can be
calculated.

 With nonprobability sampling, in contrast, population elements are selected


on the basis of their availability (e.g., because they volunteered) or because
of the researcher's personal judgment that they are representative. The
consequence is that an unknown portion of the population is excluded (e.g.,
those who did not volunteer). One of the most common types of
nonprobability sample is called a convenience sample – not because such
samples are necessarily easy to recruit, but because the researcher uses
whatever individuals are available rather than selecting from the entire
population.

Probability (Random) Samples

 Simple random sample

 Systematic random sample

 Stratified random sample

 Multistage sample

 Multiphase sample

 Cluster sample

Non-Probability Samples

 Convenience sample

 Purposive sample

 Quota

Sampling Process

 The sampling process comprises several stages:

 Defining the population of concern

 Specifying a sampling frame, a set of items or events possible to measure


 Specifying a sampling method for selecting items or events from the frame

 Determining the sample size

 Implementing the sampling plan

 Sampling and data collecting

 Reviewing the sampling process

Population definition

 A population can be defined as including all people or items with the


characteristic one wish to understand.

 Because there is very rarely enough time or money to gather information


from everyone or everything in a population, the goal becomes finding a
representative sample (or subset) of that population.

 Note also that the population from which the sample is drawn may not be the
same as the population about which we actually want information. Often
there is large but not complete overlap between these two groups due to
frame issues etc .
 Sometimes they may be entirely separate - for instance, we might study rats
in order to get a better understanding of human health, or we might study
records from people born in 2008 in order to make predictions about people
born in 2009.

SAMPLING FRAME

 In the most straightforward case, such as the sentencing of a batch of


material from production (acceptance sampling by lots), it is possible to
identify and measure every single item in the population and to include any
one of them in our sample. However, in the more general case this is not
possible. There is no way to identify all rats in the set of all rats. Where
voting is not compulsory, there is no way to identify which people will
actually vote at a forthcoming election (in advance of the election)

 As a remedy, we seek a sampling frame which has the property that we can
identify every single element and include any in our sample.

 The sampling frame must be representative of the population

PROBABILITY SAMPLING

 A probability sampling scheme is one in which every unit in the population


has a chance (greater than zero) of being selected in the sample, and this
probability can be accurately determined.

 . When every element in the population does have the same probability of
selection, this is known as an 'equal probability of selection' (EPS) design.
Such designs are also referred to as 'self-weighting' because all sampled
units are given the same weight.

Probability sampling includes:

 Simple Random Sampling,

 Systematic Sampling,

 Stratified Random Sampling,

 Cluster Sampling

 Multistage Sampling.

 Multiphase sampling

NON PROBABILITY SAMPLING

 Any sampling method where some elements of population have no chance of


selection (these are sometimes referred to as 'out of
coverage'/'undercovered'), or where the probability of selection can't be
accurately determined. It involves the selection of elements based on
assumptions regarding the population of interest, which forms the criteria for
selection. Hence, because the selection of elements is nonrandom,
nonprobability sampling not allows the estimation of sampling errors..

 Example: We visit every household in a given street, and interview the first
person to answer the door. In any household with more than one occupant,
this is a nonprobability sample, because some people are more likely to
answer the door (e.g. an unemployed person who spends most of their time
at home is more likely to answer than an employed housemate who might be
at work when the interviewer calls) and it's not practical to calculate these
probabilities.

 Nonprobability Sampling includes: Accidental Sampling, Quota Sampling


and Purposive Sampling. In addition, nonresponse effects may turn any
probability design into a nonprobability design if the characteristics of
nonresponse are not well understood, since nonresponse effectively modifies
each element's probability of being sampled.

SIMPLE RANDOM SAMPLING

• Applicable when population is small, homogeneous & readily available

• All subsets of the frame are given an equal probability. Each element of the
frame thus has an equal probability of selection.

• It provides for greatest number of possible samples. This is done by


assigning a number to each unit in the sampling frame.

• A table of random number or lottery system is used to determine which units


are to be selected.

 Estimates are easy to calculate.

 Simple random sampling is always an EPS design, but not all EPS designs
are simple random sampling.

 Disadvantages

 If sampling frame large, this method impracticable.


 Minority subgroups of interest in population may not be present in sample in
sufficient numbers for study.

SYSTEMATIC SAMPLING

 Systematic sampling relies on arranging the target population according to


some ordering scheme and then selecting elements at regular intervals
through that ordered list.

 Systematic sampling involves a random start and then proceeds with the
selection of every kth element from then onwards. In this case,
k=(population size/sample size).

 It is important that the starting point is not automatically the first in the list,
but is instead randomly chosen from within the first to the kth element in the
list.

 A simple example would be to select every 10th name from the telephone
directory (an 'every 10th' sample, also referred to as 'sampling with a skip of
10').

 ADVANTAGES:

 Sample easy to select

 Suitable sampling frame can be identified easily

 Sample evenly spread over entire reference population

 DISADVANTAGES:
 Sample may be biased if hidden periodicity in population coincides with that
of selection.

 Difficult to assess precision of estimate from one survey.

STRATIFIED SAMPLING

 Where population embraces a number of distinct categories, the frame can


be organized into separate "strata." Each stratum is then sampled as an
independent sub-population, out of which individual elements can be
randomly selected.

 Every unit in a stratum has same chance of being selected.

 Using same sampling fraction for all strata ensures proportionate


representation in the sample.

 Adequate representation of minority subgroups of interest can be ensured by


stratification & varying sampling fraction between strata as required.

CLUSTER SAMPLING

 Cluster sampling is an example of 'two-stage sampling'.

 First stage a sample of areas is chosen;

 Second stage a sample of respondents within those areas is selected.

 Population divided into clusters of homogeneous units, usually based on


geographical contiguity.
 Sampling units are groups rather than individuals.

 A sample of such clusters is then selected.

 All units from the selected clusters are studied.

Advantages:

 Cuts down on the cost of preparing a sampling frame.

 This can reduce travel and other administrative costs.

 Disadvantages: sampling error is higher for a simple random sample of same


size.

 Often used to evaluate vaccination coverage in EPI

Difference between Strata and Clusters

 Although strata and clusters are both non-overlapping subsets of the


population, they differ in several ways.

 All strata are represented in the sample; but only a subset of clusters are in
the sample.

 With stratified sampling, the best survey results occur when elements within
strata are internally homogeneous. However, with cluster sampling, the best
results occur when elements within clusters are internally heterogeneous

QUOTA SAMPLING
 The population is first segmented into mutually exclusive sub-groups, just as
in stratified sampling.

 Then judgment used to select subjects or units from each segment based on a
specified proportion.

 For example, an interviewer may be told to sample 200 females and 300
males between the age of 45 and 60.

 It is this second step which makes the technique one of non-probability


sampling.

 In quota sampling the selection of the sample is non-random.

 For example interviewers might be tempted to interview those who look


most helpful. The problem is that these samples may be biased because not
everyone gets a chance of selection. This random element is its greatest
weakness and quota versus probability has been a matter of controversy for
many years

CONVENIENCE SAMPLING

 Sometimes known as grab or opportunity sampling or accidental or


haphazard sampling.

 A type of nonprobability sampling which involves the sample being drawn


from that part of the population which is close to hand. That is, readily
available and convenient.
 The researcher using such a sample cannot scientifically make
generalizations about the total population from this sample because it would
not be representative enough.

 For example, if the interviewer was to conduct a survey at a shopping center


early in the morning on a given day, the people that he/she could interview
would be limited to those given there at that given time, which would not
represent the views of other members of society in such an area, if the
survey was to be conducted at different times of day and several times per
week.

 This type of sampling is most useful for pilot testing.

 In social science research, snowball sampling is a similar technique, where


existing study subjects are used to recruit more subjects into the sample.

Judgmental sampling or Purposive sampling

 The
researcher chooses
the sample based on
who they think would
be appropriate for the
study. This is used primarily when there is a limited number of people that
have expertise in the area being researched
Quality Criteria for Da ta Analysis (Quantitative Research)

 Focuses on measuring the quality and validity of your results.


 This is done with the help of some science quality criteria as follows:
i. Internal validity
- The results of a survey are internally valid if they measure what
they are supposed to measure and thus provide credible results.
- Internal validity measures the trustworthiness of the results and
how they can be affected by factors such as the research design,
operational definitions, how the variables are measured, and more.
Eg For instance, imagine you are doing an interview to ask people if
they brush their teeth two times a day. While most of them will
answer yes, you can still notice that their answers correspond to what
is socially acceptable, which is to brush your teeth at least twice a
day. In this case, you can’t be 100% sure if respondents actually
brush their teeth twice a day or if they just say that they do, therefore,
the internal validity of this interview is very low.

ii External validity:

- Essentially, external validity refers to the extent to which the


results of your research can be applied to a broader context.
- It basically aims to prove that the findings of a study can be
applied in the real world.
- If the research can be applied to other settings, individuals, and
times, then the external validity is high.

Iii Reliability:
- If your research is reliable, it means that it can be reproduced.
- If your measurement were repeated under the same conditions, it
would produce similar results. This means that your measuring
instrument consistently produces reliable results.
Eg For example, imagine a doctor building a symptoms
questionnaire to detect a specific disease in a patient. Then, various
other doctors use this questionnaire but end up diagnosing the
same patient with a different condition. This means the
questionnaire is not reliable in detecting the initial disease.
- Another important note here is that in order for your research to be
reliable, it also needs to be objective. If the results of a study are
the same, independent of who assesses them or interprets them,
the study can be considered reliable.

iv Objectivity:

- In data science, objectivity means that the researcher needs to stay


fully objective when it comes to its analysis.
- The results of a study need to be affected by objective criteria and
not by the beliefs, personality, or values of the researcher.
- Objectivity needs to be ensured when you are gathering the data.
Eg - when interviewing individuals, the questions need to be asked
in a way that doesn't influence the results. Paired with this,
objectivity also needs to be thought of when interpreting the data.
If different researchers reach the same conclusions, then the study
is objective.
NB The discussed quality criteria cover mostly potential influences in a
quantitative context. Analysis in qualitative research has by
default additional subjective influences that must be controlled in
a different way. Therefore, there are other quality criteria for this
kind of research such as credibility, transferability, dependability, and
confirmability.

Quality criteria for measuring qualitative research

i) Credibility-
ii) Transferability-
iii) Confirmability-
iv) Coherence
v) Transparency
vi) Saturability
vii) Dependability
viii)

Data Analysis Limitations & Barriers

 Analyzing data is not an easy task since there are many steps and
techniques t hat you need to apply in order to extract useful
information from your research. While a well-performed analysis can
bring various benefits to your organization it doesn't come without
limitations.
Limitations
 Lack of clear goals:

- No matter how good your data or analysis might be if you don’t


have clear goals or a hypothesis the process might be worthless.
While we mentioned some methods that don’t require a predefined
hypothesis, it is always better to enter the analytical process with
some clear guidelines of what you are expecting to get out of it,
especially in a business context in which data is utilized to support
important strategic decisions.

 Objectivity:

- Arguably one of the biggest barriers when it comes to data analysis


in research is to stay objective. When trying to prove a hypothesis,
researchers might find themselves, intentionally or unintentionally,
directing the results toward an outcome that they want. To avoid
this, always question your assumptions and avoid confusing facts
with opinions. You can also show your findings to a research
partner or external person to confirm that your results are
objective.

 Data representation:

- A fundamental part of the analytical procedure is the way you


represent your data. You can use various graphs and charts to
represent your findings, but not all of them will work for all
purposes. Choosing the wrong visual can not only damage your
analysis but can mislead your audience, therefore, it is important to
understand when to use each type of data depending on your
analytical goals.

 Flawed correlation:

- Misleading statistics can significantly damage your research.


Flawed correlations occur when two variables appear related to
each other but they are not. Confusing correlations with causation
can lead to a wrong interpretation of results which can lead to
building wrong strategies and loss of resources, therefore, it is very
important to identify the different interpretation mistakes and avoid
them.

 Sample size:

- A very common barrier to a reliable and efficient analysis process


is the sample size. In order for the results to be trustworthy, the
sample size should be representative of what you are analyzing.
Eg For example, imagine you have a company of 1000 employees
and you ask the question “do you like working here?” to 50
employees of which 49 say yes, which means 95%. Now, imagine
you ask the same question to the 1000 employees and 950 say yes,
which also means 95%. Saying that 95% of employees like
working in the company when the sample size was only 50 is not a
representative or trustworthy conclusion. The significance of the
results is way more accurate when surveying a bigger sample
size.

 Privacy concerns:
- In some cases, data collection can be subjected to privacy
regulations. Businesses gather all kinds of information from their
customers from purchasing behaviors to addresses and phone
numbers. If this falls into the wrong hands due to a breach, it can
affect the security and confidentiality of your clients. To avoid this
issue, you need to collect only the data that is needed for your
research and, if you are using sensitive facts, make it anonymous
so customers are protected. The misuse of customer data can
severely damage a business's reputation, so it is important to keep
an eye on privacy.

 Lack of communication between teams:

- When it comes to performing data analysis on a business level, it is


very likely that each department and team will have different goals
and strategies. However, they are all working for the same
common goal of helping the business run smoothly and keep
growing. When teams are not connected and communicating with
each other, it can directly affect the way general strategies are
built. To avoid these issues, tools such as data dashboards enable
teams to stay connected through data in a visually appealing way.

 Innumeracy:

- Businesses are working with data more and more every day. While
there are many BI tools available to perform effective analysis,
data literacy is still a constant barrier. Not all employees know how
to apply analysis techniques or extract insights from them. To
prevent this from happening, you can implement different training
opportunities that will prepare every relevant user to deal with
data.

Data Analysis Skills

 As you've learned throughout analyzing data is a complex task that


requires a lot of knowledge and skills. Below are some key skills that
are valuable to have when working with data.
i. Critical and statistical thinking:
- To successfully analyze data, one need to be creative and think out
of the box.
- A great level of critical thinking is required to uncover
connections, come up with a valuable hypothesis, and extract
conclusions that go a step further from the surface. This, of course,
needs to be complemented by statistical thinking and an
understanding of numbers.
ii. Data cleaning:
- Cleaning and preparation process accounts for 80% of a data
analyst's work.
- Not cleaning the data adequately can also significantly damage the
analysis which can lead to poor decision-making in a business h’
- cscenario.
iii. Data visualization:
- Visuals make the information easier to understand and analyze, not
only for professional users but especially for non-technical ones.
Having the necessary skills to not only choose the right chart type
but know when to apply it correctly is key. This also means being
able to design visually compelling charts that make the data
exploration process more efficient.
iv. SQL (Structured Query Language):
- The Structured Query Language or SQL is a programming
language used to communicate with databases. It is fundamental
knowledge as it enables you to update, manipulate, and organize
data from relational databases which are the most common
databases used by companies. It is fairly easy to learn and one of
the most valuable skills when it comes to data analysis.
v. Communication skills:
- This is a skill that is especially valuable in a business
environment. Being able to clearly communicate analytical
outcomes to colleagues is incredibly important, especially when
the information you are trying to convey is complex for non-
technical people. This applies to in-person communication as well
as written format, for example, when generating a dashboard or
report. While this might be considered a “soft” skill compared to
the other ones we mentioned, it should not be ignored as you most
likely will need to share analytical findings with others no matter
the context.

ETHICS IN DATA ANALYSIS IN BUSINESS RESERCH

 The analysis and reporting of findings must always avoid


exaggeration, or plain misrepresentation of data. Analysis and
reporting must be executed with honesty and rigour, as otherwise
wrong information bears counterproductive national planning.
 Reports should always draw attention to the limitations of the results
that have been analysed with regards to reliability and in applicability.
 The significance of results must not be exaggerated, nor
misrepresented.
 Data must not be fabricated, falsified nor intentionally misrepresented
to allow for concluding with desired recommendations.
 Analysis must be reported fully without omission of significant data,
disclosing details of undergirding analytical methods which might
bear upon interpretations of any findings.
 Concepts, procedures and results must be presented in sufficient detail
to allow others to understand and interpret the present information
equally.
 The misuse of findings and misunderstanding of their scope and
limitations must be acted upon.

Communicating the Data Analysis Results

 All the effort of data collection and production is to support management


and leadership to make good decisions on education planning, policymaking
and resource allocation. Openly structured and communicated data makes
authorities and community stakeholders accountable for effective
management of the education sector. It is important that the data produced is
understood by different audiences at different levels. Therefore, data
communication is always a very important aspect succeeding the data
analysis process.
THE END

REFERENCES

M. A. Alkhatib, ―Analysis of research in healthcare data analytics,‖ Australasian


Conference on Information Systems, Sydney, pp. 1-16, 2015.

M. R. Berthold, Guide to Intelligent Data Analysis, Texts in Computer Science


42,© Springer-Verlag London Limited 2010

W. S. Cleveland, The Elements of Graphing Data, Pacific Grove, CA: Wadsworth


& Advanced Book Program, 1985.

W. S. Cleveland and R. McGill, ―Graphical perception: Theory, experimentation,


and application to the development of graphical methods,‖ Journal of the American
Statistical Association, vol. 79, issue 387, pp. 531–540, 1984.

A. Lacey and D. Luff, ―Qualitative data analysis,‖ Trent Focus, 2001.


A. G. Picciano, ―The evolution of big data and learning analytics in American
higher education,‖ Journal of Asynchronous Learning Networks, vol. 16, issue 3,
pp. 9-20, 2012

Sabia and K. Sheetal, ―Applications of big data: Current status and future scope,‖
International Journal on Advanced Computer Theory and Engineering (IJACTE),
vol. 3, issue 5, pp. 25-29, 2014

P. Saxena, ―Application of statistical techniques in market research: A sample


survey,‖ International Journal of Applied Engineering Research, Dindigul, vol. 2,
no 1, pp. 163-171, 2011

You might also like