0% found this document useful (0 votes)
15 views3 pages

Detailed Data Analytics Notes

The document provides an overview of Data Analytics, detailing its importance in extracting insights from both big and small data, and categorizing analytics into descriptive, diagnostic, predictive, and prescriptive types. It also covers descriptive statistics, including scales, univariate analysis, and data visualization techniques, as well as bivariate analysis methods for exploring relationships between variables. Case studies and historical methodologies highlight the evolution and application of data analytics in various fields.

Uploaded by

ayesha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views3 pages

Detailed Data Analytics Notes

The document provides an overview of Data Analytics, detailing its importance in extracting insights from both big and small data, and categorizing analytics into descriptive, diagnostic, predictive, and prescriptive types. It also covers descriptive statistics, including scales, univariate analysis, and data visualization techniques, as well as bivariate analysis methods for exploring relationships between variables. Case studies and historical methodologies highlight the evolution and application of data analytics in various fields.

Uploaded by

ayesha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

Detailed Notes on Data Analytics and Descriptive Statistics

1. Introduction to Data Analytics


Data Analytics refers to the systematic process of examining datasets to draw meaningful
insights, identify patterns, and support decision-making. It is a foundation of Data Science,
which integrates statistics, computer science, and domain knowledge.

1.1 Big Data and Data Science


Big Data: Refers to datasets so large and complex that traditional data processing tools
cannot handle them efficiently. They are often described by the 5 V’s:
- Volume: Massive amounts of data.
- Velocity: Speed at which data is generated and processed.
- Variety: Different forms (structured, unstructured, semi-structured).
- Veracity: Data reliability and accuracy.
- Value: The potential benefit derived from the data.

Example: Social media platforms generating billions of user interactions daily.

Data Science: A multidisciplinary field combining mathematics, statistics, machine learning,


and programming to extract insights from both big and small data.
Example: Netflix uses Data Science to recommend movies to users based on past viewing
behavior.

1.2 Small Data


Small Data refers to datasets that are small enough to be analyzed by a single computer
using traditional statistical methods. Although smaller in scale, they can be highly
meaningful.

Example: A teacher analyzing the exam scores of 50 students to find class performance
trends.

1.3 Taxonomy of Data Analytics


- Descriptive Analytics: Summarizes historical data. Example: Monthly sales reports.
- Diagnostic Analytics: Explains why something happened. Example: Identifying causes of
sales drop.
- Predictive Analytics: Uses historical data and models to predict outcomes. Example:
Predicting customer churn.
- Prescriptive Analytics: Recommends optimal actions. Example: Suggesting best pricing
strategy.

1.4 Examples of Data Use


- Healthcare: Predicting disease risks based on medical history.
- Finance: Detecting fraudulent transactions.
- Retail: Personalizing shopping experiences through recommendation systems.
- Manufacturing: Predictive maintenance of machinery.

1.5 Case Studies


Breast Cancer in Wisconsin: A dataset containing features of cell nuclei, used in machine
learning classification tasks to distinguish between benign and malignant tumors.

Polish Company Insolvency Data: Contains financial ratios used to predict whether
companies will go bankrupt.

1.6 A Little History on Methodologies


- 1960s–1980s: Focus on descriptive statistics and hypothesis testing.
- 1990s: Growth of data mining and knowledge discovery.
- 2000s: Emergence of Big Data and machine learning applications.
- Today: Integration of AI and deep learning in Data Analytics.

2. Descriptive Statistics

2.1 Scale Types


- Nominal: Categories without any order. Example: Blood type (A, B, AB, O).
- Ordinal: Ordered categories. Example: Customer satisfaction ratings (Poor, Fair, Good,
Excellent).
- Interval: Numeric data without a true zero. Example: Temperature in Celsius.
- Ratio: Numeric data with a true zero. Example: Height, weight, income.

2.2 Descriptive Univariate Analysis


Analyzes one variable at a time. Includes:
- Central Tendency: Mean, Median, Mode.
- Dispersion: Range, Variance, Standard Deviation.
- Distribution Shape: Skewness, Kurtosis.

Example: Analyzing the average salary of employees in a company.

2.3 Univariate Frequencies


Frequency analysis shows how often each value or category appears.
Example: In a survey of favorite fruits, if 40% choose Mango, 30% Apple, 20% Banana, and
10% Grapes, these percentages represent the frequency distribution.

2.4 Common Univariate Probability Distributions


- Normal Distribution: Bell-shaped curve, common in natural phenomena.
- Binomial Distribution: Outcomes of binary events (success/failure).
- Poisson Distribution: Counts of events in a fixed interval. Example: Number of customer
arrivals per hour.
- Exponential Distribution: Time between events.
- Uniform Distribution: All outcomes equally likely.
3. Data Visualization
Data visualization helps represent data graphically for better understanding.
Examples:
- Histogram: Shows frequency of values.
- Box Plot: Shows distribution and outliers.
- Scatter Plot: Shows relationships between two quantitative attributes.
- Heatmap: Visualizes correlations or intensity of data.

Example: A scatter plot showing the relationship between advertising spend and sales
revenue.

4. Descriptive Bivariate Analysis


Bivariate analysis explores the relationship between two variables.

4.1 Two Quantitative Attributes


Methods: Scatter plots, correlation coefficients, regression analysis.
Example: Relationship between hours studied and exam scores.

4.2 Two Qualitative Attributes


Methods: Cross-tabulation, Chi-square tests.
Example: Relationship between gender (male/female) and preference for a product
(yes/no).

4.3 At Least One Nominal Attribute


Methods: Bar charts, stacked bar charts, ANOVA for group comparison.
Example: Comparing average income levels across different job categories.

4.4 Two Ordinal Attributes


Methods: Spearman’s rank correlation, Kendall’s tau.
Example: Relationship between education level (high school, graduate, postgraduate) and
job satisfaction rating (low, medium, high).

You might also like