Unit 1
INTRODUCTION TO DATA ANALYTICS
EVOLUTION OF DATA ANALYTICS
Data analytics has developed significantly over time, driven by the growth of technology, computing
power, and the increasing importance of data in decision-making. Its evolution can be understood in
different phases:
1. Early Stages: Manual Data Analysis (Before 1960s)
• Data was collected and analyzed manually using paper records, charts, and calculators.
• Statistics and mathematical models formed the foundation.
• Mainly descriptive analysis: summarizing and reporting historical data.
• Example: Census records, sales reports, bookkeeping.
2. The Database Era (1960s – 1980s)
• Introduction of computers and databases revolutionized data storage and retrieval.
• Development of Relational Database Management Systems (RDBMS) (e.g., Oracle, IBM
DB2).
• SQL (Structured Query Language) enabled structured data queries.
• Focus on structured data from business transactions.
• Analytics was mostly reporting and basic trend analysis.
3. Business Intelligence (1990s – 2000s)
• Emergence of Business Intelligence (BI) tools (e.g., Cognos, Tableau, SAP BI).
• Enabled dashboards, KPIs, and visualization for better business decision-making.
• Data warehouses integrated data from multiple sources.
• Shift from just reporting → to historical trend analysis and visualization.
• Focus on descriptive (“What happened?”) and diagnostic (“Why did it happen?”) analytics.
4. Big Data Era (2000s – 2010s)
• Growth of the internet, social media, e-commerce, and IoT created massive amounts of
unstructured and semi-structured data.
• Traditional databases couldn’t handle this scale → led to Hadoop, Spark, NoSQL
technologies.
• Data analytics expanded beyond structured data to include text, images, video, and sensor
data.
• Rise of data lakes for storing raw data.
• Shift toward predictive analytics using machine learning models.
5. Advanced Analytics & AI (2010s – Present)
• Artificial Intelligence (AI) and Machine Learning (ML) integrated into analytics.
• Predictive (“What will happen?”) and prescriptive analytics (“What should we do?”).
• Cloud platforms (AWS, Azure, Google Cloud) made data analytics scalable and accessible.
• Use of real-time analytics for fraud detection, recommendation engines, and personalized
marketing.
• Growth of self-service analytics tools for non-technical users.
6. Future of Data Analytics (Emerging Trends)
• Augmented Analytics: Use of AI to automate insights and analysis.
• Edge Analytics: Processing data close to the source (IoT devices, sensors).
• Explainable AI (XAI): Making machine learning models more transparent.
• Increasing focus on data privacy, ethics, and governance.
• Integration with quantum computing in the future for faster processing.
OVERVIEW OF DATA ANALYTICS
Data analytics is the process of examining raw data to discover useful patterns, insights, and trends
that help in decision-making. It combines statistical techniques, data management, and technology
to transform data into actionable knowledge. In today’s digital world, data analytics plays a vital role
across industries such as business, healthcare, finance, retail, government, and technology.
Data Analytics refers to the systematic computational analysis of data using statistical,
mathematical, and computational techniques to extract meaningful insights, identify patterns, and
support decision-making.
Importance of Data Analytics
1. Helps businesses make data-driven decisions.
2. Improves efficiency and reduces costs.
3. Enhances customer experience through personalization.
4. Detects fraud, risks, and anomalies.
5. Supports innovation and competitiveness.
Key Components of Data Analytics
1. Data Collection → Gathering raw data from multiple sources.
2. Data Cleaning & Preparation → Removing errors, duplicates, and formatting data.
3. Data Storage → Using databases, warehouses, and data lakes.
4. Data Analysis → Applying statistical, ML, and computational techniques.
5. Data Visualization → Presenting insights using charts, dashboards, and reports.
6. Decision-Making → Using insights for strategy, operations, and innovation.
Tools & Technologies
• Data Storage: SQL, NoSQL, Hadoop, Data Lakes
• Analytics & Processing: Python, R, Spark, SAS, Excel
• Visualization: Tableau, Power BI, Matplotlib, Google Data Studio
• Machine Learning & AI: Scikit-learn, TensorFlow, PyTorch
Applications of Data Analytics
• Business: Market research, customer insights, performance tracking
• Healthcare: Disease prediction, patient care optimization
• Finance: Fraud detection, risk management, algorithmic trading
• Retail: Customer behavior analysis, inventory optimization
• Government: Smart cities, crime prediction, policy-making
TYPES OF DATA ANALYTICS
1.Descriptive Analytics
• Descriptive analytics is the easiest type of data analysis. It helps us understand what has
happened in the past using simple maths, charts, and reports.
It’s like summarizing old data in a clear way so that anyone can see trends, patterns, or
totals.
• The main goal is to make sense of past data so we can learn from it.
• Primary objective- innovative ways of data summarization
Features of Descriptive Analytics
• Uses historical data.
• Involves data aggregation, summarization, and visualization.
• Provides reports, charts, and dashboards.
• Does not explain why events happened (diagnostic) or predict future outcomes (predictive).
• Foundation for other advanced analytics techniques.
Techniques Used
1. Data Aggregation → Collecting and combining data from different sources.
2. Data Mining → Identifying patterns in large datasets.
3. Statistical Analysis → Mean, median, mode, variance, correlation.
4. Data Visualization → Graphs, dashboards, scorecards.
5. Reporting Tools → Business Intelligence (BI) dashboards, KPI reports.
Examples of Descriptive Analytics
1. A retail company generating monthly sales reports.
2. A hospital analyzing patient admission trends over the past year.
3. A website using Google Analytics to view traffic reports and user behavior.
4. Banks preparing quarterly performance dashboards.
Advantages
• Easy to understand and implement.
• Provides a clear picture of past performance.
• Useful for monitoring KPIs and trends.
• Forms the foundation for more advanced analytics (predictive, prescriptive).
Limitations
• Focuses only on past data, not the future.
• Cannot explain why events occurred (needs diagnostic analytics).
• Limited in helping organizations make forward-looking decisions.
2.Diagnostic Analytics
Diagnostic Analytics is the process of examining historical data to determine the reasons for past
outcomes by identifying relationships, patterns, and anomalies. It answers Why did it happen.
Features of Diagnostic Analytics
• Works on historical data but focuses on causation.
• Uses comparisons, correlations, and drill-down analysis.
• Helps uncover dependencies and contributing factors.
• Bridges the gap between descriptive (“what”) and predictive (“what next”).
Techniques Used
1. Drill-down Analysis → Breaking data into smaller parts to find root causes.
2. Correlation Analysis → Identifying relationships between variables.
3. Regression Analysis → Determining how independent variables affect outcomes.
4. Data Mining → Detecting patterns and anomalies.
5. Cause-and-Effect Analysis → Exploring factors leading to outcomes.
Examples of Diagnostic Analytics
1. A retail company analyzing why sales dropped in a specific month (e.g., low marketing,
seasonal trends, supply issues).
2. A hospital studying why patient waiting times increased (e.g., staff shortage, equipment
downtime).
3. A bank investigating why loan defaults increased (e.g., economic downturn, risky
customers).
4. A website identifying why user traffic decreased (e.g., poor SEO, server downtime, increased
competition).
Advantages
• Provides root cause insights for better decision-making.
• Helps identify patterns, relationships, and anomalies.
• Reduces guesswork in problem-solving.
• Improves the accuracy of predictive analytics.
Limitations
• Requires clean and detailed data for accurate results.
• More complex than descriptive analytics.
• Can suggest correlation, but not always true causation.
• Time and resource-intensive.
3.Predictive Analytics
Predictive Analytics is the process of using historical data, statistical algorithms, and machine
learning techniques to predict future events or behaviors. It answers “What will happen?”
Features of Predictive Analytics
• Uses historical and current data for forecasting.
• Employs statistical models, data mining, and machine learning.
• Focuses on probabilities and likelihoods (not absolute certainty).
• Supports proactive decision-making.
Techniques Used
1. Regression Analysis → Forecasting numerical outcomes (e.g., sales, prices).
2. Classification Models → Categorizing outcomes (e.g., spam vs. not spam).
3. Time Series Analysis → Predicting trends over time (e.g., stock prices).
4. Machine Learning Algorithms → Random Forests, Neural Networks, Gradient Boosting.
5. Clustering & Pattern Recognition → Finding groups of similar behavior (e.g., customer
segmentation).
Examples of Predictive Analytics
1. Business → Forecasting product demand for the next quarter.
2. Healthcare → Predicting the likelihood of a patient developing a disease.
3. Finance → Credit scoring and fraud detection.
4. Retail & E-commerce → Recommender systems (e.g., “You may also like…”).
5. Weather Forecasting → Predicting rainfall, temperature, or storms.
Advantages
• Enables proactive planning and reduces risks.
• Improves efficiency and resource allocation.
• Enhances customer personalization (e.g., recommendations, targeted ads).
• Provides competitive advantage.
Limitations
• Predictions are probabilistic, not guaranteed.
• Requires large, high-quality datasets.
• Models can be biased if data is biased.
• Computationally intensive.
IMPORTANCE AND BENEFITS OF DATA ANALYTICS
Data Analytics plays a crucial role in converting this raw data into actionable insights that support
decision-making, efficiency, and innovation.
Importance of Data Analytics
1. Supports Data-Driven Decision Making
o Moves decisions from intuition-based to fact-based.
o Enables organizations to choose the best strategies backed by evidence.
2. Helps Understand Customers
o Provides insights into customer needs, preferences, and behavior.
o Supports personalized services and improved customer satisfaction.
3. Improves Efficiency and Productivity
o Identifies bottlenecks in operations.
o Optimizes processes and resource allocation.
4. Detects Risks and Fraud
o Monitors unusual patterns to identify fraud, security threats, and risks.
5. Drives Innovation and Competitiveness
o Helps develop new products and services based on market trends.
o Provides a competitive edge by spotting opportunities early.
6. Enables Real-Time Insights
o With modern tools, organizations can analyze data instantly.
o Useful for stock markets, healthcare monitoring, and fraud detection.
Benefits of Data Analytics
1. Better Decision Making
o Managers and leaders can make informed, evidence-based decisions.
2. Enhanced Customer Experience
o Personalization in marketing, product recommendations, and customer support.
3. Cost Reduction
o Identifies wasteful spending, improves supply chain efficiency, and reduces
operational costs.
4. Revenue Growth
o Supports cross-selling, upselling, and targeted marketing campaigns.
5. Risk Management
o Predicts potential business risks and prepares preventive strategies.
6. Competitive Advantage
o Businesses using data analytics stay ahead by quickly adapting to market changes.
7. Improved Quality of Products and Services
o Customer feedback and data analysis help refine offerings.
TEXT ANALYITCS AND WEB ANALYTICS
Text Analytics is the process of transforming unstructured textual data into meaningful insights
through techniques such as classification, sentiment analysis, keyword extraction, and topic
modeling.
It answers questions like:
• What are people talking about?
• What is the sentiment (positive/negative/neutral)?
• Which topics or keywords are most frequent?
Features of Text Analytics
• Works on unstructured or semi-structured text data.
• Uses linguistic rules, statistical methods, and machine learning.
• Converts text into structured data for analysis.
• Supports multiple languages and domains.
Techniques Used
1. Tokenization → Splitting text into words or sentences.
2. Stemming & Lemmatization → Reducing words to their root form.
3. Sentiment Analysis → Detecting emotions (positive, negative, neutral).
4. Keyword Extraction → Identifying important words/phrases.
5. Topic Modeling → Grouping text into topics (e.g., LDA – Latent Dirichlet Allocation).
6. Text Classification → Categorizing text into predefined labels (e.g., spam vs. non-spam).
7. Named Entity Recognition (NER) → Identifying names, places, organizations.
Applications of Text Analytics
1. Customer Feedback Analysis → Analyzing product reviews, surveys.
2. Social Media Monitoring → Tracking brand reputation and public opinion.
3. Fraud & Risk Detection → Detecting suspicious transactions or messages.
4. Healthcare → Extracting insights from medical records and research papers.
5. Legal & Compliance → Reviewing contracts and legal documents.
6. Business Intelligence → Identifying market trends from news and blogs.
Advantages
• Unlocks value from unstructured text data.
• Improves decision-making with deeper insights.
• Enhances customer experience via sentiment analysis.
• Automates document classification and summarization.
Limitations
• Complexity of natural language (ambiguity, sarcasm, context).
• Requires large, clean datasets for accuracy.
• May struggle with multilingual text without specialized models.
• Dependent on quality of algorithms and preprocessing.
Web Analytics
Web Analytics is the process of collecting, analyzing, and reporting web data to understand and
optimize web usage.
It answers questions like:
• How many people visited the website?
• Where did the visitors come from?
• What actions did they take (clicks, purchases, downloads)?
• Which pages performed best or worst?
Objectives of Web Analytics
1. Track and measure website traffic.
2. Understand user behavior and navigation patterns.
3. Monitor the performance of digital marketing campaigns.
4. Identify conversion rates and sales performance.
5. Improve website design, speed, and usability.
Key Metrics in Web Analytics
• Page Views → Total number of pages viewed.
• Unique Visitors → Count of distinct users.
• Bounce Rate → Percentage of visitors leaving without interaction.
• Session Duration → Average time spent on site.
• Traffic Sources → Direct, referral, social, or search engine.
• Click-Through Rate (CTR) → Percentage of users clicking on a link.
• Conversion Rate → Percentage of visitors completing a desired action (purchase, sign-up).
Tools for Web Analytics
• Google Analytics
• Adobe Analytics
• Matomo (formerly Piwik)
• Mixpanel
• Kissmetrics
• Hotjar (heatmaps & session recordings)
Applications of Web Analytics
1. Digital Marketing → Tracking ad performance, SEO, and campaigns.
2. E-Commerce → Measuring sales, abandoned carts, and customer behavior.
3. Content Optimization → Identifying popular and underperforming content.
4. User Experience (UX) → Improving navigation, layout, and design.
5. Business Intelligence → Aligning online strategy with organizational goals.
Advantages
• Provides data-driven insights into website performance.
• Helps in personalized marketing and customer targeting.
• Identifies areas for cost reduction and ROI improvement.
• Supports continuous website improvement.
Skills for Business Analytics
1. Analytical & Critical Thinking Skills
• Ability to interpret complex data
• Problem-solving mindset
• Logical reasoning and decision-making
2. Statistical & Mathematical Skills
• Descriptive and inferential statistics
• Probability concepts
• Regression, correlation, and hypothesis testing
3. Technical & Data Handling Skills
• SQL for data extraction
• Data visualization tools (Tableau, Power BI)
• Excel (pivot tables, formulas, dashboards)
• Programming (Python, R, SAS)
4. Data Management & Preprocessing Skills
• Data cleaning and transformation
• Handling missing values and outliers
• Knowledge of databases and data warehouses
5. Business & Domain Knowledge
• Understanding business processes and KPIs
• Translating data insights into actionable strategies
• Industry-specific knowledge (finance, marketing, operations, etc.)
6. Communication & Storytelling Skills
• Data storytelling with visuals and narratives
• Preparing reports and dashboards for stakeholders
• Presenting insights clearly to non-technical audiences
7. Soft Skills
• Teamwork and collaboration
• Adaptability to new tools and methods
• Time management and project management