Data Analytics and
Modelling
Amynah Reimoo
1
Session 1: Introduction Data Analytics
2
What is…?
Data Analytics:
Data Analysis:
3
The Data Journey
DATA COLLECTION DATA CLEANING DATA DATA ANALYSIS DATA DATA
TRANSFORMATION VISUALIZATION COMMUNICATION
Data ecosystem
A data ecosystem is a web of interconnected components, including
people, processes, technologies, and data sources, that work together
to produce, collect, manage, analyze, and distribute data. It facilitates
the seamless flow and integration of data across different platforms
and stakeholders, enabling informed decision-making and fostering
innovation through collaborative data utilization and sharing.
Modern Data Ecosystem
data ecosystem is a complex system of technologies and processes that
organizations use to collect, store, process, and analyze data
Data Sources
ETL / ELT Data BI
• Databases Data Ingestion
• Cleaning Warehousing
• Relational • Landing
• Converting • Staging • Reporting
• NoSQL • Batch Process
• Formatting • Warehouse • Dashboards
• Flat Files • Scheduled
• Filtering • Mart • BI
• CSV, TSV • Stream • Data Visualization
• Aggregating
• XLS, XLSX • Insights
• Normalizing
• TXT • Realtime Alerts
• Enriching • Search / Query
• APIs / Web Services
Data
• Webhooks • Network
Lake
• Web Scrapping • Landing Area
Files
• IoT • Raw Zone
• FTP / SFTP
• Logs • Processed Zone
• Local Files
• Curated Zone
Data
Governance
Data Science, Machine Learning & Artificial Intelligence
Artificial Intelligence
Intelligent systems that perform tasks smartly.
Machine Learning
Systems that learn from data and make
predictions.
Data Science
Extracting knowledge
and insights from data.
Data Analysis
Mini Case Studies
• Netflix: discount coupons for loyal customers, email marketing for
non active users, recommendation algo
• Amazon: AI matchmaking, inventory robotics with live data,
delivery machines, birthday discounts
• Spotify: recommendation algo, partnerships, trending/flop lists
• Walmart: sales data for future predictions, partnerships, loyalty
programs, marketing/promotion campaigns
DPR Metrics
• Amount – Overstated or understated?
• Completion date – mean, median
• Absolute day difference – modulus
• Charts?
• Early or late?
• Budget in relation to date
• Category wise analysis