Data Science in Climate Change
Data Science in Climate Change
Topic
2
Today’s Overview
SECTION I
Basic about Data & Information
SECTION II
Data Science
• Data Background & History
• Need of Data Science
• Tools of Data Science
• Data Science Components
• Applications of Data Science
SECTION III
Data Science in Climate Change
• Role of Data Science in Climate Change
• Data Science Tools and Techniques use in Climate Change
• Data Science Key areas/solutions in Climate Change
• Data Science Solutions in Climate Change (Real World Examples)
Basic Questions:
5
What is Data
Data
The data is information such as facts and numbers used
to analyze something or make decisions.
Information is meaningful.
Data Processing
Data Processing
Any operation or set of operations performed upon data to convert into
meaningful form/information is called data processing.
Processing
Average test score is calculated
Information 67%. This shows a student’s
score in a assessment
Data Processing
Data Processing Cycle
Once data is collected, it is processed to convert it into useful
information. The data is processed again and again until the accurate
resut is achieved. This is called data processing cycle.
Usually, Data processing activity involves four basic steps.
Input
Processing
Output
Storage
Data Processing
Data Processing Cycle
Process can be:
Manipulation of data. e.g.,
arithmetic operations,
Data is comparing, sorting, Information is
meaningless searching, etc. meaningful
Data Output
Input Data
Processing Information
Information is produced on
Data is put through input
output device such as
devices, such as keyboard
screen
Type of Digital Data
14
Type of Digital Data
Digital data is classified into the following Three categories
Structured Data
Semi-Structured Data
Unstructured Data
Type of Digital Data
Structured Data
This data has a well defined structure
JSON: Java Script Object Notation (JSON) (store data in JSON format)
26
Qualitative & Quantitative Data
Quantitative Data
Quantitative data is anything that can be counted or
measured; it refers to numerical data.
Qualitative Data
Qualitative data is descriptive, referring to things that can
be observed but not measured. For example, colors or
emotions, responses to a survey, note from observation,
transcripts from interviews, etc.
Qualitative Data vs Quantitative Data
Qualitative Data Quantitative Data
•Deals with descriptions. •Deals with numbers.
Result are particular to the objects being Obtained results can be applicable on
examined the general population
Indefinite questions, observations and Measurements, surveys, observation and
interviews are conducted by researches experiments are made by researchers
Methodology of qualitative analysis is Methodology of quantitative analysis is conclusion
investigative
Data Analysis and Data Analytics
30
Data Analysis and Data Analytics
Data Analysis
Data Analysis is the process of systematically applying
statistical and/or logical techniques, such as cleaning,
transforming and modeling data to achieve the results or
get useful information.
For example,
In healthcare industry. Through data analysis, healthcare
providers can predict disease outbreaks, improve patient
care, and make informed decisions about treatment
strategies.
Data Analysis and Data Analytics
Data Analytics
Data analytics is the science of analyzing raw data to
make conclusions about information.
35
Data Mining
What is Data Mining
Data mining is the process of searching and analyzing a big dataset of raw data in order
to identify patterns and extract useful information. It looks for anomalies, patterns or
correlations among millions of records to predict results.
Example
In Marketing: Data mining is used to explore increasingly large databases and to
improve market segmentation. By analyzing the relationships between parameters such
as customer age, gender, tastes, etc. It also predicts which users are likely to unsubscribe
from a service, etc.
Big Data
37
What is BIG DATA?
● A Collection of large and complex datasets which are difficult to store
and process using the traditional database and data processing tools is
considered as big data. Big data is collected from traditional and digital
sources which, when refined properly can be used for research and
analysis.
Transactional data
Social data is generated from all the daily
comes from the Likes, transactions that take place
Tweets & Retweets, both online and offline.
Invoices, payment orders,
Comments, Video Uploads, storage records, delivery
and general media that are receipts – all are characterized
uploaded and shared via
Machine data as transactional data
the world’s favorite social
information which is generated
media platforms. by industrial equipment,
sensors that are installed in
machinery, and even web logs
which track user behavior.
Where does BIG DATA come from?
Data Science
42
43
Introduction to Data Science
44
Introduction to Data Science
The simplest definition of data science is the extraction of actionable insights.
Famous journalist Jim Gray imagined data science as a "fourth paradigm" of science
(empirical, theoretical, computational and now data-driven) and asserted that
"everything about science is changing because of the impact of information
technology" and the big data.
Introduction to Data Science
Data science is a deep study of the massive amount of data, which involves
extracting meaningful insights from raw, structured, and unstructured data that is
processed using the scientific method, different technologies, and algorithms.
Data science uses the most powerful hardware, programming systems, and most
efficient algorithms to solve the data related problems. It is the future of artificial
intelligence.
Introduction to Data Science
Data science is an inter-disciplinary field that uses scientific methods, processes,
algorithms and systems to extract knowledge and insights from many structural and
unstructured data.
48
49
Introduction to Data Science
Data Science as a field focused on extracting knowledge and insights from data by
using scientific methods.
Introduction to Data Science
It uses techniques and theories drawn from many fields within the context of
mathematics, statistics, computer science, domain knowledge and information
science.
Data science (DS) is a multidisciplinary field of study with goal to address the
challenges in big data
Data science principles apply to all data i.e. big and small.
Data Science History
52
Data Science History
The term “data science” has been traced back to 1974, when Peter Naur proposed it as
an alternative name for computer science.
In 1997, C.F. Jeff Wu suggested that statistics should be renamed data science.
In 2002, the Committee on Data for Science and Technology launched Data Science
Journal.
Data Science History
55
Uses of Data Science
We can say that data science is all about:
Asking the correct questions and analyzing the raw data.
Understanding the data to make better decisions and finding the final result.
Uses of Data Science
Example:
Let suppose we want to travel from station A to station B by car. Now, we need to take
some decisions such as which route will be the best route to reach faster at the location,
in which route there will be no traffic jam, and which will be cost-effective.
All these decision factors will act as input data, and we will get an appropriate answer
from these decisions, so this analysis of data is called the data analysis, which is a
part of data science.
Need for Data Science
58
59
Need for Data Science
Some years ago, data was less and mostly available in a structured form, which could
be easily stored in excel sheets, and processed using BI (business intelligence) tools.
But in today's world, data is becoming so vast, i.e., approximately 328.77 million
terabytes of data is generating on every day.
Now, handling of such huge amount of data is a challenging task for every
organization. So to handle, process, and analysis of this, we required some complex,
powerful, and efficient algorithms and technology, and that technology came into
existence as data Science.
Need for Data Science
Following are some main reasons for using data science technology
With the help of data science technology, we can convert the massive amount of raw
and unstructured data into meaningful insights.
Data science can help in different predictions such as various survey, elections, flight
ticket confirmation, weather forecast, etc.
Data Science Components
64
65
Data Science Components
The main components of Data Science are given below
Domain Expertise: In data science, domain expertise binds data science together.
Domain expertise means specialized knowledge or skills of a particular area. In data
science, there are various areas for which we need domain experts.
Data Science Components
The main components of Data Science are given below
70
Data Science Process
1 Business Problem
1 Understand
• Classification/Recognition
• Prediction/Regression
• Association
• Pattern detection/clustering
• Scoring and ranking
• Optimization
Data Science Process
2 Data Acquisition
2
Data Sources
• Questionnaires
• Web servers
• Web services (API)
• Database
• Logs
• Online repositories
Data Science Process
3 Data Preparation
Spider-Man (Peter Secret Good Hazel Eyes Brown Hair Male Living 4043 Aug-62 marvel
Parker) Characters
Captain America Public Good Blue Eyes White Hair Male Living 3360 Mar-41 marvel
Natalia Romanova Public Good Green Eyes Red Hair Female Living 1050 Apr-64 marvel
(Earth-616) Characters
Training Set
Model
Test
Set
Data Science Process
Model
Test in pre-production Deploy in production
environment environment
Running Model
Monitoring
Real-time Analytics
Data Science Process
Not enough data for analyzing
Exploratory Data
Analysis
Deploy &
Maintenance
Tools for Data Science
79
Tools for Data Science
Following are some tools required for data science
Data Analysis tools: R, Python, Statistics, SAS, Jupyter, R Studio, MATLAB, Excel,
RapidMiner
81
82
Application of Data Science
Following are some common applications areas of data science
Classifications (in an email server, this could mean classifying emails as “important”
or “junk”).
Internet Search: When we want to search for something on the internet, then we use
different types of search engines such as Google, Yahoo, etc. All these search engines
use the data science technology to make the search experience better, and you can get
a search result with a fraction of seconds.
Application of Data Science
Transport: Transport industries also using data science technology to create self-
driving cars. With self-driving cars, it will be easy to reduce the number of road
accidents.
Healthcare: In the healthcare sector, data science is providing lots of benefits. Data
science is being used for tumor detection, drug discovery, medical image analysis,
virtual medical bots, etc.
Application of Data Science
Recommendation systems: Most of the companies, such as Amazon, Netflix, Google
Play, etc., are using data science technology for making a better user experience with
personalized recommendations. Such as, when you search for something on Amazon,
and you started getting suggestions for similar products, so this is because of data
science technology.
Risk detection: Finance industries always had an issue of fraud and risk of losses,
but with the help of data science, this can be rescued. Most of the finance companies
are looking for the data scientist to avoid risk and any type of losses with an increase
in customer satisfaction.
Machine Learning
90
Machine Learning
Machine learning (ML) is a branch of Artificial Intelligence (AI) that uses
algorithms trained on data sets to create models that enable machines to perform
tasks like a humans, such as categorizing images, analyzing data, or predicting price
fluctuations.
Data
Computer Output
Program
Machine Learning
Data
Computer Progra
m
Output 4
When Do We Use Machine Learning?
ML is used when:
• Human expertise does not exist (navigating on Mars)
• Humans can’t explain their expertise (speech recognition)
• Models must be customized (personalized medicine)
• Models are based on huge amounts of data (genomics)
• Supervised Learning
– Given: training data + desired outputs (labels)
• Unsupervised Learning
– Given: training data (without desired outputs)
• Semi-supervised Learning
– Given: training data + a few desired outputs
• Reinforcement Learning
– Rewards from sequence of actions
Machine Learning Example
Classification Example
98
Climate Change
99
Climate Change
What is Climate Change?
Climate change refers to long-term shifts in temperatures and weather patterns.
Natural events and human activities as contributing to average global temperatures by
increase in “Greenhouse” gases such as Carbon Dioxide (CO2).
Artificial Intelligence
101
Artificial Intelligence
What is Artificial Intelligence (AI)
Artificial Intelligence (AI) is the science and engineering of making intelligent
machines, especially intelligent computer programs. It is related to the similar task
of using computers to understand human intelligence
AI BRANCHES
1. Machine Learning 4. Neural Networks
2. Fuzzy Logic 5. Computer Vision
3. Expert Systems 6. Natural Language Processing(Speech
4. Robotics Recognition, Image Recognition, Pattern
Recognition, etc.)
Role of AI in Climate Change
103
Role of AI in Climate Change
AI for Climate Action
Technology Mechanism supports transformational climate solutions. Some AI-powered
solutions for climate action are already undergo, including
107
How Machine Learning Combat Climate Change
Machine learning can produce sustainability insights and help plan effective climate
action. It can use such as,
Predictive maintenance, such as detecting and patching methane leaks in natural gas
infrastructure.
110
Data Science Role
Data science Contribute by
Helping develop adaptation strategies, and offering predictive models for mitigation
efforts
Energy effectiveness: Machine learning (Data Science) models reduce the amount of
energy used in industrial, transportation, and building processes.
How Data Science is used to Combat
Climate Changes
113
How Data Science is used to Combat Climate Changes
Data Collection
A robust system for monitoring and data collection captures data from various sources
• Satellites,
• Weather stations,
• Ocean buoys, and
• Sensors
This data provide a huge information about temperature, precipitation,
greenhouse gas concentrations, etc.
How Data Science is used to Combat Climate Changes
Data Processing
The raw data is preprocessed to clean and structure it for analysis. It involves
handling missing data, quality control, and converting data into standardized formats.
Data Analysis
Data scientists utilize statistical and machine learning techniques to uncover patterns
and relationships within the data. This analysis helps identify trends, anomalies, and
potential correlations between climate variables.
How Data Science is used to Combat Climate Changes
Climate Models
Advanced climate models are developed and refined using data science techniques to
simulate future climate scenarios and simulate the earth climate system. These
models help scientists predict future climate scenarios and understand the potential
impacts of climate change
How Data Science is used to Combat Climate Changes
Forecasting
Climate models, driven by data science, provide short-term and long-term climate
forecasts. It enables real-time monitoring of climate variables, such as temperature,
sea levels, and weather patterns. These predictions help governments, industries, and
communities prepare for extreme weather events and plan for climate adaptation
strategies.
How Data Science is used to Combat Climate Changes
Policy Analysis
Data science models assess the potential impact of climate policies on emissions,
energy consumption, and other relevant metrics. This analysis informs the
development of policy strategies.
How Data Science is used to Combat Climate Changes
Scenario Modeling
Data science enables the modeling of different climate scenarios, allowing
policymakers to evaluate the consequences of different courses of action and make
data-driven decisions.
Emissions Monitoring
Data science is used to monitor and verify greenhouse gas emissions, ensuring
compliance with emissions reduction targets.
How Data Science is used to Combat Climate Changes
Data Visualization
Data scientists use visualization techniques to present climate data in a
comprehensible and engaging manner, making the information accessible to a
broader audience.
How Data Science is used to Combat Climate Changes
Climate Education
Data science supports the development of educational tools and platforms that teach
students and the public about climate science and sustainable practices.
How Data Science is used to Combat Climate Changes
Satellite Technology
Data from remote sensing satellites is analyzed using data science to monitor
deforestation, land use changes, and carbon emissions. This information is crucial
for tracking compliance with international climate agreements.
How Data Science is used to Combat Climate Changes
7. Citizen Engagement and Crowdsourced Data
Engaging citizens in data collection and climate monitoring can enhance the data science
system. Functions in this area include:
Crowdsourcing
Citizens can contribute data on weather conditions, air quality, and other
environmental parameters through mobile apps and online platforms. Data science is
employed to process and integrate this crowdsourced data into climate models.
Community Resilience
Communities use data science and crowdsourced information to build resilience
against climate change impacts, such as local flooding and heatwaves.
How Data Science is used to Combat Climate Changes
8. Climate Finance and Investment
Data science is integral to climate finance by supporting the allocation of resources to
projects that reduce emissions and promote sustainability. Functions in this realm
include:
Investment Analysis
Data science helps assess the financial viability and environmental impact of
climate-related projects, facilitating investment decisions by governments,
organizations, and individuals.
How Data Science is used to Combat Climate Changes
Carbon Markets
Data science is used to develop and optimize carbon markets, ensuring accurate
measurement and verification of emissions reductions.
Impact Measurement
Data-driven impact assessment determines the effectiveness of climate finance
initiatives, enabling adjustments and improvements in resource allocation.
Data Science Tools and Techniques use in
Climate Change
129
Data Science Tools and Techniques Use in Climate Change
To address the challenges of climate change, data scientists use a wide range of tools and
techniques
2. Big Data Analytics: Climate data is massive and ever-growing. Big data analytics
tools help manage and process this data efficiently, making it easier to extract insights
and respond to climate events in real-time.
Data Science Tools and Techniques Use in Climate Change
4. Geospatial Analysis: Geospatial data and geographic information systems (GIS) are
used to understand the spatial distribution of climate-related phenomena, such as
temperature changes, sea-level rise, and the impact on local ecosystems.
Data Science Key Areas / Solutions in
Climate Change
132
Data Science Solutions in Climate Change
Some key areas where Data Science/Data-Driven solutions are making
a difference
1. Renewable Energy Optimization: Data science helps optimize the use of renewable
energy sources like wind and solar power. By analyzing historical weather data and
energy production, we can better predict when and where to deploy these resources for
maximum efficiency.
135
Electricity Systems
Enabling Low-Carbon Electricity
136
Enabling Low-Carbon Electricity Example
Enabling Low-Carbon Electricity
The power industry try to use AI & ML to introduce the smart grid
1. IPCC. 2014. Climate Change 2014: Mitigation of Climate Change. Contribution of Working Group III to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change. O.
Edenhofer, et all Minx (Eds.). Intergovernmental Panel on Climate Change
Enabling Low-Carbon Electricity Example
Enabling low carbon electricity
Low-carbon electricity sources are essential to tackling climate change.
These sources are: Solar panels, wind turbines, and other electricity generators (they
need natural gas plants, storage, or other controllable sources ready to buffer changes in
their output. These sources are provided by coal and natural gas plants which provide
huge amount of CO2).
Enabling Low-Carbon Electricity
139
Enabling Low-Carbon Electricity Example
Role of ML in enabling low carbon electricity
ML can contribute in research, deployment, and operation of electricity system
technologies, including
ML can both reduce emissions from today’s standby generators and enable the
transition to carbon-free systems by helping improve necessary technologies
(namely forecasting, scheduling, and control) and by helping create advanced
electricity markets that accommodate both variable electricity and flexible demand.
Example
Forecasting supply and demand
ML methods can be used to forecast electricity supply and demand (used historical data,
physical model outputs, images, video data).
Enabling Low-Carbon Electricity Example
Improving scheduling and flexible demand
ML can help improve the existing (centralized) process of scheduling and dispatch by
speeding up power system optimization problems and improving the quality of
optimization solutions.
144
Transportation
145
Transportation Example
Role of ML in enabling low carbon in Transportation
Alternative Fuels
Much of the transportation sector dependent on (liquid fossil fuels, Electro fuels,
Solar fuels, Hydrogen & natural gas.
• ML techniques can be used to provide best option of fuels that emit less emit
low CO2
Transportation Example
Transport Modal Selection
149
Optimizing Buildings
150
Optimizing Buildings and Cities Example
Designing new buildings and improving existing ones, there are numerous
technologies that can reduce GHG emissions
ML can used to
Modeling data on energy consumption and
Optimizing energy use (in smart buildings).
Optimizing Buildings and Cities Example
Modeling data on energy consumption
• ML can used to forecast the energy demand of specific buildings.
• Energy demand depend on building physical design and physical structure of the
building
• ML can used to evaluate and forecast the predication ignoring the building
physical design and physical structure
• ML can be use how to transfer the knowledge gain in the deign of one building
to another building design
Optimizing Buildings and Cities Example
Smart Buildings
In smart buildings intelligent control systems can be used to decrease the carbon
emission.
• ML can be use to reduce energy usage
• ML can be use forecasting which temperature are need through the systems
• ML can be use for automatics building diagnostics and maintenance through
fault detection.
• ML can be used to derive high level patterns, while designing strategies such as
district heating and cooling, integrating new technology within buildings, etc.
• ML can be used to Modeling energy use across buildings.
• ML can be use for gathering infrastructure data.
Optimizing Buildings and Cities Example
Future of Cities
For the smart cities development, the city govt. try to regulate transportation, buildings,
and economic activity. For this they handle, diverse issues, including energy, water,
waste, crime, health, etc.
155
Selected opportunities to reduce GHG emissions in industry using ML
156
Farms & Forests
157
Selected opportunities to reduce GHG emissions from land use using ML
158
Best Climate Datasets for
Machine Learning
159
Best Climate Datasets for Machine Learning
1. World Bank Climate Change Data
2. Climate Change: Earth Surface Temperature Data
3. International Greenhouse Gas Emissions
4. Daily Sea Ice Extent Data
5. Temperature Change Dataset
6. Air Quality Annual Summary
7. VEMAP 2: Annual Ecosystem Model Responses to U.S. Climate Change,
1994 - 2100
8. Climate Change Tweets Ids
9. EU emission trading system
Data Science (Machine Learning) Research
in Climate Change
161
The Data Science Research in Climate Change
The growth in the publications on applications of machine learning (ML) and deep learning (DL) in climate
162
change mitigation and adaptation (left) and the dominant subject areas (right).
The most frequent
machine learning and
deep learning
methods applied for
climate change
adaptation and
mitigation.
163
Thanks You
Questions/ Answers
164