0% found this document useful (0 votes)
19 views20 pages

Climode Report

Uploaded by

an.phung0411
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views20 pages

Climode Report

Uploaded by

an.phung0411
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

VIETNAM NATIONAL UNIVERSITY HO CHI MINH CITY

HO CHI MINH CITY UNIVERSITY OF TECHNOLOGY


FACULTY OF APPLIED SCIENCE
—***—

PROJECT REPORT
CLASS: CALCULUS 1 – CC10

TOPIC 19: INTRODUCTION TO AI


AI IN WEATHER FORECASTING AND AN OVERVIEW OF CLIMODE

Supervisor: Assoc. Prof. Phan Thành An

Student Name ID E-mail Task


Trần Phương Quỳnh Anh 2452094 [Link]@[Link] Content
Đào Gia Huy 2452373 huy.daovn06@[Link] Content
Vũ Lê Hoàng 2452365 [Link]@[Link] Research + Present
Lê Tuấn Hy 2452441 hy.le1so7@[Link] Experiment
Nguyễn Đỗ Bảo Khanh 2452486 khanh.nguyen2452486@[Link] Latex

HO CHI MINH CITY, DECEMBER 2024

1
ACKNOWLEDGEMENTS

First and foremost, we would like to express our deepest gratitude to Associate Professor Phan
Thành An, our supervisor, for his tremendous assistance and guidance during the development
of our project. Without his insights and feedback, completing this project would have been
impossible.

Furthermore, we would also like to extend our appreciation to the Faculty of Applied Science
of Ho Chi Minh University of Technology – Vietnam National University Ho Chi Minh City for
providing us the opportunities to attend the Calculus 1 study course. This program equipped us
with necessary knowledge and experience to continue our studies in future years as students of
this university.

Throughout this project, we have learned valuable lessons and grown in both intelligence and
skills. Nevertheless, we acknowledge that we are not yet complete professionals and despite
our best efforts, it is inevitable our abilities remained limited. We sincerely hope to receive
constructive and helpful advice from the professors so as to refine and improve our report.

Finally, we would like to thank everyone who has supported us throughout this project. Your
encouragement and assistance have been invaluable, and we are profoundly grateful.

2
TABLE OF CONTENTS

ACKNOWLEDGEMENTS 2

ABSTRACT 3

A INTRODUCTION 5
1 Rationale . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.1 Current Situation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2 Objective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3 Significance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

B MAIN BODY 6
1 Theoretical Basis of ClimODE . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2 Key Concepts – Physics Integration . . . . . . . . . . . . . . . . . . . 6
1.2.1 Representation of Weather as a Flux . . . . . . . . . . . . . 6
1.2.2 The Lagrangian Derivative in a Flux . . . . . . . . . . . . . 7
1.2.3 The Advection Equation . . . . . . . . . . . . . . . . . . . . 7
1.2.4 Neural ODEs . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3 Neural Transport Model . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.3.1 Advection Equation . . . . . . . . . . . . . . . . . . . . . . 9
1.3.2 Flow Velocity . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.3.3 Second-order PDE as a System of First-order ODEs . . . . . 10
1.3.4 Modeling Local and Global Effects . . . . . . . . . . . . . . 10
1.3.5 Spatiotemporal Embedding . . . . . . . . . . . . . . . . . . 11
1.3.6 Initial Velocity Inference . . . . . . . . . . . . . . . . . . . 11
1.3.7 System Sources and Uncertainty Estimation . . . . . . . . . 11
1.3.8 Loss . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.3.9 Neural Network Architecture . . . . . . . . . . . . . . . . . 12
2 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.1 Data Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2 Training and Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . 14
3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.1 Strengths and Weaknesses . . . . . . . . . . . . . . . . . . . . . . . . 16
3.1.1 Strengths . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.1.2 Weaknesses . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.2 Potential Growth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

C KEYTAKEAWAYS 19

REFERENCE 20

3
ABSTRACT

AI (Artificial Intelligence) has become a critical component in various domains, including


weather forecasting. A notable example is ClimODE, a model that integrates physics-based
methodologies with AI, employing neural ordinary differential equations (ODEs) to simulate
complex weather systems. By combining mathematical rigor with AI principles such as neu-
ral networks, ClimODE exemplifies the interdisciplinary potential of modern analytical ap-
proaches. Utilizing the key findings of the research “ClimODE: Climate and Weather Forecast-
ing with Physics-informed Neural ODEs”, our analysis report provides our understandings and
detailed examination of the model’s fundamental mechanisms and serves as an introduction to
the foundational concepts underlying AI training methodologies.

4
INTRODUCTION

1. Rationale
1.1. Current Situation
In recent years, a lingering problem that majority of weather prediction systems suffer is the
ignorance to sudden changes in the climate and atmospheric data, most of which can be signs
of unexpected natural disasters. The highly advanced models, though more accurate, requires
intensely precise numerical simulation and access to them are often highly restricted to the
public.

1.2. Solution
These limitations of current weather forecasting systems are addressed with ClimODE, a spa-
tiotemporal continuous-time process that implements a key principle of advection from statis-
tical mechanics, namely, weather changes due to a spatial movement of quantities over time.
ClimODE models precise weather evolution with value-conserving dynamics, learning global
weather transport as a neural flow, which also enables estimating the uncertainty in predictions.
The approach outperforms existing data-driven methods in global and regional forecasting with
an order of magnitude smaller parameterization, establishing a new state of the art.

2. Objective
By analyzing the fundamental principles behind ClimODE, particularly its use of neural ordi-
nary differential equations (ODEs), the report aims to demonstrate how mathematical frame-
works and AI-driven approaches can be related in such models. A key focus will be on under-
standing the specific AI techniques employed in ClimODE, such as neural networks and ma-
chine learning algorithms, and evaluating their role in enhancing weather forecasting accuracy.
Furthermore, the analysis will use ClimODE as an entry point to introduce core AI training con-
cepts, such as data optimization, uncertainty estimation, and hybrid modeling. Lastly, we also
address the accessibility, limitations, and future implications of models like ClimODE, empha-
sizing their potential to democratize advanced forecasting tools while addressing the pressing
need for more accurate and inclusive weather prediction systems.

3. Significance
Choosing ClimODE, we hope to highlight how the model is able to combine AI traditional
physics-based methods to enhance weather forecasting accuracy. We also aim to emphasize on
the significance of AI when it comes to continuous estimation, which is an important part in
current climate and weather predictions. With the involvement of AI, it can make models like
ClimODE more accessible to the overall public, especially when it comes to predicting natural
disasters.

5
MAIN BODY

1. Theoretical Basis of ClimODE


1.1. Overview
The ClimODE model is designed for weather forecasting by representing weather as the inter-
action of multiple spatiotemporal quantities (such as temperature, humidity, etc.). It employs a
neural ordinary differential equation (ODE) approach, where the change in weather conditions
is modeled over time and space under the influence of a fluid with a velocity that is calculated
using a combination of neural networks. Specifically:

• The model utilizes a convolutional neural network (CNN) to capture local spatial depen-
dencies between weather variables.
• It incorporates an attention mechanism to account for global interactions across regions.
• The velocity field of the weather system is represented using a second-order neural ODE,
which helps in accurately modeling the transport and compression effects of the system.
• The weather embeddings also incorporate additional information about the day-night cycle
and more into the neural network.
• The model introduces uncertainty by using a Gaussian emission model to estimate the bias
(mean) and uncertainty (variance) of the weather forecasts.
• It handles large-scale spatiotemporal data by leveraging both local convolutional opera-
tions and global spatiotemporal embeddings to represent periodic effects like daily and
seasonal cycles.

1.2. Key Concepts – Physics Integration


1.2.1. Representation of Weather as a Flux

In statistical mechanics, weather can be described as a flux, a spatial movement of quantities


over time represented by the partial differential continuity equation

transport compression
du z }| { z }| {
+ v| · ∇u +{z u∇ · v} = |{z}
s ,
dt
|{z} advection sources
time evolution u̇

where u(x, t) is a any quantity (e.g. temperature) evolving over space x ∈ Ω and time t ∈ R
driven by a flow’s velocity v(x, t) ∈ Ω and sources s(x,t). The advection moves and redis-
tributes existing weather mass, while sources add or remove quantities. Crucially, the dynamics
need to be time-continuous and modeling them with autoregressive jumps will incur approx-
imation errors, meaning for pure Deep Learning models that rely on discrete data points i.e.
radar images, they will not be continuously timed and will incur approximation errors.

6
1.2.2. The Lagrangian Derivative in a Flux

To understand how weather can be described as a flux, we need to understand each term of the
equation. First, we explore the term
du
,
dt

this term is known as the material derivative or the Lagrangian derivative represents the total
(or material) derivative of the quantity u with respect to time. In fluid systems, this term will
normally be expressed as
du ∂u
= + v · ∇u
dt ∂t

because ∂u
∂t
represents the local rate of change of u with respect to time if space does not change,
but in a fluid system, it is also affected by the flow of the medium transporting it, hence the
second term v · ∇u. However, in the original function, the transport term is already explicitly
defined, thus leaving the equation dudt
to only represent the local rate of change of u with respect
to time:
du ∂u
=
dt ∂t

1.2.3. The Advection Equation

Next, we explore the advection equation


v · ∇u + u∇ · v
by breaking it into 2 smaller components: the transport and compression terms and examine
how these 2 terms combined can represent the way a fluid affects how u changes over space
and time.
We see the transport term being represented as a function of the velocity of the flow v and the
quantity u(x, t) as
v · ∇u
which combined the dot product of 2 vectors to form a scalar. In an n dimensional space, v is
a vector with n elements representing the velocity of the flow in n directions. Similarly, ∇u is
the gradient of u, which is also a vector of n elements that points to the direction of the greatest
rate of change of u, and is expressed as:
 
∂u ∂u
∇u = , . . . ∀x ∈ [1 : n]
x1 x2

So, when we take the dot product of v and ∇u, the resulting scalar can have 3 possible values:

• Positive: If the scalar is positive, the flow is moving with the direction of increasing u.
• Negative: If the scalar is negative, the flow is moving against the direction of increasing
u, or the flow is carrying u from higher value to lower value.
• 0: If the scalar is 0, the flow is moving perpendicular to u , not affecting how u is moving.

7
We now review the compression term, which is represented as how the divergence of the ve-
locity of the fluid affects u over time and space, and is expressed as:

u∇ · v
which multiplies u with the divergence of v. The divergence of velocity v is the term
∇·v

and this term measures how much the flow expands or compresses at a given point in space.
The ∇ operations on a vector velocity field on a n dimension plane can be expressed as:
X
n
∂u
∇·v=
i=1
∂i

for example if we are concerned with our normal 3D plane, the term will become
∂v ∂v ∂v
∇·v= + +
∂x ∂y ∂z
and the expression will result in a scalar. The resulting scalar has 3 possible outcomes:

• Positive: If the scalar is positive, the fluid is diverging (spreading out from a point)
• Negative: If the scalar is negative, the fluid is converging (coming together to a point)
• 0: If the scalar is 0, the fluid is incompressible, meaning there is no expansion or compres-
sion at that point. Finally, if we then take the product of u and the resulting scalar, it describes
how u is affected by the compression of the fluid. If the fluid is compressed and thus the scalar
is positive, u will increase because it is also getting compressed together by the fluid. The
opposite is true for when the resulting scalar is negative.

1.2.4. Neural ODEs

In normal neural networks, the hidden state of the network in forward propagation is updated
with a discrete number of layers (a finite number of layers) and can be represented as
ht+1 = f (ht , θt )

where ht+1 is the hidden state of the next time step, ht , θt are the hidden state and the learned
parameter of the current time step and f is a transformation function (convolution, linear, acti-
vation...). In neural ODEs, the hidden state is governed continuously over time according to an
ODE (ordinary differential equation) and can be expressed as:

dh(t)
= f (h(t), t; θ)
dt

where f is the entire neural network and θ are the learned parameters of the neural network. To
solve for h(t), an ODE solver is used to integrate the function.

8
Figure B.1: Whole prediction pipeline for ClimODE

1.3. Neural Transport Model


1.3.1. Advection Equation

The research aims to model weather as a spatiotemporal process of combined quantities e.g.
heat, humidity... and mathematically expressed them as a set of functions over continuous time
t and space x
u(x, t) = (u1 (x, t), ..., uK (x, t)) ∈ RK

with
uk (x, t) ∈ R and t ∈ R

and latitude-longitude locations x as


x = (h, w) ∈ Ω = [−90◦ , 90◦ ] × [−180◦ , 180◦ ] ⊂ R2

The weather is a flux as discussed above and thus follows the advection PDE
u̇k (x, t) = −vk (x, t) · ∇uk (x, t) − uk (x, t)∇ · vk (x, t),
| {z } | {z }
transport compression

where quantity change u̇k (x, t) is caused by the flow, whose velocity vk ∈ Ω transports and
concentrates air mass (which in turn affects the quantity).

1.3.2. Flow Velocity

The research modeled the flow velocity v with a second-order flow by parameterizing the
change of velocity with a neural network fθ ,
v̇k (x, t) = fθ (u(t), ∇u(t), v(t), ψ),
meaning the rate of change of the velocity v̇k (x, t) is controlled by a neural network and that
the neural network takes in several input:

1. u(t): the function of the current state - the hidden state in neural ODEs - which has a
formula of u(t) = u(x, t) : x ∈ Ω ∈ RK×H×W
2. ∇u(t): the gradient of u with respect to space with a fixed time which has a formula of
∇u(t) = ∂u , ∂u ∈ R2K×H×W
∂h ∂ω

9
3. v(t): the current flow velocity at time t, which has a formula of v(t) = {v(x, t) : x ∈
Ω} ∈ R2K×H×W
4. ψ: the spatiotemporal embeddings of the neural networks that were can contain additional
information about the weather and has a shape of ψ ∈ RC×H×W These inputs denote global
frames at time t discretized to a resolution (H, W ) with a total of 5K quantity channels
and C embedding channels.

1.3.3. Second-order PDE as a System of First-order ODEs

The paper uses the method of lines (MOL) to discretize the PDE into a grid of location-specific
ODEs. Additionally, a second-order differential equation can be transformed into a pair of first-
order differential equations. Combining these techniques yields a system of first-order ODEs
(uki (t), vki (t)) of quantities k at location xi :
h i h i Z th i h i Z th i
u(t) u(t0 ) u̇(τ ) {uk (t0 )}k {−∇·(uk (τ )vk (τ ))}k
v(t) = v(t0 ) + v̇(τ ) dτ = {vk (t0 )}k + {f0 (u(τ ),∇u(τ ),v(τ ),ψ)k }k dτ,
t0 t0

where τ ∈ R is an integration time. Backpropagation of ODEs is compatible with standard


autodiff in neural network training. The forward solution u(t) can be accurately approximated
with numerical solves such as Runge-Kutta with low computational cost.

1.3.4. Modeling Local and Global Effects

With the model above, the acceleration v̇ are computed only with the current state and its gradi-
ent at the same location x and time t, ruling out long-range connections between accelerations
of many points and many time frames affection each other.
The research paper has an example of this about how Atlantic weather conditions - u(x1 , t1 ), v(x1 , t1 )
- can affect future weather patterns in Europe and Africa - u(x2 , t2 ), v(x2 , t2 ).
To capture these long range dependencies, the research proposed a hybrid network to account
for both local transport and global effects (hybrid network sounds fancy but it is just a normal
neural network with an attention mechanism),

fθ (u(t), ∇u(t), v(t), ψ) = fconv (u(t), ∇u(t), v(t), ψ) +γ fatt (u(t), ∇u(t), v(t), ψ) .
| {z } | {z }
convolution network attention network

Local Convolutions
To capture local effects, the model uses local convolutional network, denoted as fconv . This
network is parameterized using ResNets with 3x3 convolution layers. The local receptive field
property of convolution layers allows the model to aggregate weather information up to a dis-
tance of L pixels away from the location x with L being the depth of the neural network.
Attention Convolutional Network
To capture global information, they employ an attention convolutional network fatt that consid-
ers states across the entire Earth, enabling long-distance connections. This attention network
is structured as a dot-product attention, with Key, Query and Value parameterized with CNNs
and γ as a learnable hyper-parameter.

10
1.3.5. Spatiotemporal Embedding

Embeddings are additional information about the input given to the neural networks so that they
can make a better prediction of the output. In the case of weather forecast, this research added
embeddings that are related to the weather by various method of encodings.

Day and Season The research encode daily and seasonal periodicity of time t with
trigonometric time embeddings

 
2πt 2πt
ψ(t) = sin 2πt, cos 2πt, sin , cos
365 365
Location The research also encodes latitude h and longitude w with trigonometric and
spherical-position encoding
ψ(x) = [{sin, cos} × {h, w}, sin(h) cos(w), sin(h) sin(w)].

Joint time-location embedding A joint time-location embedding is created by


combining position and time encodings (ψ(t)×ψ(x)), capturing the cyclical patterns of day and
season across different locations on the map. Additionally, the research incorporates constant
spatial and time features, with ψ(h) and ψ(w) representing 2D latitude and longitude maps, and
lsm and oro denoting static variables in the data,
ψ(x, t) = [ψ(t), ψ(x), ψ(t) × ψ(x), ψ(c)], ψ(c) = [ψ(h), ψ(w), lsm,oro].

1.3.6. Initial Velocity Inference

The neural transport model requires an initial velocity estimate, v̂k (x0 , t0 ), to start the ODE so-
lution. In traditional dynamic systems, estimating velocity poses a challenging inverse problem.
However, by representing weather as a flux, we can use the established identity u̇+∇·(uv) = 0
to directly solve for the missing velocity when observing state u. The velocity is optimized for
location x, quantity k and time t with penalized-least squares

n o
v̂k (t) = arg min = ||u̇˜k (t) + vk (t) · ∇u
˜ k (t) + uk (t)∇
˜ · vk (x, t)||22 + α||vk (t)||K .
vk (t)

1.3.7. System Sources and Uncertainty Estimation

The system described so far has 2 limitations:

1. The system is deterministic (outputs a single value from input) and thus has no uncer-
tainty, which is harmful for a prediction task because there will always be an amount of
uncertainty in predicting the future.
2. The system is close and does not allow value loss or gain [Link] day-night cycle.

The research tackles both issues by adding an emission g outputting a bias µk (x, t) and variance
σk2 (x, t) of uk (x, t) as a Gaussian distribution,

11

k (x, t) ∼ N uk (x, t) + µk (x, t), σk (x, t) , µk (x, t), σk (x, t) = gk (u(x, t), ψ)
uobs 2

The observed value uobs


k is modeled as coming from a Gaussian distribution with:

• Mean: uk (x, t)+µk (x, t) where µk (x, t) represents the bias over time (value gained or loss
over time)
• Variance: σk2 (x, t), which captures the uncertainty of the estimate for that quantity, refer-
ring to how much the actual value might fluctuate. This allows the model to capture both
aleatoric uncertainty and epistemic uncertainty.
• Both the bias and the variance are functions of the current state of the system, parame-
terized by an emission network gk which takes u(x, t), the current state of the modeled
quantities, and ψ, the spatiotemporal embeddings.

1.3.8. Loss

The model expects a full-earth dataset D = (y1 , . . . , yN ) of a total of N timepoints of observed


frames yi ∈ RK×H×W at times ti . The data is also organized into a dense and regular spatial grid
(H, W ). The research team also tried to minimize negative log likelihood of the observations
yi .
!
1 XN   
L(θ; D) = − log N yi |u(ti ) + µ(ti ), diagσ 2 (ti ) + log N+ σ(ti )|0, λ2σ I ,
N KHW i=1

1.3.9. Neural Network Architecture

The neural network includes features of base ResNets:

• Leaky ReLU activation function


• Batch Normalization
• 3x3 Convolution
• Dropout layer
• Skip connections

12
2. Experiments
2.1. Data Processing
The model is trained on the ERA5 dataset from WeatherBench, which includes historical weather
data. The data covers variables such as ground temperature, atmospheric temperature, geopo-
tential, and wind speeds.
The data is preprocessed by normalizing the weather variables to the range [0, 1] using min-max
scaling. Spatiotemporal embeddings (time of day, season, latitude, longitude) are also created
and added as additional input channels to the neural network.
Data organizing for training and validation The training data spans ten years
(2006-2015), validation data is from 2016, and testing data is from 2017-2018. The spatial
data is discretized into latitude-longitude grids, and weather data is fed into the model in 6-hour
increments.

13
Figure B.2: Quantities available in era5 dataset [2]

2.2. Training and Evaluation


Experimental environment (hardware, operating system, programming lan-
guage, packages, ...)
The model is implemented using PyTorch and torchdiffeq for solving the ODEs. It was trained
on a 32GB NVIDIA V100 GPU using the Euler solver for ODEs.

The metrics to evaluate this model


The model is evaluated using:

1. Latitude-weighted RMSE (Root Mean Square Error): To measure the difference between
predicted and true weather values.

v
N u
X u X
H X
W
1 t 1
RM SE = α(h)(ythw − uthw )2
N t
HW h w

where:

• N is the number of time points


• H × W are the spatial grid dimensions, where H is the number of latitude points and W
is the number of longitude points.

14
• ythw and uthw represent the observed and predicted values at time t, latitude h, and longi-
tude w, respectively.
cos(h)
• α(h) = is the latitude weight, accounting for the Earth’s curvature and
1 PH ′)
′ cos(h
H h
normalizing the weight of different latitudes.

2. ACC (Anomaly Correlation Coefficient): To assess the model’s ability to capture deviations
from normal weather patterns.

P
α(h)ỹthw ũthw
t,h,w
ACC = qP P
2 2
t,h,w α(h)ỹthw t,h,w α(h)ũthw

where ỹthw = ythw − C and ũthw = uthw − C are the anomalies of the observed and predicted
1 P
values, calculated as the difference from the empirical mean C = ythw
N t,h,w
3. CRPS (Continuous Ranked Probability Score): Used to assess both the accuracy and uncer-
tainty of predictions.

How this model fares against competing models


ClimODE outperforms other neural methods like ClimaX, FourCastNet, and GraphCast in terms
of forecast accuracy (RMSE, ACC), and also provides uncertainty estimates, which many com-
peting models lack. It delivers comparable performance to traditional Numerical Weather Pre-
diction (NWP) models but at a fraction of the computational cost.

Figure B.3: RMSE(↓) and ACC(↑) comparison with baselines. ClimODE outperforms competitive neural methods
across different metrics and variables.

15
3. Conclusion
3.1. Strengths and Weaknesses
3.1.1. Strengths

• This model has the advantage of operating on continuous time, which overcomes the weak-
ness of predicting in discrete steps by normal Deep Learning network.
• It also incorporates physics into the neural network, helping it not blindly predicting the
next state and that predictions are also based on factual information following the physics
equation.
• The help of neural ODEs also makes climODE more computationally accessible while still
keeping the results accurate compared to pure numerical weather simulations
• Uncertainty calculations and emission model help climODE adapt to realistic weather cir-
cumstances where values are constantly added and removed.

3.1.2. Weaknesses

While the ClimODE model represents a significant advancement in the combination of AI and
physics-based modeling for weather forecasting, it has potential weaknesses that could limit its
application and effectiveness:

• While ClimODE improves overall accuracy and uncertainty estimation, its performance in
capturing rare or extreme events is not explicitly highlighted, and this remains a persistent
challenge in climate modeling.
• ClimODE is effective for medium-range forecasts but may face difficulties with long-term
climate changes due to limitations in the training dataset’s historical scope and assumptions
of stationarity in climate processes.
• Modeling complex feedback loops in the Earth system, such as ocean-atmosphere inter-
actions, requires additional enhancements beyond ClimODE’s current capabilities.
• Despite incorporating physical principles, the neural network’s intermediate representa-
tions may still lack physical interpretability compared to traditional physics-based models.

3.2. Potential Growth


There exists many areas in which the ClimODE model can improve to further expand its capa-
bilities, as well as accuracy, and make it more accessible for practical use in weather forecasting
and climate science. Some key avenues for its growth includes:

• Focus on loss functions or sampling techniques tailored for extreme events to improve
reliability under rare scenarios.
• Incorporate dynamic or evolving physical processes to account for non-stationary climate
changes.

16
• Combine ClimODE with traditional numerical weather prediction (NWP) models to lever-
age strengths from both approaches.
• Extend ClimODE to include multi-physics systems where atmospheric, oceanic, and land-
surface dynamics are simultaneously modeled. [1]
• Use techniques like saliency maps, feature attribution, or symbolic regression to map
ClimODE’s outputs to interpretable physical variables.

17
Aspect ClimODE Numerical Weather Pre- Earth System Models
diction (NWP)[4] (ESM)[1]
Primary Focus Neural-based modeling of High-resolution, physics- Comprehensive long-term
atmospheric dynamics with driven simulations for short- climate modeling, incorpo-
physical constraints (e.g., to medium-term weather rating atmosphere, ocean,
conservation laws). forecasting. cryosphere, and biosphere.
Mathematical Neural ODEs solving ad- Solves systems of Partial Integrates PDEs for multi-
Framework vection equations with em- Differential Equations ple Earth subsystems, sim-
bedded physical biases (e.g., (PDEs) representing at- ulating interactions among
mass conservation). mospheric physics and them.
dynamics.
Spatiotemporal Global and regional fore- Regional or global weather Long-term predictions
Scope casting with a focus on patterns for timescales rang- (months to centuries), fo-
short to medium-range pre- ing from hours to 10 days. cusing on global climate
dictions (hours to days). dynamics.
Data Requirements Requires preprocessed Relies on real-time obser- Historical and geological
reanalysis datasets (e.g., vational data (e.g., satellite, data, as well as coupled
ERA5), primarily for atmo- ground stations) and initial initial conditions for various
spheric variables. condition assimilation. subsystems.
Computational Low-to-medium: Trainable High: Requires significant Extremely high: Long in-
Cost on a single GPU; computa- computational power for tegration times and com-
tional efficiency due to neu- real-time assimilation and plex subsystem interactions
ral flow modeling. simulations on HPCs. demand supercomputing re-
sources.
Physical Inter- Moderate: Enforces physi- High: Directly solves physi- High: Simulates subsystem
pretability cal principles (e.g., advec- cally interpretable equations interactions explicitly, pro-
tion, continuity) but lacks with clear connections to viding detailed physical in-
direct mapping to physical real-world phenomena. sights.
phenomena.
Flexibility Limited to atmospheric dy- Focused on atmospheric Very flexible: Explicitly
namics but can include other physics; less flexible for incorporates multiple Earth
subsystems with additional broader Earth system mod- subsystems and their inter-
neural components. eling. actions.
Key Strengths Efficient and scalable. Highly accurate for short- Comprehensive representa-
Includes uncertainty quan- term forecasts. Real-time tion of the Earth system.
tification. Competitive in data assimilation enhances Captures long-term trends
medium-range forecasting. short-term prediction accu- and feedbacks. Long-range
racy. climate projections.
Key Limitations Limited capability to model Computationally expensive Prone to biases from long-
complex Earth system for global coverage. Ac- term parameterization and
feedbacks (e.g., ocean- curacy decreases for longer- assumptions. Cannot re-
atmosphere interactions). term forecasts. solve fine-scale features
Difficult to interpret neural (e.g., storms) without down-
parameters physically. scaling.
Uncertainty Quan- Yes: Uses probabilistic Limited: Ensemble forecast- Yes: Captures uncertainties
tification emission models to estimate ing provides probabilistic in- through ensemble modeling
uncertainty in predictions. sights but is computationally and scenario-based projec-
intensive. tions.

Table B.1: Comparison between ClimODE, NWP, and ESM models.

18
KEYTAKEAWAYS
The process of exploring ClimODE brings valuable lessons across various fields, especially
the intersection of artificial intelligence (AI), physics, and climate science. Here are the key
takeaways:
The Potential of AI in Climate Science
• AI in weather forecasting: The ClimODE research shows how AI can complement or
replace traditional methods in weather forecasting and disaster prediction.
• Wide-ranging applications: Models like ClimODE can help democratize weather fore-
casting, reducing reliance on proprietary numerical models.

Integration of AI and Physics


• Learning to combine physics theory with AI models: ClimODE not only uses data but
also integrates physical principles like the advection equation to enhance accuracy and
practicality.
• The benefits of interdisciplinary integration: Merging AI with physics creates a more
robust model than relying on just one discipline.

Managing Uncertainty
• Understanding uncertainty in forecasting: ClimODE illustrates how uncertainty esti-
mation improves model usability and reliability:
• Aleatoric uncertainty: Related to random factors in the weather.
• Epistemic uncertainty: Due to limited knowledge or insufficient data.
• Practical application: Forecasting requires not only accuracy but also the ability to ex-
plain the reliability of results.

Developing Research and Teamwork Skills


• Enhancing interdisciplinary research skills: Understanding ClimODE requires knowl-
edge of AI, physics, mathematics, and climatology.
• Presentation and explanation skills: Conveying complex concepts about modeling and
uncertainty to diverse audiences is a valuable takeaway.

The importance of AI:


From daily conveniences to addressing global challenges, ClimODE underscores AI’s vital role
in modern life. It highlights AI as not just a tool for efficiency but a transformative force that en-
hances resilience, democratizes access to technology, and empowers societies to tackle critical
issues effectively.
In conclusion, through ClimODE, we learn how to maximize the power of modern AI tech-
nology when combined with traditional knowledge to solve practical problems. At the same
time, there are valuable lessons about how to develop interdisciplinary scientific models. The
research process also helps improve our necessary soft skills and teaches us about effective
teamwork.

19
REFERENCE
[1] Gregory Flato. “Earth system models: An overview”. In: Wiley Interdisciplinary Reviews: Climate Change
2 (Nov. 2011), pp. 783–800. DOI: 10.1002/wcc.148.
[2] Stephan Rasp et al. “WeatherBench: A Benchmark Data Set for Data-Driven Weather Forecasting”. In:
Journal of Advances in Modeling Earth Systems 12.11 (Nov. 2020). ISSN: 1942-2466. DOI: 10 . 1029 /
2020ms002203. URL: [Link]
[3] Yogesh Verma, Markus Heinonen, and Vikas Garg. “ClimODE: Climate and Weather Forecasting with Physics-
informed Neural ODEs”. In: The Twelfth International Conference on Learning Representations. 2024. URL:
[Link]
[4] Wikipedia contributors. Numerical weather prediction — Wikipedia, The Free Encyclopedia. [Link]
wikipedia . org / w / index . php ? title = Numerical _ weather _ prediction & oldid = 1241874927.
[Online; accessed 8-December-2024]. 2024.

20

You might also like