AI & ML: A Technical Overview
AI & ML: A Technical Overview
net/publication/375988604
CITATIONS READS
0 46
1 author:
Ideen Sadrehaghighi
CFD Open Series
89 PUBLICATIONS 154 CITATIONS
SEE PROFILE
All content following this page was uploaded by Ideen Sadrehaghighi on 30 November 2023.
a
Artificial Intelligence (AI)
&
Machine Learning (ML)
Artificial
Intelegence
Machine
Learning
Artificial
Neutrual
Networks
(ANNs)
ANNAPOLIS, MD
2
Contents
List of Tables
Table 2.3.1 Data Considered ......................................................................................................................................... 11
Table 2.6.1 Machine learning algorithms may be categorized into supervised, unsupervised, and
semi-supervised, depending on the extent and type of information available............................................ 16
Table 3.2.1 Results of Different Methods ................................................................................................................ 23
List of Figures
Figure 1.1.1 Scope of Artificial Intelligence - Courtesy of Hackerearth Blog.............................................. 5
Figure 1.2.1 Research in artificial intelligence [1] .................................................................................................... 6
Figure 1.2.2 Schematics of Deep Learning ................................................................................................................. 7
Figure 2.3.1 Machine Learning Programming ....................................................................................................... 10
Figure 2.3.2 Decision Tree Classifier .......................................................................................................................... 11
Figure 2.4.1 Schematics of AI, Machine Learning and Deep Learning........................................................ 14
Figure 2.5.1 A Learning Machine Uses Inputs From a Sample Generator and Observations from a
System to Generate an Approximation of its Output (Credit: Cherkassky & Mulier (2007)) ................ 15
Figure 2.6.1 Linear Regression .................................................................................................................................... 16
Figure 2.6.2 Decision Tree ............................................................................................................................................. 17
Figure 3.1.1 Artificial Neural Network (ANN) ....................................................................................................... 19
Figure 3.2.1 Perceptron .................................................................................................................................................. 20
Figure 3.2.2 Multi-Layer Perceptron Architecture .............................................................................................. 20
Figure 3.2.3 Radial Basis Function ............................................................................................................................. 21
4
1 Artificial Intelligence
1.1 Definitions
Artificial Intelligence (AI) can be outlined as the analysis of mental and psychological abilities by
using various computational
patterns and sequences [1,2].
The term “intelligence” in this (a) Machine
field can be very deceptive. For Learning
instance, we usually apply this
word when we want to
describe someone displaying
unusual inventiveness and
mind-blowing skills. This (b) Neural
results in giving the impression Networks
that artificial intelligence is a
reliable method for generating
loads of clever ideas and
insights but in reality, it
Artificial (c) Deep
Learning
revolves around the basic idea Intelegence
of duplicating the physiological
and mental abilities of the
“ordinary” people. It can also Figure 1.1.1 Scope of Artificial Intelligence - Courtesy of
be defined as the science of Hackerearth Blog
creating sophisticated
machines and devices as well as various computerized programs to analyze human intelligence [3]
for solving the practical problems that the world presents us with. The ultimate aim of artificial
intelligence is to create devices that have human-level intelligence, as some might think that this
practice is immoral and indecent [2]. In broadest way, Artificial Intelligence (AI) can be think of
about advanced, computer intelligence. In 1956 at the Dartmouth Artificial Intelligence Conference,
the technology was described as such: "Every aspect of learning or any other feature of
intelligence can in principle be so precisely described that a machine can be made to simulate
it." A.I. can refer to anything from a computer program playing a game of chess, to a voice-recognition
system like Amazon's Alexa interpreting and responding to speech. IBM's Deep Blue, which beat chess
grand master Garry Kasparov at the game in 1996, or Google DeepMind's Alpha Go, are examples of
A.I. It also used to classify machines that mimic human intelligence and human cognitive functions,
like problem-solving and learning. AI uses predictions and automation to optimize and solve complex
tasks that humans have historically done, such as facial and speech recognition, decision making and
translation (IBM Blog, 2023).
1.2 Categories of AI
Three main categories of AI are:
• Artificial Narrow Intelligence (ANI)
• Artificial General Intelligence (AGI)
• Artificial Super Intelligence (ASI)
ANI is considered “weak” AI, whereas the other two types are classified as “strong” AI. We define
weak AI by its ability to complete a specific task, like winning a chess game or identifying a particular
individual in a series of photos. Natural language processing (NLP) and computer vision, which let
companies automate tasks and underpin chatbots and virtual assistants such as Siri and Alexa, are
6
examples of ANI. Computer vision is a factor in the development of self-driving cars. Stronger forms
of AI, like AGI and ASI, incorporate human behaviors more prominently, such as the ability to
interpret tone and emotion. Strong AI is defined by its ability compared to humans. Artificial General
Intelligence (AGI) would perform on par with another human, while Artificial Super Intelligence
(ASI), also known as superintelligence, would surpass a human’s intelligence and ability. Neither
form of Strong AI exists yet, but research in this field is ongoing (IBM newsletter). According to
HackerEarth Blog, AI can be classified into the following (see Figure 1.1.1):
• Machine Learning (ML)
• Deep Learning (DL)
• Neural Networks (NNs)
Other definitions provided by Kontos [4], which defines A.I. as a single and consolidated discipline it
might be better to consider as a set of different technologies that are easier to define individually.
This set can include data mining, question answering, self-aware systems, pattern recognition,
knowledge representation, automatic reasoning, deep learning, expert systems, information
extraction, text mining, natural language processing, problem solving, intelligent agents, logic
programming, machine learning, artificial neural networks, artificial vision, computational discovery,
computational creativity. Therefore artificial ``Self-aware'' or ``conscious'' systems are the products
of one of these technologies. Figure 1.2.1 indicates the various area of Artificial Intelligence with
attentive subject contoured shown in red ellipse.
While for artificial intelligence (AI), machine learning (ML), deep learning (DL) and neural networks
(NN) are related technologies, the terms are often used interchangeably, which frequently leads to
confusion about their differences. Vargas et al. [5] describes the Deep Learning (DL) as an emerging
7
Representations
area of Machine Learning (ML) research (Figure of1.2.2).
Deep Learning
It comprises multiple hidden layers of
Artificial Neural Networks (ANNs). The deep learning methodology applies nonlinear
transformations and model abstractions of high level in large databases.
1.2.1 References
[1] Charniak, E. (1985). Introduction to artificial intelligence. Pearson Education India
[2] Ananya Priyadarshini, “Artificial Intelligence: The Inescapable”, B.Tech. Computer Science, 2022.
[3] McCarthy, J. (2007). What is artificial intelligence?
[4] John Kontos, “Artificial Intelligence, Machine Consciousness and Explanation”, Academia Letters
preprint, 2012.
[5] Vargas, R., Mosavi, A., & Ruiz, R. (2017). Deep learning: a review.
8
9
multiplies inputs in order to make guesses as to the inputs' nature. Different outputs/guesses are the
product of the inputs and the algorithm. Usually, the initial guesses are quite wrong, and if you are
lucky enough to have ground-truth labels pertaining to the input, you can measure how wrong your
guesses are by contrasting them with the truth, and then use that error to modify your algorithm.
That's what Artificial Neural Networks (ANN) do. They keep on measuring the error and modifying
their parameters until they can't achieve any less error. They are, in short, an optimization algorithm.
If you tune them right, they minimize their error by guessing and guessing and guessing again.
Another point of view expressed by (Pandey, Schumacher, & Sreenivasan, 2020) [4] is that while ML
is sometimes regarded as a subset of AI, there are some differences in usage. AI mimics
natural intelligence to solve complex problems and enables decision making; efficiency is not
its main driver, and it is an intelligence capability which we want to build into all machines.
Machine learning, on the other hand, is about improving and maximizing performance by
means of self-learning algorithms. Both of them require large databases from which to learn: the
more the high-quality data that becomes available, the better the results, hence the close connection
of AI and ML to Big Data.
2.2.1 Reference
[1] Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction, vol. 1. MIT Press, Cambridge
(1998)
[2] Mnih, V., Kavukcuoglu, K., Silver, D., et al.: Human-level control through deep reinforcement
learning. Nature 518, 529 (2015)
[3] arXiv:2110.02083 [physics. flu-dyn]
[4] Pandey, S., Schumacher, J., & Sreenivasan, K. R. (2020). A perspective on machine learning in
turbulent flows. Journal of Turbulence.
2.3 Creating Your First Machine Learning Model (Apples & Oranges)
Source : Newark.com
In ML, instead of defining the rules and expressing them in a programming language, answers
(typically called labels) are provided with the data (see Figure 2.3.1). The machine will conclude
the rules that determine the relationship between the labels and the data. The data and labels are
used to create ML Algorithms,
typically called models. Using this
model, when the machine gets new
data, it predicts or correctly labels
them. If we train the model to
distinguish between apples and
oranges, the model can predict
whether it is an apple or an orange Figure 2.3.1 Machine Learning Programming
when new data is presented. The
problem sounds easy, but it is impossible to solve without ML. You'd need to write tons of rules to
tell the difference between apples and oranges. With a new problem, you need to restart the process.
There are many aspects of the fruit that we can collect data on, including color, weight, texture, and
shape. For our purposes, we'll pick only two simple ones as data: weight and texture. In this article,
we will explain how to create a simple ML algorithm that discerns between an apple and an orange.
To discern between an apple and an orange, we create an algorithm that can figure out the rules so
we don't have to write them by hand. And for that, we're going to train what's called a classifier. You
can think of a classified as a function. It takes some data as input and assigns a label to it as output.
The technique of automatically writing the classifier is called supervised learning.
2.3.1 Supervised Learning
In supervised learning, the training data will have expert labels that should be predicted or modeled
with the machine learning algorithm (Brunton, 2021)4. These output labels may be discrete, such as
a categorical label of a “dog” or a “cat” given an input image, in which case the task is one of
classification. If the labels are continuous, such as the average value of lift or drag given a specified
airfoil geometry, then the task is one of regression. To use supervised learning, we follow a simple
procedure with a few standard steps. The first step is to collect training data. These are essentially
examples of the problem we want to solve. Step two is to use these examples to train a classifier.
Once we have a trained classifier, the next step is to make predictions and classify a new fruit.
2.3.2 Collect Training Data
To collect training data, assume we head out to an orchard and Weight Texture Label
collect some data. We look at different apples and oranges and 155 rough Orange
write down their descriptive measurements in a table. In ML, 180 rough Orange
these measurements are called features. To keep things simple, 135 smooth apple
we've used only two types of data – how much each fruit weighs 110 smooth apple
in grams and its texture, which can be bumpy or smooth. Each
row in our training data depicts an example. It describes one Table 2.3.1 Data Considered
piece of fruit. The last column is known as the label. It identifies
what type of fruit is in each row, and in this case, there are only
two possibilities – apples or
oranges. The more training data
you have, the better a classifier
you create. (see Table 2.3.1).
2.3.3 Training the Classifier
With the dataset prepared, the
next step is to set up our training
data and code it. Before we set
up our training data, ensure the
scikit-learn package is loaded.
Scikit-learn provides a range of
supervised and unsupervised
learning algorithms via a
consistent interface in Python.
Now let's write down our Figure 2.3.2 Decision Tree Classifier
training data in code. We will use
two variables – features and labels.
features = [[155, “rough”], [180, “rough”],[135, “smooth”],[110, “smooth”]]
labels = ["orange", "orange", "apple", "apple"]
In the preceding code, the features contain the first two columns, and labels contain the last. Since
scikit-learn works best with integers, we're going to change the variable types of all features to
integers instead of strings – using 0 for rough and 1 for smooth. We will do the same for our labels –
using 0 for apple and 1 for orange. The next step involves using these example features to train a
12
classifier. The type of classifier we will use is called a decision tree. There are many different
classifiers, but for simplicity, you think of a classified as a box of rules. Before we use our classifier,
we must import the decision tree into the environment. Then on the next line in our script, we will
create the classifier. (
Figure 2.3.2).
2.3.4 Make Predictions
We have a trained classifier. Let's test it and use it to classify a new fruit. The input to the classifier is
the feature for a new example. Let's say the fruit we want to classify is 150 grams and bumpy. Let's
see if our ML algorithm can make such a prediction:
print (clf.predict(X = [[150, 0]]))
(1)
It works! The output is what we expected: 1 (orange). If everything worked for you, then
congratulations! You have completed your first ML project in Python. You can create a new classifier
for a new problem just by changing the training data. Fortunately, with the abundance of open source
libraries and resources available today, programming with ML has become more comfortable and
accessible to a rising number of users every day. Once you have a basic understanding of ML software
programs and algorithms, you can scale your project using AI-based development boards. Decide on
a hardware platform based on your application, and you are ready to go for real-world deployment.
2.3.5 Warming Up: Quadratic Equation
Consider a prototypical problem of finding roots of quadratic equation, ax2 + bx + c = 0,
−b ± √b 2 − 4ac
rL , rR =
2a
Eq. 2.3.1
We would like to learn the Eq. 2.3.1
(a, b, c) → (rL , r R )
Eq. 2.3.2
without relying on the knowledge of the underlying processes (Gyrya, Shashkov, Skurikhin, &
Tokareva, 2019)[4]. For example, the relationship Eq. 2.3.2 may represent a physical process for
which some observations are available but the analytical relation Eq. 2.3.1 has not yet been
established. The prototypical problem of finding roots of a quadratic equation was selected as a proxy
for the following reasons that are relevant to many complex practical problems:
• It is a fairly simple problem that is familiar to everyone who would be reading this paper. Yet,
it is good representative a wide class of approximation problem in scientific computing.
• Finding solution involves different arithmetic operations some of which could be difficult to
model by machine learning techniques. For example, division and taking of a square root
represent a challenge for neural networks to capture exactly using activation functions.
• There are situations when a particular form of analytic expression/algorithm may exhibit
loss of accuracy. For example, the analytic expression Eq. 2.3.1 for the larger root is
numerically inaccurate when b is much larger than 4ac.
• The roots of quadratic equation under certain condition exhibit some non-trivial behavior.
There are several branches in the solution: if a = 0, the quadratic equation becomes a linear
equation, which has one root – this is a qualitative change from one regime to a different one;
depending on the discriminant the number of roots as well as the nature of the roots changes
(real vs. complex).
13
• Probably, the most significant challenge from the standpoint of ML is that there is a small
range of input parameters for which output values are increasingly large (corresponding to
small values of a).
We will now explain what we mean by learning the relation Eq. 2.3.2. Assume we are provided a
number of observations (training set):
j j j j
(aj , b j , c j ) → (r̅L , r̅R ) ≈ (rL , rR ) , j = N + 1, , , , , , , , N + K
Eq. 2.3.4
The goal is to minimize mismatches between the estimates (˜rjL; ˜rjR) and the testing data (rjL; rjR)
j j 2 j j 2
Cost = ∑(rL − r̅L ) + ∑(rR − r̅R )
j j
Eq. 2.3.5
Since the testing data is not available during the training process the minimization is performed on
the training set with the idea that the training and the testing set are selected from the same pool.
The above setup is the typical ML setup. In this work our goal was to compare the performance of
several existing ML approaches for the case of a quadratic equation.
2.3.6 References
[1] Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction, vol. 1. MIT Press, Cambridge
(1998)
[2] Mnih, V., Kavukcuoglu, K., Silver, D., et al.: Human-level control through deep reinforcement
learning. Nature 518, 529 (2015)
[3] arXiv:2110.02083 [physics. flu-dyn]
[4] Gyrya, V., Shashkov, M., Skurikhin, A., & Tokareva, S. (2019). Machine learning approaches for the
solution of the Riemann problem in fluid dynamics: a case study. Journal of Computational Physics.
[5] Pandey, S., Schumacher, J., & Sreenivasan, K. R. (2020). A perspective on machine learning in
turbulent flows. Journal of Turbulence.
identified the best fit line having linear equation y = 0.2811 x +13.9 (see Figure 2.6.1). Now using
this equation, we can find the weight, knowing the height of a person. Linear Regression is of mainly
two types: Simple Linear Regression and Multiple Linear Regression. Simple Linear Regression is
characterized by one independent variable. And, Multiple Linear Regression(as the name suggests)
is characterized by multiple (more than 1) independent variables. While finding best fit line, you can
fit a polynomial or curvilinear regression. And these are known as polynomial or curvilinear
regression [1].
2.6.2 Logistic Regression
Don’t get confused by its name! It is a classification not a regression algorithm. It is used to estimate
discrete values ( Binary values like 0/1, yes/no, true/false ) based on given set of independent
variable(s). In simple words, it predicts the probability of occurrence of an event by fitting data to a
logit function. Hence, it is also known as logistic regression. Since, it predicts the probability,
its output values lies between 0 and 1 (as expected). Again, let us try and understand this through a
simple example. Let’s say your friend gives you a puzzle to solve. There are only 2 outcome scenarios
; either you solve it or you don’t. Now imagine, that you are being given wide range of puzzles/
quizzes in an attempt to understand which subjects you are good at. The outcome to this study would
be something like this ; if you are given a trigonometry based tenth grade problem, you are 70%
likely to solve it. On the other hand, if it is grade fifth history question, the probability of getting an
answer is only 30%. This is what Logistic Regression provides you. Coming to the math, the log odds
of the outcome is modeled as a linear combination of the predictor variables odds = p/(1 - p) =
probability of event occurrence / probability of not event occurrence. ln(odds) = ln(p/(1 - p)),
logit(p) = ln(p/(1 - p)). Above, p is the probability of presence of the characteristic of interest. It
chooses parameters that maximize the likelihood of observing the sample values rather than
that minimize the sum of squared errors (like in ordinary regression). Now, you may ask, why take a
log? For the sake of simplicity, let’s just say that this is one of the best mathematical way to replicate
a step function. It can go in more details, but that will beat the purpose of this article.
2.6.3 Decision Tree
This is favorite algorithm and used
it quite frequently. It is a type of
supervised learning algorithm that
is mostly set for classification
problems [2]. Surprisingly, it
works for both categorical and
continuous dependent variables.
In this algorithm, we split the
population into two or more
homogeneous sets. This is done
based on most significant
attributes/ independent variables
to make as distinct groups as
possible. In the image above, you
can see that population is
classified into four different
groups based on
multiple attributes to identify
Figure 2.6.2 Decision Tree
‘if they will play or not’. To split
the population into different
heterogeneous groups, it uses various techniques (Figure 2.6.2).
18
2.6.4 References
[1] Sunil, Ray, “Essentials of Machine learning Algorithms (with Python and R codes)”, August 2015.
[2] Same as Above
19
an input layer, a layer with RBF neurons and an output. The RBF neurons store the actual classes for
each of the training data instances. The RBN are different from the usual Multilayer perceptron
because of the Radial Function used as an
activation function.
When the new data is fed into the neural
network, the RBF neurons compare the
Euclidian distance of the feature values
with the actual classes stored in the
neurons. This is similar to finding which
cluster to does the particular instance
belong. The class where the distance is
minimum is assigned as the predicted class.
The RBNs are used mostly in function
approximation applications like Power
Restoration systems. (Figure 3.2.3).
3.2.5 Convolutional Neural Networks
When it comes to image classification, the
most used neural networks are
Convolution Neural Networks (CNN). Figure 3.2.3 Radial Basis Function
CNN contain multiple convolution layers
which are responsible for the extraction of important features from the image (Figure 3.2.4). The
earlier layers are responsible for low-level details and the later layers are responsible for more high-
level features. The Convolution operation uses a custom matrix, also called as filters, to convolute
over the input image and produce maps. These filters are initialized randomly and then are updated
via backpropagation. One example of such a filter is the Canny Edge Detector, which is used to find
the edges in any image.
3 Wikipedia
23
3.2.8 Case Study - Prediction & Comparison of the Maximal Wall Shear Stress (MWSS) for Carotid
Artery Bifurcation
Steady state simulations for 1886 geometries were undertaken and MWSS values were calculated for
each of them. This dataset was used for training and testing following data mining algorithms; k-
nearest neighbors, linear regression, neural network: multilayer perceptron, random forest and
support vector machine. The results are based on Relative Root Mean Square (RMSE):
Figure 3.2.7 Maximal Wall Shear Stress (MWSS) Value for Carotid Artery Bifurcation
3.4 Field Inversion and Machine Learning in Support of Data Driven Environment
A machine learning technique such as an
Artificial Neural Network (ANN) can
adequately describe by its field inversion on
data driven context. The Calibration Cases
(offline data) where few configuration data
(DNS or Experimental data) such as the one
showing in Figure 3.4.1. The Prediction
cases (Machine Learning with no data) has
similar configuration with different; (1) Twist,
(2) Sweep angles, and (3) Airfoil shape4. The
challenge in predictive modeling, however, is
to extract an optimal model form that is Figure 3.4.1 Calibration Cases for off Line Data
sufficiently accurate. Constructing such a
model and demonstrating its predictive
capabilities for a class of problems is the objective.
Figure 3.5.1 Network Diagram for a feed-forward NN with three inputs and one output
4Heng Xiao, “Physics-Informed Machine Learning for Predictive Turbulence Modeling: Status, Perspectives,
and Case Studies”, Machine Learning Technologies and Their Applications to Scientific and Engineering
Domains Workshop, August 17, 2016.
25
below, elements of the feature vector η are chosen to be locally non-dimensional quantities. The
standard NN algorithm operates by constructing linear combinations of inputs and transforming
them through nonlinear activation functions5. The process is repeated once for each hidden layer
(marked blue in Figure 3.5.1) in the network, until the output layer is reached. Figure 3.5.1
presents a sample ANN where a Network diagram for a feed-forward NN with three inputs, two
hidden layers, and one output. For this sample network, the values of the hidden nodes z1,1 through
z1,H1 would be constructed as
3
1
z1,i = a1 (∑ wi,j ηi )
i=1
Eq. 3.5.1
where a1 and w1i,j are the activation function and weights associated with the first hidden layer,
respectively. Similarly, the second layer of hidden nodes is constructed a
H1
5 Singh, A. P., Medida, S., & Duraisamy, K. (2016). Machine Learning-augmented Predictive Modeling of
Turbulent Separated. arXiv:1608.03990v3 [cs.CE].
6 Zhang, Z. J. and Duraisamy, K., “Machine Learning Methods for Data-Driven Turbulence Modeling,” 22nd AIAA
Computational Fluid Dynamics Conference, AIAA Aviation, (AIAA 2015-2460), Dallas, TX, Jun 2015.
7 S. Muller , M. Milano and P. Koumoutsakos, “Application of machine learning algorithms to flow modeling and
v = V + ∑ an (t)φn (x)
i=1
Eq. 3.5.4
where V is the time averaged flow, φn is the set of the first n eigenvectors of the covariance matrix C
= E [(vi−V )(vj −V )]; when this representation for v is substituted in the Navier Stokes equations, the
original PDE model is transformed in an ODE model, composed by n equations. The POD can be
expressed as a multi-layer feed-forward neural network. Such a network is defined by the number of
layers, the specification of the output function for the neurons in each layer, and the weight matrices
for each layer. [Baldi and Hornik]9 have shown that training a linear neural network structure to
perform an identity mapping on a set of vectors is equivalent to obtaining the POD of this set of
vectors. A neural network performing the linear POD can be specified as a 2 layer linear network:
x = W1 v
v̂ = W2 x
Eq. 3.5.5
where ^v is the reconstructed field, v is the original flow field, having N components, x is the reduced
order representation of the field, having n components, and W1 and W2 are the network weight
matrices, of sizes N x n and n x N respectively. Non-linearity can be introduced by a simple extension
to this basic network:
x = W2 tanh(W1 v)
v̂ = W4 tanh(W3 x)
Eq. 3.5.6
This corresponds to a neural network
model with 4 layers: the first one,
with an m x N weight matrix W1,
nonlinear; the second one, with an n x
m weight matrix W2, linear; the third
one, also nonlinear, with an m x n
weight matrix W3, and the last one,
linear with an N x m weight matrix
W4. However, the resulting system of
ODEs is more involved as compared
to the one resulting from the
application of the linear POD.
3.5.1.1 POD and Nonlinear ANN
A simple comparison of POD and
nonlinear ANN is provided by the
reconstruction of the velocity field in
the stochastically forced Burger's
Figure 3.5.2 Comparison of linear POD (top) and Neural
equation a classical 1D model for
Networks (bottom)
turbulent flow [Chambers]10. The
9 Baldi, P. & Hornik, K., “ Neural networks and principal component analysis: Learning from examples without
local minima”. Neural Networks. 2, 53-58, 1989.
10 Chambers, D. H., Adrian R. J., Moin, P. & Stewart, S.,”Karhunen-Loeve expansion of Burgers model of turbulence”.
linear POD was used to obtain a set of 256 linear Eigen functions using 10000 snapshots extracted
from a simulation. Using the first 7 Eigen functions it is possible to reconstruct the original flow field,
keeping the 90 percent of the energy. A nonlinear neural network was trained on the same data set
to perform the identity mapping: this network is composed by 256 inputs and 4 layers having
respectively 64 nonlinear neurons, 7 linear neurons, 64 nonlinear neurons, and 256 linear neurons.
For validation purposes, a data set of 1000 snapshots, not used in the training phase, was used. In
Figure 3.5.2 it is possible to appreciate the reconstruction performances of both the approaches;
the proposed nonlinear ANN clearly outperforms the linear POD (top) using a velocity field in Burgers
equation.
Other researchers such as (Romit Maulik et al.)13, tried using an open source module (TensorFlow),
within the OpenFOAM. It outline the development of a data science module within OpenFOAM which
allows for the in-situ deployment of trained deep learning architectures for general-purpose
11 Ling, J., Kurzawski, A. & Templeton, J. “Reynolds averaged turbulence modelling using deep neural networks
with embedded invariance”, J. Fluid Mech 807, 155–166, 2016.
12 Karthik Duraisamy, “A Framework for Turbulence Modeling using Big Data”, NASA Aeronautics Research
Figure 3.6.2 Contour plots for a backward facing step. Note that the training of the ML surrogate did
not include data for the shown step height.
predictive tasks. This is constructed with the TensorFlow C API and is integrated into OpenFOAM as
an application that may be linked at run time. In this experiment, the different geometries are all
backward facing steps with varying step heights (ℎ). Once trained, the steady-state eddy-viscosity
emulator may be used at the start of the simulation (by observing the initial conditions) following
which solely the pressure and velocity equations need to be solved to convergence. We outline
results from one such experiment (backward steps), where the geometry is ‘unseen’, in Figure 3.6.2.