Complex Systems - Docs
Complex Systems - Docs
sciences
Review
1 . Introduction
The ever increasing amount of big data in science, engineering, and society, including meteorological,
hydrological, ecological, environmental, as well as various kinds of biomedical, manufacturing, e-commerce, and
government management data, has fueled enormous optimism among researchers, entrepreneurs, government
officials, the media, and the general public [1,2]. It is now hoped that by recording and analyzing the errors of all
the components of a sophisticated machine, one can quickly diagnose and then fix its malfunctioning. When
one is sick, one hopes that in the near future, with all the increasingly detailed data about oneself, including
genomic, cellular, clinical, psychological,
Appl. Sci. 2021, 11, 5736 2 of 67
Such optimism is not entirely unfounded, as big data indeed has brought
some pleasant surprises to science and society. For example, a good online
shopping system can quickly and fairly accurately infer what an online shopper
is looking for by analyzing the shopper’s online behavior in real time. By
analyzing the tweets about major natural disasters, key information of disasters
can be accurately obtained [3]. Google Flu Trends did an impressive job in
predicting the 2008 influenza [4].
While the big data showcase does not stop at the above successful
examples, it is important that one is not carried away by those successes. In
fact, many more not so successful cases also exist. For example, right after
2008, Google Flu Trends over-predicted influenza outbreaks, and by 2012, the
error was by as much as a factor of two [5], which then prompted Google to
give up the predictor. The box office price of the film “Golden Times”, which
was first released in China during the National Holiday, 1 October 2014, was
only slightly more than 40 million, while Baidu, the leading Chinese web
services company, predicted it to be about 200–230 million. The poor
prediction by Baidu made a reviewer of the film to lament that big data may not
be dependable [6]. Of course, we have to add the failed prediction of the
Trump presidency in 2016 by many predictors, whose implications to the
Americas, and even the world’s politics, are almost unfathomable.
Among the most important values of big data analysis are to detect
changes in system dynamics (e.g., detect and understand abnormal behaviors )
and to help science to extend its reach (and most desirably, to possibly uncover
new fundamental laws). This includes timely diagnosis and treatment of various
kinds of diseases in health care, proper prediction of regime changes in weather
and climate patterns, timely forewarning of natural disasters, and timely
detection and fixing of malfunctioning of various kinds of devices,
infrastructure, and software in the field of operation and maintenance [7–10],
among many others. Understandably, abnormal behaviors cannot be expected
to occur frequently, and thus the relevant data may not be so abundant that
direct application of machine-learning based approaches will always be very
rewarding. In those situations, the systems often generate data with complex
characteristics including long-range spatial–temporal correlations, extreme
variations (sometimes caused by small disturbances), time-varying mean and
variance, and multiscale analysis (i.e., different behavior depending on the
scales at which the data are examined). Such situations have been increasingly
manifesting themselves in science, engineering, and society. To adequately
cope with these situations, it is often beneficial to resort to complexity science
to analyze the relevant data. In fact, when dealing with such highly challenging
situations, many analyses using machine-learning based approaches may be
considered pre-processing of the data or the first step that can facilitate further
application of complexity-based approaches, or as post-processing of the
features obtained through multiscale analysis. An excellent article along this
line (more precisely, study of segmental organization of the human genome by
combining complexity with machine learning approaches) has recently been
reported by Karakatsanis et al. [11]. In short, the complex behaviors in nature,
science, engineering, and society must be infinite. To help one to peek into the
infinity of the complex behaviors, going beyond statistical analysis and
machine-learning by resorting to the type of mathematics that embodies an
element of infinity will often be beneficial.
At this point, it is important to pause for a moment to discuss a peculiar
phenomenon: while many consider complexity science to be very useful, some
others doubt its relevance to reality. Why is this so? The basic reason is that in
complexity research, conceptual thinking, simulational study, and applications
have not been well connected. For example, Science magazine dedicated the
April 1999 issue to Complex Systems. A number of leading experts in their
respective fields, including chemistry, physics, economics, ecology, and biology,
expressed their views on the relevance/importance of complexity science in
Appl. Sci. 2021, 11, 5736 3 of 67
their fields. While the special issue is influential in making some concepts of
complex systems known to a wider research community and even the general
public, it does little in teaching readers how to solve real-world problems. This
may have contributed to the waning of enthusiasm in complexity science
research in the subsequent years, as most readers cannot see how complexity
science can help solve their problems. Fortunately, the tide appears to have
been reversed (please see recent reviews on complexity theory and leadership
practice [12] and health [13]).
The purpose of this article is to convey how the many concepts in
complexity science can be effectively applied to help one formulate stimulating
problems pertinent to the data and the underlying system. We will particularly
focus on multiscale approaches. They are the key to find scaling laws from the
data. With the scaling laws, we can then find defining parameters/properties of
the data and eliminate spurious causal relations in the data. The latter can help
to shed some light on a new generation of AI, which is based on
correlation/causality rather than pure probabilistic thinking [14]. To better
serve our goal, we will discuss various kinds of applications right after a
concept/method is introduced. Our goal here is to fully arouse readers’ interest
in the materials covered, and to equip them with a set of widely applicable
concepts and methods to help solve their own interesting problems.
(1)
]
P[X ≥ x ∼ x−α, x → ∞. (2)
Notice here the emphasis that x → ∞. An interesting property of the power law
distributiondo not exist. Therefore, when
P 0, (3)
Here, α is the shape parameter, and b the location parameter. In the discrete
case, the Pareto distribution is called the Zipf distribution, which provides an
excellent description between the frequency of any word in a corpus of natural
language and its rank in the frequency table.
Appl. Sci. 2021, 11, 5736 6 of 67
Somewhat related to the Zipf distribution is another distribution called
Benford’s law [40], which is about the probability of occurrence of leading digits
, 2, ,9
d ∈ {1 ··· },
P (4)
A good mechanism for explaining the uneven distributions stipulated by
Benford’s law has been proposed in [41].
Benford’s law has been used for evaluating possible fraud in accounting
data [42], legal status [43], election data [44–46], macroeconomic data [47],
price data [48], etc. From Equation (4), we observe that beyond the small digits,
the probability approximately approaches the Zipf distribution with α = 1,
P , d = 3, 4, ··· , 9 (5)
X
N( ≥ x) = Ax−α (6)
where A is some coefficient. If the minimal wealth of a person is x0, then the
total number of people in the society can be denoted as N(X ≥ xo), and
Their ratio gives the percentage of rich people with wealth at least x and is
equal to
(8)
The proability density function for a person to have wealth of x is
1
xdx (10)
and the total wealth of rich people with at least wealth x is given by
Z∞
αx−α−1xdx (11) x
(12)
Solving for α by letting the ratios given by Equations (8) and (12) to be 0.2
and 0.8, respectively, we find
Appl. Sci. 2021, 11, 5736 7 of 67
α = ln 5/ ln 4 ≈ 1.16 (13)
Since FX(x) is monotonically nondecreasing, its inverse function exists. We then have
X = FX−1(U). (15)
−α. When estimatingx, then estimate the CCDF forα from a finite set of data
points, it is important to first take theln x, and finally check if the logarithm of
CCDF
logarithm of
has a linear relation with ln x. If one straightforwardly estimates a PDF or CCDF
for the original data, then take log-log of both axes to estimate α, one will often
get a very inaccurate or even wrong estimation. The reason is many of the small
intervals used for counting the number of data points x falling within them will
be empty.
When we observe the balls with our naked eyes, we normally will only pay
more attention to the balls of certain size ranges—large balls will block our
vision, and very small balls cannot be seen. Now assume that our eyes are
comfortable with the scales r0, 2r0, r0/2, etc. Our perception is determined by
the relevant abundance or the ratio of the balls of sizes 2r0, r0, and r0/2:
better described by power law distributions, while that in wet seasons is better
described by log-normal distributions [53]. In deserts, surely flooding does not
constitute a major risk.
0 0
-0.5 -0.5
a) a)
= -1.576
P(A -1 1 P(A -1
10 10 = -1.03
log -1.5 = -3.796 log -1.5 2
2
-2 -2
(a) USA (b) China
-2.5 -2.5
5 5.5 6 0 2 4 6
log (A) log (A)
10 10
Figure 3. Complementary cumulative distribution function (CCDF) for the forest fires in
USA and China, where the size of a fire is measured by its area A. The data for USA are
the sizes of individual fires from 1997 to 2018, while those for China are the total annual
size of forest fires in the 30 provinces from 1998 to 2017.
log v log c
10 10
Figure 4. Complementary cumulative distribution function (CCDF) for the products of
volcanic eruptions in the Holocene: (a) tephra volume (km3) and dense rock equivalent
(DRE) (km3), and (b) volcanic sulfate (data were from [54]).
2.2.4. Mechanisms for Power Laws
The prevalence of power laws calls for development of models to explain
the mechanism. Various models have been proposed, including Tsallis non-
extensive statistics [55–57]. For a systematic discussion, we refer to Chapter 11
of [38]. Here, we note two of them, which appear to be relevant to many
different scenarios and thus may better stimulate readers to readily find
mechanisms when they find power laws from their data. One model is related
to spatial heterogeneity and resource allocation (or availability). It is provided
by the model that superposition of exponential distributions with different
parameters can give rise to power law distributions. The other reflects the
underlying local dynamics of the problem to some degree, and thus is in some
sense more thought-provoking. The most well-known example of this class is
perhaps the scale-free power law network model [58]. Another example is
related to social segregation and crimes in a society: distributions of the ratio
between sex offenders and the total population in the states of Ohio and New
York in the USA follow power laws, as shown in Figure 5 [59]. While intuitively
this must be driven by crimes (more concretely, sexual offenses) and instigated
by laws preventing crimes, so far, however, a concrete model is still lacking.
Such a model is surely worth developing in the future.
Appl. Sci. 2021, 11, 5736 10 of 67
0 0
α =1.87 α =1.70
r)
P(R−1
≥ −1
10 −2 −2
log
−3 −3
(a) Ohio (b) New York
−5 −4 −3 −2 −1 −5 −4 −3 −2 −1
log10 r log10 r
Figure 5. Distribution for the ratio between sex offenders and the total population in (a)
Ohio and (b) New York (adapted from [59]).
closed loop, again irrespective of where the trajectory starts. This is called a
globally stable limit cycle. The discrete counter part of a limit cycle is a periodic
motion with certain period (say N): the corresponding attractor consists of N
points, and the trajectory amounts to hopping among the N points
with a definite order.
To be more familiar with the concept of phase space, it is useful to
examine certain experience in daily life. To illustrate the idea, suppose we were
going to a meeting by a taxi. On our way, there was a traffic jam, and the taxi
got stuck. Afraid of being late, we decided to call the organizer. How would we
describe our situation? Usually, we would tell the organizer where we got stuck
and how quickly or slowly the taxi was moving. In other words, we actually have
been using the concept of phase space as part of our daily language.
Although the concept of phase space is among the most basic in
dynamical systems theory, its usefulness in geographical science has yet to be
seriously explored [60]. To accelerate the coming of a time that phase space
becomes as basic in geographical science as in complexity science, it is helpful
to discuss two potential applications of phase space in geographical science.
One application is top-down, that is, to systematically think about how many
independent variables are needed to fully characterize an interesting and
important problem in geographical science, and how each variable can be
measured. The other application is bottom-up. It is easiest to illustrate the idea
by using some variables in the World Value Survey (WVS, accessed on 17 April
2021, [Link] org/[Link]) as an example. WVS is an
interesting project that explores values and beliefs of people around the globe,
how the values and beliefs evolve with time, and what social and political
implications they may have. Since 1981, researchers have conducted
representative national surveys in almost 100 countries. During the survey, a lot
of variables have been deduced. We show here that phase space offers a
convenient geometrical way to visualize the data and identify co-variations of
the variables. For this purpose, we choose a variable that gives three levels of
religious participation for people in the nations surveyed. The other variable we
choose is happiness, which is given in four levels. How are the two variables
related? How different are people in different countries in terms of these two
variables? To gain insights into these interesting questions, we can form a
phase space spanned by these two variables. The format of the survey data
determines that people surveyed in a nation will belong to one of the 12
different categories. To fully utilize the notion of space, we can associate each
category with a box. Instead of putting every person belonging to that category
at one single point (e.g., the center of the box), we can generate two uniformly
distributed random variables as the coordinate of the person in the
corresponding box. Please see Figure 6. With such a visualization, one can
immediately see the abundance of each category. When WVS data of different
waves (times) are used, one can then examine variation of the percentage of
people in each category over time for a nation, compare among different
nations, deduce functional relationships between these two variables, and
classify nations in the world into different clusters. Note Figure 6 may be called
phase space ensemble based visualization, where an ensemble amounts to a
participant in the survey.
Appl. Sci. 2021, 11, 5736 12 of 67
Figure 6. Phase space diagram of religious participation vs. happiness for the USA based on wave 7 of the World Value
Survey Data.
twisted by the wind? To make this discussion more concrete, we can consider
how a unit circle is transformed by the Henon map [61]:
where a = 1.4, b = 0.3. Figure 7 shows the successive (from left to right and top
=
to bottom) images of the unit circle after n 1, ··· , 5 iterations. Note that the
fifth image is basically the Henon attractor one can find in textbooks, journal
papers, or certain web sites. It is usually obtained by choosing an arbitrary
initial condition and iterate the Henon map long enough. If the trajectory does
not diverge, then after removing the transient points (which are the first few
points here), the remaining trajectory (not connected by lines) will be very
similar to the fifth image shown here. In our ensemble scenario, we observe
that just after one iteration, the unit circle is already changed to a very different
shape, and by the fourth iteration, the shape of the image is already very
similar to the Henon attractor. By now, one could easily understand that the
Henon attractor can either be readily obtained from an arbitrarily shaped phase
space region (discarding initial conditions which lead to the divergence of the
iterations) or by iterating a single arbitrary initial condition many times. The
equivalence of the two approaches, one based on the evolution of ensembles in
the phase space, the other based on long-time iterations, is a clear
manifestation of the ergodic property of the Henon map (and more generally,
chaotic systems).
Figure 7. Successive transformation of a unit circle by the Henon map. The unit circle is
represented by 36,000 points with equal arc spacing. These points are then taken as
initial conditions for the Henon map. Successive (i = 1, 2, . . . , 5) images of the unit circle
(discarding initial conditions which lead to divergence of the iterations) are shown from
left to right and top to bottom in the figure.
Appl. Sci. 2021, 11, 5736 14 of 67
To enhance our understanding of the materials discussed so far, let us
visually observe how chaos manifests itself in the chaotic Lorenz system:
4z.
with very small variance, and monitor the evolution of all those points. These
initial conditions are shown in Figure 8 as a magenta block centered at our
chosen initial condition. After 2 units of time, these initial conditions spread to
the points labeled as red in the Figure. After another 2 units of time, the red
points further evolve to the points labeled as green. Two more units of time
later, the green points become the blue points. By that time, the shape of the
points already resembles the chaotic Lorenz attractor we usually see in books,
papers, and on the Internet.
Figure 8. Evolution of point clouds in the chaotic Lorenz system: magenta, red, green,
and blue correspond to t = 0, 2, 4, 6, respectively.
−5
x(n)|
−10
∆
ln|
−15
−20
0 10 20 30 40 50
n
Figure 9. Error growth in the logistic map.
The largest positive Lyapunov exponent for the Henon map and the
chaotic Lorenz system we discussed in Section 2.3.1 can also be conveniently
computed based on time series data. This will be discussed shortly.
The trajectories of a chaotic attractor are bounded in the phase space. This
is another fundamental property of the chaotic attractor. The ceaseless
stretching due to exponential divergence of nearby trajectories, and folding
from time to time due to boundedness of the attractor, make the chaotic
attractor a fractal, characterized by
where N(e) represents the (minimal) number of boxes, with linear length not
larger than e, needed to completely cover the attractor in the phase space. D is
called the box-counting dimension of the attractor. Typically, it is a nonintegral
number. For the chaotic Henon and Lorenz attractor, D is 1.2 and 2.05,
respectively.
xn 2xn − 1, 1/2 ≤
xn < 1, (25)
This map in fact acts as a Bernoulli shift [62], or binary shift, since if we
represent an initial condition x0 in binary form
x j
aj , (26)
and so on, where each of the digits aj is either 1 or 0. Now it is clear that when
x0 is a rational number, the trajectory is periodic. In fact, we can easily find
cycles of any length.
=
For example, if x0 is a 3-bit repeating sequence, such as x0 0.001001001 ··· ,
then the trajectory is periodic with period 3. Since there are infinitely more
irrational numbers than rational numbers in [0, 1), an arbitrary initial condition
x0 will be an irrational number with probability 1, and will almost surely
generate an aperiodic, chaotic trajectory. Since after each iteration the map
shifts one bit, a digit that is initially very unimportant, say
the 80th digit (corresponding todigit after 80 iterations. This is a vivid example
that a small change in the initial condition2−80 ≈ 10−24), becomes the first and
the most important makes a profound change in xn. Clearly, the largest
Lyapunov exponent λ1 here is ln 2. Next, let us re-consider the logistic map with
µ = 4. If we make a transformation,
xn = sin2(2πyn) (27)
then the logistic map becomes the Bernoulli shift map discussed above.
Therefore, the largest Lyapunov exponent λ1 for the logistic map with µ = 4 is
also ln 2, as we already mentioned.
Now that we have gained some understanding by considering simple
model systems, we can discuss how to characterize general chaotic systems.
For a chaotic dynamical system with dimensions higher than 1, first we need to
realize that exponential divergence can occur in more than one direction, and
possibly in many directions. That means we have multiple positive Lyapunov
exponents. We denote them by λ+, among them, the largest one is usually
denoted as λ1. How are these Lyapunov exponents related to the rate of
creation of new information, or in other words, loss of prior knowledge, in the
system? To find the answer, we may partition the phase space into boxes of
size e, compute the probability pi that the trajectory visits box i, and finally
=
calculate the Shannon entropy I −∑ pi ln pi. For many systems, when e → 0,
information increases with time linearly [63]
I(e, t) = I0 + Kt (28)
already explained. If all these regions are visited by the trajectories with equal
probability, then pi(T) ∼ 1/N, and the information function becomes
I T (29)
We thus have K = ∑λ+. In general, if these phase space regions are not
visited equally likely, then
K ≤ ∑λ+ (30)
Equation (22) again. Let us set µ = 2 and iterate the map starting with an initial
condition x0 = 0.3. With simple calculations, we can easily find that xn soon
= = = =
equals 0.5 after a few iterations. If we choose x0 0.5, then x1 x2 ··· 0.5. This
means that 0.538]is a stable, fixed-point, here, let us resort to solution. While it
is easy to prove this statement rigorously [
simulations: For anyiterate Equation (22). After discarding the initial iterations
so that the solution of the mapµ, where µ ∈ [2, 4], we choose an arbitrary initial
value of x0, and has stabilized, we retain a large number (say, 100) of the value
of the iterations, and form a scatter plot of those values with µ. When the map
has a globally attracting fixed-point solution, then the recorded values of xn will
all become the same since the transients have been discarded. In this situation,
one only observes a single point with the horizontal axis being the chosen µ and
the vertical axis being the converged value of xn. For a periodic solution with
period m, one can observe m distinct points on the vertical axis. When the
motion becomes chaotic, one observes on the vertical axis as many distinct
points as one records (100 in our example). Figure 10a shows the bifurcation
diagram for the logistic map—the interesting structure is the celebrated period-
doubling bifurcation to chaos.
Figure 10. Bifurcation diagram for the logistic map; (b) is an enlargement of the little
rectangular box indicated by the arrow in (a).
δ= . (31) k
Other routes to chaos also exist. They include the well-known quasi-
periodicity route [66] and the intermittency route [67]. The former refers to
when a controlling parameter is changed, the motion of the system changes
from a periodic motion with one basic frequency, a quasi-periodic motion with
two or more basic frequencies, to chaotic motions. This route has been
observed in many mechanical and physical systems, including fluid systems. A
bit surprisingly, this route has also manifested itself in the Internet transport
dynamics (concretely, a variable amounting to the round-trip time of a message
transmitting through the Internet can change from periodic and quasi-periodic
motion to chaos when the congestion level increases [68]). The third classic
route to chaos, intermittency, refers to the behavior that the motion of the
system alters between smooth and chaotic modes, again when a controlling
parameter is changed. This route to chaos is very relevant to many
nonstationary phenomena in nature, including river flow dynamics, which are
very different in wet and dry seasons.
where m is the embedding dimension and L the delay time. More explicitly, we
have
Appl. Sci. 2021, 11, 5736 20 of 67
m
V1 = [x(t1), x(t1 + τ), x(t1 + 2τ), ..., x(t1 + (
− 1)τ], V2 = [x(t2), x(t2 + τ), x(t2 + 2τ), ...,
m
x(t2 + ( − 1)τ],
...
...
where ti+1 − ti = ∆t and τ = L∆t. We thus obtain a discrete dynamical system (i.e.,
a map),
x, (36)
The general solution is
in which andimensional reconstruction is not. Passing fromreconstructed trajectory is eliminated. This feature
can be quantified by the sharpm0-dimensional delay reconstruction is embedded but anm0 − 1 to m0, self-
intersection in the(m0 − 1)-
Therefore, the optimal value ofdecrease in the number of nearest neighbors whenm is m0. More precisely, for
each reconstructed vectorm is increased from m0 − 1 to m0.
,
Vi(m) = [x(ti), x(ti + τ), x(ti + 2τ) ··· , x(ti + (m − 1)τ)], its nearest neighbor
Vj(m) is found (to ensure unambiguity, here the superscript (m) is used to
emphasize that this is an m-dimensional reconstruction). If m is not large
enough, then Vj(m) may be a false neighbor of Vi(m) (something like both the
north and south poles are mapped to the center of the equator, or
multiple different objects have the same shadow). If embedding can be
achieved by increasing m by 1, then the embedding vectors become Vi(m+1)
m
= [x(ti), x(ti + τ), x(ti + 2τ), ··· , x(ti + ( − 1)
Rf RT, (38)
where RT is a heuristic threshold value. Abarbanel [72] recommends RT =
15.
After m is determined, τ can be obtained by minimizing Rf .
Appl. Sci. 2021, 11, 5736 22 of 67
While this method is intuitively appealing, it should be pointed out that it
works less effectively in the noisy case. Partly, this is because nearest
neighbors may not be well defined when data have noise.
(2) Time-dependent exponent curves: This is a dynamical method developed
by Gao and Zheng [73,74]. The basic idea is that false neighbors will fly
apart rapidly if we follow them on the trajectory. Denote the
reconstructed trajectory by V1(m), V2(m), ··· .
If Vi(m) and Vj(m) are false neighbors, then it is unlikely that points Vi(+mk),
Vj(+mk), where k is the evolution time, will remain close neighbors. That is,
the distance between Vi(+mk) and Vj(+mk) will be much larger than that
between Vi(m) and Vj(m) if the delay reconstruction is not an embedding.
The metric recommended by Gao and Zheng is
Λ(m, L, k) = . (39)
Here, for simplicity, the superscript (m) in the reconstructed vectors is no
longer indicated. The angle brackets denote the average of all possible
(Vi, Vj) pairs satisfying the condition
ei ≤ kVi − Vjk ≤ ei + ∆ei, i = 1, 2, 3, ··· , (40)
where ei and ∆ei are more or less arbitrarily chosen small distances.
Geometrically speaking, Equation (40) defines a shell, with ei being the
diameter of the shell and ∆ei the thickness of the shell. When ek = 0, the
shell becomes a ball; in particular, if the embedding dimension m is 2,
then the ball is a circle. Note that the computation is carried out for a
series of shells, i = 1, 2, 3, ··· , and ∆ei may depend on the index i. With this
approach, the effect of noise can be greatly suppressed.
As a rule of thumb, Gao and Zheng find that for a fixed small k, the
minimal m is such that when further increasing m, Λ(m, L, k) no longer
decreases significantly. After m is determined, L can be chosen by minimizing
Λ(m, L, k).
Now that we have determined an optimal embedding, we can discuss how
to estimate the largest positive Lyapunov exponent, dimension, and
Kolmogorov entropy of chaotic attractors.
B. Estimation of the largest positive Lyapunov exponent
A number of algorithms for estimating the Lyapunov exponents have been
developed. A classic method is Wolf et al.’s algorithm [75]. The basic idea is to
select a fiducial trajectory and monitor how the deviation from it grows with
time. Let the distance between the two trajectories at time ti and ti+1 be di0 and
di+1. The rate of the exponential divergence over this time period is given by
ln(di+1/di0).
ti + 1 − ti
λ1
=
.
Appl. Sci. 2021, 11, 5736 23 of 67
(41) i
,
(42)
average over j average over i
( )
whereIf Λ k Vi∼is a reference point andk for a certain intermediate range ofVj
are neighbors tok, then the slope is the largest LyapunovVi, satisfying the
condition kVi − Vjk <e. exponent. This is the most fundamental part of the
Appl. Sci. 2021, 11, 5736 24 of 67
algorithm: it explicitly tests whether the dynamics of the data possess
While in principle this method can distinguish chaos from noise, with finite
noisy data it may not function as desired. One of the major reasons is that in
order for the average over j to be well defined, e has to be small. In fact,
sometimes the e-neighborhood of Vi is replaced by the nearest neighbor of Vi.
For this reason, the method cannot handle short, noisy time series well.
Gao and Zheng’s algorithm [73,74,79] contains three basic
ingredients:
Equations (39) and (40), and the condition
|i − j| > w. (43)
Equation (39) plays the same role as but is simpler than Equation (42),
since it eliminates the necessity of performing two rounds of averages. More
important are the conditions specified by two Inequalities (40) and (43). The
condition specifying the series of shells makes the method a direct test for
deterministic chaos, which will be explained momentarily. The condition
specified by Inequality (43) ensures that tangential motions corresponding to
the condition that Vi and Vj follow each other along the orbit are removed.
Tangential motions contribute a Lyapunov exponent of zero and, hence,
severely underestimate the positive Lyapunov exponent. An example is
exhibited in Figure 13. We find that when w = 1, the slope of the curve severely
underestimates the largest positive Lyapunov exponent, while w = 54 solves
the problem. In practice, w can be chosen to be larger than one orbital time,
when orbital times are defined in the dynamical system (Lorenz and Rossler
attractor are such systems). If an orbital time cannot be defined, it can be more
or less arbitrarily set to be a large integer if the dataset is not too small.
Figure 13. Λ(k) vs. k curves for the Lorenz system. When w = 1, the slope of the curve
severely underestimates the largest Lyapunov exponent. When w is increased to 54, the
slope correctly estimates the largest Lyapunov exponent (reproduced from [74]).
To see how the condition specifying the series of shells gives rise to a
direct test for deterministic chaos, we can compare the behavior of the time-
dependent exponent curves for truly chaotic data and independent, identically
distributed random variables. The basic results are illustrated in Figure 14. We
observe that for true chaotic signals, the time-dependent exponent curves from
different shells not only grow linearly for some intermediate range of the
evolution time k, but form a common envelope. As one expects, the slope of
the common envelope gives an accurate estimation of the largest positive
Lyapunov exponent. Such a common envelope does not exist for IID random
Appl. Sci. 2021, 11, 5736 25 of 67
variables. In fact, the behavior of the IID random variables vividly illustrates the
Figure 14. Time-dependent exponent curves for the chaotic Lorenz data (left) and IID
random variables (right), where the curves, from bottom up, correspond to shells (2−
(i+1)
/2, 2−i/2), 1 = 1, 2, ··· , 9) (adapted from [74]).
where the parameter α signifies what kind of diffusion the dynamic executes:
the dynamic is called sub-diffusion, normal diffusion, and super-diffusion when
0 <α< 1/2, α ≈ 1/2, and 1/2 < α, respectively. In the case of tremors, the
dynamics basically are normal diffusions [80]. Typical Λ(k) curves for normal
diffusions are of the shape shown in Figure 15 (right), which are also true for
the fluid dynamics in wakes behind circular cylinders in low Reynolds numbers
[81,82]. Other types of diffusions, although rarer, are also possible. We will
return to this issue later when we consider chaos communications.
Appl. Sci. 2021, 11, 5736 26 of 67
Figure 15. 2D phase diagram for essential tremor data (left) and time-dependent
exponent curves (right), where the curves, from bottom up, correspond to shells (2−
(i+1)
/2, 2−i/2, 1 = 1, 2, ··· , 9) (adapted from [80]).
C. Estimation of fractal dimension and Kolmogorov entropy
There is an elegant algorithm, the Grassberger–Procaccia algorithm
[64,84], that takes care of both. To fully understand the algorithm, we first
extend the box-counting dimension defined in Equation (23). Recall that when
we defined the box-counting or capacity dimension of a chaotic attractor, we
partitioned the phase space where the attractor locates into many small regions
called cells or boxes of linear size e, and we counted the number of non-empty
cells or boxes. We can monitor the non-empty boxes more precisely by
counting how many points of the attractor have fallen into each of them. We
can then assign a probability pi to the ith cell that is not empty. The simplest
way to compute pi is by using ni/N, where ni is the number of points that fall
within the ith cell, and N is the total number of points. Then
where n is the total number of nonempty cells, and q is real. Generally speaking,
Dq is a nonincreasing function of q. D0 is the very box-counting or capacity
dimension we have already discussed, since gives the
information dimension DI,
)
p(l ∼ lα, l → 0, (47)
where p(l) is the measure (i.e., probability) for the trajectory to fall within a
neighborhood of size l centered at a reference point. D2 is called the correlation
dimension. It is what the
Appl. Sci. 2021, 11, 5736 27 of 67
Grassberger–Procaccia algorithm calculates. It involves computing the
correlation integral
1 N
∑
C(e) = lim 2 H(e − ||Vi − Vj||), (48)
N→∞ N i,j=1
where Vi and Vj are the embedding vectors, H(y) is the Heaviside function,
<
which is 1 if y ≥ 0 and 0 if y 0. N is the number of points randomly chosen
from the reconstructed vectors. The term involving the Heaviside function
amounts to counting the number of points falling within a cell of radius e that is
centered around Vi. Therefore, C(e) estimates the average fraction of points
within a distance of e. One then checks the following scaling behavior:
C(e) ∼ eD2, as e → 0. (49)
= ∑
K −τlim→0 elim→0 dlim→∞ dτ i1, ··· ,id p(i1, ··· , id) ln p(i1, ··· , id) .
(50)
where K characterizes the rate of creation of entropy. To see this, we can start
from the block entropy:
− ∑
Hd(e, τ) = i1 , ··· ,id p(i1, ··· , id) ln p(i1, ··· , id). (51)
It is on the order of dτK. The difference between Hd+1(e, τ) and Hd(e, τ) gives the rate:
hd . (52)
Let
K Hd(e, τ)]
(54)
Kq , id).
(55)
When q → 1, Kq → K. Like the correlation dimension, the correlation
entropy K2 can be computed by the Grassberger–Procaccia algorithm by the
following equation:
where τ = Lδt is the actual delay time. The above equation can also be
expressed as
K . (57)
Although the above equations involve taking limits, in practice, data are of
finite length, and one really looks for power-law scaling behaviors between
Cm(e) and e when m is changed. When power law relations hold, in log-log
scale, one should observe a series of curves, which are straight over a
significant range of e, and the curves for smaller embedding dimension m lie
above those for larger m. In certain applications, one may just fix e to some
small value e∗, say 10% or 15% of the standard deviation of the original time
series, then compute K2(e∗). This K2(e∗) is called sample entropy, which has
been widely used in various kinds of physiological data analyses. Sample
entropy can also be computed for filtered data. When the filter is simply the
moving average, which is the simplest ever known, the resulting series of
entropies corresponding to different parameters for the moving average is
called multiscale entropy. For more details, we refer to [86].
Before ending this subsection, we note a simple but very interesting and
useful technique for testing nonlinearity. It is called the surrogate data
approach [87,88]. The basic idea is to examine whether the original time series
is distinctly different from a random time series sharing some basic properties
of the original time series, such as the distribution or the power-spectral
density. In the former case, the random time series can be readily obtained by
simply shuffling the original time series. In the latter case, one can randomize
the phase of the Fourier transform of the original time series and take the
inverse transform.
2.3.6. Chaos-Based Communications and Effect of Noise on Dynamical Systems
Among the most promising applications of chaos theory is the exploitation
of the short-term deterministic and long-term unpredictable aspects of chaotic
behavior for the development of chaos-based communication systems. The
actual research in this area goes in two directions. One, started in the early
1990s, is chaos-based secure communications [89]. The other, which is more
recent, is to use chaos to rapidly generate random bits in physical devices, for a
range of applications in cryptography and secure communication [90–99]. The
potential of each direction is dictated by the role of noise played in the
corresponding dynamical systems, which we will explain here.
In chaos-based secure communications, the most extensively studied is
the scheme exploiting synchronization of chaos in two similar and coupled
nonlinear systems [100–111]. The unpredictable behavior of chaos provides a
means of security since chaotic signals are hard to decode by a third party
(called an eavesdropper). The chaotic signal is used as a carrier to mask a
message in the time or frequency domain. The synchronization of a chaotic
receiver with the chaotic emitter is then used to retrieve the message. In
mathematical notation,
Appl. Sci. 2021, 11, 5736 29 of 67
• an emitter generates a chaotic signal x(t),
Here, x and y are both vectors, x0 = F(x) is a chaotic system, and α is the
parameter that couples the system x and y. An invariant subspace of the
coupled system is given by x(t) = y(t). If this subspace is locally attractive, then
the two systems can synchronize
perfectly. The role of α> 0 is to suppress the divergence between the x and the
y systems: in general, the larger the α, the easier the synchronization. To find
=
the critical α, let us focus on v x − y. Assuming v to be small, we can then use
Taylor series expansion. Further assuming that higher order nonlinearities can
be neglected, we obtain a linear differential equation
v0 = DF(x(t))v − 2αv (59)
Here, DF(x(t)) is the Jacobian of the vector field along the solution. When
α = 0, we have
u0 = DF(x(t))u, (60)
Figure 16. Clean (open triangles) and noisy (filled circles) trajectories for (a) µ = 3.55 and
(b) µ = 3.74 (reproduced from [118]).
Figure 17. Time-dependent exponent curves for the noisy Logistic map: (a) µ = 3.55 and
σ = 0.01; (b) µ = 3.63 and σ = 0.005; (c) µ = 3.74 and σ = 0.002; and (d) µ = 3.83 and σ =
0.005. Six curves, from the bottom up, correspond to shells (2−(i+1)/2, 2−i/2) with i = 7, 8,
9, 10, 11, and 12 (reproduced from [118]).
Appl. Sci. 2021, 11, 5736 32 of 67
Figure 18. Logarithmic displacement curves illustrating the mechanism for noise-induced
chaos. Each group actually consists of three curves, corresponding to shells (2−(i+1)/2,
2−i/2) with i = 12, 13, and 14. They basically collapse on each other. The parameters for
the four groups are (a) µ = 3.74 and σ = 0.0003; (b) µ = 3.83 and σ = 0.001; (c) µ = 3.63
and σ = 0.0003; and (d) µ = 3.55 and σ = 0.0005. To separate these four groups (a–d) of
curves from each other, they are shifted by 2, 1, −0.5, and
indicates shifting downward. All four groups of curves are well modeled by −0.2 units,
respectively, where a positive number indicates shifting upward, and a negative
numberln kα with α = 1.5, 1.0,
understand the differences in the mathematics of the two theories, and fail to
fully appreciate important issues such as distinguishing chaos from noise—a
newcomer tackling the issue would think it sufficient to distinguish chaos from
simple white noise. Unfortunately, this is not the case. Only if we can
distinguish chaos from all known models of random processes can we say we
can distinguish chaos from noise.
Below, we first discuss the basics of fractal theory, then we focus on
random fractal theory. We will resume discussion of distinguishing chaos from
noise in Section 2.5.
where N(e) is the number of intervals walked. Amazingly, N(e) scales with e as
a power law, just as described by Equation (23), where 1 < D < 2 is not an
integer. Such a nonintegral D is the celebrated fractal dimension of the hiking
trail.
What is the meaning of a nonintegral D? To find the answer, we start from
the measurement of certain length, area, or volume. The basics of calculus
teach us that we can measure a curve, a surface area, or a volume using very
small intervals, squares, or cubes by properly covering the object we are
interested to measure. Take the unit length, unit area, or unit volume as the
unit of measurement, with linear size e. Now suppose we measure the length of
a straight line with length 1. What is the minimal number of intervals of length
( )
e needed to fully cover this unit length? We need at least N e ∼ e−1 intervals.
Extending to 2D and 3D, when covering an area or volume by boxes with linear
length e, we need at least N(Ne)(e∼) e∼−2e−squares to cover the area, andD
here is called the topological dimension, whichM isolated points, the scaling
volume− DWe can now discuss the consequence of, 2 − D, and 3 − D objects, we
are talking about their topological dimensions. 1 < D < 2 for an irregular
mountain trail.0 − D,
L = e1−D, (67)
= (68)
Lhare Ltortoise
The tortoise has to crawl twice the distance that the hare runs! Based on
this simple calculation, we now understand when we walk along a wild trail, get
tired, slow down, we are actually shrinking our step sizes, so we will be walking
out a longer trail!
Next, we consider the Cantor set, one of the prototypical fractal objects,
so that we can appreciate the concept of fractal dimension better.
The standard Cantor set is obtained by first partitioning a line segment
into three equal parts and deleting the middle one. This step is then repeated,
deleting the middle third of each remaining segment iteratively. See Figure 20a.
Note that such a process can be related to the iteration of a nonlinear discrete
map. The removed middle thirds can be related to the intervals that make the
map diverge to infinity, while the remaining structures are linked to the
invariant points of the map. At the limiting stage, n → ∞, the Cantor set
consists of infinitely many isolated points. Consistent with isolated point(s)
having dimension 0, the topological dimension here is 0. The length of the total
segments removed is
Therefore, the entire unit interval has been removed! Is the fractal
dimension here the same as the topological dimension, which is 0?
To find out, let us focus on stage n. One needs N(e) = 2n boxes of length e
= ( )n to cover the set. Hence, the fractal dimension for the Cantor set is
D = − ln N(e)/ ln e = ln 2/ ln 3. (70)
It is not zero!
(a)
Appl. Sci. 2021, 11, 5736 36 of 67
(b)
n=0 n=1 n=2 n=3 n=4
(c)
(d)
Figure 20. Standard Cantor set (a) and its variants (b–d). See the text for details.
of fractal forms in living things. This more space-filling property of fractals has
also been exploited to design fractal antennas by maximizing the effective
length or perimeter of the material that receives or transmits electromagnetic
radiation. Fractal antennas are excellent for wideband and multiband
applications [123].
2.4.2. Overview of Random Fractal Theory
Gaussian white noise is the most extensively studied noise in engineering.
In complex systems, however, the temporal or spatial fluctuations often cannot
be modeled by Gaussian white noise. Rather, they are characterized by a power
law decaying spectrum in the frequency domain, denoted as 1/fα noise [38]. Its
dimensionality cannot be reduced by popular methods such as principle
component analysis [124]. Interesting examples of such processes include
genetics [125–129], human cognition [130] and coordination [131], posture
[132], vision [133,134], physiological signals [80,135–143], neuronal firing
[144,145], urban traffic [146], tree rings [147], global terrorism [148], human
response to natural and social phenomena [149], foreign exchange rate [76],
and the distribution of prime numbers [150].
}. Its mean is µ, variance is σ2, and autocorrelation function r(k), k ≥ 0 has the
following form
where H is a parameter called the Hurst exponent. It is in the unit interval, 0 <
H < 1. The exponent α for the spectra of the process, 1/fα, is related to H by a
simple equation,
α = 2H − 1. (72)
When 0 < H < 1/2, the process is said to have anti-persistent correlations; when
H = 1/2, the process is memoryless or only has short memory, when 1/2 < H <
1, the process is said to have persistent long-range correlations. In this case, it is
easy to prove ∑k r(k) = ∞. This is why the process is said to have long-range
correlation [38].
Let us now smooth the process X to obtain a time series X ,
m = 1, 2, 3, . . . , where
X /m, t ≥ 1 . (73)
∑
yk = (Xi − X), (76) i=1
D E
|y(i + m) − y(i)|2 ∼ m2H, (78)
where the angle brackets denote averaging over i. Equation (78) is often called
fluctuation analysis (FA). The superiority of Equation (78) over Equation (74) is
that it can be readily generalized to a multifractal formulation.
and covariance:
E (80)
( )
whereBH i∆tH , is the Hurst parameter. The increment process of the fBm,i ≥ 1,
autocorrelation function:
γ(k) = E(XiXi+k)
0 (81)
where q is real-valued. The average is taken over all possible pairs of (y(i + m),
y(i)). Note thatabsolute value ofq > 0 highlights large absolute value ofy(i + m)
y(i) (to understand better, it is beneficial to take|y(i + m) − y(i)|, while q < 0
highlights smallq = 10 and maxi|y(i + m) − y|(i)| = 100−, and|q = −10 and mini|
N (e )
where N(e) is the minimal number of boxes of linear size e that are used to
cover the support of the measure µ. The spectrum of the generalized
dimensions Dq is defined by
Dq , (84)
Comparing with our discussions on the Dq spectrum for chaotic systems, we
readily see that D0 is the capacity (or box-counting) dimension, and D1 is the
information dimension. Just as the H(q) spectrum, D(q) is a non-decreasing
function of q. When D(q) is constant in q, the measure is called monofractal;
otherwise, it is called multifractal.
There is another interesting way to characterize the properties of the
measure. It is by the singular spectrum f(α), where α is called the pointwise
dimension. The basic equation connecting the two characterizations is the
Legendre transform,
q = df(α)/dα, τ(q) = qα − f(α). (85)
Dq . (86)
We thus see that Dq and f(α) provide the same amount of information.
Appl. Sci. 2021, 11, 5736 40 of 67
The Random Cascade Model
= 1r, ...,ij and2N}1, can be expressed as− rij both have marginal distribution P(r).
The weights at the
= ···
where ul, l 1, ). , N, are either rij or 1 − rij. Thus, {ui, i ≥ 1} are IID random
The cascade model has many interesting properties. We list a few here:
• The weights at stage N are log-normally distributed. To see this, one can
take logarithm on both sides of Equation (87), then the multiplication
becomes summation, and one can use the central limit theorem.
Appl. Sci. 2021, 11, 5736 41 of 67
• We can readily derive that
H µq/ ln 2, (89)
and
τ(q) = qH(q) − 1. (90)
We now illustrate Equations (88) and (89) using an example, the random
binomial model, whose P(r) is
P(r) = [δ(r − p) + δ(r − (1 − p))]/2 (91)
and
H (93)
Clearly, H(q) is a non-decreasing function of q. Without loss of generality, we may
Whentake p ≤ 1//2. When∞, H converges to its tight lower bound ofq → −∞, H
converges to its tight upper bound of− ln(1 − p)/ ln 2 <−[Link] p/ ln 2 > 1.
q→
In the cascade model, many different functional shapes for P(r) can be
used [163,164], and the model can simulate a random function with very high
accuracy. Two examples are shown in Figure 22 for sea-clutter amplitude data
[38,166]. The model can also be readily generalized to the high-dimensional
case. The case for 2D is shown in Figure 23.
Figure 22. Sea clutter amplitude data: (a) is the original data without target, (b) is the
original data with a primary target, and (c,d) are the modeled data.
Appl. Sci. 2021, 11, 5736 42 of 67
wi
wir3
wir2 wir4
wir1
2.5. Going from Distinguishing Chaos from Noise to Fully Understanding the
System Dynamics
A long-standing problem in time series analysis, which is still of interest
today, is to distinguish chaos from noise. This problem naturally arises when
one wishes to understand whether certain complex behaviors in physics,
finance, life sciences, ecology, and other fields, are of deterministic origin, or
genuinely random. An unambiguous answer to the question can greatly help
one to choose a proper model to study the behavior one wishes to understand.
For a long time, however, when one computes a nonintegral fractal dimension,
or a positive largest Lyapunov exponent, or a finite Kolmogorov entropy from a
time series, one would think the time series is chaotic. In many applications,
many researchers are still assuming so! Is this a sound assumption?
Unfortunately, it is not. As one can expect, the most convincing counter-
example would be the one that a genuinely random time series is interpreted
as deterministic chaos by this assumption. It turns out that all 1/fα random
processes can be proven to have non-integral fractal dimensions of 1/H [38],
and finite Kolmogorov entropies [167,168], and thus may be misclassified as
chaos. Because of this problem, it is desirable that whenever one studies chaos
in observational data, one explicitly tests whether the data truly have the
signature of chaos, the exponential divergence. In Section 4, we will discuss the
scale-dependent Lyapunov exponent (SDLE), which generalizes the notion of
the Lyapunov exponent. We will see there that SDLE can readily solve the
problem.
Nowadays, efforts are still being made to develop innovative methods to
distinguish chaos from noise. In our view, it is more important to find the
defining parameters of the complex time series that one studies. In particular,
one has to ask: If the time series is truly chaotic, what is the exponential growth
rate? If the time series is random, what type of randomness is it? Only if we can
unambiguously answer these fundamental questions can we truly understand
the system under study. Clearly, this is more than simply trying to distinguish
chaos from noise. In doing so, one will find that chaos and random fractals may
both play significant roles in one’s problem: chaos and random fractal may be
manifested on different scales. This is the essence of multiscale phenomena:
signals may exhibit different quantifiable features on different scales.
Therefore, to best characterize a complex system, we need to use a number of
tools synergistically. With this rationale, another fundamental question arises:
what are the relations among the different complexity measures?
Appl. Sci. 2021, 11, 5736 43 of 67
We have already introduced a number of different complexity measures,
fixed scale e and two small embedding dimensions (sayand lim m→∞. While it is
instead of taking the limits of lim e→0 sample entropy in resolving the scaling
behavior. This is part of the reason that multiscale entropy is built op top of the
1x)(Li + () We start from an≤m −x(i1)+ (L)]j. Let us sort the elements of the
vector,2 − 1)L), we choose their natural order, i.e., ifx(i + (jm − 1)L]. When an
equality occurs, e.g.,Xi = [x(i),jxi1([i<x+(ij+ (iL2), then, j···1 −, x(i + (ji1 − 1)L) =
Appl. Sci. 2021, 11, 5736 44 of 67
x(i + (ji1 − 1)L) ≤ x(i + (ji2 − 1)L). This way, the vector Xi is mapped onto a
that there are a total ofj ). Each permutation can be considered as anm-
m
asK ≤ m! distinct symbols. The PE, denoted by Ep, for the time series {
Ep Pj ln Pj. (94)
(
The two weights, w, can be written as 1
) =
− dj/n , j 1, 2, where dj are the distances of the point from the centers of the
two fitted polynomials. Therefore, the weights decrease linearly with the
distance from the center of the segment.
The weighting ensures symmetry. The scheme ensures that the overall fitted
curve is continuous everywhere, has the right- or left-derivatives at the
boundary, and is differentiable at non-boundary points.
The adaptive filter can readily determine any kind of trend from the data.
An example for determining the trend from the global annual sea surface
temperature (SST) data is shown in Figure 24a, where the blue straight line is
the global linear fit, the black curve is the global second-order polynomial fit,
and the red curve is the adaptive trend with a window size about the half of the
total data length. It is amazing that with such a large window size, not only the
global warming trend but also the local brief cooling periods are clearly shown.
In fact, the residual noise (i.e., the difference between the fitting and the
original data) shown in Figure 24b with these fits is comparable to that
Appl. Sci. 2021, 11, 5736 46 of 67
obtained by empirical mode decomposition (EMD) [187]. Since EMD involves
dyadic decomposition,
while the window size used by the adaptive method is continuous, the adaptive
filtering is
more flexible and can be accurate.
Annual temperature data
0.6 1
(a) (b)
Original data
0.4 Global linear trend Residual data
Window size = 61
Window size = 121 0.5
0.2
0
0
−0.2
−0.4 −0.5
15 15
10 10
x(t+L) x(t+L)
5 5
0 0
−5 −5
−10 −10
−15 −15
−20 −20
−20 −15 −10 −5 0 5 10 15 20 25 −20 −15 −10 −5 0 5 10 15 20 25
x(t) x(t)
Appl. Sci. 2021, 11, 5736 47 of 67
25 25
(c) Adatptive denoising (d) Wavelet denoisinig
20 20
15 15
10 10
x(t+L) x(t+L)
5 5
0 0
−5 −5
−10 −10
−15 −15
−20 −20
−20 −15 −10 −5 0 5 10 15 20 25 −20 −15 −10 −5 0 5 10 15 20 25
x(t) x(t)
Figure 25. Denoising of the chaotic Lorenz signal: (a) phase diagrams constructed from the the clean and the noisy
signal, which are marked as green and red, respectively; (b) the filtered signal obtained by a chaos-based approach; (c)
the filtered
signal obtained by the adaptive algorithm; and (d) the filtered signal obtained by a wavelet method.
In order to see the details of xdetrended, Figure 26b shows a small segment of
it as the blue curve. We observe a diurnal cycle in the data. This is reasonable
since electricity consumption in daytime and during night has to be quite
different. The signal does not have a fixed amplitude though, as it is still quite
noisy. This noise, which is high frequency, can also be removed by applying the
adaptive filter again, with a small window size. The trend thus determined will
better represent the diurnal cycle. It is a band-pass signal. An example of this
signal is shown in Figure 26b as the red curve, where we used a window size of
9 and a polynomial of order 2. From this signal, we can construct a phase
diagram with delayed coordinates. This is shown in Figure 26c. A limit cycle-like
structure does emerge.
We can further analyze the oscillatory feature of the trend signal by
computing powerspectral density (PSD) from the data. The result is shown
Figure 26d, where the blue, red, and green curves are for the raw, detrended,
Appl. Sci. 2021, 11, 5736 48 of 67
and the band-passed signals, respectively. The PSD curves show very sharp
spectral peaks at frequency of 1 day −1 and its harmonics. Note the blue curves
are basically covered by the other two colors, except at the very low frequency
(i.e., close to 0 Hz). This is due to the red trend signal shown in Figure 26a.
(a)
1000
x(t)
500
0
0.5 1 1.5 2 2.5 3 3.5 4 4.5
t (hr) x 10
4
100 (b)
50
x(t)
0
−50
−100
400 450 500 550 600 650 700
t (hr)
Figure 26. Electricity consumption analysis: (a) raw data (blue) and the trend signal (red);
(b) enlargement of the high-frequency load data showing the diurnal cycle (blue) and its
filtered band-pass data (red); (c) 2D phase diagrams constructed from the data shown in
(b); (d) PSD for the raw, detrended, and denoised data, which are marked by blue, red,
and green, respectively.
3.2. Adaptive Fractal Analysis (AFA)
In the past three decades, many efforts have been made to estimate H,
the most important parameter for random fractals. As a result, many excellent
methods for estimating H have been proposed. Among them is the celebrated
detrended fluctuation analysis
(DFA) [151,152]. It works as follows: To analyze a time series, x1, x2, x3, ··· , one
first76). By determines its mean x, then constructs a random walk process using
Equation ( doing so, one has assumed that the data are like a noise process.
One then partitions the random walk into non-overlapping segments of length l
(therefore, the number of distinct segments is not larger than N/l, where N is
the length of the time series). One furthers determines the local trend in each
segment by using the best linear or polynomial fitting. This procedure is
schematically shown in Figure 27, where a short EEG signal is used as an
example. Finally, one obtains the difference between the original “walk” and
the local trend. Denote it by u(n). H is then estimated by
Dl E1/2
Fd(l) = ∑ u (i) 2
∼ lH
(98)
i
where the angle brackets is a short-hand notation for averages over all the
segments.
Appl. Sci. 2021, 11, 5736 49 of 67
2000
EEG1000
data x(n)
−1000
−2000
0 200 400 600 800 1000
Index n
have discussed earlier, positive q values highlight large values in |u(i) −99v)(can readily be extended to long-
range cross-correlation analysisi)|, and negative q values highlight small values in |u(i) − v(i)|. [188]
Equation (
Appl. Sci. 2021, 11, 5736 50 of 67
between two series: x(i), i = 1 ··· ,wn byandtrend_y(i),xi (w=)(i1),···i =, n1.
···Denote their trend sig-, n and trend_y(w)(i), nals corresponding to window
size i = 1 ··· , n, respectively. Then we have
i
1/2
8
(a)
F(w)6
2
log
4
2
0 2 4 6 8 10
log w
2
Figure 28. Cont.
Appl. Sci. 2021, 11, 5736 51 of 67
H(t)
Time t (year)
Figure 28. AFA of power load data: (a) an example of log2 F(w) vs. log2 w for the load
data of an arbitrarily chosen day, (b) temporal variation of the Hurst parameter (red)
and the rescaled temperature (black).
. (104)
Equivalently, we can write,
det
= λ(et)et. (105) dt
εt + δ t
εt
ε0
Figure 29. A schematic showing how a small distance between two nearby trajectories
grows with time.
Now that we have introduced SDLE, we can better understand the classic
algorithm for computing the largest Lyapunov exponent λ1 discussed earlier
[75]. That algorithm assumes et ∼ e0eλ1t and then through averaging estimates λ1
( )
by ln et − ln e0 /t. This assumption may not even hold for true chaotic signals.
This is reminded in the detail of the schematic plot shown in Figure 29 —et+δt
may be smaller than et. As already mentioned, a fundamental difficulty with this
assumption is that for any type of noise, when e0 is small (which is the case
when nearest neighbors are used), λ1 can always be positive, leading to
misinterpreting noise as chaos. The reason is simple: et will rapidly converge to
the most probable distance between the constructed vectors, and thus will
almost be surely larger than e0. However, when we define SDLE using Equation
(103), we have not made any assumptions, except ∆t being small (usually taken
to be the sampling time interval). As we will see, chaos is characterized by a
constant λ(e) over a range of e.
In the computation of SDLE, we first examine which embedding vectors
defined by Equation (32) fall within the series of shells defined by Equation (40).
Then, the evolution of those vector pairs (Vi, Vj) can be monitored, and their
average behavior of divergence (not necessarily exponential) can be computed.
So far as exponential or power law divergence are concerned, we can exchange
the order of taking the logarithm and averaging. Then,
Equation (104) becomes
D E
λ(et) = ln kVi+t+∆t − Vj+t+∆∆ttk − ln kVi+t − Vj+tk (106)
Appl. Sci. 2021, 11, 5736 53 of 67
where t and ∆t are measured in terms of the sampling time, and the average,
denoted by the angle brackets, is over all indices i, j with their corresponding
vectors satisfying Equation (40).
The program for computing SDLE is explained in detail in [202], and can be
obtained from the authors. The major scaling laws of SDLE that are most
relevant for analyzing complex data are summarized below [189]:
• For deterministic chaos,
λ(e) ∼ constant, (107)
λ(e) ∼ −γ ln e, (108)
The law pertains to small scales, and γ> 0 controls the speed of
information loss. • For 1/f 2H+1 processes,
λ(e) ∼ He−1/H. (109)
. (110)
• For stochastic oscillations, both scaling laws λ(e) ∼ −γ ln e and λ(e) ∼
He−1/H can be observed when different embedding parameters are used.
• When the dynamics of a system are very complicated, one or more of the
above scaling laws may manifest themselves on different e ranges.
It is now clear that with the help of these scaling laws, distinguishing chaos
from noise can be readily solved. More importantly, we can now understand
very well the nature of each type of behavior of the data by obtaining the
defining parameters for that behavior.
To illustrate how SDLE characterizes chaotic features and the effect of
noise, let us briefly discuss Boolean chaos in a ring oscillator. Boolean chaos
normally refers to the continuous time dynamics of a system of interconnected
digital gates whose output updates are not regulated by an external clock.
Recently, an alternative Boolean architecture for generating chaotic oscillations
was proposed by Blakely et al. [204]. See Figure 30. Three typical kinds of
waveforms for the variable v3 are shown in Figure 31. The chaotic behaviors of
the oscillations can be aptly characterized by SDLE, as shown in Figure 32—the
Figure actually has shown more than chaos: the chaotic behavior is best defined
for the variable v1, and the effect of noise is most clearly visible for the variable
v3. The reason is straightforward: in this series circuit, the noise at the third gate
is the largest.
Among the many properties of SDLE, two make it unique. One is its skill of
dealing with nonstationarity, including detecting intermittent chaos from
models as well as observational data [155,195]. To understand intermittency, it
is useful to consider the evolution of river flow dynamics over 1 year. With
some thinking, one can readily realize that the time period may be divided into
two periods, wet and dry, where the wet season may be associated with
frequent rain and snow melting, and the dry sea may be largely associated with
no or little rain, and constant evaporation. The river flow dynamics must be
Appl. Sci. 2021, 11, 5736 54 of 67
very different in these two periods. Since standard methods for detecting chaos
assume the existence of a single chaotic attractor, those methods are ill
positioned to unambiguously determine whether river flow dynamics are
chaotic or not. To illustrate how intermittent chaos can be detected by SDLE,
Figure 33 shows an example of the Umpgua river in Oregon. The exponential
divergence is evidently shown by the linear ln e(t) vs. t curve for t going from
about 20 days to about 100–150 days. Consequentially, there are well-defined
plateaus of SDLE, i.e., a constant SDLE, shown in Figure 33a2 (the blue curves).
It is also interesting to note the scaling law of Equation (108) on small scales.
This is caused by the faster-than-exponential growth of small distances in the
initial period (less than 20 days), and it is mainly due to stochasticity, i.e.,
randomly driven by snow melting, rain, etc., besides measurement noise. The
chaotic and the noisy dynamics depicted in Figure 33 can be improved by using
the adaptive algorithm discussed earlier. The results using the filtered data are
shown in Figure 33 as the red curves.
Figure 30. (a) A three inverter ring oscillator. (b) A ring oscillator driven by an external
periodic signal. The resistor-capacitor stages may represent either discrete components
or the finite bandwidth of non-ideal inverters.
Appl. Sci. 2021, 11, 5736 55 of 67
−4 −3 −2 −1 0 1
ln ε
Figure 32. SDLE calculated from the experimental time series of v1 (blue), v2 (green), and v3 (red).
Appl. Sci. 2021, 11, 5736 56 of 67
)
(a1) Umpqua 0.15 (a2) Umpqua
10
(per day
8 )0.1
(t)
εln 6 0.05
ε(
λ
4 0
Original data Original data
Denoised data Denoised data
2 −0.05
0 50 100 150 5 6 7 8
9 10 11 t (day) ln ε
Figure 33. Intermittent chaos in the Umpqua river. Shown in (a1,a2) are the error growth curves and SDLE
curves, respectively. The blue curves are for the original data, while the red curves are for the filtered data. The
embedding parameters used in the computation are m = 6, L = 3. Three different shells specified by Equation
(40) are used. These curves collapse on each other, except when t is small. This highlights that the computational
results are essentially independent of the initial shells chosen.
Figure 34. Epileptic seizure detection from continuous EEG data of a patient, illustrating
that SDLE can serve as a basis to unify commonly used complexity measures. Shown in
the figure are the temporal variations of (a) λsmall−e, (b) λlarge−e, (c) the LE, (d) the K2
entropy, (e) the D2, and (f) the Hurst parameter. Seizure occurrence times were
determined by clinical experts and were indicated here as the vertical dashed lines.
H(t) that occurred around 1987. This suggests that the cold war between the
USA and former Soviet Union also had greatly strained the US. In the case of
China, local maxima and minima of the H(t) curve correspond to changes of
national leaders very well (concretely, one local maximum is at 1997, when
DENG Xiaoping died, and JIANG Zemin took over the leadership; two local
mimina are at 2002 and 2012, when HU Jintao took over the power from JIANG,
and when XI Jinping took over the power from HU, respectively). This is also
observed for many other countries. In general, H(t) will increase when policies
in a country are enhanced and will decrease when internal/external conditions
change such that many policies of a country have to be modified or replaced by
new ones. Therefore, the temporal variation of H(t) parsimoniously and
accurately summarizes the evolution of the political processes (and hence
history) of a country.
There is an important implication of the above understanding to the
overseas infrastructure investment. This is a key issue that has to be seriously
considered by China in the implementation of the Belt and Road Initiative, and
by any other countries who wish to make infrastructure investments overseas.
The necessary condition for the smooth implementation of a project is that the
duration of the construction of the project is shorter than half of the average
cycle of policy changes in a targeted country. To understand this, consider
construction of high-speed rail as an example. We can now understand why the
Ankara-Istanbul line, even though constructed for 11 years, from 2003 to 2014,
was successfully completed. It was in an increasing H(t) episode. Such long
episodes are rare among all the countries in the world though. In contrast, the
H(t) curve for Indonesia varies with a much higher frequency. Indeed, there is a
strong anti-China sentiment in Indonesia partly induced by the construction of a
high-speed rail there.
Appl. Sci. 2021, 11, 5736 60 of 67
0.5
1979 1982 1985 1988 1991 1994 1997 2000 2003 2006 2009 2012 2015
2018 Year
Figure 35. Long-range correlations (or inertia) of political processes in four countries: (a)
USA, (b) China, (c) Turkey, and (d) Indonesia. The blue curve has a temporal resolution of
1 month, while the red one has a temporal resolution of 1 year.
References
1. Manyika, J.; Chui, M.; Brown, B.; Bughin, J.; Dobbs, R.; Roxburgh, C.; Byers, A.H. Big Data: The Next Frontier for
Innovation, Competition, and Productivity. 2011. Available online:
[Link]
(accessed on 17 June 2021).
2. Boyd, D.; Crawford, K. Six Provocations for Big Data; The Center for Open Science: Charlottesville, VA, USA, 2017.
[CrossRef]
3. Sakaki, T.; Okazaki, M.; Matsuo, Y. Earthquake shakes Twitter users: Real-time event detection by social sensors. In
Proceedings of the 19th International Conference on World Wide Web, Raleigh, NC, USA, 26–30 April 2010.
4. Ginsberg, J.; Mohebbi, M.H.; Patel, R.S.; Brammer, L.; Smolinski, M.S.; Brilliant, L. Detecting influenza epidemics using
search engine query data. Nature 2009, 457, 1012–1014. [CrossRef] [PubMed]
5. Butler, D. When Google got flu wrong. Nature 2013, 494, 155–156. [CrossRef] [PubMed]
6. Available online: [Link]
url=zP_UWpBFHUI5PYen8cvlzKsXUhprdWaw97tSQ3L7ffOjjUYCTfnq_
NMnxZG6IsKS5t0y85b2vMuIPa02atZFjStLmWoJMAFEvlfGlfvJ7zK#f-comment (accessed on 17 June 2021).
7. Dash, S.; Shakyawar, S.K.; Sharma, M.; Kaushik, S. Big data in healthcare: Management, analysis and future prospects.
J. Big Data 2019, 6, 1–25. [CrossRef]
8. Raghupathi, W.; Raghupathi, V. Big data analytics in healthcare: Promise and potential. Health Inf. Sci. Syst. 2014, 2, 1–
10. [CrossRef] [PubMed]
9. Reichman, O.J.; Jones, M.B.; Schildhauer, M.P. Challenges and opportunities of open data in ecology. Science 2011,
331, 703–705. [CrossRef]
10. O’Donovan, P.; Leahy, K.; Bruton, K.; O’Sullivan, D.T.J. Big data in manufacturing: A systematic mapping study. J. Big
Data 2015, 2, 1–22. [CrossRef]
11. Karakatsanis, L.P.; Pavlos, E.G.; Tsoulouhas, G.; Stamokostas, G.L.; Mosbruger, T.; Duke, J.L.; Pavlos, G.P.; Monos, D.S.
Spatial constrains and information content of sub-genomic regions of the human genome. Iscience 2021, 24, 102048.
[CrossRef]
Appl. Sci. 2021, 11, 5736 62 of 67
12. Rosenhead, J.; Franco, L.A.; Grint, K.; Friedland, B. Complexity theory and leadership practice: A review, a critique, and
720.
50. Durant, W.; Durant, A. The Story of Civilization, Rousseau and Revolution; Simon & Schuster: New York, NY, USA, 1967;
p. 643.
51. Gao, J.B.; Hu, J.; Mao, X.; Zhou, M.; Gurbaxani, B.; Lin, J.W.-B. Entropies of negative incomes, Pareto-distributed loss,
and financial crises. PLoS ONE 2011, 6, e25053. [CrossRef]
52. Fan, F.L.; Gao, J.B.; Liang, S.H. Crisis-like behavior in China’s stock market and its interpretation. PLoS ONE 2015, 10,
e0117209. [CrossRef]
53. Bowers, M.C.; Tung, W.W.; Gao, J.B. On the distributions of seasonal river flows: Lognormal or powerlaw? Water
Resour. Res. 2012, 48, W05536. [CrossRef]
54. Deligne, N.; Coles, S.; Sparks, R. Recurrence rates of large explosive volcanic eruptions. J. Geophys. Res. 2010, 115,
B06203. [CrossRef]
55. Tsallis, C. Possible generalization of Boltzmann-Gibbs statistics. J. Stat. Phys. 1988, 52, 479–487. [CrossRef]
56. Tsallis, C. Introduction to Nonextensive Statistical Mechanics: Approaching a Complex World; Springer:
Berlin/Heidelberg, Germany, 2009.
57. Pavlos, G.P.; Karakatsanis, L.P.; Xenakis, M.N.; Pavlos, E.G.; Iliopoulos, A.C.; Sarafopoulos, D.V. Universality of non-
extensive Tsallis statistics and time series analysis: Theory and applications. Phys. A Stat. Mech. Appl. 2014, 395, 58–
95. [CrossRef]
58. Barabasi, A.L.; Albert, R. Emergence of scaling in random networks. Science 1999, 286, 509–512. [CrossRef] [PubMed]
59. Gao, J.B.; Han, Q.; Lu, X.L.; Yang, L.; Hu, J. Self organized hotspots and social tomography. EAI Endorsed Trans. Complex
Syst. 2013, 13, e1. [CrossRef]
60. Jones, M. Phase space: Geography, relational thinking, and beyond. Prog. Hum. Geogr. 2009, 33, 487–506. [CrossRef]
61. Henon, M. A two-dimensional mapping with a strange attractor. Commun. Math. Phys. 1976, 50, 69–77. [CrossRef]
62. Shields, P. The Theory of Bernoulli Shifts; Univ. Chicago Press: Chicago, IL, USA, 1973.
63. Atmanspacher, H.; Scheingraber, H. A fundamental link between system theory and statistical mechanics. Found. Phys.
1987, 17, 939–963. [CrossRef]
64. Grassberger, P.; Procaccia, I. Estimation of the Kolmogorov entropy from a chaotic signal. Phys. Rev. A 1983, 28, 2591–
2593. [CrossRef]
65. Feigenbaum, M.J. Universal behavior in nonlinear systems. Phys. D 1983, 7, 16–39. [CrossRef]
66. Ruelle, D.; Takens, F. On the nature of turbulence. Commun. Math. Phys. 1971, 20, 167. [CrossRef]
67. Pomeau, Y.; Manneville, P. Intermittent transition to turbulence in dissipative dynamical systems. Commun. Math.
Phys. 1980, 74, 189–197. [CrossRef]
68. Gao, J.B.; Rao, N.S.V.; Hu, J.; Jing, A. Quasi-periodic route to chaos in the dynamics of Internet transport protocols.
Phys. Rev. Lett. 2005, 94, 198702. [CrossRef] [PubMed]
69. Packard, N.H.; Crutchfield, J.P.; Farmer, J.D.; Shaw, R.S. Geometry from a time series. Phys. Rev. Lett. 1980, 45, 712–
716. [CrossRef]
70. Takens, F. Detecting strange attractors in turbulence. In Dynamical Systems and Turbulence, Lecture Notes in
Mathematics; Rand, D.A., Young, L.S., Eds.; Springer: Berlin/Heidelberg, Germany, 1981; Volume 898, p. 366.
71. Sauer, T.; Yorke, J.A.; Casdagli, M. Embedology. J. Stat. Phys. 1991, 65, 579–616. [CrossRef] 72. Abarbanel, H.D.I.
Analysis of Observed Chaotic Data; Springer: Berlin/Heidelberg, Germany, 1996.
73. Gao, J.B.; Zheng, Z.M. Local exponential divergence plot and optimal embedding of a chaotic time series. Phys. Lett. A
1993, 181, 153–158. [CrossRef]
74. Gao, J.B.; Zheng, Z.M. Direct dynamical test for deterministic chaos and optimal embedding of a chaotic time series.
Phys. Rev. E 1994, 49, 3807–3814. [CrossRef]
75. Wolf, A.; Swift, J.B.; Swinney, H.L.; Vastano, J.A. Determining Lyapunov exponents from a time series. Phys. D 1985, 16,
285–317. [CrossRef]
76. Gao, J.B.; Hu, J.; Tung, W.W.; Zheng, Y. Multiscale analysis of economic time series by scale-dependent Lyapunov
exponent. Quant. Financ. 2013, 13, 265–274. [CrossRef]
77. Rosenstein, M.T.; Collins, J.J.; De Luca, C.J. Reconstruction expansion as a geometry-based framework for choosing
proper delay times. Phys. D 1994, 73, 82–98. [CrossRef]
78. Kantz, H. A robust method to estimate the maximal Lyapunov exponent of a time series. Phys. Lett. A 1994, 185, 77–
87. [CrossRef]
79. Gao, J.B.; Zheng, Z.M. Direct dynamical test for deterministic chaos. Europhys. Lett. 1994, 25, 485–490. [CrossRef]
80. Gao, J.B.; Tung, W.W. Pathological tremors as diffusional processes. Biol. Cybern. 2002, 86, 263–270. [CrossRef]
81. Gao, J.B. Recognizing randomness in a time series. Phys. D 1997, 106, 49–56. [CrossRef]
82. Gao, J.B.; Chen, C.C.; Hwang, S.K.; Liu, J.M. Noise-induced chaos. Int. J. Mod. Phys. B 1999, 13, 3283–3305. [CrossRef]
83. Hu, S.Q.; Raman, A. Chaos in Atomic Force Microscopy. Phys. Rev. Lett. 2006, 96, 036107. [CrossRef]
84. Grassberger, P.; Procaccia, I. Characterization of strange attractors. Phys. Rev. Lett. 1983, 50, 346–349. [CrossRef]
85. Theiler, J. Spurious dimension from correlation algorithms applied to limited time-series data. Phys. Rev. A 1986, 34,
2427–2432. [CrossRef]
86. Gao, J.B.; Hu, J.; Liu, F.Y.; Cao, Y.H. Multiscale entropy analysis of biological signals: A fundamental bi-scaling law.
Front. Comput. Neurosci. 2015, 9, 64. [CrossRef]
Appl. Sci. 2021, 11, 5736 64 of 67
87. Theiler, J.; Eubank, S.; Longtin, A.; Galdrikian, B.; Farmer, J.D. Testing for nonlinearity in time series: The method of
185. Nielbo, K.L.; Baunvig, K.F.; Liu, B.; Gao, J.B. A curious case of entropic decay: Persistent complexity in textual cultural
heritage. Digit. Scholarsh. Humanit. 2018. [CrossRef]
186. Hu, J.; Gao, J.B.; Wang, X.S. Multifractal analysis of sunspot time series: The effects of the 11-year cycle and Fourier
truncation. J. Stat. Mech. 2009, 2009, P02066. [CrossRef]
187. Wu, Z.H.; Huang, N.E.; Long, S.R.; Peng, C.K. On the trend, detrending, and variability of nonlinear and nonstationary
time series. Proc. Natl. Acad. Sci. USA 2007, 104, 14889–14894. [CrossRef]
188. Podobnik, B.; Stanley, H.E. Detrended cross-correlation analysis: A new method for analyzing two non-stationary time
series. Phys. Rev. Lett. 2008, 100, 84–102. [CrossRef] [PubMed]
189. Gao, J.B.; Hu, J.; Tung, W.W.; Cao, Y.H. Distinguishing chaos from noise by scale-dependent Lyapunov exponent. Phys.
Rev. E 2006, 74, 066204. [CrossRef]
190. Torcini, A.; Grassberger, P.; Politi, A. Error Propagation in Extended Chaotic Systems. J. Phys. A Math. Gen. 1995, 28,
4533. [CrossRef]
191. Aurell, E.; Boffetta, G.; Crisanti, A.; Paladin, G.; Vulpiani, A. Growth of non-infinitesimal perturbations in turbulence.
Phys. Rev. Lett. 1996, 77, 1262. [CrossRef] [PubMed]
192. Aurell, E.; Boffetta, G.; Crisanti, A.; Paladin, G.;Vulpiani, A. Predictability in the large: An extension of the concept of
Lyapunov exponent. J. Phys. A 1997, 30, 1–26. [CrossRef]
193. Gao, J.B.; Tung, W.W.; Hu, J. Quantifying dynamical predictability: The pseudo-ensemble approach (in honor of
Professor Andrew Majda’s 60th birthday). Chin. Ann. Math. Ser. B 2009, 30, 569–588. [CrossRef]
194. Gao, J.B.; Hu, J.; Mao, X.; Tung, W.W. Detecting low-dimensional chaos by the “noise titration” technique: Possible
problems and remedies. Chaos Solitons Fractals 2012, 45, 213–223. [CrossRef]
195. Hu, J.; Gao, J.B.; Tung, W.W.; Cao, Y.H. Multiscale analysis of heart rate variability: A comparison of different
complexity measures. Ann. Biomed. Eng. 2010, 38, 854–864. [CrossRef]
196. Hu, J.; Gao, J.B.; Tung, W.W. Characterizing heart rate variability by scale-dependent Lyapunov exponent. Chaos
Interdiscip. J. Nonlinear Sci. 2009, 19, 028506. [CrossRef]
197. Ryan, D.A.; Sarson, G.R. The geodynamo as a low-dimensional deterministic system at the edge of chaos. EPL 2008, 83,
49001. [CrossRef]
198. Fan, Q.B.; Wang, Y.X.; Zhu, L. Complexity analysis of spatial—Ctemporal precipitation system by PCA and SDLE. Appl.
Math. Model. 2013, 37, 4059–4066. [CrossRef]
199. Hu, J.; Gao, J.B. Multiscale characterization of sea clutter by scale-dependent Lyapunov exponent. Math. Probl. Eng.
2013, 2013, 584252. [CrossRef]
200. Blasch, E.; Gao, J.B.; Tung, W.W. Chaos-based Image Assessment for THz Imagery. In Proceedings of the 2012 11th
International Conference on Information Science, Signal Processing and Their Applications, Montreal, QC, Canada, 3–5
July 2012.
201. Li, Q.; Gao, J.B.; Huang, Q.; Wu, Y.; Xu, B. Distinguishing Epileptiform Discharges from Normal Electroencephalograms
Using Scale-Dependent Lyapunov Exponent. Front. Bioeng. Biotechnol. 2020, 8, 1006. [CrossRef]
202. Gao, J.B.; Hu, J.; Tung, W.W.; Blasch, E. Multiscale analysis of physiological data by scale-dependent Lyapunov
exponent. Front. Fractal Physiol. 2012, 2, 110. [CrossRef]
203. Berera, A.; Ho, R.D.J.G. Chaotic Properties of a Turbulent Isotropic Fluid. Phys. Rev. Lett. 2018, 120, 024101. [CrossRef]
204. Blakely, J.N.; Corron, N.J.; Pethel, S.D.; Stahl, M.T.; Gao, J.B. Non-autonomous Boolean chaos in a driven ring oscillator.
In New Research Trends in Nonlinear circuits—Design, Chaotic Phenomena and Applications; Kyprianidis, I.,
Stouboulos, I., Volos, C., Eds.; Nova Publishers: New York, NY, USA, 2014; Chapter 8, pp. 153–168.
205. Lazer, D.; Pentland, A.; Adamic, L.; Aral, S.; Barabasi, A.L.; Brewer, D.; Christakis, N.; Contractor, N.; Fowler, J.; Van
Alstyne, M.; et al. Computational social science. Science 2009, 323, 721–723. [CrossRef] [PubMed]
206. Goldstein, J.S. A Conflict-Cooperation Scale for WEIS Events Data. J. Confl. Resolut. 1992, 36, 369–385. [CrossRef]
207. Schrodt, P.A.; Gerner, D.J.; Ömür, G. Conflict and Mediation Event Observations (CAMEO): An Event Data Framework
for a Post Cold War World. In International Conflict Mediation: New Approaches and Findings; Bercovitch, J., Gartner,
S., Eds.; Routledge: New York, NY, USA, 2009.
208. O’Brien, S.P. Crisis Early Warning and Decision Support: Contemporary Approaches and Thoughts on Future Research.
Int. Stud. Rev. 2010, 12, 87–104. [CrossRef]
209. Turchin, P. Historical Dynamics: Why States Rise and Fall; Princeton University Press: Princeton, NJ, USA, 2003. 210.
Turchin, P. Arise ’cliodynamics’. Nature 2008, 454, 34–35. [CrossRef] [PubMed]