Application of Machine Learning in Ocean Data
Application of Machine Learning in Ocean Data
Citation Lou, R., Lv, Z., Dang, S., Su, T., & Li, X. (2021). Application
of machine learning in ocean data. Multimedia Systems.
doi:10.1007/s00530-020-00733-x
DOI 10.1007/s00530-020-00733-x
Rights Archived with thanks to Springer Science and Business Media LLC
Abstract In recent years, machine learning has be- and builds a model. The model is used to predict and
come a hot research method in various fields and has analyze another part of the data to get the results peo-
been applied to every aspect of our life, providing an ple want. With the continuous advancement of ocean
intelligent solution to problems that could not be solved observation technology, the amount of ocean data and
or difficult to be solved before. Machine learning is data dimensions are rising sharply. The use of tradi-
driven by data. It learns from a part of the input data tional data analysis methods to analyze massive amounts
of data has revealed many shortcomings. The develop-
Zhihan Lv
ment of machine learning has solved these shortcom-
E-mail: lvzhihan@[Link] ings. Nowadays, the use of machine learning technology
to analyze and apply ocean data becomes the focus of
Ranran Lou scientific research. This method has important practi-
E-mail: louranran1113@[Link]
cal and long-term significance for protecting the ocean
Shuping Dang environment, predicting ocean elements, exploring the
E-mail: [Link]@[Link] unknown, and responding to extreme weather. This pa-
per focuses on the analysis of the state of the art and
Tianyun Su
E-mail: sutiany@[Link]
specific practices of machine learning in ocean data, re-
view the application examples of machine learning in
Xinfang Li various fields such as ocean sound source identification
E-mail: lixinfang@[Link] and positioning, ocean elements prediction, ocean bio-
diversity monitoring, and deep-sea resource monitoring.
1
School of Data Science and Software Engineering, We also point out some constraints that still exist in the
Qingdao University (QDU), Qingdao, China
research and put forward the future development direc-
2
Laboratory for Regional Oceanography and tion and application prospects.
Numerical Modeling, Pilot National Laboratory for
Marine Science and Technology, Qingdao, China Keywords Ocean · Data · Ocean data · Machine
learning
3
Marine Data and Information Center,
The First Institute of Oceanography, MNR,
Qingdao, China
4
National Engineering Laboratory for Integrated 1 Introduction
Aero-Space-Ground-Ocean Big Data Application
Technology, Qingdao, China Nowadays big data has become the focal discussion and
5
attention in various industries [1]. Through various data
Computer, Electrical and Mathematical Science
and Engineering Division, King Abdullah University
generated in production and life, we can summarize the
of Science and Technology (KAUST), Thuwal, laws of nature and society and predict future trends.
Saudi Arabia Benefit from advances in ocean observation technology
and system processing capabilities, different types of
2 Ranran Lou1 et al.
ocean data exceeding TB level are collected from var- of training experience [16]. What is machine learning?
ious kinds of sensor equipment every day [2–4]. The Machine learning is an interdisciplinary subject involv-
ocean itself is a huge and complex ecosystem, including ing mathematics, statistics, and computer science, it
many disciplines and fields, involving marine chemistry, uses instance data or past experience to train computers
marine geology, physical oceanography, marine biology, to optimize certain performance standards [17]. With
etc. For ocean data research, collecting data is only one the rapid increase in the amount of data in various in-
aspect, while how to process these ocean big data with dustries, the use of appropriate machine learning algo-
different types and sources for scientific research is an rithms can improve the efficiency of data analysis and
urgent problem that we need to solve. In the past, peo- processing, and solve some practical problems. Figure
ple were still in the stage of acquiring and analyzing 1 shows the process of machine learning.
ocean data, and the application of ocean data was not
extensive.
Making full use of big ocean data can help human
beings achieve better development in the research of
responding to climate change, protecting the ecologi-
cal environment, and preventing natural disasters [5–9].
Traditional ocean data processing and analysis mostly
use manual classification and recognition, traditional
statistical analysis, ocean model simulation and other
methods. These methods are often affected by subjec-
tive factors and cannot truly describe the hidden in-
formation in the data. Moreover, most of the big data
in the ocean are unstructured or semi-structured data,
with complex or unrelated relationships among the data,
which also poses challenges to the traditional statistical
analysis and ocean model simulation [10]. The numer-
ical model itself needs to be cautious and professional
in its realization. When a large amount of external in-
formation cannot be obtained and computing resources
and expertise are limited, the results are not satisfac-
tory. With the development and popularization of ma-
chine learning technology, the use of machine learning Fig. 1 Machine learning process
algorithms to analyze and the application of big data
have become a hot research fields. Big data provides
In 1950, Turing, the “Father of Artificial Intelli-
sufficient data support for machine learning to extract
gence”, invented the famous “Turing Test”, which initi-
patterns and build models [11]. Machine learning al-
ated scientific research on artificial intelligence. In 1957,
gorithms learn from a large amount of data and build
Frank Rosenblatt proposed the Perceptron concept and
models. Compared with traditional ocean data analysis
designed the first neural network. In 1969, Marvin Min-
methods, it has the advantages of high accuracy, low
sky and Seymour Papert raised the XOR problem in
complexity, and less calculation, in some cases, it re-
their book ”Perceptron”, which brought machine learn-
duces data requirements. At present, machine learning
ing into a low ebb. In 1985, Rumelhart, Hinton and
is used in ocean big data to identify and locate under
Williams proposed the well-known back propagation
the sea, predict marine elements, marine life distribu-
(BP) algorithm [18], which becomes the essential al-
tion, and climate analysis [12–15]. This paper will dis-
gorithm of neural networks. In 1990, various machine
cuss the definition of machine learning and the latest
learning algorithms such as support vector machine (SVM),
development of machine learning application in ocean
random forest (RF), logistic regression (LR), etc., came
data, summarize the problems involved, and analyze
out, which are capable of completing basic recognition
the future research directions.
and classification tasks. In 2006, Hinton and Salakhut-
dinov invented the deep learning algorithm [19]. Fig-
2 Machine learning ure 2 shows the algorithm for feature learning through
a multi-layer perceptron (MLP) with multiple hidden
A learning problem can be defined as a problem that layers. MLP consists of an input layer, an output layer,
improves the performance of a task through some type and a hidden layer. There can be multiple hidden lay-
Application of Machine Learning in Ocean Data 3
ers. The input data will undergo a series of weighted this method can only be used in simple and stable en-
sums in the hidden layer. After calculating the weighted vironments [22, 23]. Machine learning provides a more
summation of each hidden neuron, the result is applied reliable method for marine source location. It can learn
to a non-linear function, the so-called activation func- directly from data without simulating the marine envi-
tion, and the result of this function is weighted and ronment [22].
summed to obtain the output. The emergence of deep In the 1990s, researchers began to use machine learn-
learning has also promoted the current upsurge in ma- ing to study marine acoustics [24, 25]. Commonly used
chine learning research. For example, in 2012, the Hin- machine learning algorithms for maritime positioning
ton research team used deep learning to win the most are: SVM, RF, feedforward neural network (FNN), con-
influential ImageNet competition in the field of com- volutional neural network (CNN). Using these algorithms
puter vision. In the Go game held in March 2016, the of machine learning, it is possible to conduct applied re-
AlphaGo robot, which applied the principles of deep search on ship positioning, ship classification, and esti-
learning, defeated Lee Sedol, the world champion of Go mating seabed distance, and they have achieved better
and a professional player with a nine-dan rank and a research results compared with MFP [22–27]. In 2017,
total score of 4-1. Niu et al. preprocessed the normalized sample covari-
ance matrix constructed by the pressure received by
the vertical linear array and input it into three machine
learning models: FNN, SVM and RF to learn the source
range, proved the potential of machine learning in un-
derwater source location [22]. At the same time, they
used ship acoustic data at different speeds in the Santa
Barbara Strait to prove the effectiveness of SVM and
FNN machine learning classifiers for acoustic localiza-
tion [26]. Van Komen et al. proved the feasibility of us-
ing deep learning CNN to simultaneously determine the
seabed type and source range from the 1s pressure time
Fig. 2 Deep learning of multi-layer perceptron series of impulsive sounds [27]. Although machine learn-
ing uses data-driven advantages to overcome the disad-
vantages of traditional marine sound source positioning
Machine learning is divided into supervised learn-
that requires environmental factors, new problems have
ing and unsupervised learning according to whether
also followed. The existing marine acoustic data cannot
the sample data contains label data. Supervised learn-
meet the amount of data required for training models.
ing algorithms learn from the labeled train data and
For the research of ocean sound source positioning, this
makes label predictions on the test data. Unsupervised
is the next problem that needs to be solved.
learning algorithms do not need to learn from labeled
train data, they learn the structural features between
the data from the unlabeled sample data, and classifies 3.2 Ocean forecast
them according to the learned features.
Sea surface temperature, sea waves, sea ice, etc. are all
important ocean elements. The analysis and prediction
3 Application status of these elements is of great significance to disaster pre-
vention, environmental protection, and weather fore-
3.1 Sound source identification and location casting. Numerous algorithms of machine learning pro-
vide accurate and efficient methods for analyzing and
One of the main applications of ocean data is to use predicting ocean elements. However, the use of machine
acoustic data to identify and locate at sea. Before apply- learning to predict ocean elements still has the problem
ing machine learning, matching field processing (MFP) of insufficiently clear characteristics.
was the most common ocean positioning method. MFP
is a general beamforming method that uses the spatial Sea surface temperature Sea surface temperature (SST)
complexity of the sound field in ocean waveguides to depends on the heat budget of sea water. It has obvi-
locate sources in range, depth, and azimuth or to infer ous diurnal change and seasonal change, especially the
the parameters of the waveguide itself [20, 21]. How- change of geographical distribution, and has important
ever, the use of MFP for marine sound source local- influence on climate and marine ecosystem [28–30]. The
ization relies on marine environmental information, so long short-term memory (LSTM) neural network has
4 Ranran Lou1 et al.
a strong learning and predictive ability for time series sification algorithm, SVM is also one of the machine
data such as SST, and its structure is shown in Figure 3. learning algorithms for predicting ocean waves [44, 45].
The network selectively memorizes the input from the Mahjoubi, Mosabbeb used the current wind speed and
previous node through input gates, forget gates, and the hourly wind speed in the previous six hours col-
output gates, and determines the output of the current lected from the deep water of Lake Michigan as input
state. Based on LSTM, Xiao et al. built a 5-layer deep data, used the SVM to predict the wave height, and
neural network model for SST anomaly (SSTA) pre- used the same data with the ANN and Radial Basis
diction, as shown in Figure 4, at the same time, the Function (RBF). Multiple evaluation indexes (devia-
AdaBoost integrated learning model was used to solve tion, correlation coefficient (R), root mean square error
the problem of overfitting. Improved machine perfor- (RMSE) and scatter index (SI)) show that SVM can be
mance by combining these two powerful and hetero- successfully used in wave height prediction and the error
geneous models [31]. They also established a spatio- of SVM is slightly better than that of ANN. [44]. Com-
temporal deep learning model, using convolutional long pared with ANN, SVM does not overfit, requires fewer
short-term memory (ConvLSTM) as a building block parameters, shorter calculation time, and higher accu-
for training, and had relatively accurate prediction re- racy. For the time series interpolation problem of buoy
sults for the short-term and mid-term daily forecasts of missing data, these software algorithms are also appli-
SST [32]. Lins et al. proposed to predict seasonal SST cable [46–48]. In order to meet the constantly changing
through the SVM [33]. data flow, Durán-Rosal et al. proposed to use the evo-
lutionary unit neural network (EPUNN) and use the
linear model as the input part to reconstruct the data.
Waves Nowadays, wave forecasting provides great con-
This method has good performance in the real case of
venience for people’s sea life, and is helpful for ship-
reconstruction of a large number of lost data on 6 wave
ping, fishing, national defense, and offshore energy ex-
buoys in the Alaska Bay [48].
ploration, not only that, the prediction of ocean waves
also helps to study the energy transmission and ma-
terial exchange of marine ecology [35]. The formation Ocean eddy Eddy is a vortex-type water vortex, also
of ocean waves is a complicated seawater movement known as a black hole in the ocean. Ocean eddies are
process, which is the propagation of the undulating usually caused by tides. The global ocean circulation
shape of the sea surface and a wave formed by the is also largely affected by mesoscale ocean eddies [49].
periodic vibration of water quality points when they These eddies exist in all sea areas around the world
leave the equilibrium position and propagate in a cer- and play a role in the transmission of kinetic energy
tain direction. Traditional ocean wave forecasting is to in the ocean circulation. In order to monitor and track
establish a numerical model by simulating the wave eddy currents, Franz et al. proposed a framework that
evolution process generated by the wind field acting combines CNN with the image processing tool Kanade-
on the ocean surface. Currently, the third generation Lucas-Tomasi (KLT), and compared with the LSTM.
wave forecasting model is usually used, including the This method achieves a high recognition rate and ac-
WAM (Wave Model) established by the WAMDI team curacy [49]. Bai et al. developed a deep learning method
in 1988 [36], the SWAN (Simulating Wave Nearshore) called streampath-based region-based convolutional neu-
model developed by Booij et al. [37] and the WAVE- ral networks (SP-RCNN) for automatically identifying
WATCH III model developed by Tolman et al. [38]. ocean vortices from flow field data. First, a large-scale
However, these forecasting models require more and eddy dataset is constructed from ocean current data
accurate input data, the forecasting time is long, the through a streampath-based method. Then combine the
complexity is high, and the forecasting effect is not sat- multi-layer features in the neural network with the fea-
isfactory. In recent years, the use of machine learning tures of the eddy, and place more particles in the eddy
to predict ocean waves has become the focus of at- domain image to enhance the display of the eddy, the
tention of researchers and has been widely used. Ar- mean average precision (mAP) of the monitoring re-
tificial neural network(ANN) is widely used to predict sults is 90.64%, the success of detection rate (SDR) is
wave parameters [39–42]. Rao, Mandal in 2005 used 98.91%, which solves the problem that it is difficult to
the neural network approach to estimate wave param- detect the eddy in the sparse flow path area. This is
eters from the wind field generated by cyclones [43]. also the first method to apply deep learning technology
Compared with traditional numerical prediction mod- to identifying eddy currents in flow field data [50]. In
els, neural networks improve accuracy, reduce complex- addition, the use of machine learning can also predict
ity, reduce the amount of calculation, and in some cases turbulence processes and ocean flow fields, and classify
reduce the need for data. As a commonly used clas- eddies [51, 52].
Application of Machine Learning in Ocean Data 5
3.3 Ocean biodiversity monitoring seabed [55]. Fish species classification is one of the im-
portant studies of marine life. In the 1990s, scientists
Biodiversity is a broad concept that describes the de- tried to use Principal Component Analysis (PCA) and
gree of diversity in the natural world. May believes that linear discriminant analysis to extract the main char-
from the genetic diversity within the native population acteristics of fish are classified, but the accuracy is not
of a species to the genetic diversity between geographi- high [56,57]. Huang et al. proposed a new balanced opti-
cally different populations of the same species, to com- mization tree (BEOTR) classifier with rejection options
munities or ecosystems, biodiversity exists at many dif- for live fish recognition. After the recognition phase,
ferent levels [53]. Compared with land, there are more a rejection system based on Gaussian Mixture Model
types of marine life, but people pay less attention [54]. (GMM) is added to the classifier. The rejection func-
Machine learning is expected to replace methods such tion evaluates the posterior probability of the test sam-
as equation fitting and manual monitoring, making ma- ple, which can overcome some misclassifications in the
rine biological monitoring more accurate, convenient BEOTR classifier. The classifier tested 24,150 artifi-
and efficient. cially labeled images containing 15 common fish species
in the waters of Taiwan, with an accuracy rate of 74.8%.
In order to better study marine life, researchers use
[58]. In 2018, Siddiqui et al. used a CNN model pre-
various data collected in the ocean and combine ma-
trained in a public image set to extract features from 16
chine learning algorithms to carry out various exper-
different fish images in the temperate and subtropical
iments. Researchers such as Wei and others used the
coastal waters of Western Australia, and finally applied
RF to predict global seabed biomass in the Census
the linear one-to-many strategy SVM performs classifi-
of Marine Life (CoML) project, this method models
cation with an accuracy of 94.3% [59]. In addition, for
the complex and potentially nonlinear relationships be-
more marine biological research, Reus et al. established
tween ocean attributes and seabed conventional pop-
the first publicly available seagrass image dataset and
ulations, analyzing the cycle of organic matter effec-
proposed a machine learning method to automatically
tively predicts the biomass and abundance of the global
6 Ranran Lou1 et al.
estimate seagrass coverage on the seafloor, and studied initial ocean-observing system (EDIOS), array for real-
the use of CNN to describe seagrass patches and su- time geostrophic oceanography (ARGO) and many more.
perpixels [60]; Glotin et al. used machine learning to These standards apply to different scopes. Therefore,
study sperm whale bioacoustics [61, 62]; Al-Barazanchi establishing a unified ocean data standard is one of the
et al. used CNN not only to classify plankton images, important prerequisites for improving the use rate of
but also to extend to new classifications [63]. machine learning in marine data and the accuracy of
the model. Secondly, in terms of data preprocessing,
due to different data sources and diverse data types,
3.4 Deep-sea resource monitoring sometimes it is necessary to apply different types and
different sources of data to the machine learning model.
Human development is inseparable from the develop- It takes a long time to analyze the data before build-
ment and utilization of various resources. Today, when ing the model. For preprocessing, the fusion technology
land resources are gradually depleted, people have turned of various marine data will be improved in the future
their attention to the deep ocean. There are huge re- to improve the efficiency of model building. Finally, in
serves of various energy and minerals in the deep sea. terms of data volume, although a large amount of ocean
Deep-sea resource development will not only signifi- observation data is generated every day, the types of
cantly increase the world’s resource base, but also bring these data are not evenly distributed, and some types of
considerable economic benefits to the world in the fu- data cannot meet the needs of machine learning, such
ture. Measuring distribution is an indispensable task in as source identification and positioning in the ocean.
the early stage of seabed resource development. Tra- Among them, the amount of marine acoustic data is
ditional marine resource exploration requires various not enough to meet the training needs of the model.
types of observation equipment to sample and use math- In the future, it is necessary to expand the scope of
ematical methods for modeling, which is very time- marine monitoring and increase the amount of various
consuming, labor-intensive and has low accuracy. Ma- data collection.
chine learning can quickly and accurately measure and
model deep resources.
The deep-sea iron-manganese nodules found in the 4.2 Scope of application
Clarion-Clipperton Zone (CCZ) of the Pacific Ocean are
a huge potential source of metals such as nickel, cobalt, At present, most of the application of machine learning
and manganese. In order to obtain data on the quantity in ocean data is to select data of a specific geographic
and quality distribution of nodules in the CCZ, Hari et range for experimental research. There are no compar-
al. proposed a method based on artificial neural net- ative experiments for different sea areas. The environ-
work is used to model the nodule parameters in CCZ ment of each sea area is different, and the input data
using limited data available in the open domain [64]. of the model is quite different. Therefore, whether the
Similarly, in order to measure the coverage of nodules, research method can be applied in other regions is a
Jie used side scan sonar and Automatic Underwater problem that researchers need to solve in the research
Vehicle (AUV) collected data on the Clarion and Clip- process.
perton Fracture Zone (CCFZ), and proposed an ANN
based evaluation The PMN abundance of metal nodules 4.3 Algorithms comparison
has a test accuracy of 84% [65].
Machine learning has many algorithms, and the appli-
cation scenarios in the ocean are increasing. In research,
4 The problem most researchers use existing algorithm transformation
or multi-algorithm comparison for training, and it is
4.1 Data
not clear which algorithm should be used in the re-
search field. To be suitable, find the optimal solution
First of all, in terms of ocean data standards, with
by constantly adjusting the algorithm, which requires
the development of ocean observation technology, the
a huge time cost.
data standards collected by various observation meth-
ods are also different, which is a challenge for data-
driven machine learning. Ocean metadata is the main 5 Application prospects
means to solve ocean data management. The current
ocean metadata standards include marine environmen- In the future, machine learning will be widely used in
tal data inventory (MEDI), European directory of the ocean data to prevent natural disasters, marine environ-
Application of Machine Learning in Ocean Data 7
ment monitoring, marine resource development, marine Unlike land transportation, maritime transportation has
transportation research, and other fields. Through the the characteristics of uncertain ship routes, which makes
continuous expansion of marine data, it will promote maritime traffic monitoring increasingly difficult. In the
the development of the marine industry. near future, it is expected to use machine learning to
In the prevention of natural disasters, the use of improve maritime traffic conditions and calculate ship
data collected by marine sensors, meteorological satel- density through ocean big data combined with ship traf-
lites and other observation methods, through the anal- fic information and port and waterway information; re-
ysis of machine learning algorithms, improves the level duce traffic accidents caused by harsh ocean environ-
of forecasting and early warning of severe maritime ments, Assess the risk of sailing routes through harsh
weather in coastal areas and reduces the loss of life environments and remote areas [80–84].
and property. A typhoon is a tropical cyclone that car- Over the past period of time, wireless communica-
ries huge amounts of energy. Wherever it goes, it may tion technology has made considerable progress [85–89].
bring natural disasters such as squalls and rains to peo- A large amount of ocean observation data is wireless
ple. The emergence of machine learning will improve transmission data. It is especially important to prop-
the traditional typhoon prediction model and make the erly manage these wireless sensors to ensure that they
prediction more accurate [66–68]. can reliably and continuously transmit data [90]. In the
In the field of marine environmental monitoring, the future, while using machine learning to efficiently an-
three-dimensional monitoring network composed of sea, alyze wireless transmission data, wireless sensors can
land and air is used to monitor the entire sea area, and also be reasonably controlled [91].
machine learning is used to better identify marine en-
vironmental problems such as marine red tides, storm Today, neuromorphic computing is very popular.
surges, sea waves, sea ice, and marine oil spills. In recent Silicon neurons provide a medium that can simulate
years, with the increasing frequency of offshore oil ex- neural networks directly in hardware, and they are more
ploration and development and marine transportation suitable for real-time large-scale neural simulations than
activities, and frequent oil spills, marine oil spill pollu- those performed on general purpose computers [92].
tion has become one of the most important threats to Multicompartment emulation is an important step to
the marine environment. The analysis and processing enhance the biological realism of neuromorphic system
of ocean image data using CNN can effectively classify and to further understand the computational power of
and identify oil spills. [69–72]. As shown in Figure 5, neurons. It can accurately reproduce the biodynamics
the output of each layer of CNN is the input of the of a single neuron. So far, scientists have proposed a
next layer. Through the convolutional layer, the fea- neuromorphic structure that can be used to realize a
tures are extracted. Through the pooling layer, similar large-scale biologically meaningful neural network with
features are merged to reduce the amount of data and one million multicompartment neurons [93]. By combin-
generalize general features. The fully connected layer ing work on event-based neuromorph systems, activity-
merges the results after convolution and pooling. With driven event-based vision sensors can quickly output
the rapid development of the aquaculture industry, the compressed digital data in the form of events [94]. Based
degree of eutrophication has intensified, and the con- on this, underwater identification will become more ef-
centration and structure of nutrients in the water body ficient in the future.
have changed, resulting in the frequent occurrence of In addition, the analysis of ocean data through ma-
harmful algal blooms. In the future, the water envi- chine learning may also solve scientific issues such as
ronment can be predicted and identified through ma- global warming, sea level rise, and “La Madre” [95, 96].
chine learning. Prevent red tide from polluting sea wa- “La Madre” is also known as the “Pacific Decade Os-
ter [73–75]. cillation”, which alternately appears over the Pacific in
In the field of Marine resource development, ma- two forms of “warm phase” and “cold phase”. Each phe-
chine learning in marine fish identification and mon- nomenon lasts for 20 to 30 years. When the “La Madre”
itoring technology is applied to fishery development, phenomenon appears in the form of “warm phase”, the
so as to make fishery fishing more efficient and create water temperature of the sea near the North American
higher economic benefits [76, 77]. In addition, it can continent will rise abnormally, while the temperature of
regulate fishing behaviors, prevent ecological damage, the North Pacific ocean surface will drop abnormally;
and facilitate the supervision of law enforcement per- when the “cold phase” appears, the situation is just
sonnel [78, 79]. the opposite. The cold phase period is a period of con-
With the development of economic globalization, centrated outbreaks of global strong earthquakes. The
more and more ships are participating in maritime trade. development of machine learning and ocean data min-
8 Ranran Lou1 et al.
ing technology may provide a new idea for the study 2. Shuai, L., Ge, C., Ying-Jie, L., Feng-Lin, T.: Research
and prediction of “La Madre” phenomenon. and analysis on marine big data applied technology. Pe-
riodical of Ocean University of China (2020)
3. Riser, S.C., Freeland, H.J., Roemmich, D., Wijffels, S.,
Troisi, A., Belbéoch, M., Gilbert, D., Xu, J., Pouliquen,
6 Conclusions S., Thresher, A., et al.: Fifteen years of ocean observa-
tions with the global argo array. Nature Climate Change
From the various applications of machine learning in 6(2), 145–153 (2016)
4. Shi, R., Gan, Y., Wang, Y.: Evaluating scalability bot-
ocean data at this stage, it can be seen that machine tlenecks by workload extrapolation. In: 2018 IEEE 26th
learning has changed the traditional way of manually International Symposium on Modeling, Analysis, and
performing ocean data analysis, improved the efficiency Simulation of Computer and Telecommunication Sys-
tems (MASCOTS), pp. 333–347 (2018). DOI 10.1109/
of data analysis, and provided solutions for specific sci-
MASCOTS.2018.00039
entific research problems in this field. The new method 5. Deo, R.C., Şahin, M.: Application of the extreme learning
is of great significance for revealing the laws of the machine algorithm for the prediction of monthly effective
ocean, protecting the ocean ecological environment, and drought index in eastern australia. Atmospheric Research
153, 512–525 (2015)
developing the marine economy. At the same time, when 6. Rasouli, K., Hsieh, W.W., Cannon, A.J.: Daily stream-
machine learning and ocean data are combined, there flow forecasting by machine learning methods with
are still problems such as inconsistent data standards, weather and climate inputs. Journal of Hydrology 414,
low data utilization, small application scope, and un- 284–293 (2012)
7. Kim, Y.H., Im, J., Ha, H.K., Choi, J.K., Ha, S.: Machine
clear algorithm usage. With the continuous develop- learning approaches to coastal water quality monitoring
ment of machine learning and ocean observation tech- using goci satellite data. GIScience & Remote Sensing
nology, it is believed that in the future, the application 51(2), 158–174 (2014)
8. Rosso, I., Mazloff, M.R., Talley, L.D., Purkey, S.G., Free-
range of machine learning and ocean data will be wider, man, N.M., Maze, G.: Water mass and biogeochemi-
the application will become cheaper and the application cal variability in the kerguelen sector of the southern
efficiency will be higher. ocean: A machine learning approach for a mixing hot
spot. Journal of Geophysical Research: Oceans 125(3),
e2019JC015877 (2020)
Acknowledgements This work was supported in part by 9. Mosavi, A., Ozturk, P., Chau, K.w.: Flood prediction us-
the National Natural Science Foundation of China (NSFC) ing machine learning models: Literature review. Water
under Grant Nos. 61902203, Key Research and Development 10(11), 1536 (2018)
Plan - Major Scientific and Technological Innovation Projects 10. Sun, M., Yu, F.U., Chongjing, L., Jiang, X.: Deep learn-
of ShanDong Province (2019JZZY020101). ing application in marine big data mining. Science &
Technology Review (2018)
11. Zhou, L., Pan, S., Wang, J., Vasilakos, A.V.: Machine
learning on big data: Opportunities and challenges. Neu-
Conflict of interest rocomputing 237, 350–361 (2017)
12. Asefa, T., Kemblowski, M., McKee, M., Khalil, A.: Multi-
The authors declare no competing interests. time scale stream flow predictions: The support vector
machines approach. Journal of hydrology 318(1-4), 7–16
(2006)
13. Guilford, T., Meade, J., Willis, J., Phillips, R.A., Boyle,
References D., Roberts, S., Collett, M., Freeman, R., Perrins, C.:
Migration and stopover in a small pelagic seabird, the
1. Jin, X., Wah, B.W., Cheng, X., Wang, Y.: Significance manx shearwater puffinus puffinus: insights from machine
and challenges of big data research. Big Data Research learning. Proceedings of the Royal Society B: Biological
2(2), 59–64 (2015) Sciences 276(1660), 1215–1223 (2009)
Application of Machine Learning in Ocean Data 9
14. Krinitskiy, M.: Application of machine learning methods 33. Lins, I.D., Araujo, M., das Chagas Moura, M., Silva,
to the solar disk state detection by all-sky images over M.A., Droguett, E.L.: Prediction of sea surface tempera-
the ocean. Oceanology 57(2), 265–269 (2017) ture in the tropical atlantic by support vector machines.
15. Deo, M.: Artificial neural networks in coastal and ocean Computational Statistics & Data Analysis 61, 187–198
engineering (2010) (2013)
16. Jordan, M.I., Mitchell, T.M.: Machine learning: Trends, 34. Olah, C.: Understanding lstm networks (2015)
perspectives, and prospects. Science 349(6245), 255–260 35. Savitha, R., Al Mamun, A., et al.: Regional ocean wave
(2015) height prediction using sequential learning neural net-
17. Alpaydin, E.: Introduction to machine learning. MIT works. Ocean Engineering 129, 605–612 (2017)
press (2020) 36. Group, T.W.: The wam model—a third generation ocean
18. Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learn- wave prediction model. Journal of Physical Oceanogra-
ing representations by back-propagating errors. nature phy 18(12), 1775–1810 (1988)
323(6088), 533–536 (1986) 37. Booij, N., Ris, R.C., Holthuijsen, L.H.: A third-
19. Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimen- generation wave model for coastal regions: 1. model de-
sionality of data with neural networks. science 313(5786), scription and validation. Journal of geophysical research:
504–507 (2006) Oceans 104(C4), 7649–7666 (1999)
20. Baggeroer, A.B., Kuperman, W.A., Mikhalevsky, P.N.: 38. Tolman, H.L., Chalikov, D.: Source terms in a third-
An overview of matched field methods in ocean acous- generation wind wave model. Journal of Physical
tics. IEEE Journal of Oceanic Engineering 18(4), 401– Oceanography 26(11), 2497–2518 (1996)
424 (1993) 39. Makarynskyy, O.: Improving wave predictions with arti-
21. Baggeroer, A.B., Kuperman, W.A.: Matched field pro- ficial neural networks. Ocean Engineering 31(5-6), 709–
cessing in ocean acoustics. In: Acoustic signal processing 724 (2004)
for ocean exploration, pp. 79–114. Springer (1993) 40. Agrawal, J., Deo, M.: On-line wave prediction. Marine
22. Niu, H., Reeves, E., Gerstoft, P.: Source localization in structures 15(1), 57–74 (2002)
an ocean waveguide using supervised machine learning. 41. Jain, P., Deo, M.: Artificial intelligence tools to forecast
The Journal of the Acoustical Society of America 142(3), ocean waves in real time. The Open Ocean Engineering
1176–1188 (2017) Journal 1(1) (2008)
23. Choi, J., Choo, Y., Lee, K.: Acoustic classification of sur- 42. James, S.C., Zhang, Y., O’Donncha, F.: A machine learn-
face and underwater vessels in the ocean using supervised ing framework to forecast wave conditions. Coastal En-
machine learning. Sensors 19(16), 3492 (2019) gineering 137, 1–10 (2018)
24. Steinberg, B.Z., Beran, M.J., Chin, S.H., Howard Jr, J.H.: 43. Rao, S., Mandal, S.: Hindcasting of storm waves using
A neural network approach to source localization. The neural networks. Ocean Engineering 32(5-6), 667–684
Journal of the Acoustical Society of America 90(4), 2081– (2005)
2090 (1991) 44. Mahjoobi, J., Mosabbeb, E.A.: Prediction of significant
25. Caiti, A., Parisini, T.: Mapping ocean sediments by rbf wave height using regressive support vector machines.
networks. IEEE journal of oceanic engineering 19(4), Ocean Engineering 36(5), 339–347 (2009)
577–582 (1994) 45. Quan, J., Feng, H., Yong-Zeng, Y.: Prediction of the sig-
26. Niu, H., Ozanich, E., Gerstoft, P.: Ship localization in nificant wave height based on the support vector machine.
santa barbara channel using machine learning classifiers. Advances in Marine Science 37(2), 199–209 (2019)
The Journal of the Acoustical Society of America 142(5), 46. Alexandre, E., Cuadra, L., Nieto-Borge, J., Candil-
EL455–EL460 (2017) Garcia, G., Del Pino, M., Salcedo-Sanz, S.: A hybrid ge-
27. Van Komen, D.F., Neilsen, T.B., Howarth, K., Knobles, netic algorithm—extreme learning machine approach for
D.P., Dahl, P.H.: Seabed and range estimation of im- accurate significant wave height reconstruction. Ocean
pulsive time series using a convolutional neural network. Modelling 92, 115–123 (2015)
The Journal of the Acoustical Society of America 147(5), 47. Salcedo-Sanz, S., Borge, J.N., Carro-Calvo, L., Cuadra,
EL403–EL408 (2020) L., Hessner, K., Alexandre, E.: Significant wave height es-
28. Cane, M.A., Clement, A.C., Kaplan, A., Kushnir, Y., timation using svr algorithms and shadowing information
Pozdnyakov, D., Seager, R., Zebiak, S.E., Murtugudde, from simulated and real measured x-band radar images of
R.: Twentieth-century sea surface temperature trends. the sea surface. Ocean Engineering 101, 244–253 (2015)
science 275(5302), 957–960 (1997) 48. Durán-Rosal, A., Hervás-Martı́nez, C., Tallón-
29. Castro, S.L., Wick, G.A., Steele, M.: Validation of satel- Ballesteros, A., Martı́nez-Estudillo, A., Salcedo-Sanz,
lite sea surface temperature analyses in the beaufort sea S.: Massive missing data reconstruction in ocean buoys
using uptempo buoys. Remote Sensing of Environment with evolutionary product unit neural networks. Ocean
187, 458–475 (2016) Engineering 117, 292–301 (2016)
30. Chaidez, V., Dreano, D., Agusti, S., Duarte, C.M., 49. Franz, K., Roscher, R., Milioto, A., Wenzel, S., Kusche,
Hoteit, I.: Decadal trends in red sea maximum surface J.: Ocean eddy identification and tracking using neural
temperature. Scientific Reports 7(1), 1–8 (2017) networks. In: IGARSS 2018-2018 IEEE International
31. Xiao, C., Chen, N., Hu, C., Wang, K., Gong, J., Chen, Z.: Geoscience and Remote Sensing Symposium, pp. 6887–
Short and mid-term sea surface temperature prediction 6890. IEEE (2018)
using time-series satellite data and lstm-adaboost combi- 50. Bai, X., Wang, C., Li, C.: A streampath-based rcnn ap-
nation approach. Remote Sensing of Environment 233, proach to ocean eddy detection. IEEE Access 7, 106336–
111358 (2019) 106345 (2019)
32. Xiao, C., Chen, N., Hu, C., Wang, K., Xu, Z., Cai, Y., Xu, 51. Lguensat, R., Sun, M., Fablet, R., Tandeo, P., Mason,
L., Chen, Z., Gong, J.: A spatiotemporal deep learning E., Chen, G.: Eddynet: A deep neural network for pixel-
model for sea surface temperature field prediction using wise classification of oceanic eddies. In: IGARSS 2018-
time-series satellite data. Environmental Modelling & 2018 IEEE International Geoscience and Remote Sensing
Software 120, 104502 (2019) Symposium, pp. 1764–1767. IEEE (2018)
10 Ranran Lou1 et al.
52. Bolton, T., Zanna, L.: Applications of deep learning to prediction of tropical storm surge. Natural Hazards
ocean data inference and subgrid parameterization. Jour- 82(1), 471–491 (2016)
nal of Advances in Modeling Earth Systems 11(1), 376– 68. Zhang, C., Durgan, S.D., Lagomasino, D.: Modeling risk
399 (2019) of mangroves to tropical cyclones: A case study of hur-
53. May, R.M.: Conceptual aspects of the quantification of ricane irma. Estuarine, Coastal and Shelf Science 224,
the extent of biological diversity. Philosophical Transac- 108–116 (2019)
tions of the Royal Society of London. Series B: Biological 69. Khlongkhoi, P., Chayantrakom, K., Kanbua, W.: Appli-
Sciences 345(1311), 13–20 (1994) cation of a deep learning technique to the problem of oil
54. Ormond, R.: Marine biodiversity: causes and conse- spreading in the gulf of thailand. Advances in Difference
quences. Journal of the Marine Biological Association Equations 2019(1), 306 (2019)
of the United Kingdom 76(1), 151–152 (1996) 70. Topouzelis, K., Psyllos, A.: Oil spill feature selection and
55. Wei, C.L., Rowe, G.T., Escobar-Briones, E., Boetius, A., classification using decision tree forest on sar image data.
Soltwedel, T., Caley, M.J., Soliman, Y., Huettmann, F., ISPRS journal of photogrammetry and remote sensing
Qu, F., Yu, Z., et al.: Global patterns and predictions of 68, 135–143 (2012)
seafloor biomass using random forests. PloS one 5(12),
71. Xu, L., Li, J., Brenning, A.: A comparative study of dif-
e15323 (2010)
ferent classification techniques for marine oil spill iden-
56. Turk, M., Pentland, A.: Eigenfaces for recognition. Jour-
tification using radarsat-1 imagery. Remote Sensing of
nal of cognitive neuroscience 3(1), 71–86 (1991)
Environment 141, 14–23 (2014)
57. Mika, S., Ratsch, G., Weston, J., Scholkopf, B., Mullers,
K.R.: Fisher discriminant analysis with kernels. In: Neu- 72. Brekke, C., Solberg, A.H.: Classifiers and confidence es-
ral networks for signal processing IX: Proceedings of the timation for oil spill detection in envisat asar images.
1999 IEEE signal processing society workshop (cat. no. IEEE Geoscience and Remote Sensing Letters 5(1), 65–
98th8468), pp. 41–48. Ieee (1999) 69 (2008)
58. Huang, P.X.: Hierarchical classification system with re- 73. Grasso, I., Archer, S.D., Burnell, C., Tupper, B.,
ject option for live fish recognition. In: Fish4Knowledge: Rauschenberg, C., Kanwit, K., Record, N.R.: The hunt
Collecting and Analyzing Massive Coral Reef Fish Video for red tides: Deep learning algorithm forecasts shellfish
Data, pp. 141–159. Springer (2016) toxicity at site scales in coastal maine. Ecosphere 10(12),
59. Siddiqui, S.A., Salman, A., Malik, M.I., Shafait, F., Mian, e02960 (2019)
A., Shortis, M.R., Harvey, E.S.: Automatic fish species 74. Bak, S.H., Hwang, D.H., Kim, H.M., Kim, B.K., Enkg-
classification in underwater videos: exploiting pre-trained jargal, U., Oh, S.Y., Yoon, H.J.: A study on red tide
deep neural network models to compensate for limited detection technique by using multi-layer perceptron. In-
labelled data. ICES Journal of Marine Science 75(1), ternational Journal of Grid and Distributed Computing
374–389 (2018) 11(9), 93–102 (2018)
60. Reus, G., Möller, T., Jäger, J., Schultz, S.T., Kruschel, 75. Fdez-Riverola, F., Corchado, J.M.: Fsfrt: forecasting sys-
C., Hasenauer, J., Wolff, V., Fricke-Neuderth, K.: Look- tem for red tides. a hybrid autonomous ai model. Applied
ing for seagrass: Deep learning for visual coverage esti- Artificial Intelligence 17(10), 955–982 (2003)
mation. In: 2018 OCEANS-MTS/IEEE Kobe Techno- 76. Sala, E., Mayorga, J., Costello, C., Kroodsma, D., Palo-
Oceans (OTO), pp. 1–6. IEEE (2018) mares, M.L., Pauly, D., Sumaila, U.R., Zeller, D.: The
61. Glotin, H., Spong, P., Symonds, H., Roger, V., economics of fishing the high seas. Science Advances 4(6),
Balestriero, R., Ferrari, M., Poupard, M., Towers, J., eaat2504 (2018)
Veirs, S., Marxer, R., et al.: Deep learning for ethoa- 77. Fernandes, J.A., Irigoien, X., Goikoetxea, N., Lozano,
coustical mapping: Application to a single cachalot long J.A., Inza, I., Pérez, A., Bode, A.: Fish recruitment pre-
term recording on joint observatories in vancouver island. diction, using robust supervised classification methods.
The Journal of the Acoustical Society of America 144(3), Ecological Modelling 221(2), 338–352 (2010)
1776–1777 (2018) 78. Stamoulis, K.A., Delevaux, J.M., Williams, I.D., Poti,
62. Bermant, P.C., Bronstein, M.M., Wood, R.J., Gero, S.,
M., Lecky, J., Costa, B., Kendall, M.S., Pittman, S.J.,
Gruber, D.F.: Deep machine learning techniques for the
Donovan, M.K., Wedding, L.M., et al.: Seascape mod-
detection and classification of sperm whale bioacoustics.
els reveal places to focus coastal fisheries management.
Scientific reports 9(1), 1–10 (2019)
Ecological Applications 28(4), 910–925 (2018)
63. Al-Barazanchi, H., Verma, A., Wang, S.X.: Intelligent
79. de Souza, E.N., Boerder, K., Matwin, S., Worm, B.: Im-
plankton image classification with deep learning. Inter-
proving fishing pattern detection from satellite ais us-
national Journal of Computational Vision and Robotics
ing data mining and machine learning. PloS one 11(7),
8(6), 561–571 (2018)
64. Hari, V.N., Kalyan, B., Chitre, M., Ganesan, V.: Spatial e0158248 (2016)
modeling of deep-sea ferromanganese nodules with lim- 80. Ning, J., Huang, T., Diao, B., et al.: A fine grained
ited data using neural networks. IEEE Journal of Oceanic grid-based maritime traffic density algorithm for mass
Engineering 43(4), 997–1014 (2017) ship trajectory data. Computer Engineering & Science
65. Jie, W.L., Kalyan, B., Chitre, M., Vishnu, H.: Polymetal- 37(12), 2242–2249 (2015)
lic nodules abundance estimation using sidescan sonar: A 81. Kim, D., Park, M.S., Park, Y.J., Kim, W.: Geostationary
quantitative approach using artificial neural network. In: ocean color imager (goci) marine fog detection in combi-
OCEANS 2017-Aberdeen, pp. 1–6. IEEE (2017) nation with himawari-8 based on the decision tree. Re-
66. Jiang, G.Q., Xu, J., Wei, J.: A deep learning algorithm mote Sensing 12(1), 149 (2020)
of neural network for the parameterization of typhoon- 82. Tang, J., Deng, C., Huang, G.B., Zhao, B.: Compressed-
ocean feedback in typhoon forecast models. Geophysical domain ship detection on spaceborne optical image us-
Research Letters 45(8), 3706–3716 (2018) ing deep neural network and extreme learning machine.
67. Hashemi, M.R., Spaulding, M.L., Shaw, A., Farhadi, H., IEEE Transactions on Geoscience and Remote Sensing
Lewis, M.: An efficient artificial intelligence model for 53(3), 1174–1185 (2014)
Application of Machine Learning in Ocean Data 11
83. Khan, B., Khan, F., Veitch, B., Yang, M.: An operational satellite observations. Journal of Geophysical Research:
risk analysis tool to analyze marine transportation in arc- Oceans 123(1), 399–410 (2018)
tic waters. Reliability Engineering & System Safety 169,
485–502 (2018)
84. Trucco, P., Cagno, E., Ruggeri, F., Grande, O.: A
bayesian belief network modelling of organisational fac-
tors in risk analysis: A case study in maritime transporta-
tion. Reliability Engineering & System Safety 93(6),
845–856 (2008)
85. Wen, M., Chen, X., Li, Q., Basar, E., Wu, Y.C., Zhang,
W.: Index modulation aided subcarrier mapping for dual-
hop ofdm relaying. IEEE Transactions on Communica-
tions 67(9), 6012–6024 (2019)
86. Wen, M., Zheng, B., Kim, K.J., Di Renzo, M., Tsiftsis,
T.A., Chen, K.C., Al-Dhahir, N.: A survey on spatial
modulation in emerging wireless systems: Research pro-
gresses and applications. IEEE Journal on Selected Areas
in Communications 37(9), 1949–1972 (2019)
87. Wen, M., Li, Q., Basar, E., Zhang, W.: Generalized
multiple-mode ofdm with index modulation. IEEE Trans-
actions on Wireless Communications 17(10), 6531–6543
(2018)
88. Wen, M., Basar, E., Li, Q., Zheng, B., Zhang, M.:
Multiple-mode orthogonal frequency division multiplex-
ing with index modulation. IEEE Transactions on Com-
munications 65(9), 3892–3906 (2017)
89. Wen, M., Ye, B., Basar, E., Li, Q., Ji, F.: Enhanced
orthogonal frequency division multiplexing with index
modulation. IEEE Transactions on Wireless Communi-
cations 16(7), 4786–4801 (2017)
90. Li, Y., Zhang, Y., Li, W., Jiang, T.: Marine wireless
big data: Efficient transmission, related applications, and
challenges. IEEE Wireless Communications 25(1), 19–25
(2018)
91. Park, S., Byun, J., Shin, K.S., Jo, O.: Ocean current pre-
diction based on machine learning for deciding handover
priority in underwater wireless sensor networks. In: 2020
International Conference on Artificial Intelligence in In-
formation and Communication (ICAIIC), pp. 505–509.
IEEE (2020)
92. Indiveri, G., Linares-Barranco, B., Hamilton, T., van
Schaik, A., Etienne-Cummings, R., Delbruck, T., Liu,
S.C., Dudek, P., Häfliger, P., Renaud, S., Schemmel, J.,
Cauwenberghs, G., Arthur, J., Hynna, K., Folowosele,
F., SAÏGHI, S., Serrano-Gotarredona, T., Wijekoon, J.,
Wang, Y., Boahen, K.: Neuromorphic silicon neuron cir-
cuits. Frontiers in Neuroscience 5, 73 (2011). DOI
10.3389/fnins.2011.00073
93. Yang, S., Deng, B., Wang, J., Li, H., Lu, M., Che, Y.,
Wei, X., Loparo, K.A.: Scalable digital neuromorphic ar-
chitecture for large-scale biophysically meaningful neural
network with multi-compartment neurons. IEEE Trans-
actions on Neural Networks and Learning Systems 31(1),
148–162 (2020). DOI 10.1109/TNNLS.2019.2899936
94. Delbrück, T., Linares-Barranco, B., Culurciello, E.,
Posch, C.: Activity-driven, event-based vision sensors.
In: Proceedings of 2010 IEEE International Symposium
on Circuits and Systems, pp. 2426–2429 (2010). DOI
10.1109/ISCAS.2010.5537149
95. D’Alelio, D., Rampone, S., Cusano, L.M., Morfino, V.,
Russo, L., Sanseverino, N., Cloern, J.E., Lomas, M.W.:
Machine learning identifies a strong association between
warming and reduced primary productivity in an olig-
otrophic ocean gyre. Scientific reports 10(1), 1–12 (2020)
96. Su, H., Li, W., Yan, X.H.: Retrieving temperature
anomaly in the global subsurface and deeper ocean from