Quantitative Evaluation of Flood Extent Detection Using Attention U-Net Case Studies From Eastern South Wales Australia in March 2021 and July 2022
Quantitative Evaluation of Flood Extent Detection Using Attention U-Net Case Studies From Eastern South Wales Australia in March 2021 and July 2022
com/scientificreports
Keywords Flood, Sentinel-1, Remote sensing, Attention U-Net, Deep learning, Machine learning, Artificial
intelligence
Floods are among the most threatening and impacting environmental hazards, resulting in significant
infrastructure losses and damage to national and regional economies1. The severity and frequency of flooding
have increased in recent years due to the expansion of urbanization and deforestation, as well as extreme weather,
such as hurricanes2,3. According to the report by the Centre for Research on the4, floods have accounted for
47% of all weather-related disasters and have risen to an average of 171 per year for the period between 2005
and 2014. After a flood occurs, governments, non-governmental organizations, disaster charters, and national
organizations are the first to respond to this event, and reliable and accurate mapping of the flood can help them
in rescue missions, damage assessment of infrastructure, and planning response measures.
Reference5,6 demonstrated that traditional flood monitoring uses hydrological processes simulation with
precipitation data from surface hydrological stations, satellite observations, re-analysis data based on numerical
models, and assimilation. These data have certain limitations in terms of temporal resolution, spatial resolution,
and accuracy. Additionally, conventional flood management depends on the ground-based monitoring of
rainfall and river discharge. Many parts of the world lack coverage from these sensor networks, and ground-
based systems are also costly.
Remote sensing technology offers a unique capability of observing the earth on a large scale and has been
proven to be the most reliable source for generating flood maps7. Investigating the effects and extent of flood
events is critical to quantify damage, organizing rescue measures, determining insurance refunds, and calibrating
prediction models for risk assessment and management3,8. Over the years, different types of imaging sensors
have been employed to monitor flood events, such as optical and Synthetic Aperture Radar (SAR) sensors9,10.
Optical sensors including RGB and multispectral, operate in daylight and cloudless sky conditions and can
discriminate flooded areas from other landcover types. However, since flooding usually occurs after heavy
rainfall, optical data does not seem to be optimal for flood application. SAR images can be acquired regardless of
cloud cover and time of day or night by the active sensors with specific frequency ranges. Moreover, microwaves
penetrate the foliage, providing complete information about the inundated area11. Furthermore, SAR missions
are very popular nowadays, and a large variety of constellations from the public and private sectors exist. Public-
sponsored missions include the TerraSAR-X (TSX) from the German Aerospace Centre, the COSMO-SkyMed
(CSK) from the Italian Space Agency, and the Sentinel-1 by the European Space Agency.
Mere thresholding was the first exploited technique to detect water bodies from SAR backscatter. Since water
bodies present very low backscatter coefficients, the threshold is usually chosen to maximize the histogram
separation between water and non-water pixels12–14.
Following these thresholding strategies, a global water body layer was generated from TerraSAR-X and
TanDEM-X SAR data at the German Aerospace Centre By taking into account the system’s geometric layout and
utilizing a watershed-based segmentation technique, they categorize water surfaces in single TanDEM-X images.
The global TerraSAR-X add-on for Digital Elevation Measurement (TDM) global water body (WBL) product,
which consists of a binary averaged water/non-water layer as well as a permanent/temporary water indication
layer, is then created by mosaicking together single overlapping acquisitions in a two-step logically weighting
process15.
Though thresholding techniques could perform very well in heterologous datasets, e.g., data presenting a
bimodal distribution, their accuracy can drop rapidly if the water body occupies a small portion of the satellite
scene, which is the case of flooded areas16. In the last decade, machine learning-based algorithms have been
widely used in different applications, including natural disasters, providing promising results. For example17,
employed machine learning algorithms such as Support Vector Machine and Random Forest to detect landslides
and floods around Salzburg, Austria. The analysis considered 13 influencing factor such as elevation, slope,
aspect, topographic wetness index (TWI), stream power index (SPI), normalized difference vegetation index
(NDVI), geology, lithology, rainfall, land cover, distance to roads, distance to faults, and distance to drainage,
were used to create multi-hazard exposure maps. With the commonly used splitting ratio, they divided the flood
and landslide inventory data into training and validation, using 70% of the locations for training and 30% for
validation. Through the use of the ROC (receiver operating curve) and R-Index (relative density), the accuracy
evaluation of the exposure maps was created.
In recent years, machine learning, in particular data-driven and deep learning methods, have received special
attention for analysing remotely sensed images. Deep Learning (DL) is an effective method for a large variety of
computer vision tasks18–20. The remote sensing community has employed DL methods to increase computational
performance and product accuracy using optical-to-radar datasets for many applications.
In many mapping applications, DL models have contributed to increasing semantic segmentation algorithms’
accuracy21–23. Reference24 reviewed the application of many DL model applications for flood mapping. The
models included multi-layer perception (MLP), recurrent neural networks (RNN), and CNN. They report that
the CNN model outperformed both RNN and MLP models. More recently, a variant of the Fully Convolutional
Network called the U-Net architecture has been widely used in flood mapping25,26. Reference16 used ALOS
single polarized HH data with a U-Net architecture to map river flood inundation. Due to the lack of a training
dataset, image augmentation techniques were used to increase the sample size, evidently achieving an accuracy
of 89.5%. However, the algorithm often failed to detect some flooded areas within the urban environment. This
can be attributed to the strong backscatter signal from the infrastructure. Also, some agricultural fields were
classified as floodwaters due to their low backscatter intensity.
Reference27 assessed the performance of U-Net and an alternative model, XNet, on Sentinel-1 data. The
models were trained on a large dataset derived by28, which includes flood events since 2007. Both models
achieved a high accuracy of 97% in flood mapping.
They evaluated different CNN architectures and developed models utilizing flood masks produced by a
combination of traditional semi-automated methods, thorough manual cleaning, and eye inspection. The
technology employed demonstrates high performance across a wide range of locations and environmental
conditions, significantly reducing the time required to produce a flood map by half. This technology can also
be implemented into end-to-end pipelines for timelier and more continuous flood monitoring due to the open-
source data and the minimal image cleaning needed.
According to29 Floods can be mapped and monitored with remotely sensed data acquired by aircraft and
satellites, or even from ground-based platforms. The sensors and data processing techniques that exist to derive
information about floods are numerous. Instruments that record flood events may operate in the electromagnetic
spectrum’s visible, thermal, and microwave range. Due to the limitations posed by adverse weather conditions
during flood events, active radar is invaluable for monitoring floods; however, if a visible image of flooding can
be acquired, retrieving useful information from this is often more straightforward.
Reference26 adopted an attention U-NET architecture that combines high-level semantic information with
low-level features to obtain accurate segmentation results. The methodology of this study worked well on
Cosmo-Skyed data for flood detection, but there were some difficulties with water extraction in dense built-
up and high mountainous areas. The performance of the algorithm is not reported in the study. Furthermore,
Cosmo-Skyed data are not openly accessible and free of cost thereby, hindering the ability to map flooded and
inundated areas rapidly after an event occurs. Reference30 selected 67 globally distributed Sentinel-1 scenes to
derive flood extent and employed a modified version of the U-Net called ResNET-34. Compared to U-Net, the
encoder is replaced by a residual network structure. The authors found that the segmentation results depend on
the polarization configuration and reported an accuracy of 91% using single polarized VV and 95% using dual
polarized VV-VH data. They reported that the contrast between land and water was low in the VV polarization
channel, leading to increased false positives.
Recent advancements in deep learning (DL) techniques have significantly improved flood mapping
capabilities. While Convolutional Neural Networks (CNNs) have been widely used for semantic segmentation
tasks31, CNNs has emerged as particularly effective for flood extent detection due to its fully convolutional
nature and ability to perform pixel-wise classification. The U-Net architecture is uniquely equipped to handle
high-resolution remote sensing data by using skip connections that preserve spatial details, making it well-
suited for tasks requiring fine boundary delineation, such as flood mapping. Although other models like ResNet
and VGG have shown strong performance in image classification, their inability to maintain spatial resolution
during the down-sampling process makes them less effective for flood extent detection. Additionally, traditional
machine learning models like Random Forest and Support Vector Machines (SVM) are often limited by the need
for extensive feature engineering and do not leverage the hierarchical feature representation of deep learning
models. The U-Net’s capacity to combine high-level semantic information with low-level spatial features makes
it the most appropriate choice for our study, particularly in complex terrains where flood delineation can be
challenging.
This manuscript presents a strategy for training DL models for flood extent mapping from the Sentinel-1
time series. Specifically, the algorithm is designed to cope with a pair of SAR images: a pre- and a post-event
image. We further address the problem of low label availability by implementing an attention mechanism on top
of the U-Net model. Indeed, we show how the proposed methodology implicitly adapts to data characteristics
by suppressing irrelevant regions while giving more importance (or attention) to the salient features of flooded
areas.
Study area
New South Wales (NSW) is a state on the east coast of Australia. Its coast borders the Coral and Tasman Seas
to the east. This state has a long history of flooding, which is increasing due to climate change, as shown by32.
Historical data show that after the big dry in southeast Australia ended in 2010, a subsequent heavy rain event,
heralding Australia’s wettest 2-year period on record, led to significant river flooding in New South Wales and
Queensland, including its capital Brisbane.
Reference32 show that tropical climate systems and monsoonal rains dominate the northern part of Australia
during the summer wet season with a dry season for the remainder of the year. Toward the south, the climate is
characterized by winter storm activity originating in the Southern Ocean, and high-pressure summer systems
pushed southward by the monsoon32. Rainfall from tropical cyclones can penetrate the continent, causing
intermittent widespread inundation32.
Most of the continent received above-average rainfall, except the southwest corner, which in 2010 had its
driest year on record. It also coincided with warm ocean temperatures. While much of Australia experienced
extreme heat in early to mid-January 2013, from 22 to 29 January, heavy rainfall triggered severe flooding along
coastal Queensland and northern New South Wales, comparable in magnitude to the flooding in 2010.
Reference33 reported that natural disasters’ frequency, intensity, and duration are projected to increase,
including bushfires, flooding, and storms. There is also some emerging evidence that climate change could
increase the likelihood of multiple interconnected events occurring in proximity. Recently the region has been
affected by extreme rainfall starting on 18 March 2021, which led to widespread flooding Fig. 1.
Data
The research methodology includes two phases: Phase one addresses data augmentation strategies for generating
training data, and phase two investigates the identification of suitable hyperparameter combinations. The overall
workflow of the study is depicted in Fig. 2.
Data augmentation is used where more datasets are artificially created by performing flipping, rotation, and
shear operations on the existing dataset. Additionally, to overcome the limitations of other studies, we perform a
texture analysis such as Grey Level Co-occurrence Matrix (GLCM) on the SAR data. Texture analysis indeed aids
in the discrimination between different land cover classes, improving the algorithm’s performance.
As indicated in the previous studies and reports, the study area was subjected to rapid flooding in a short
period, hence, in order to take all these challenges into consideration, prevent the deduction of false conclusions,
and detect the flooding extension, Interferometric Wide Swath (IW) SAR image pairs of (S1), Level-1 Single
Look Complex (SLC) and Level-1 Ground Range Detected High Resolution (GRDHR) of two frames and
matched the date of total four images of each frame were collected. As well as another date in the year 2022 was
used as an unseen frame (Table 1).
A DL attention U-Net was chosen to detect the flooded areas during the investigated period (see “Deep
learning for automated flood mapping” section). However, this requires feature engineering to create the prior
label mask of flood extensions.
Labelling based on optical images presents challenges due to weather conditions, particularly cloud cover. To
address this, we utilized a subset of labels to validate and train the deep learning model. Unlike optical imagery,
Fig. 1. The area of interest is shown on the left (New South Wales). On the right, a zoom in view of the
flooded areas is depicted. South-Eastern New South Wales region in shown in red rectangular and Sydney
Wollongong region in green rectangular. Figures showcasing flood extent and Area of Interest, based on SAR
GRDH calibrated images, were generated by Authors using the open-source Software34 and35. The authors are
authorized to access the open-source data and utilize the software because they are publicly available.
SAR images remain unaffected by cloud cover, making them a more reliable option in such scenarios. During
peak flooding, cloud cover is often extensive, making near real-time mapping with Sentinel-2 difficult. Therefore,
Sentinel-1 provides a viable alternative for timely flood monitoring. Once cloud cover clears, Sentinel-2 data can
be used to validate the generated flood maps, ensuring accuracy and reliability36,37.
Hence, the only choice was to visually interpret the radiometric values of the microwave data which is
essentially converting the processed (GRDHR) images into measurements of radar backscatter, therefore
obtaining the samples of pixel values of the visible flood extent. The two processed acquisitions before and after
Table 1. Information of the SAR data that was employed in the study.
the event are carefully digitized for each frame, avoiding the adjacent existing bodies of water, and using them
as labels for training our model.
The detection of water bodies in SAR imagery relies on the backscatter signature. Water generally appears
as a smooth surface with a well-defined intensity of backscatter that can be seen in the SAR image. However,
depending on environmental conditions such as landscape topography and the generated shadows, a universal
threshold for water backscatter intensity does not exist27. For these reasons, the radar coefficient sigma naught
(σo) comes to be part of the solution. Equation 1 shows the used σo formula.
σo = βo · sinθ(1)
The radiometric calibration of a particular microwave image is mandatory for measuring backscattering values.
The digital number (DN) of the pixels is used to convert the SAR image into the βo image. At the same time, the
σo computation requires the βo and the incident angles of each image pixel38,39.
As shown in Fig. 1, the study area shows pronounced topographical patterns, and it’s divided by natural
features into four distinct geographical sections. The Coast, a thin coastal strip stretching along 1460 km
from the subtropical Northern Rivers region near Queensland, through the mid-north Coast, Hunter, Sydney,
Illawarra, and Shoalhaven down to the cooler far south coast; The Mountains, the Great Dividing Range, which
includes the Snowy Mountains, the Northern, Central and Southern Tablelands, the Southern Highlands, and
the Southwest Slopes. Although these mountains are not steep, many peaks rise above 1000 m high, and they are
one of the oldest mountain chains on earth. The Central Plains, the cultivated central plains spread 500 km from
east to west and are the agricultural powerhouse of the NSW economy due to the rich, fertile soil and adequate
water supply; the Western Plains, the arid Western Plains cover more than two-thirds of the state, though they
are sparsely populated compared to the coastal regions. The land is fertile but has low levels of rainfall and
inadequate river systems54.
To be able to include the land cover features with respect to the topographic structure and to enhance the
typical CNN architecture block of the Attention-U-Net, a digital elevation model (DEM) at a spatial resolution
of 30 m of the ALOS Global Digital Surface Model55 was used to derive the following parameters: DEM slope,
aspect, Roughness, Ruggedness Index, Plan Curvature, Profile Curvature, Total Curvature by means of digital
terrain analysis56. In this case study, the parameters mentioned above did not add any value to the accuracy of
the model, and therefore, they were excluded from further analysis (see “Deep learning for automated flood
mapping” and “Discussions” sections.). The DEM was resampled to 13 m using bilinear interpolation. This
resampling method was chosen to ensure that the spatial resolution of the DEM matches the Sentinel-1 SAR
imagery, allowing for a more precise analysis of flood-prone areas. The decision to use 13 m was based on a
trade-off between computational efficiency and the need for finer topographic detail in the study area. However,
it should be noted that resampling may introduce uncertainties in regions with steep elevation changes, and
future studies should explore the use of alternative interpolation methods to minimize this effect.
Model architecture
When performing classification tasks like identifying flood footprints using the low-level and high-level
characteristics of satellite images, a (CNN) can learn hierarchical feature representations of image data57.
However, more complicated models are needed, particularly when working with SAR data, to adequately define
and segment the pixels for determining the locations of the flooded areas. In order to categorize each pixel of
the image into a binary class for the purposes of this study, we modified a variant of U-Net termed Attn-U-Net.
Attention U-Net is a variant of the U-Net model that is trained with Attention Gates (AGs) implicitly learn
to suppress irrelevant regions in an input image while highlighting salient features useful for a specific task58 see
Eqs. (3, 4, 5).
( ( ))
q1att = ΨT σ1 WT
x xi + Wg gi + bg
l T
+ bΨ (3)
where xi is the features from the contracting path and g is the gating signal. The term σ2(xi,c) represents the
sigmoid function:
σ2 (xi, c) = 1/1 + exp (−xi, c)(5)
The network’s ability to follow the traditional U-Net encoder-decoder structure with skip connections, which
permits the preservation of the structural integrity of the image to reduce the distortions associated with just
convolution operations, is one of the reasons it is well suited for this task. Additional layers of Attn-U-Nets,
including attention gates59, enable the network to recognize pertinent spatial information from low-level feature
maps. The decoding phase is then informed of the information. It consists of two vector x, and the vector g. x
undergoes a strung convolution that is subsequently summed up, increasing the amount of the aligned weights.
g originates from the network’s lowest layer, which has lower dimensions and better feature representation.
The skip connections that actively inhibit activations in unnecessary areas are where the attention gates are
implemented, which lowers the number of redundant features. A scheme of the U-net involving attention gates
is visible in Fig. 3.
Fig. 3. Schematic architecture of Attention U-Net encompasses Attention gates and the mask and predicted
flood extension of the two frames. The lower-right corner of the figure illustrates how the image is split into
tiles. Including all the tiles would obscure the split or make it less visible.
U-Nets. The linkage of pixels to class labels, in this case, flooded area and non- flooded area, is demonstrated via
semantic segmentation. We employed an encoder-decoder network structure with Attn-U-Nets that enables the
categorization of each pixel into its predicted prediction classes and produces a final image that is the same size
as the input. By adopting such a network structure, a 128 × 128-pixel input image would be output as 128 × 128,
enabling pixel-by-pixel segmentation of the image. We decide that this patch size is ideal because none of the
other sizes increased the models’ accuracy. Additionally feeding a neural network with the entire satellite image
is not computationally feasible, due to its large memory footprint27. Maintaining a balance between the flood
and non-flood groups is just as necessary as maintaining training labels. If not, there may be more instances
of false negatives (actual flooded areas being missed out). So, using methods like data augmentation might be
quite useful in this situation. However, labels could be placed in regions that lack important SAR amplitude
information that is relevant to the flooded area. In this case, augmentations could be harmful since they might
increase the proportion of geometrically deformed pixels in the dataset (see “Discussions” section). However,
because flipping a satellite image produces newer flood images that demonstrate diverse flood orientations
without significantly distorting the original image, we picked vertical and horizontal random flip augmenters.
The entire procedure was developed in Python, with ArcMap handling GIS processing and TensorFlow
handling machine learning61. As the loss function for training the (DL) model, we utilized Dice Loss62:
∑N
∑ Pic Gic + ∈
Dice Loss = 1 − ∑Ni=1 (6)
c i=1
Pic + Gic + ∈
Equation (1) depicts a 2-class Dice score coefficient (DSC) variation for class C, with Gic ∈ [0, 1] and Pic ∈ [0,
1]. Representing the actual and predicted labels with N denoting the total number of pixels, while this ensures
numerical stability to prevent division by zero.
Dice Loss is an optimal choice for our architecture because it directly addresses class imbalance by
emphasizing the overlap between predicted and ground truth masks, which is crucial for segmentation tasks.
This makes it particularly effective for our Attention U-Net model, where accurate delineation of regions, such
as flooded areas, is essential for reliable predictions63.
We used a stochastic gradient descent method (SGD) that is effective in problems with noisy data and/or
gradients that are sparse, based on an adaptive estimate of first- and second-order moments (Adam). Finally,
when training DL models, optimal hyperparameter combinations are essential to maximize model performance
and achieve the best possible results. To train the model, we repeatedly combined a number of hyper-parameters,
specifically the number of filters. As a result, we iteratively trained the model using a variety of hyper-parameters
combinations learning rate, batch size, number er of filters, and the number of epochs was (1e-3–1e-5), (8, 16),
(8, 16, 32), (100) respectively.
Results
A classical form of the U-Net structure model was considered for the same purpose as well. The results are
reported in Table 2. The results of the Attn-U-Net model are presented here, which achieved the highest
performances. The model was applied to two adjacent frames of the same descending orbit and evaluated the
effect of augmentations for all the combinations per S1. For each combination, hyperparameters are tuned to
achieve the best possible results. The state-of-the-art model has successfully identified the flood coverage with
the best overall metrics, with precision, recall, and F1-score of both frames. The basic model U-Net metrics Are
lower than those achieved after adding an attention gate (see Table 2 and Figs. 4, 5 and 6). Sample training and
validation loss curves are provided in Fig. 6. The model achieves a performance higher than the baseline model.
Furthermore, the metrics were achieved of the unseen frame as the future test data got acceptable results.
The F1-score is measured per pixel with a threshold greater than 0. Utilizing the combinations of different
VH polarization of pre-processed S1, sigma naught (σo), energy and entropy (GLCM), water map, and coherence
of GRD and SLC were significantly more accurate than those obtained by VV over the two frames, therefore
the VV was neglected. Furthermore, the topographic input features were discarded because they decreased the
precision, recall, and F1 score alongside different VH and VV combinations.
Discussions
The results demonstrated that the VH polarization channel provided the best performance in terms of flood
extent detection. This aligns with the findings of11, who reported that VH is more sensitive to surface roughness
and vegetation, making it ideal for detecting water surfaces in flood-prone areas. The incorporation of the
Attention U-Net architecture further improved model performance by allowing the network to focus on salient
features and suppress irrelevant regions. This was particularly beneficial in urban and forested areas where
traditional U-Net architectures struggled to delineate flood boundaries. Compared to studies that utilized
VV polarization30, our model achieved higher precision and recall due to VH’s enhanced capability to capture
subtle water-related features. However, the variability of SAR backscatter intensity in different environmental
conditions remains a source of uncertainty, which could be addressed by incorporating multi-polarization data
or integrating optical imagery.
Furthermore, data augmentation can improve the generalization potential of the model significantly without
needing to fuse radar data with any other optical data65. The focal loss metric is constructed to address the
imbalance of water/non-water, difficult/easy pixels during training66. Various data combinations were applied for
the detection of flooded pixels. Nevertheless, the topographic features have resulted in poor metrics significantly,
and this might have a comeback to one or all of the three reasons, (1) there are too many training channels, bands,
and features consequently leading to the Curse of Dimensionality67, where increasing the volume of the space
increases so fast that the available data becomes sparse, and then leads to a reduction in accuracy as we further
train the model iteratively. (2) Most of the topographic features in sourced SAR C-band capabilities images are
hidden away, because of the water elevation rises, and ultimately they’re not compatible with any similar features
from topographic input features. In other words, the last one might be a comeback to the principle architecture
of the U-Net which only allows very few training images and yields more precise segmentations68.
The best results were achieved in this study for detecting flood extent by potentially utilizing the (VH) for
both trained and unseen frames11,30. The geometric calibration used in this study was range doppler terrain
correction and the digital elevation model (DEM)–SRTM-1Sec. The water map layer has played an important
Fig. 4. Flood event at Sydney Wollogong region. Top left shows a radiometrically enhanced SAR image before
the flood event. Top right shows a radiometrically enhanced SAR image after the flood event. Bottom left
shows the training data. Bottom right shows the flood predictions. Figures showcasing flood extent and Area of
Interest, based on SAR GRDH calibrated images, were generated by Authors using the open-source Software34
and35 Source of data is64 Free available. The authors are authorized to access the open-source data and utilize the
software because they are publicly available. SNAP, [Link] Version
10. QGIS Development Team 2020. QGIS Geographic Information System (Long term releaseHannover
3.16.16). https://blog.qgis.org/2020/10 /27/qgis- 3-16-hannover-is-released/.
role in increasing the evaluation metrics significantly. Performing a GLCM on the SAR data aids in the
discrimination between landcover classes improving the performance of the algorithm in this study.
Flip, shear, and rotate augmentations have been applied to training data. Augmentations could be harmful
since they might increase the proportion of geometrically deformed pixels in the dataset leading to increased
false positive predictions because they can alter the key features of flood representation in radar images69,70.
This approach enables the rapid mapping of flooded areas by processing satellite data acquired during or
immediately after the satellite’s pass over the affected area. It is crucial to emphasize that this refers specifically
to the satellite’s pass coinciding with or shortly following the event, rather than the satellite’s temporal revisit
interval, which is typically six or twelve days.
As highlighted in the literature, by leveraging optical and radar satellite imagery from the European
Commission’s Copernicus Sentinel fleet and the Italian Space Agency’s COSMO-SkyMed constellation, these
Fig. 5. Flood event at South-Eastern New South Wales region. Top left shows a radiometrically enhanced SAR
image before the flood event. Top right shows a radiometrically enhanced SAR image after the flood event.
Bottom left shows the training data. Bottom right shows the flood predictions. Figures showcasing flood extent
and Area of Interest, based on SAR GRDH calibrated images, were generated by Authors using the open-
source Software34 and35 Source of data is64 Free available. The authors are authorized to access the open-source
data and utilize the software because they are publicly available.. SNAP, https://step .esa.int/m
ain/downlo
ad/sn
ap-do
wnload/ Version 10. QGIS Development Team 2020. QGIS Geographic Information System (Long term
release Hannover 3.16.16). https://blog.qgis.org/2020/10/27/qgis-3-16-hannover-is-released/
datasets have proven to be very valuable reservoir of information. They enable comprehensive assessments of the
duration, extent, and severity of natural disasters70.
In this context, automatic and accurate satellite-derived flood maps play a critical role in enabling swift
emergency response and comprehensive damage assessment. However, current operational approaches for flood
mapping face significant challenges, including cloud coverage in optical satellite imagery, limitations in flood
detection accuracy, and the lack of method generalization across diverse geographic regions71. Conversely, the
flexibility to utilize various sensors and advanced post-processing techniques in our study provides resilience
and enhances model performance. This adaptability enables the rapid and precise delineation of highly dynamic
flood extents, overcoming key obstacles such as cloud cover in optical data during extreme events.
Conclusions
This study successfully demonstrated the effectiveness of the Attention U-Net architecture in detecting flood
extents using Sentinel-1 SAR data. By leveraging the VH polarization and integrating texture analysis, the
model achieved high precision and recall, particularly in complex environments such as urban areas. The low
computational requirements and high performance on unseen data frames make the model suitable for real-
time flood monitoring applications. Future research should focus on integrating multi-sensor datasets, including
optical and infrared imagery, to further improve accuracy. Additionally, addressing uncertainties related to
SAR backscatter variability through advanced modelling techniques or enhanced data fusion approaches could
further enhance the robustness of flood detection systems.
The ability to perform accurate flood detection is important in assisting the government to collect timely
information about the flooded area. With the use of satellite data, we can detect floods over very large areas.
We found the cross-polarization HV is the best one for detecting an open water area. The reason returns to
the signal mechanism in HV mode doesn’t depolarize well in open water the more the signal scatters the higher
the likelihood the signal is depolarized, and it scatters multiple times, that’s why the signal doesn’t depolarize
and remains very low.
In this study, we identified effective combinations of S1 products in conjunction with Attention U-Net
combined with focal loss, the model achieved state-of-the-art accuracy for challenging boundary pixels’
identification of flood water extension at different depths of water. data pre-processing doesn’t require a long
time to be accomplished, nevertheless, in the general incidents or/and the specific instance of this study the
sort of pre-processing had better-designed corresponding to the features required to be extracted from the S1
imagery either SLC or/and GRDH. The other significant reason is to feed and train the model with the most
related data in order to decrease the time consumption of the model and to avoid the curse of dimensionality.
The model is recommended to be used in emergency circumstances, because of the low amount of data necessary
to be trained in the model and its high performance for unseen data of the other NSW frame in this research.
A qualitative analysis of the flood spot extraction followed by a quantitative evaluation based on the F1 score
shows a high agreement with the ground truth (0.89) confirming that the proposed method allows for mapping
flood extents from S1 imagery.
Data availability
The datasets generated and/or analysed during the current study are available in the Dr. Falah Fakhri falah-
fakhri-Iraq GitHub repository, https://github.c om/falahf akhri-Iraq/Attention-U-Net-for-flood- extent-de tectio
n-in
-South-New
-Wales.
Code availability
The post-processed data and the code are available withing the GitHub repository https: //github. com/falahfakh
References
1. Firman, T., Surbakti, I., Idroes, I. & Simarmata, H. Potential climate-change related vulnerabilities in Jakarta: Challenges and
current status. Habitat Int. 35(2), 372–378 (2011).
2. Swiss, Re., 2012. Flood—An underestuimated risk: Inspect, inform, insure. Available online: ht tps://medi a.swissre. com/docume n
ts/Flood.pdf
3. Refice, A., Capolongo, D., Chini, M. & D’Addabbo, A. Improving flood detection and monitoring through remote sensing. Water
14, 36 (2022).
4. Centre for Research on the Epidemiology of Disasters. The Human Cost of Weather-Related Disasters 1995–2015. United Nations
Office for Disaster Risk Reduction, Geneva, Switzerland (2015).
5. Wu, H. et al. Real-time global flood estimation using satellite-based pre-capitation and a coupled land surface and routing model.
Water Resour. Res. 50(3), 2693–2717 (2014).
6. Rahman, M. S. & Di, L. The state of the art of spaceborne remote sensing in flood management. Nat. Hazards 85, 1223–1248.
[Link] (2017).
7. Vanama, V., Rao, Y. & Bhatt, C. Change detection-based flood mapping using multi-temporal earth observation satellite images:
2018 flood event of Kerala, India. Eur. J. Remote Sens. 54(1), 42–58 (2021).
8. Bouchard, I., Rancourt, M. -È., Aloise, D. & Kalaitzis, F. On transfer learning for building damage assessment from satellite
imagery in emergency contexts. Remote Sens. 14, 2532. [Link] (2022).
9. Serpico, S. B. et al. Information extraction from remote sensing images for flood monitoring and damage evaluation. Proc. IEEE
100, 2946–2970 (2012).
10. Schumann, G., Brakenridge, G., Kettner, A., Kashif, R. & Niebuhr, E. Assisting flood disaster response with earth observation data
and products: A critical assessment. Remote Sens. 10, 1230 (2018).
11. Carreño Conde, F. & De Mata Muñoz, M. Flood monitoring based on the study of Sentinel-1 SAR images: The Ebro River case
study. Water 11, 2454. [Link] (2019).
12. Otsu, N. A threshold selection method from gray-level histograms. IEEE Trans. Syst. Man Cybern. 9, 62–66 (1979).
13. Khurana, S. Comparative study on threshold techniques for image analysis. Int. J. Eng. Res. Technol. https://d oi.org/10. 1109/TSM
C.1979.43100 76 (2015).
14. Nguyen, B. D. Automatic detection of surface water bodies from Sentinel-1 SAR images using valley-emphasis method. Vietnam J.
Earth Sci. 37(4), 328–343 (2016).
15. Bueso-Bello, J.-L. et al. The global water body layer from TanDEM-X interferometric SAR data. Remote Sens. 13, 5069. http s://doi
.or g/10.3390/ rs13245069 (2021).
16. Katiyar, V., Tamkuan, N. & Nagai, M. Flood area detection using sar images with deep neural network during, 2020 kyushu flood
japan (2020). Available online: [Link]
17. Nachappa, T., Ghorbanzadeh, O., Gholamnia, K. & Blaschke, T. Multi-hazard exposure mapping using machine learning for the
state of Salzburg, Austria. Remote Sens. 12(17), 2757 (2020).
18. Yu, F. & Koltun, V. Multi-scale context aggregation by dilated convolutions (2016). 1511.07122, arXiv, [Link]. https://ar xiv.org/ab s/
1511.071 22
19. Badrinarayanan, V., Kendall, A. & Cipolla, R. SegNet: A deep convolutional encoder-decoder architecture for image segmentation.
IEEE Trans. Pattern Anal. Mach. Intell. 39(12), 2481–2495. [Link] (2017).
20. Chen, L., Papandreou, G., Kokkinos, I., Murphy, K. & Yuille, A. DeepLab: Semantic image segmentation with deep convolutional
nets, atrous convolution, and fully connected CRFs (2017). Available online: [Link]
21. Eui-ik, J., Sunghak, K., Soyoung, P., Juwon, K. & Imho, C. Semantic segmentation of seagrass habitat from drone imagery based on
deep learning: A comparative study. Ecol. Inf. 66, 101430. [Link] (2021).
22. Yuan, X., Shi, J. & Lichuan, G. A review of deep learning methods for semantic segmentation of remote sensing imagery. Expert
Systems with Applications 169, 114417 (2021).
23. Zaffaroni, M. & Rossi, C. Water segmentation with deep learning models for flood detection and monitoring. http://hdl .handle.n
et/2318/183 2212 since 2022-01-13T[Link]Z, WiP Paper—AI Systems for Crisis and Risks. In Proceedings of the 17th ISCRAM
Conference-Blacksburg, VA, USA May 2020 (eds. Hughes, A. L., McNeill, F., & Zobel, C) (2020).
24. Bentivoglio, R., Isufi, E., Jonkman, S.N. & Taormina, R. Deep learning methods for flood mapping: A review of existing applications
and future research directions. Hydrol. Earth Syst. Sci. (2021)
25. Kotaridis, I. & Lazaridou, M. Integration of convolutional neural networks for flood risk mapping in Tuscany, Italy. Nat. Hazards
114, 3409–3424. [Link] (2022).
26. Xu, C. et al. SAR image water extraction using the attention U-net and multi-scale level set method: flood monitoring in South
China in 2020 as a test case. Geo-spatial Inf. Sci. 25, 1–14 (2021).
27. Nemni, E., Bullock, J., Belabbes, S. & Bromley, L. Fully convolutional neural network for rapid flood segmentation in synthetic
aperture radar imagery. Remote Sens. 12(16), 2532. [Link] (2020).
28. UNOSAT 2019 Preliminary satellite derived flood assessment in White Nile State, Sudan. Product ID:2743, GLIDE:FL20190815SDN.
[Link]
29. Schumann, G.J.-P. Preface: Remote sensing in flood monitoring and management. Remote Sens. 7, 17013–17015. https: //doi.org/ 1
0.3390/rs 71215871 (2015).
30. Helleis, M., Wieland, M., Krullikowski, C., Martinis, S. & Plank, S. Sentinel-1-based water and flood mapping: Benchmarking
convolutional neural networks against an operational rule-based processing chain. IEEE J. Select. Top. Appl. Earth Observ. Remote
Sens. 15, 2023–2036 (2022).
31. Bentivoglio, R., Isufi, E., Jonkman, S. N. & Taormina, R. Deep learning methods for flood mapping: A review of existing applications
and future research directions. Hydrol. Earth Syst. Sci. 8, 9. [Link] (2022).
32. Head, L., Adams, M., McGregor, H. V. & Toole, S. Climate change and Australia. Wiley Interdiscip. Rev. WIREs Clim. Change 5(2),
175–197 (2014).
33. Wood, N., Beauman, M. & Adams, P. An indicative assessment of four key areas of climate risk for the 2021 NSW Intergenerational
Report, TTRP21-05. NSW Intergenerational Report Team NSW Treasury (2021).
34. QGIS Development Team 2020. QGIS Geographic Information System (Long term release Hannover 3.16.16). https://blog.qgis .o
rg/2020 /10/27/qgis-3-16-hannover-is -released/ Open Source Geospatial Foundation Project. [Link]
35. SNAP-ESA Sentinel Application Platform v8.0.0. [Link]
36. Gao, B.-C. NDWI—A normalized difference water index for remote sensing of vegetation liquid water from space (PDF). Remote
Sens. Environ. 58(3), 257–266. [Link] (1996).
37. Eudaric, J. et al. A satellite imagery-driven framework for rapid resource allocation in flood scenarios to enhance loss and damage
fund effectiveness. Sci. Rep. 14, 19290. [Link] (2024).
38. Ren, S., Chang, W., Jin, T. & Wang, Z. Automated SAR reference image preparation for navigation. Progress Electromagnet. Res.
121, 535–555 (2011).
39. Koo, V. C. et al. A new unmanned aerial vehicle synthetic aperture radar for environmental monitoring. Progress Electromagnet.
Res. 122, 245–268 (2012).
40. Kuhn, M. & Johnson, K. Applied predictive modelling. 2013 Corrected at 5th printing. (Springer, 2016). https ://doi.org /10.1007/9
78-1-4614- 6849-3
41. Mwangi, B., Tian, T. S. & Soares, J. C. A review of feature reduction techniques in neuroimaging. Neuroinformatics 12(2), 229–244.
[Link] (2014).
42. Lee, J. S. et al. Improved Sigma filter for speckle filtering of SAR imagery. IEEE Trans. Geosci. Rem. Sens. 47(1), 202–213 (2009).
43. Iqbal, N., Mumtaz, R., Shafi, U. & Zaidi, S. M. H. Gray level co-occurrence matrix (GLCM) texture based crop classification using
low altitude remote sensing platforms. PeerJ Comput. Sci. 19(7), e536. [Link] (2021).
44. Haralick, R. M., Shanmugam, K. & Dinstein, I. H. Textural features for image classification. IEEE Trans. Syst. Man. Cybern. 3,
610–621 (1973).
45. Small. D. SAR backscatter multitemporal compositing via local resolution weighting. In 2012 IEEE International Geoscience and
Remote Sensing Symposium 4521–4524. [Link]
46. ASF, Alaska Satellite Facility, Tools developed by ASF for working with SAR data, 2156 Koyukuk Drive Fairbanks. https://gith ub.c
om/ASFHyP3
47. Braun, A., Fakhri, F. & Hochschild, V. Refugee camp monitoring and environmental change assessment of Kutupalong, Bangladesh,
based on radar imagery of Sentinel-1 and ALOS-2. Remote Sens. 11, 2047. [Link] (2019).
48. Fakhri, F. & Gkanatsios, I. O. Integration of Sentinel-1 and Sentinel-2 data for change detection: A case study in a war conflict area
of Mosul city. Remote Sens. Appl. Soc. Environ. 22, 100505. [Link] (2021).
49. Chust, G., Ducrot, D. & Pretus, J. L. Land cover mapping with patch-derived landscape indices. Landsc. Urban Plann. 69, 437–449
(2004). Remote Sens.10, 1–22 (2018).
50. Fahsi, A., Tsegaye, T., Tadesse, W. & Coleman, T. Incorporation of digital elevation models with Landsat-TM data to improve land
cover classification accuracy. For. Ecol. Manag. 128, 57–64 (2000).
51. Fan, H. Land-cover mapping in the Nujiang Grand Canyon: Integrating spectral, textural, and topographic data in a random forest
classifier. Int. J. Rem. Sens. 34, 7545–7567 (2013).
52. Hale, S. R. & Rock, B. N. Impact of topographic normalization on land-cover classification accuracy. Photogramm. Eng. Remote
Sens. 69, 785–791 (2003).
53. Small, D. Flattening gamma: Radiometric terrain correction for SAR imagery. IEEE Trans. Geosci. Remote Sens. 49, 3081–3093
(2011).
54. NSW Government. https://www.n sw.gov.au/about-nsw/key-fact s-about-n
sw#toc-geo
graphy-of-nsw
55. Takaku, J., Tadono, T., Doutsu, M., Ohgushi, F. & Kai, H. Updates of ‘aw3d30’ ALOS global digital surface model with other
open access datasets. In ISPRS-International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences.
XLIII-B4-2020, 183–189. 10.5194/isprs-archives-XLIII-B4-2020-183-2020 (2020).
56. Schillaci, C., Braun, A. & Kropáček, J. Terrain analysis and landform recognition. Geomorphol. Tech. 2, 1–18 (2015).
57. Marcelino, P. Transfer learning from pre-trained models. Towards Data Sci. 10, 23 (2018).
58. Oktay, O., et al. Attention u-net: Learning where to look for the pancreas. ArXiv Prepr. (2018). https:// doi.org/10 .48550/arX iv.180
4.03 999
59. Abraham, N. & Khan, N. M. A novel focal Tversky loss function with improved attention U-Net for lesion segmentation. In
2019 IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019) 683–687. [Link]
(2019).
60. Pan, Z., Xu, J., Guo, Y., Hu, Y. & Wang, G. Deep learning segmentation and classification for urban village using a worldview
satellite image based on U-Net. Remote Sens. 12, 1574. [Link] (2020).
61. Marcham, F. TensorFlow: Large-scale machine learning on heterogeneous distributed systems (Preliminary White Paper,
November 9, 2015). arXiv 2016. arXiv:1603.04467
62. Milletari, F., Navab, N. & Ahmadi, S. V-Net: Fully convolutional neural networks for volumetric medical image segmentation. In
2016 Fourth International Conference on 3D Vision (3DV) 565–571. [Link] (2016).
63. Sudre, C. H., Li, W., Vercauteren, T., Ourselin, S. & JorgeCardoso, M. Generalised dice overlap as a deep learning loss function
for highly unbalanced segmentations. In Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision
Support. DLMIA ML-CDS 2017. Lecture Notes in Computer Science Vol. 10553 (eds Cardoso, M. et al.) (Springer, 2017). https:// d
oi.org/10 .1007/978- 3-319-6755 8-9_28.
64. Copernicus Service information [2021–2022]’ for Copernicus Service Information. https://sentinel s.coperni cus.eu/documents/24
7904/690755/Sentin el_Data_L
egal_Notice
65. Bai, Y. et al. Enhancement of detecting permanent water and temporary water in flood disasters by fusing sentinel-1 and sentinel-2
imagery using deep learning algorithms: Demonstration of Sen1Floods11 benchmark datasets. Remote Sens. 13, 2220. https://doi.
org/10.3390/ rs13112220 (2021).
66. Goyal, P. & Kaiming, H. Focal loss for dense object detection. In Proceedings of the IEEE International Conference on Computer
Vision, Venice, Italy, 22–29 October 2017 2980–2988.
67. Debie, E. & Shafi, K. Implications of the curse of dimensionality for supervised learning classifier systems: Theoretical and
empirical analyses. Pattern Anal. Appl. [Link] (2019).
68. Ronneberger, O., Fischer, P. & Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. (Computer Science
Department and BIOSS Centre for Biological Signalling Studies, 2015).
69. Wangiyana, S., Samczyński, P. & Gromek, A. Data augmentation for building footprint segmentation in SAR images: An empirical
study. Remote Sens. 2022, 14. [Link] (2012).
70. Tapete, D. & Cigna, F. Poorly known 2018 floods in Bosra UNESCO site and Sergiopolis in Syria unveiled from space using
Sentinel-1/2 and COSMO-SkyMed. Sci. Rep. 10, 12307. [Link] (2020).
71. Portalés-Julià, E. et al. Global flood extent segmentation in optical satellite images. Sci. Rep. 13, 20316. https://doi .org/10.10 38/s41
598- 023-47595-7 (2023).
72. Nava, L., Bhuyan, K., Meena, S. R., Monserrat, O. & Catani, F. Rapid mapping of landslides on SAR data by attention U-Net. Remote
Sens. 14, 1449. [Link] (2022).
Acknowledgements
Thanks to Dr. Leila Hashemi-Beni, Kushanav Bhuyan and Dr. Francescopaolo Sica of their early support for this
work.
Author contributions
F.F. designed the project and pre-processed the data F.F., I.G. and F.F. wrote the introduction, F.F. wrote the liter-
ature review. F.F. wrote the first draft. F.F. made the last revision.
Funding
This research received no external funding.
Declarations
Competing interests
The authors declare no competing interests.
Additional information
Correspondence and requests for materials should be addressed to F.F.
Reprints and permissions information is available at [Link]/reprints.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and
institutional affiliations.
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives
4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in
any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide
a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have
permission under this licence to share adapted material derived from this article or parts of it. The images or
other third party material in this article are included in the article’s Creative Commons licence, unless indicated
otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence
and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to
obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creati vecommo
ns. org/licens es/by-nc-nd/4.0/.