Feature Detection 4
Feature Detection 4
Authorized licensed use limited to: Lovely Professional University - Phagwara. Downloaded on September 26,2024 at 07:57:35 UTC from IEEE Xplore. Restrictions apply.
5606115 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 61, 2023
In view of the above challenges, numerous methods have pipeline has been developed for NRD between multimodal
been proposed for multimodal matching in the past two images. These methods of this pipeline evaluate the simi-
decades. These multimodal matching methods can be generally larity of generated features rather than intensity information
grouped into three categories: area-based methods, feature- by using the above similarity metrics (i.e., SSD, NCC, and
based methods, and learning-based methods [12], [13]. With phase correlation). Histogram of orientated phase congruency
the development of deep learning technology, learning-based (HOPC) [21], phase congruency structural descriptor (PCSD)
matching methods exhibit excellent matching performance [22], channel features of orientated gradients (CFOG) [23],
and have developed into a pipeline that cannot be ignored and optical-SAR-phase correlation (PC) [24] are the most
in the field of MRSIM [14]. Wang et al. [15] presented an representative ones. However, area-based matching methods
effective deep neural network to optimize the whole processing are very sensitive to geometric distortions (i.e., scale and
(learning mapping function) through information feedback, rotation deformations) between images and usually require
and transfer learning was used to improve their framework’s georeferencing implementation to eliminate the significant
performance and efficiency. To contrast different stages of global geometric distortions [25].
feature matching, Hughes et al. [16] proposed a fully auto- In contrast, feature-based matching methods rely on the
mated SAR–optical matching framework that was composed salient and distinctive features (i.e., points, lines, and regions)
of a goodness network, correspondence network, and outlier between images and are more robust to geometric distortions
reduction network, and each of these subnetworks has been and NRD compared with area-based methods [26]. Among
proven to individually improve the matching performance. Fur- these methods, point features are the most common local
thermore, Zhou et al. [17] employed deep learning techniques invariant features in the remote sensing domain. This matching
to refine structure features and designed the multiscale con- pipeline usually consists of two key components: feature
volutional gradient features (MCGFs) by utilizing a shallow detection and feature description. In the past several decades,
pseudo-Siamese network. Similarly, Quan et al. [18] exploited feature matching methods of monomodal images have been
more similar features using a self-distillation feature learning well-studied, and many classical feature detectors and feature
network (SDNet) for optimization enhancement of deep net- descriptors have been developed. These traditional feature
work, which achieved robust matching of optical–SAR images. detectors detect salient features between images based on
Ye et al. [19] designed a multiscale framework without costly the gradient information of images, such as Moravec [27],
ground truth labels and a novel loss function paradigm based Harris [28], differences of Gaussian (DoG) [29], and features
on structural similarity, which can directly learn the end-to-end from accelerated segment test (FAST) [30]. Nevertheless,
mapping from multimodal image pairs to their transformation these gradient-based detectors are difficult to detect inter-
parameters. Also, their matching framework has the steady est points (IPs) with high repeatability among multimodal
performance to be robust to nonlinear radiometric differences images. According to the inherent properties of optical and
(NRD) between multimodal image pairs. SAR images, Xiang et al. [31] constructed two Harris scale
Although learning-based matching methods can signifi- spaces to extract IPs by designing consistent gradients for
cantly improve their resistance to geometric and radiation optical and SAR images utilizing the multiscale Sobel and
distortion by extracting finer common features than traditional multiscale ratio of exponentially weighted averages operators,
handcrafted features, the main limitations of this pipeline are respectively. Furthermore, some studies have found that the use
also found to be significant. On the one hand, supervised of the phase consistency (PC) model can effectively resistance
learning-based methods often rely on a large amount of to significant NRD and extract more stable and repeatable
training data [15], [16], [17], [18], and the transferability of the IPs than using only the gradient information. Ye et al. [32]
trained model is poor resulting in their matching performance combined the minimum moment of PC with the Laplacian of
generally dropping sharply on different test datasets. On the Gaussian (MMPC-Lap) to detect stable IPs in image scale
other hand, despite unsupervised learning methods that can space. Subsequently, Li et al. [33] detected corner feature
overcome the dependence on training data, the process of points and edge feature points on the minimum moment map
converting various parameters is very complex [19], and its and maximum moment map of the PC, respectively. Although
efficiency depends on the basic configuration of the computer these PC-based feature detectors have a certain resistance to
infrastructure. These deficiencies mentioned above limit the NRD between multimodal images, they have the cost of high
widespread application of learning-based pipeline in multi- computational complexity.
modal matching fields. Once the feature detection of MRSIs is completed, then
Generally, traditional area-based methods identify corre- corresponding local invariant feature descriptors must be
spondences by selecting some classical similarity metrics to explored. Similarly, the construction of many well-known
evaluate the similarity of intensity information within a tem- feature descriptors also utilizes the gradient information of
plate window. There are three commonly used similarity met- images, and hence, they cannot achieve the robust matching
rics in the spatial domain: sum of squared differences (SSD), of MRSIs with both geometric distortion and radiation differ-
normalized cross correlation (NCC), and mutual information ences. Scale-invariant feature transform (SIFT) [29], gradient
(MI). In addition, phase correlation is the most commonly used location and orientation histogram (GLOH) [34], DAISY [35],
similarity metric in the frequency domain because of its illu- and their improved variants [31], [36], [37] are the most repre-
mination invariance [20]. Recently, a structure feature-based sentative feature descriptors. As shown in Fig. 1, in particular,
Authorized licensed use limited to: Lovely Professional University - Phagwara. Downloaded on September 26,2024 at 07:57:35 UTC from IEEE Xplore. Restrictions apply.
ZHU et al.: R2 FD2 : FAST AND ROBUST MATCHING OF MRSIS VIA REPEATABLE FEATURE DETECTOR 5606115
such significant intensity differences and severe speckle noise that are also time-consuming and easily prone to generate
in multimodal images will further decline the matching per- outliers [32], [43], [45], which vastly affects the final matching
formance of these gradient-based descriptors, accompanied by performance.
the difficulty of identifying accurate correspondences. Although numerous efforts have been made to enhance
A recently popular pipeline for radiation-robust description the robustness of MRSIM, the current feature detectors in
is structural features because it is more resistant to modality the aspect of IPs repeatability are still not efficacious, and
variations than the gradient-based description that is already feature descriptors remain challenging in rotation invariance.
described in the above literature. With a number of descriptors To address the aforementioned limitations of pivotal compo-
derived from structural features having been developed for nents in feature matching, we present a robust and efficient
multimodal image matching, the most commonly used feature feature-based method (called R2 FD2 ) for multimodal matching
descriptors can be divided into two categories. The former in this work. First, to improve the repeatability of feature
is based on local self-similar (LSS) descriptors utilizing a detection, we construct a repeatable feature detector called the
log-polar spatial structure as feature descriptors, which can multichannel autocorrelation of the log-Gabor (MALG). The
effectively capture the internal geometric composition of self- MALG detector combines the multichannel autocorrelation
similarities within local image patches and is less sensitive strategy with the log-Gabor wavelets, which can be capable
to significant NRD to a certain extent [38], [39], [40], [41]. of extracting evenly distributed IPs with high repeatability.
Ye and Shan [40] introduced the LSS descriptor as a new Subsequently, we build a rotation-invariant feature descriptor
similarity metric to detect correspondences for the matching named the rotation-invariant maximum index map of the log-
of multispectral remote sensing images. Based on LSS, a shape Gabor (RMLG). The MALG descriptor consists of a fast
descriptor named dense local self-similarity (DLSS) was fur- assignment strategy of dominant orientation and an advanced
ther designed for optical and SAR image matching [41]. descriptor configuration. In the process of fast assignment of
Sedaghat and Mohammadi [38] improved the distinctiveness dominant orientation, we propose a novel rotation-invariant
of the histogram of oriented self-similarity (HOSS) descriptor MIM (RMIM) to achieve reliable rotation invariance. Then, the
by adding directional attributes to image patches where the RMLG descriptor incorporates the rotation-invariant RMIM
self-similarity values are computed. For the problem of LSS’ with the spatial configuration of DAISY to depict discrimina-
computation complex, Xiong et al. [39] proposed a feature tive features of multimodal images, which aims to construct
descriptor named oriented self-similarity (OSS) by using offset feature representation that is as robust as possible against the
mean filtering to calculate the self-similarity features fast differences in radiation and rotation.
based on the symmetry of the self-similarity. However, there The following is a summary of the main contributions.
still exist limitations with these descriptors because the rel- 1) A repeatable feature detector called MALG is defined
atively low discriminative capability of LSS descriptors may to detect evenly distributed IPs with high repeatability.
lead to the inability to maintain robust matching performance 2) A rotation-invariant feature descriptor named RMLG is
in some multimodal matching cases [22]. constructed based on the RMIM with rotation invariance
Another structural feature of radiation-robust description is and the spatial configuration of DAISY.
by utilizing the PC model, which is based on the position 3) The presented R2 FD2 matching method, consisting of
perception feature of the maximum Fourier component [42]. the MALG detector and RMLG descriptor, is quantifi-
Given that the PC model is more robust to illumination cationally and qualitatively evaluated with existing state-
and contrast changes compared with gradient information, of-the-art methods using various types of MRSIs.
therefore, many PC-based descriptors have been developed
The remainder of this article is organized as follows. The
[32], [33], [43], [44]. Ye et al. [32] presented a local HOPC proposed multimodal feature matching method is introduced in
(LHOPC) descriptor by combining the extended PC model and Section II, with an emphasis on the construction of the MALG
the arrangement of DAISY. Li et al. [33] developed a radiation- detector and RMLG descriptor. Section III examines and
variation-insensitive feature transform (RIFT) method, and a evaluates the matching performance of the proposed R2 FD2
maximum index map (MIM) was introduced based on the by conducting experiments on various multimodal image pairs.
PC model for feature description. Xiang et al. [44] improved Finally, the conclusion is summarized in Section IV.
different PC models to construct features for the matching
of optical and SAR images. In a similar work, Fan et al.
II. M ETHODOLOGY
[43] designed a multiscale PC descriptor, named multiscale
adaptive binning phase congruency (MABPC), which uses an In this section, a fast and robust method (named R2 FD2 ),
adaptive binning spatial structure to encode multiscale phase involving the MALG detector and the RMLG descrip-
congruency features, while improving its robustness to address tor, is proposed to improve the matching performance of
geometric and radiometric discrepancies. Nevertheless, in the multimodal images. Specifically, the MALG detector is first
process of feature description, the above methods either lack presented to detect IPs with high repeatability between multi-
rotation invariance [44] or rely on time-consuming loop traver- modal image pairs. Then, the RMLG descriptor is employed
sal based on the log-Gabor convolution sequence to achieve to robustly depict the local invariant characteristics of detected
rotation invariance [33] or estimate the dominant orientation IPs. The flowchart of the proposed R2 FD2 is shown in Fig. 2,
by combining orientation histogram with local PC features which is then elaborated in detail.
Authorized licensed use limited to: Lovely Professional University - Phagwara. Downloaded on September 26,2024 at 07:57:35 UTC from IEEE Xplore. Restrictions apply.
5606115 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 61, 2023
A. Construction of MALG Detector of the filter, respectively; δθ is the angular bandwidth; and θo
represents the filter’s orientation.
As mentioned above, the PC model is more resistant to Since the 2-D log-Gabor is a frequency domain filter,
significant NRD between multimodal images compared with its expression in the space domain can be obtained by
gradient information, and there have been relevant studies [32], inverse Fourier transform based on the corresponding fre-
[33] using the PC model to extract stable IPs. Nevertheless,
quency response of log-Gabor filters in polar coordinates [49].
these detectors using the minimum moment or maximum
Therefore, the 2-D log-Gabor function in the space domain can
moment of PC may cause loss of multidirectional features and
be typically decomposed into an even-symmetric filter and an
are computationally expensive because they are the weighted
odd-symmetric filter, which is defined as follows:
responses of PC in different orientations and represent the
moment changes with the orientation [46]. Also, the responses LG(x, y, s, o) = LGeven (x, y, s, o) + i · LGodd (x, y, s, o) (2)
of PC in different orientations are calculated by making use of
where the real component LGeven (x, y, s, o) and the imaginary
log-Gabor wavelets because of their good antinoise and edge
component LGeven (x, y, s, o) represent the even- and the odd-
extraction performance [47]. In order to improve the reliability
symmetric filters, respectively, of the log-Gabor wavelets at
of feature detector while ensuring the high repeatability of IPs,
scale s with orientation o.
in this article, the MALG detector is proposed by incorporating
Accordingly, the space response components E(x, y, s, o)
the multichannel autocorrelation strategy with the log-Gabor
and O(x, y, s, o) of log-Gabor filters can be yielded by
wavelets for IPs detection.
convolving the image I (x, y) with the two even- and odd-
Given that good noise suppression and edge preservation
symmetric filters
are two crucial characteristics of an excellent feature detec- (
tor [48], the 2-D log-Gabor wavelets are employed during E(x, y, s, o) = I (x, y) ∗ LGeven (x, y, s, o)
the construction of the proposed MALG detector. They can (3)
O(x, y, s, o) = I (x, y) ∗ LGodd (x, y, s, o).
provide a useful description of edge feature information for
multiple orientations and at multiple scales from multimodal Then, the amplitudes of log-Gabor for all Ns scales are
image pairs, which is suitable for describing the local structure summed at orientation o to obtain the multichannel log-Gabor
of multimodal images. Generally, a 2-D log-Gabor filter is features; formally, the multichannel log-Gabor features are
expressed as follows: defined as follows:
( q
2 (x, y) + O 2 (x, y)
A(x, y, s, o) = E s,o s,o
(log( f /Fs ))2 (θ − θo )2 (4)
No
LGs,o ( f, θ ) = exp − A = Ai (x, y, o) 1 = 1 A(x, y, s, o)
o
P Ns
exp − (1)
2(log β)2 2δθ2
where A(x, y, s, o) is the amplitude component of I (x, y) at
where o and s represent the orientation and scale of the log- scale s and orientation o, Ns represents the number of scales
Gabor filter, respectively; β determines the bandwidth of the for the log-Gabor filter banks, No represents the number of
filter; f and Fs define the frequency and central frequency orientations for the log-Gabor filter banks, and 6 is summed
Authorized licensed use limited to: Lovely Professional University - Phagwara. Downloaded on September 26,2024 at 07:57:35 UTC from IEEE Xplore. Restrictions apply.
ZHU et al.: R2 FD2 : FAST AND ROBUST MATCHING OF MRSIS VIA REPEATABLE FEATURE DETECTOR 5606115
Fig. 3. Schematic of extracted IPs by the proposed MALG detector. (a) Optical–infrared image pairs (repeatability = 55.87%). (b) Optical–depth image
pairs (repeatability = 44.10%). (c) Optical–SAR image pairs (repeatability = 36.79%).
over the log-Gabor filter banks on different scales for the The autocorrelation response value R of the multichannel
orientation o. Also, Ai (x, y, o) equals the amplitude responses log-Gabor features for each pixel is calculated by utilizing the
of the log-Gabor at the location (x, y) for orientation o, and comprehensive autocorrelation matrix Mcom
i = 1, 2, . . . , No . In this article, Ns = 4 and No = 6 are fixed
values. R = det[Mcom (x, y)] − α[trace Mcom (x, y)]2 (10)
For the multichannel log-Gabor features Ai (x, y, o), their
self-similarity for log-Gabor features of each orientation after where det[Mcom (x, y)] is the determinant of matrix M and
a shift (1x, 1y) at the location (x, y) can be yielded by the trace Mcom (x, y) represents the direct trace of matrix Mcom .
following autocorrelation function: Also, α is a constant, ranging from 0.04 to 0.06. Finally,
N the local maximum extrema are first the extracted IP, while
C Ao = C Ai (x, y, 1x, 1y, o) 1 o
X 2 nonmaximum suppression is carried out to decrease some
w(u, v) Ao (x, y, o)− Ao (u +1x, v+1y, o)
= adjacent IPs, that is, the first N local extrema with the largest
(u,v)∈W (x,y) response values will be selected as the final IPs by our MALG
(5) detector.
Moreover, Fig. 3 presents three illustrations of the IPs
where W (x, y) is a window centered at the location (x, y). extracted by the proposed MALG detector; specifically,
Also, w(u, v) is a weighting function, which is either a Fig. 3(a)–(c) shows the IPs extracted from optical–infrared,
constant or a Gaussian weighting function. According to the optical–depth, and optical-SAR image pairs, respectively.
Taylor expansion, the first-order approximation is performed As seen, our MALG detector is capable of extracting IPs
after shifting (1x, 1y) for the log-Gabor feature of each with high repeatability and uniform distribution between multi-
channel modal image pairs. The definition of repeatability is introduced
Ao (u + 1x, v + 1y, o) = Ao (u, v, o) + Aox (u, v, o)1x in Section III-B, and more performance evaluation of MALG
is given in Section III-B.
+ Aoy (u, v, o)1y + O(1x 2 +1y 2 )
Ao (u + 1x, v + 1y, o) ≈ Ao (u, v, o) + Aox (u, v, o)1x
+ Aoy (u, v, o)1y (6) B. Establishment of RMLG Descriptor
where Aox
and Aoy
are the partial derivative of log-Gabor Once repeatable IPs have been extracted, the next crit-
feature in the corresponding orientation o. Therefore, the ical step is to design a robust feature descriptor with the
above autocorrelation function for each orientation can be intent of increasing the distinction of features. The feature
simplified as descriptor usually consists of two components: assignment
N of dominant orientation and construction of feature repre-
C Ao = C Ai (x, y, 1x, 1y, o) 1 o
sentation. However, as mentioned earlier, the gradient-based
= |1x, 1y|M(x, y, o) 1x, 1y
(7) descriptors are very sensitive to NRD, and existing structural
feature-based descriptors either rely on time-consuming loop
with M(x, y, o) denoting the autocorrelation matrix of orien-
traversal based on the log-Gabor convolution sequence to
tation o defined as
achieve rotation invariance [33] or complicatedly assign the
M(x, y, o) dominant orientation by combining orientation histogram with
w A x (x, y, o) x (x, y, o)A y (x, y, o)
2 local PC features [32], [43], [45]. Therefore, it is difficult for
P o P o o
w A
= . these descriptors to achieve fast and robust multimodal image
w A x (x, y, o)A y (x, y, o) w A y (x, y, o)
o o o 2
P P
Authorized licensed use limited to: Lovely Professional University - Phagwara. Downloaded on September 26,2024 at 07:57:35 UTC from IEEE Xplore. Restrictions apply.
5606115 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 61, 2023
1) Fast Assignment of Dominant Orientation: From the and is very discrete. Therefore, there are many redundant
previous equation (4), we can get the multichannel log- orientation estimations based on the orientation histogram
o
Gabor features
No A , that is, the log-Gabor response sequence because the index of the rotated MIM patch needs to be
Ai (x, y, o) 1 . Nevertheless, the log-Gabor response
cyclically shifted by kref and ksen positions in the reconstruction
sequence not possesses the rotation invariance compared with of MIM.
the gradient map. This means that rotating the log-Gabor Inspired by the orientation histogram of SIFT and combined
response sequence will not yield the corresponding log-Gabor with the above analysis, we design a fast assignment strategy
response sequence for the rotated image patch. To obtain the for dominant orientation, and a novel MIM with rotation
rotation invariance, the MIM and circular effect are proposed invariance is performed by a statistical measure based on the
by means of loop traversal [33]. The calculation of MIM is MIM. This strategy can avoid the process of weighting the
given as follows: histogram calculation by trilinear interpolation that calculates
n N o the weight of each pixel of the spatial and directional bins.
MIM(x, y) = arg maxo Ai (x, y, o) 1 o (11) Specifically, the fast assignment strategy of dominant orienta-
tion is calculated as follows.
where arg maxo represents the orientation index corresponding
The essence of the orientation histogram is to count the
to
the maximum value in the log-Gabor response sequence
N gradient amplitude and orientation of the pixels in the neigh-
Ai (x, y, o) 1 o .
borhood, while MIM itself has the directional characteristics of
Furthermore, Yu et al. [45] assigned the dominant ori-
the log-Gabor convolution sequence. Hence, we directly count
entation by combining the orientation histogram with the
the value with the most occurrences in the MIM (denoted as
amplitudes and orientations of PC, and auxiliary orientations
C M I M ) and use it as the dominant orientation of IPs, which
were also estimated in the same way as the SIFT. Then, the
can be expressed as follows:
corresponding MIM patch was rotated by the dominant or
auxiliary orientations; subsequently, the index of the rotated
CMIM = mode MIM(x, y)
MIM patch for the reference and the sensed image was 180◦ (14)
cyclically shifted by kref and ksen positions, respectively DO = CMIM ∗
No
rotation
kref = round (12) where mode represents the operation to calculate the sample
180◦ /No
mode in MIM, that is, the value that appears most times in
rotation MIM. Also, DO represents the dominant orientation. Fig. 4
ceil
180 /No
◦
ksen = (13) shows the feasibility of the above strategy for calculating
rotation dominant orientation with different rotated images. Given a
floor
180◦ /No reference image without rotation, its corresponding sensed
where round indicates the rounding operation and ceil and image without rotation, and rotating 90◦ the sensed image,
floor represent the round-up and round-down operation, we select a pair of corresponding IPs between these images,
respectively. and then, their dominant orientations are computed. It is not
Specifically, one feature vector was constructed for each difficult to find that the dominant orientation of the proposed
orientation of IPs in the reference image, and two feature strategy is the same, and this example preliminarily indicates
vectors were constructed for each orientation of IPs in the that the proposed strategy of dominant orientation is feasible.
sensed image. The aforementioned loop traversal strategy to What follows is the reconstruction of MIM, CMIM is used
achieve rotation invariance was very time-consuming [33]. to calculate the new MIM based on the following equation:
While Yu et al. [45] estimation of the dominant orientation (
required calculating the complex amplitudes and orientations MIMnew (x, y) = MIM(x, y) − CMIM + 1
of PC, increasing the auxiliary directions of feature points, MIMnew (x, y) = MIMnew (x, y)+ No , MIMnew (x, y) < 1.
and designing two feature vectors for each orientation of IPs (15)
in the sensed image, which further lead to the time-consuming
nature of their descriptor. Actually, MIMnew (x, y) represents the new MIM that is
We note that the essence of the orientation histogram in recalculated by circularly shifting the CMIM th layer of the log-
the SIFT descriptor is to count the gradient amplitude and Gabor convolution sequence as the first layer of the log-Gabor
orientation of the pixels in the neighborhood, and the orienta- convolution sequence. Finally, the novel MIM with rotation
tion corresponding to the peak of the histogram represents invariance (named RMIM) can be obtained by rotating the
the dominant directions of IPs. The range of the gradient recalculated MIM by the dominant orientation
orientation is [0, 360] and is continuous, while the value range
of MIM based on the log-Gabor response sequence is [1, No ] RMIM = rotate[MIMnew (x, y), DO]. (16)
" #
(x, (x, (x,
P No P o 2
P No P o o
A
w x y, o) A y, o)A y, o)
Mcom (x, y) = P No P1 1 P Nwo P x y
. (9)
w A x (x, y, o)A y (x, y, o) w A y (x, y, o)
o o o 2
1 1
Authorized licensed use limited to: Lovely Professional University - Phagwara. Downloaded on September 26,2024 at 07:57:35 UTC from IEEE Xplore. Restrictions apply.
ZHU et al.: R2 FD2 : FAST AND ROBUST MATCHING OF MRSIS VIA REPEATABLE FEATURE DETECTOR 5606115
Authorized licensed use limited to: Lovely Professional University - Phagwara. Downloaded on September 26,2024 at 07:57:35 UTC from IEEE Xplore. Restrictions apply.
5606115 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 61, 2023
Fig. 5. Comparison of rotation invariance for the MIM and RMIM. (a) Input image. (b) MIM of (a). (c) MIMnew of (a). (d) RMIM of (a). (e) Errors map
of MIM. (f) Rotated 30◦ image. (g) MIM of (f). (h) MIMnew of (f). (i) RMIM of (f). (j) Errors map of RMIM.
Fig. 6. Three different spatial arrangements for feature description. (a) SIFT.
(b) GLOH. (c) DAISY.
Authorized licensed use limited to: Lovely Professional University - Phagwara. Downloaded on September 26,2024 at 07:57:35 UTC from IEEE Xplore. Restrictions apply.
ZHU et al.: R2 FD2 : FAST AND ROBUST MATCHING OF MRSIS VIA REPEATABLE FEATURE DETECTOR 5606115
TABLE I
AVERAGE M ATCHING P ERFORMANCE OF THE RMLG D ESCRIPTOR W ITH D IFFERENT S PATIAL A RRANGEMENTS
Authorized licensed use limited to: Lovely Professional University - Phagwara. Downloaded on September 26,2024 at 07:57:35 UTC from IEEE Xplore. Restrictions apply.
5606115 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 61, 2023
Fig. 8. Comparison of the repeatability and distributions of IPs between the FASTMPC and our MALG detector. Contrastive example of extracted IPs for
(a) optical–infrared image pairs, (b) optical–LiDAR image pairs, and (c) optical–SAR image pairs.
process is given as follows. First, we randomly selected MS-HLMO, and HOWP. Each image pair of the above MRSIs
two sets of image pairs without rotation from the above datasets was rotated from 0◦ to 180◦ with an interval of
MRSIs datasets for experimentation. Then, one of the selected 10◦ , and a total of 1140 (19∗20∗3) rotated images can be
image pairs was rotated from 0◦ to 360◦ with an interval obtained as experimentation. The correct matches of each
of 10◦ , and a total of 72 rotated images can be obtained. image pair were manually determined by selecting 10–20
These rotated images and their corresponding optical images evenly distributed correspondences to estimate the projective
constitute 72 pairs of test cases, which are finally matched by model (denoted as Ptruth ). These matched correspondences
our R2 FD2 . Also, the two examples of rotation invariance tests with residuals less than three pixels were considered as the
for our R2 FD2 are shown in Fig. 9. correct matches by utilizing the estimated projective model
These NCMs were marked with red dots, it can be clearly Ptruth . For quantitative evaluation, we employed four criteria
seen that NCMs of all rotated angles were not less than 100, to evaluate the performance of each matching method in terms
and more than half of NCMs were greater than 300. What is of NCM, SR, root-mean-square errors (RMSEs), and running
more, the matching success rate (SR) of all rotation angles time (RT). Among them, NCM represented the number of
was up to 100%, which further verifies that our R2 FD2 can correspondences correctly matched. If the number of NCM
maintain rotation invariance in the range of [0◦ , 360◦ ]. Fig. 10 was less than ten, the corresponding image pairs were marked
shows the matching results of several groups of rotation as a matching failure. The RMSE can be calculated as follows:
angles (30◦ , 130◦ , 240◦ , and 350◦ ) and their corresponding s
registration results. As can be seen, the distribution of corre- PN
R(x, y) − Ptruth ∗ S(x, y)
2
i=1
spondences is relatively uniform, and the checkboard maps of RMSE = (19)
N
registration results have been aligned correctly.
where R(x, y) and S(x, y) represent the correct matches of
D. Matching Performance Evaluation of R2 FD2 the reference and sensed images, respectively, Ptruth is the
In this section, we compared our R2 FD2 with five projective model, and N represents the number of NCM. The
state-of-the-art matching methods: HOSS, RIFT, RI-ALGH, smaller the RMSE, the higher the accuracy of the matching
Authorized licensed use limited to: Lovely Professional University - Phagwara. Downloaded on September 26,2024 at 07:57:35 UTC from IEEE Xplore. Restrictions apply.
ZHU et al.: R2 FD2 : FAST AND ROBUST MATCHING OF MRSIS VIA REPEATABLE FEATURE DETECTOR 5606115
Fig. 11. Comparisons of average NCM criteria for different matching methods. Average NCM of (a) optical–infrared datasets, (b) optical–LiDAR datasets,
and (c) optical–SAR datasets.
Fig. 12. Comparisons of SR criteria for different matching methods. SR of (a) optical–infrared datasets, (b) optical–LiDAR datasets, and (c) optical–SAR
datasets.
method. The definition of SR is given as follows: reduced in the optical–SAR dataset and RIFT achieved more
i I (Pi )
NCMs than HOSS. This could be caused by the relatively low
P
SR = ∗ 100% (20) discriminative capability of HOSS based on LSS description
( T led to the inability to maintain robust matching performance
1, NCM(Pi ) ≥ 10
I (Pi ) = (21) for optical–SAR dataset with significant NRD. In contrast,
0, else it is obvious that our R2 FD2 outperformed the other methods
where T represents the total number of image pairs of a multi- in the average NCM criteria and obtained the most matching
modal image sets, I (Pi ) represents a logical value, 1 represents numbers on all types of multimodal image pairs, followed by
a successful matching trial, and 0 represents a failed matching HOWP and RI-ALGH. This indicates that the features detected
trial. NCM(Pi ) represents the NCM of the ith image pair. by our MALG are more repeatable and the features described
SR was the ratio between the number of image pairs that are by our RMLG are more discriminative.
successfully matched to the total number of image pairs. The Fig. 12 shows the comparison results of SR criteria for
larger the value of NCM and SR, the stronger the robustness different matching methods, where the SR criteria of HOSS
of the corresponding matching method. were the worst, especially since there are cases where the
As shown in Fig. 11, the comparison results of average SR of HOSS was zero in optical–SAR datasets, followed by
NCM criteria for different matching methods on each multi- MS-HLMO and RIFT. RI-ALGH and HOWP had comparable
modal image dataset are demonstrated. Also, the average NCM performance regarding the SR criteria in each type of dataset.
refers to the average of all NCMs of a total of 19 sets of images On the whole, our R2 FD2 obtained the highest SR on all the
generated by each image pair with an interval of 10◦ from 0◦ datasets, and the matching SR of R2 FD2 reached 100% on
to 180◦ . As can be seen, MS-HLMO matched the least NCMs most datasets and close to 100% on a few image pairs.
for all types of multimodal image pairs, followed by HOSS. To further evaluate the accuracy of different matching
This may be related to the fact that MS-HLMO utilizes the methods, Fig. 13 shows the comparison results of average
Harris-based function to detect IPs, which usually results in RMSE criteria, where RMSE was set to five indicating a failed
the extracted IPs being less than others such as the MALG and match. Similar to the average NCM, the average RMSE refers
FASTMPC detector. The average NCM criteria of HOSS and to the average of all RMSEs of a total of 19 sets of images
RIFT were comparable in optical–infrared and optical–LiDAR generated by each image pair with an interval of 10◦ from 0◦
datasets, while the average NCMs of HOSS were extremely to 180◦ . It can be seen from Fig. 13 that our R2 FD2 yielded the
Authorized licensed use limited to: Lovely Professional University - Phagwara. Downloaded on September 26,2024 at 07:57:35 UTC from IEEE Xplore. Restrictions apply.
5606115 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 61, 2023
Fig. 13. Comparisons of average RMSE criteria for different matching methods. RMSE of (a) optical–infrared datasets, (b) optical–LiDAR datasets, and
(c) optical–SAR datasets.
Fig. 14. Correspondence visualization of R2 FD2 . Matching results of (a) optical–infrared datasets, (b) optical–LiDAR datasets, and (c) optical–SAR datasets.
TABLE III Table III gives the average RT of each compared method for
C OMPARISONS OF RT C RITERIA FOR E ACH M ATCHING M ETHOD the whole dataset, which was implemented on a laptop with a
CPU i7-10750H 2.6 GHz and 16-GB RAM. The average RT
of our R2 FD2 was the fastest, and the time consumption was
about 11 s. The efficiency of HOWP was second best, RIFT
ranked third, HOSS and MS-HLMO fourth, and RI-ALGH
last. Specifically, the RT of our R2 FD2 was about 9 times,
6.5 times, 5 times, and 1.4 times faster than RI-ALGH,
MS-HLMO (HOSS), RIFT, and HOWP, respectively. It is
obvious that our R2 FD2 has a great advantage in matching
best results on the criterion of average RMSE and achieved the efficiency, which is attributed to the fast assignment of domi-
matching accuracy of fewer than two pixels for all datasets. nant orientation and the construction of the RMLG descriptor
This was followed by HOWP and RI-ALGH, and their RMSE using the RMIM with rotation invariance.
was relatively worse than our R2 FD2 . Nevertheless, HOSS and Furthermore, we carried out qualitative evaluations for
MS-HLMO were all likely to exhibit the worst performance R2 FD2 by displaying correct correspondences and registration
on the criterion of average RMSE for different cases. These results for the visual inspection. Fig. 14 shows more matching
experimental results further illustrate that the validity of the results of our R2 FD2 , and at least four image pairs were
proposed approaches compared to the state-of-the-art methods, randomly selected from each multimodal dataset and applied
and the rotation-invariance performed by the RMLG descriptor different rotation deformations in the range of [0◦ , 360◦ ].
is more reliable than others. Fig. 15 shows the corresponding registration results of Fig. 14
Authorized licensed use limited to: Lovely Professional University - Phagwara. Downloaded on September 26,2024 at 07:57:35 UTC from IEEE Xplore. Restrictions apply.
ZHU et al.: R2 FD2 : FAST AND ROBUST MATCHING OF MRSIS VIA REPEATABLE FEATURE DETECTOR 5606115
Fig. 15. Checkboard visualization of R2 FD2 . Registration results of (a) optical–infrared datasets, (b) optical–LiDAR datasets, and (c) optical–SAR datasets.
by using checkerboard maps. Each edge of all checkerboard repeatable MALG detector and the rotation-invariant RMLG
maps can be well aligned without obvious misalignment, descriptor. The MALG detector was first designed by inte-
which further verifies the satisfactory generality of R2 FD2 . grating the multichannel autocorrelation strategy with the log-
Overall, these evaluations and coherence analysis proved Gabor wavelets for IPs extraction. In this way, IPs extracted
that our R2 FD2 achieved high computational efficiency and the by MALG generally had a high repetition rate and were
effectiveness of our R2 FD2 in resisting significant radiation evenly distributed in multimodal images. Then, the fast
and rotation differences among multimodal images were far assignment strategy of dominant orientation was proposed to
superior to the state-of-the-art feature matching methods. The establish the novel RMIM with rotation invariance. Subse-
excellent matching performance of R2 FD2 was mainly due quently, the RMLG descriptor was conducted by incorporating
to the following two reasons. On the one hand, the feature the rotation-invariant RMIM with the spatial arrangement
detection of R2 FD2 adopted the novel MALG detector, and of DAISY for feature representation. Qualitative and quan-
MALG had the excellent property of high repeatability and titative experiments were performed by utilizing different
uniform distribution for IPs detection, which can be rather types of MRSIs datasets (optical–infrared, optical–LiDAR, and
advantageous for subsequent matching. On the other hand, optical–SAR image pairs) to evaluate the matching perfor-
the feature description of R2 FD2 utilized the discriminative mance of our R2 FD2 . The experimental results demonstrated
RMLG descriptor, and RMLG integrated the rotation-invariant that the proposed R2 FD2 outperformed five state-of-the-art
RMIM with the arrangement of DAISY to depict more dis- feature matching methods (i.e., HOSS, RIFT, RI-ALGH, MS-
criminative invariant features, which lays a foundation for fast HLMO, and HOWP) in all criteria (including NCM, SR,
and robust matching. RMSE, and RT). As a result, our R2 FD2 can be capable of
reliably achieving fast and robust feature matching for MRSIs.
IV. C ONCLUSION Although the proposed R2 FD2 exhibited superior adap-
In this article, a novel feature matching method (named tation to rotation and radiation differences for multimodal
R2 FD2 ) was presented for MRSIM, involving both the feature matching, it was sensitive to scale distortions between
Authorized licensed use limited to: Lovely Professional University - Phagwara. Downloaded on September 26,2024 at 07:57:35 UTC from IEEE Xplore. Restrictions apply.
5606115 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 61, 2023
multimodal images because it did not address the question of [21] Y. Ye, J. Shan, L. Bruzzone, and L. Shen, “Robust registration of
scale invariance. Accordingly, our future research will include multimodal remote sensing images based on structural similarity,” IEEE
Trans. Geosci. Remote Sens., vol. 55, no. 5, pp. 2941–2958, Mar. 2017.
the exploration of these limitations more deeply. For example, [22] J. Fan, Y. Wu, M. Li, W. Liang, and Y. Cao, “SAR and optical
it is of great significance to establish a suitable scale space for image registration using nonlinear diffusion and phase congruency
achieving scale invariance, such as co-occurrence scale space structural descriptor,” IEEE Trans. Geosci. Remote Sens., vol. 56, no. 9,
pp. 5368–5379, Sep. 2018.
[53], nonlinear diffusion scale space [22], and Gaussian scale [23] Y. Ye, L. Bruzzone, J. Shan, F. Bovolo, and Q. Zhu, “Fast and robust
space [54]. matching for multimodal remote sensing image registration,” IEEE
Trans. Geosci. Remote Sens., vol. 57, no. 11, pp. 9059–9070, Nov. 2019.
R EFERENCES [24] Y. Xiang, R. Tao, and H. You, “OS-PC: Combining feature repre-
sentation and 3-D phase correlation for subpixel optical and SAR
[1] Y. Zhang, Z. Zhang, and J. Gong, “Generalized photogrammetry
image registration,” IEEE Trans. Geosci. Remote Sens., vol. 58, no. 9,
of spaceborne, airborne and terrestrial multi-source remote sensing
pp. 6451–6466, Mar. 2020.
datasets,” Acta Geodaetica et Cartographica Sinica, vol. 50, no. 1,
pp. 1–11, 2021. [25] Y. Ye, B. Zhu, T. Tang, C. Yang, Q. Xu, and G. Zhang, “A robust
multimodal remote sensing image registration method and system using
[2] J. Ma, Y. Ma, and C. Li, “Infrared and visible image fusion methods and
steerable filters with first- and second-order gradients,” ISPRS J. Pho-
applications: A survey,” Inf. Fusion, vol. 45, pp. 153–178, Jan. 2019.
togramm. Remote Sens., vol. 188, pp. 331–350, Jun. 2022.
[3] L.-J. Deng, M. Feng, and X.-C. Tai, “The fusion of panchromatic and
[26] A. Sedaghat and N. Mohammadi, “Uniform competency-based local
multispectral remote sensing images via tensor-based sparse modeling
feature extraction for remote sensing images,” ISPRS J. Photogram.
and hyper-Laplacian prior,” Inf. Fusion, vol. 52, pp. 76–89, Dec. 2019.
Remote Sens., vol. 135, pp. 142–157, Jan. 2018.
[4] F. Luo, Z. Zou, J. Liu, and Z. Lin, “Dimensionality reduction and
[27] H. P. Moravec, Obstacle Avoidance and Navigation in the Real World
classification of hyperspectral image via multistructure unified discrim-
by a Seeing Robot Rover. Stanford, CA, USA: Stanford Univ., 1980.
inative embedding,” IEEE Trans. Geosci. Remote Sens., vol. 60, 2022,
Art. no. 5517916. [28] C. Harris and M. Stephens, “A combined corner and edge detector,” in
[5] S. Hao, W. Wang, Y. Ye, T. Nie, and L. Bruzzone, “Two-stream deep Proc. Alvey Vis. Conf., vol. 15, Manchester, U.K., 1988, pp. 5210–5244.
architecture for hyperspectral image classification,” IEEE Trans. Geosci. [29] D. G. Lowe, “Distinctive image features from scale-invariant keypoints,”
Remote Sens., vol. 56, no. 4, pp. 2349–2361, Apr. 2018. Int. J. Comput. Vis., vol. 60, no. 2, pp. 91–110, 2004.
[6] Y. Ye, W. Liu, L. Zhou, T. Peng, and Q. Xu, “An unsupervised SAR and [30] E. Rosten, R. Porter, and T. Drummond, “Faster and better: A machine
optical image fusion network based on structure-texture decomposition,” learning approach to corner detection,” IEEE Trans. Pattern Anal. Mach.
IEEE Geosci. Remote Sens. Lett., vol. 19, pp. 1–5, 2022. Intell., vol. 32, no. 1, pp. 105–119, Jan. 2010.
[7] H. Cao, P. Tao, H. Li, and J. Shi, “Bundle adjustment of satellite images [31] Y. Xiang, F. Wang, and H. You, “OS-SIFT: A robust SIFT-like algorithm
based on an equivalent geometric sensor model with digital elevation for high-resolution optical-to-SAR image registration in suburban areas,”
model,” ISPRS J. Photogramm. Remote Sens., vol. 156, pp. 169–183, IEEE Trans. Geosci. Remote Sens., vol. 56, no. 8, pp. 3078–3090,
Oct. 2019. Jun. 2018.
[8] L. Tang, Y. Deng, Y. Ma, J. Huang, and J. Ma, “SuperFusion: A ver- [32] Y. Ye, J. Shan, S. Hao, L. Bruzzone, and Y. Qin, “A local phase
satile image registration and fusion network with semantic awareness,” based invariant feature for remote sensing image matching,” ISPRS
IEEE/CAA J. Autom. Sinica, vol. 9, no. 12, pp. 2121–2137, Dec. 2022. J. Photogramm. Remote Sens., vol. 142, pp. 205–221, Aug. 2018.
[9] Y. Ye et al., “Feature decomposition-optimization-reorganization net- [33] J. Li, Q. Hu, and M. Ai, “RIFT: Multi-modal image matching based
work for building change detection in remote sensing images,” Remote on radiation-variation insensitive feature transform,” IEEE Trans. Image
Sens., vol. 14, no. 3, p. 722, Feb. 2022. Process., vol. 29, pp. 3296–3310, 2020.
[10] J. Ma, J. Zhao, J. Jiang, H. Zhou, and X. Guo, “Locality preserving [34] K. Mikolajczyk and C. Schmid, “A performance evaluation of local
matching,” Int. J. Comput. Vis., vol. 127, no. 5, pp. 512–531, 2019. descriptors,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 27, no. 10,
[11] B. Zhu, J. Zhang, T. Tang, and Y. Ye, “SFOC: A novel multi-directional pp. 1615–1630, Oct. 2005.
and multi-scale structural descriptor for multimodal remote sensing [35] E. Tola, V. Lepetit, and P. Fua, “DAISY: An efficient dense descriptor
image matching,” Int. Arch. Photogramm., Remote Sens. Spatial Inf. applied to wide-baseline stereo,” IEEE Trans. Pattern Anal. Mach.
Sci., vols. 127, pp. 113–120, May 2022. Intell., vol. 32, no. 5, pp. 815–830, May 2010.
[12] J. Ma, X. Jiang, A. Fan, J. Jiang, and J. Yan, “Image matching from [36] A. Sedaghat and H. Ebadi, “Remote sensing image matching based on
handcrafted to deep features: A survey,” Int. J. Comput. Vis., vol. 129, adaptive binning SIFT descriptor,” IEEE Trans. Geosci. Remote Sens.,
no. 1, pp. 23–79, Aug. 2020. vol. 53, no. 10, pp. 5283–5293, Oct. 2015.
[13] B. Zhu, L. Zhou, S. Pu, J. Fan, and Y. Ye, “Advances and chal- [37] A. Sedaghat, M. Mokhtarzade, and H. Ebadi, “Uniform robust scale-
lenges in multimodal remote sensing image registration,” IEEE J. invariant feature matching for optical remote sensing images,” IEEE
Miniaturization Air Space Syst., early access, Feb. 14, 2023, doi: Trans. Geosci. Remote Sens., vol. 49, no. 11, pp. 4516–4527, Nov. 2011.
10.1109/JMASS.2023.3244848. [38] A. Sedaghat and N. Mohammadi, “Illumination-robust remote sensing
[14] Y. Deng and J. Ma, “ReDFeat: Recoupling detection and description image matching based on oriented self-similarity,” ISPRS J. Pho-
for multimodal feature learning,” IEEE Trans. Image Process., vol. 32, togramm. Remote Sens., vol. 153, pp. 21–35, Jul. 2019.
pp. 591–602, 2022. [39] X. Xiong, G. Jin, Q. Xu, and H. Zhang, “Self-similarity features
[15] S. Wang, D. Quan, X. Liang, M. Ning, Y. Guo, and L. Jiao, “A deep for multimodal remote sensing image matching,” IEEE J. Sel. Top-
learning framework for remote sensing image registration,” ISPRS J. ics Appl. Earth Observ. Remote Sens., vol. 14, pp. 12440–12454,
Photogramm. Remote Sens., vol. 145, pp. 148–164, Nov. 2018. 2021.
[16] L. H. Hughes, D. Marcos, S. Lobry, D. Tuia, and M. Schmitt, “A deep [40] Y. Ye and J. Shan, “A local descriptor based registration method
learning framework for matching of SAR and optical imagery,” ISPRS for multispectral remote sensing images with non-linear intensity dif-
J. Photogramm. Remote Sens., vol. 169, pp. 166–179, Nov. 2020. ferences,” ISPRS J. Photogramm. Remote Sens., vol. 90, pp. 83–95,
[17] L. Zhou, Y. Ye, T. Tang, K. Nan, and Y. Qin, “Robust matching for SAR Apr. 2014.
and optical images using multiscale convolutional gradient features,” [41] Y. Ye, L. Shen, M. Hao, J. Wang, and Z. Xu, “Robust optical-to-SAR
IEEE Geosci. Remote Sens. Lett., vol. 19, pp. 1–5, 2022. image matching based on shape properties,” IEEE Geosci. Remote Sens.
[18] D. Quan et al., “Self-distillation feature learning network for optical and Lett., vol. 14, no. 4, pp. 564–568, Apr. 2017.
SAR image registration,” IEEE Trans. Geosci. Remote Sens., vol. 60, [42] P. Kovesi, “Image features from phase congruency,” J. Comput. Vis. Res.,
2022, Art. no. 4706718. vol. 1, no. 3, pp. 1–26, 1999.
[19] Y. Ye, T. Tang, B. Zhu, C. Yang, B. Li, and S. Hao, “A multiscale frame- [43] J. Fan, Y. Ye, J. Li, G. Liu, and Y. Li, “A novel multiscale adaptive bin-
work with unsupervised learning for remote sensing image registration,” ning phase congruency feature for SAR and optical image registration,”
IEEE Trans. Geosci. Remote Sens., vol. 60, 2022, Art. no. 5622215. IEEE Trans. Geosci. Remote Sens., vol. 60, 2022, Art. no. 5235216.
[20] B. Zhu, Y. Ye, L. Zhou, Z. Li, and G. Yin, “Robust registration of aerial [44] Y. Xiang, R. Tao, F. Wang, H. You, and B. Han, “Automatic reg-
images and LiDAR data using spatial constraints and Gabor structural istration of optical and SAR images via improved phase congruency
features,” ISPRS J. Photogramm. Remote Sens., vol. 181, pp. 129–147, model,” IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 13,
Nov. 2021. pp. 5847–5861, 2020.
Authorized licensed use limited to: Lovely Professional University - Phagwara. Downloaded on September 26,2024 at 07:57:35 UTC from IEEE Xplore. Restrictions apply.
ZHU et al.: R2 FD2 : FAST AND ROBUST MATCHING OF MRSIS VIA REPEATABLE FEATURE DETECTOR 5606115
[45] Q. Yu, D. Ni, Y. Jiang, Y. Yan, J. An, and T. Sun, “Universal SAR Jinkun Dai received the B.S. degree from the
and optical image registration via a novel SIFT framework based on Faculty of Geosciences and Environmental Engi-
nonlinear diffusion and a polar spatial-frequency descriptor,” ISPRS J. neering, Southwest Jiaotong University, Chengdu,
Photogramm. Remote Sens., vol. 171, pp. 1–17, Jan. 2021. China, in 2022, where he is currently pursuing the
[46] P. Kovesi, “Phase congruency detects corners and edges,” in Proc. M.S. degree in surveying and mapping science and
Austral. Pattern Recognit. Soc. Conf. DICTA, 2003, pp. 1–10. technology.
[47] Y. Xiang, F. Wang, L. Wan, and H. You, “SAR-PC: Edge detection in His research interests include image matching,
SAR images via an advanced phase congruency model,” Remote Sens., image fusion, and classification.
vol. 9, no. 3, p. 209, Feb. 2017.
[48] J. Fan, Y. Wu, F. Wang, Q. Zhang, G. Liao, and M. Li, “SAR
image registration using phase congruency and nonlinear diffusion-based
SIFT,” IEEE Geosci. Remote Sens. Lett., vol. 12, no. 3, pp. 562–566,
Mar. 2015.
[49] J. Arróspide and L. Salgado, “Log-Gabor filters for image-based vehicle Jianwei Fan received the B.S. degree in electronic
verification,” IEEE Trans. Image Process., vol. 22, no. 6, pp. 2286–2295, information science and technology from the Henan
Jun. 2013. University of Science and Technology, Luoyang,
[50] Y. Wu, W. Ma, M. Gong, L. Su, and L. Jiao, “A novel point-matching China, in 2011, and the Ph.D. degree in pattern
algorithm based on fast sample consensus for image registration,” IEEE recognition and intelligent systems from Xidian Uni-
Geosci. Remote Sens. Lett., vol. 12, no. 1, pp. 43–47, Jan. 2015. versity, Xi’an, China, in 2017.
[51] C. Gao, W. Li, R. Tao, and Q. Du, “MS-HLMO: Multiscale histogram He is currently a Lecturer with the School of Com-
of local main orientation for remote sensing image registration,” IEEE puter and Information Technology, Xinyang Normal
Trans. Geosci. Remote Sens., vol. 60, 2022, Art. no. 5626714. University, Xinyang, China. His main research inter-
[52] Y. Zhang et al., “Histogram of the orientation of the weighted phase ests include remote sensing image processing, image
descriptor for multi-modal remote sensing image matching,” ISPRS J. registration, and feature extraction.
Photogramm. Remote Sens., vol. 196, pp. 1–15, Feb. 2023.
[53] Y. Yao, Y. Zhang, Y. Wan, X. Liu, X. Yan, and J. Li, “Multi-modal
remote sensing image matching considering co-occurrence filter,” IEEE
Trans. Image Process., vol. 31, pp. 2584–2597, 2022. Yao Qin (Student Member, IEEE) received the B.S
[54] F. Dellinger, J. Delon, Y. Gousseau, J. Michel, and F. Tupin, “SAR-SIFT: degree in information engineering from Shanghai
A SIFT-like algorithm for SAR images,” IEEE Trans. Geosci. Remote Jiaotong University, Shanghai, China, in 2013, and
Sens., vol. 53, no. 1, pp. 453–466, Jan. 2014. the M.S. and Ph.D. degrees in information and
communication engineering from the College of
Electronic Science, National University of Defense
Technology (NUDT), Changsha, China, in 2015 and
2019, respectively.
He was a Visiting Ph.D. with the Remote Sensing
Laboratory, Department of Information Engineering
Bai Zhu received the B.S. degree from the Fac-
and Computer Science, University of Trento, Trento,
ulty of Geosciences and Environmental Engineer-
Italy. He has been a Research Assistant with the Northwest Institute of Nuclear
ing, Southwest Jiaotong University, Chengdu, China,
Technology, Xi’an, China, since 2020. His research interests include infrared
in 2019, where he is currently pursuing the Ph.D.
small target detection, hyperspectral image classification and clustering, and
degree in surveying and mapping science and
domain adaptation.
technology.
His research is mainly focused on remote sens-
ing image processing, multimodal image matching,
image registration, and feature extraction.
Yuanxin Ye (Member, IEEE) received the B.S.
degree in remote sensing science and technol-
ogy from Southwest Jiaotong University, Chengdu,
China, in 2008, and the Ph.D. degree in photogram-
metry and remote sensing from Wuhan University,
Wuhan, China, in 2013.
He is currently a Research Fellow with the Fac-
Chao Yang received the B.S. degree from the
ulty of Geosciences and Environmental Engineer-
Faculty of Geosciences and Environmental Engi-
ing, Southwest Jiaotong University, Chengdu. His
neering, Southwest Jiaotong University, Chengdu,
research interests include remote sensing image pro-
China, in 2019, where he is currently pursuing the
cessing, image registration, change detection, and
Ph.D. degree in surveying and mapping science and
object detection.
technology.
Dr. Ye achieved “the International Society for Photogrammetry and Remote
His research interests include image matching,
Sensing (ISPRS) Prizes for Best Papers by Young Authors” by the 23rd
deep learning, and image processing.
International Society for Photogrammetry and Remote Sensing Congress in
Prague, in 2016, and “the Best Youth Oral Paper Award” by ISPRS Geospatial
Week 2017 in Wuhan, in 2017.
Authorized licensed use limited to: Lovely Professional University - Phagwara. Downloaded on September 26,2024 at 07:57:35 UTC from IEEE Xplore. Restrictions apply.