0% found this document useful (0 votes)

24 views8 pages

Asymmetric Distances For Binary Embeddings

Uploaded by

xingyanzhou687

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

24 views8 pages

Asymmetric Distances For Binary Embeddings

Uploaded by

xingyanzhou687

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Asymmetric Distances for Binary Embeddings

Albert Gordo ∗ Florent Perronnin

Computer Vision Center Textual and Visual Pattern Analysis
Universitat Autònoma de Barcelona, Spain Xerox Research Centre Europe, France
[email protected] [email protected]

Abstract cient operations. The second one is the memory cost: the
memory footprint of the objects should be small enough so
In large-scale query-by-example retrieval, embedding that all database image signatures fit in RAM. If this is not
image signatures in a binary space offers two benefits: data the case, i.e. if a significant portion of the database signa-
compression and search efficiency. While most embedding tures has to be stored on disk, then the response time of a
algorithms binarize both query and database signatures, it query collapses because the disk access is much slower than
has been noted that this is not strictly a requirement. In- that of RAM access.
deed, asymmetric schemes which binarize the database sig- To address these two interrelated issues, many binary
natures but not the query still enjoy the same two bene- embedding techniques were proposed including among oth-
fits but may provide superior accuracy. In this work, we ers Locality Sensitive Hashing (LSH) [4, 2], Spectral Hash-
propose two general asymmetric distances which are appli- ing (SH) [17], Kernel LSH (KLSH) [8], Locality Sensi-
cable to a wide variety of embedding techniques including tive Binary Codes (LSBC) [13] or Semi-Supervised Hash-
Locality Sensitive Hashing (LSH), Locality Sensitive Binary ing (SSH) [16]. They consist in transforming the initial
Codes (LSBC), Spectral Hashing (SH) and Semi-Supervised signatures in a binary space where the Hamming distance
Hashing (SSH). We experiment on four public benchmarks Ha can be employed. Binary embeddings address both the
containing up to 1M images and show that the proposed computational and memory issue: the Hamming distance
asymmetric distances consistently lead to large improve- between two binary signatures can be computed extremely
ments over the symmetric Hamming distance for all binary efficiently and the memory footprint is drastically reduced.
embedding techniques. We also propose a novel simple bi- In all the previous techniques, both the database signatures
nary embedding technique – PCA Embedding (PCAE) – and the query signature are binarized. For this reason, we
which is shown to yield competitive results with respect to will refer to such distance computation algorithms as sym-
more complex algorithms such as SH and SSH. metric.
However, it has been noted that compressing the query
signature is not mandatory [3, 6, 1]. Indeed, the additional
1. Introduction cost of storing in memory a single non-binarized signature
We are interested in computing efficiently the distance is negligible. Also, the distance between an original signa-
between two image signatures. The application we will fo- ture and a compressed signature can still be computed effi-
cus on is query-by-example image retrieval – i.e. given a ciently through look-up table operations. As the distance is
query image, rank a set of database images from closest to computed in two different spaces, these algorithms are re-
farthest – but our work finds applications in numerous other ferred to as asymmetric. A major benefit of asymmetric al-
problems such as nearest neighbor classification, density es- gorithms is that they can achieve higher accuracy for a fixed
timation or clustering. compression rate because they take advantage of the more
When dealing with a large number of database images, precise position information of the query. We note however
there are two considerations of paramount importance. The that the asymmetric algorithms presented in [3, 6, 1] are tied
first one is the computational cost: the computation of the to specific compression schemes. Dong et al. [3] presented
distance between two image signatures should rely on effi- an asymmetric algorithm for compression schemes based
on random projections. Jégou et al. [6] proposed an asym-
∗ Albert Gordo is partially supported by the Spanish projects TIN2008-
metric algorithm for compression schemes based on vector
04998, TIN2009-14633-C03-03, and CONSOLIDER-INGENIO 2010
(CSD2007-00018). This work was done while Albert Gordo was a visi-
quantization. Brandt [1] subsequently used a similar idea.
tor at XRCE. In this work, we generalize these asymmetric schemes to

729

Authorized licensed use limited to: XIDIAN UNIVERSITY. Downloaded on April 19,2024 at 10:37:36 UTC from IEEE Xplore. Restrictions apply.
the broad family of binary embeddings. More precisely, our We show that for LSH, LSBC, SH, SSH and PCAE the
contributions are the following ones: functions hk can be decomposed as follows:
• We first provide a theoretical analysis of several binary hk (x) = qk [gk (x)], (1)
embedding algorithms (including LSH, LSBC, SH and
SSH) showing that they can be decomposed into two where gk : Ω → R, and qk : R → {0, 1}. We denote
steps: i) the signatures are first embedded in an inter- g : Ω → RK with g(x) = [g1 (x), . . . , gK (x)]′ . If we
mediate real-valued space and ii) thresholding is per- have two image signatures x and y, we also show that the
formed in this space to obtain binary outputs. A key Euclidean distance is the natural metric between g(x) and
insight is that the Euclidean distance is a natural met- g(y).
ric in the intermediate real-valued space.
2.1. Locality Sensitive Hashing (LSH)
• Building on the previous analysis we propose two In LSH, the functions hk are called hash functions and
asymmetric distances for binary embeddings which are selected to approximate a similarity function sim in the
can be broadly applied to binary embedding algo- original space Ω = RD . Valid hash functions hk must sat-
rithms. The first one is an expectation-based technique isfy the LSH property:
inspired by [6]. The second one is a lower-bound-
based technique which generalizes [3]. P r [hk (x) = hk (y)] = sim(x, y). (2)
• We show experimentally on four datasets of different Here we focus on the case where sim is the cosine similar-
nature that the proposed asymmetric distances con- ity for which a suitable hash function is [2]:
sistently and significantly improve the retrieval accu-
racy of LSH, LSBC, SH and SSH over the symmetric hk (x) = σ (rk′ x) , (3)
Hamming distance. Somewhat surprisingly, although
with
the lower-bound and expectation-based techniques are (
0 if u < 0,
very different in nature, they are shown to yield very σ(u) = (4)
similar improvements. 1 if u ≥ 0.

• We provide an experimental comparison of LSH, The vectors rk ∈ RD are drawn from a multi-dimensional
LSBC, SH and SSH and draw conclusions about the Gaussian distribution p with zero mean and identity covari-
respective merits of these algorithms. Especially, we ance matrix ID . We therefore have qk (u) = σ(u) and
show that the key ingredient to the success of SH and gk (x) = rk′ x. In such a case the natural distance between
SSH is the dimensionality reduction. We therefore g(x) and g(y) in the intermediate space is the Euclidean
propose a much simpler PCA-based binary embedding distance as random Gaussian projections preserve the Eu-
technique, called PCAE (for PCA Embedding), which clidean distance in expectation. This property stems from
is shown to perform comparably. the following equality:

Er∼p ||r′ x − r′ y||2 = ||x − y||2 .

The remainder of this article is organized as follows. In (5)
the next section, we provide an analysis of several binary
We note that centering the data around the origin can
embedding techniques. In section 3 we build on the pre-
impact very positively LSH, especially when dealing with
vious analysis to propose two asymmetric distance compu-
non-negative data such as GIST vectors [10]. This is be-
tation algorithms for binary embeddings. In section 4 we
cause, given a set of points {xi , i = 1 . . . N } with zero-
provide experimental results. We finally draw conclusions.
mean andPNa random direction rk , we have the guarantee
that N1 i=1 rk′ xi = 0, i.e. the distribution of values rk′ xi
2. Theoretical Analysis of Binary Embeddings
is centered around the LSH binarization threshold. If the
We now provide a review of several successful binary data is not centered around the origin, the mean of the rk′ xi
embedding techniques: LSH, LSBC, SH and SSH. We also values can be very different from zero (we observed cases
propose a very simple technique: PCAE. where the projections all had the same sign), leading to very
Let us introduce a set of notations. Let x be an image sig- weakly discriminative embedding functions 1 .
nature in a space Ω and let hk be a binary embedding func- We note that the previous analysis can be readily ex-
tion, i.e. hk : Ω → {0, 1} (some authors prefer the conven- tended to the Kernelized LSH approach of Kulis and Grau-
tion hk : Ω → {−1, +1}). A set H = {hk , k = 1 . . . K} of man [8] as the Mercer kernel between two objects is just
K functions defines a multi-dimensional embedding func- 1 An alternative would be to choose a per-dimension threshold equal to
tion h : Ω → {0, 1}K with h(x) = [h1 (x), . . . , hK (x)]′ the median of the rk′ xi values. However, this would not guarantee any-
(and the apostrophe denotes the transpose). more the convergence to the cosine.

730

Authorized licensed use limited to: XIDIAN UNIVERSITY. Downloaded on April 19,2024 at 10:37:36 UTC from IEEE Xplore. Restrictions apply.
a dot product in another space using a non-linear mapping As optimizing (8) under constraint (9) is a NP hard problem
φ(x) of the input vectors. This enables to extend LSH as (even in the case where K = 1, i.e. of a single bit), Weiss et
well as the asymmetric distance computations beyond vec- al. propose to optimize a relaxed version of their problem,
torial representations. i.e. to remove constraint (9), and then to binarize the real-
valued output at 0. This is equivalent to minimizing:
2.2. Locality-Sensitive Binary Codes (LSBC) Z
Consider a Mercer kernel K : RD × RD → R that satis- ||g(x) − g(y)||2 sim(x, y)p(x)p(y)dxdy, (12)
x,y
fies the following properties for all points x and y:
1. It is translation invariant: K(x, y) = K(x − y). with respect to g and then writing h(x) = 2σ(g(x)) − 1,
i.e. q(u) = 2σ(u) − 1. The solutions to the relaxed problem
2. It is normalized: i.e., K(x − y) ≤ 1 and K(0) = 1. are eigenfunctions of the weighted Laplace-Beltrami opera-
tors for which there exists a closed-form formula in certain
3. ∀α ∈ R : α ≥ 1, K(αx − αy) ≤ K(x, y).
cases, e.g. when sim is the Gaussian kernel and p is sep-
Two well known examples are the Gaussian kernel and arable and uniform. To satisfy, at least approximately, the
the Laplacian kernel. Raginsky and Lazebnik [13] showed separability condition a PCA is first performed on the input
that for such kernels, K(x, y) can be approximated by vectors.
1 − Ha(h(x), h(y)) with hk (x) = qk (gk (x)) where: Minimizing the objective function (12) enforces the Eu-
gk (x) = cos(wk′ x + bk ), (6) clidean distance between g(x) and g(y) to be (inversely)
correlated with the Gaussian kernel, and therefore the Eu-
qk (u) = σ(u − tk ). (7)
clidean distance, in the original space. This shows that in
bk and tk are random values drawn respectively from the case of SH the natural measure in the intermediate space
unif[0, 2π] and unif[−1, +1]. The vectors wk are drawn is the Euclidean distance.
from a distribution pK which depends on the particular
choice of the kernel K. For instance, if K is the Gaussian
2.4. Semi-Supervised Hashing (SSH)
kernel with bandwidth γ, then pK is a Gaussian distribution Let {xi , i = 1 . . . N } be a set of points in Ω = RD . SSH
with mean zero and covariance matrix γID . As the size of [16] assumes the availability of supervisory information in
the binary embedding space increases, 1 − Ha(h(x), h(y)) the form of labeled matching and non-matching pairs. Us-
is guaranteed to converge to K(x, y). ing the notations of [16], we have two sets M and C. A pair
We know from Rahimi and Recht [14] that (xi , xj ) ∈ M if xi and xj are considered similar (neighbor
g(x)′ g(y) is guaranteed to converge to K(x, y) since pair) and (xi , xj ) ∈ C if xi and xj are considered dissimilar
Ewk ∼pK gk (x)gk (y) = K(x, y) and therefore that the (non-neighbor pair). The functions hk are of the form:
embedding g preserves the dot-product in expectation.
hk (xi ) = sign(wk′ xi ), (13)
Since K(x, x) = K(x − x) = K(0) = 1, the norm
||g(x)||2 is also guaranteed to converge to 1, ∀x. In such a i.e. q(u) = 2σ(u) − 1 and gk (x) = wk′ xi . The goal is to
case the dot-product is equivalent to the Euclidean distance learn the set W = {wk , k = 1 . . . K} which maximizes the
and the Euclidean distance is preserved in the intermediate following objective function:
space. X X
J(W) = h(xi )′ h(xj ) − h(xi )′ h(xj ), (14)
2.3. Spectral Hashing (SH) (xi ,xj )∈M (xi ,xj )∈C
D
Given a similarity sim between objects in Ω = R , subject to the constraints:
and assuming that the distribution of objects in Ω may be
N
described by a probability density function p, SH [17] at- X
hk (xi ) = 0, ∀k (15)
tempts to minimize the following objective function with
i=1
respect to h: N
1 X
h(xi )h(xi )′ = ID .
Z
(16)
||h(x) − h(y)||2 sim(x, y)p(x)p(y)dxdy, (8) N i=1
x,y

subject to the following constraints: This is an NP-hard problem and Wang et al. propose to relax
the binary assumption both in the objective function and the
h(x) ∈ {−1, 1}K , (9) constraint. This leads to the following objective function:
Z
h(x)p(x)dx = 0, (10)
X X
J(W) = g(xi )′ g(xj ) − g(xi )′ g(xj )
x
Z (xi ,xj )∈M (xi ,xj )∈C
h(x)h(x)′ p(x)dx = ID . (11) + ηR(W), (17)
x

731

Authorized licensed use limited to: XIDIAN UNIVERSITY. Downloaded on April 19,2024 at 10:37:36 UTC from IEEE Xplore. Restrictions apply.
where R(W) is a regularization function which arises from database vector is encoded by the index of its closest cen-
the relaxation of constraint (15) and η strikes a balance be- troid in the codebook. The distance between the uncom-
tween the two terms. We observed experimentally on sev- pressed query and a quantized database signature is simply
eral datasets that the value of η had little impact on the accu- computed as the Euclidean distance between the query and
racy. We note that in the extreme case where η → ∞ SSH the corresponding centroid.
simply performs a PCA (see [16]). This shows that, for We now adapt this idea to binary embeddings. We note
large η values, SSH approximately preserves the Euclidean that in the case of LSH, LSBC, SH, SSH or PCAE we have
distance in the intermediate space. no notion of centroid. However, we note that the centroid of
a given cell in k-means clustering can be interpreted as the
2.5. PCA Embedding (PCAE) expected value of the vectors assigned to this particular cell.
We also propose the following very simple embedding: Similarly, we propose an asymmetric expectation-based ap-
proximation dE for binary embeddings.
gk (x) = wk′ (x − µ), (18) Assuming that the samples in the original space Ω are
qk (u) = σ(u), (19) drawn from a distribution p, we define:
dE (x, y) =
where µ is the data mean and wk is the eigenvector associ- X
ated with the k-th largest eigenvalue of the data correlation d(gk (x), Eu∼p [gk (u)|hk (u) = hk (y)]). (21)
matrix. The previous equations can be trivially extended k
to kernel PCA (kPCA) and therefore this embedding can d(gk (x), Eu∼p [gk (u)|hk (u) = hk (y)]) is the distance in the
be easily extended to non-vectorial signatures. This em- intermediate space between gk (x) and the expected value of
bedding function is interesting as it can be related to SH the samples gk (u) such that hk (u) = hk (y).
and SSH. In the case of SH, the PCA is the first step of Since we generally do not have access to the distribution
the embedding (to ensure the approximate separability of p, the expectation operator is approximated by a sample av-
the data). In the case of SSH, this corresponds to the limit erage. In practice, we randomly draw a set of signatures
case η → ∞. We will show experimentally that, despite S = {xi , i = 1 . . . N } from Ω. For each dimension k of
its simplicity, this embedding technique yields competitive the embedding, we partition S into two subsets: Sk0 con-
results. In the case of PCAE, the Euclidean distance is also tains the signatures xi such that hk (xi ) = 0 and Sk1 those
the natural distance in the intermediate space as the PCA signatures xi that satisfy hk (xi ) = 1. We compute offline
projections preserve approximately the Euclidean distance. the following query-independent values:
3. Proposed Asymmetric Distances 1 X
α0k = gk (u), (22)
|Sk0 | 0
In the previous section we decomposed several binary u∈Sk

embedding functions hk into real-valued embedding func- 1 X

α1k = gk (u). (23)
tions gk and quantization functions qk . Let d denote the |Sk1 | 1
u∈Sk
squared Euclidean distance 2 . We also showed that:
X Online, for a given query x, we first pre-compute and store
d(g(x), g(y)) = d(gk (x), gk (y)). (20) in look-up tables the following query-dependent values:
k
βk0 = d(gk (x), α0k ), (24)
is a natural distance in the intermediate space. We now pro-
βk1 = d(gk (x), α1k ). (25)
pose two approximations of the quantity (20). In the fol-
lowing, x is assumed non-binarized, i.e. we have access to By definition, we have:
the values gk (x) (and therefore also to hk (x)), while y is
h (y)
X
binarized, i.e. we only have access to the values hk (y) (but dE (x, y) = βk k . (26)
not to gk (y)). k

3.1. Expectation-Based Asymmetric Distance The cost of pre-computing the β values is negligible with
respect to the cost of computing many dE (x, y)’s for a large
In [6] Jégou et al. proposed an asymmetric algorithm number of database signatures y. Note that the sum (26) can
for compression schemes based on vector quantization. be efficiently computed, e.g. by grouping the dimensions in
A codebook is learned through k-means clustering and a blocks of 8 bits and having one 256-dimensional look-up
2 We use the following abuse of notation for simplicity: d denotes both table per block (rather than one 2-dimensional look-up table
the distance between the vectors g(x) and g(y), i.e. d : RK × RK → R, per dimension). This reduces the number of summations as
and the distance between the individual dimensions, i.e. d : R × R → R. well as the number of accesses to the memory.

732

Authorized licensed use limited to: XIDIAN UNIVERSITY. Downloaded on April 19,2024 at 10:37:36 UTC from IEEE Xplore. Restrictions apply.
3.2. Lower-Bound-Based Asymmetric Distance 4. Experimental Validation
In [3], Dong et al. proposed an asymmetric algorithm for We now show the benefits of the asymmetric distances
binary embeddings based on random projections. We now proposed in section 3 on the binary embeddings we re-
generalize this idea and show that a similar approach can be viewed in section 2. We first describe the four public bench-
applied to a much wider range of binary embedding tech- marks we experiment on. We then provide implementation
niques. For the simplicity of the presentation, we assume details for the different embedding algorithms. We finally
that qk has the form qk (u) = σ(u − tk ) where tk is a thresh- report and discuss results.
old but this can be trivially generalized to other quantization
functions. 4.1. Datasets
The idea is to lower-bound the quantity (20) by bounding We run experiments on four public benchmarks: La-
each of its terms. We note that tk splits R into two half-lines belMe, MNIST, the University of Kentucky Benchmark
and consider two cases: (UKB) and INRIA Holidays. We describe these datasets
as well as the associated standard experimental protocols.
• If hk (x) 6= hk (y), i.e. gk (x) and gk (y) are on differ- LabelMe. The LabelMe dataset [15] contains more than
ent sides of tk , then a lower-bound between gk (x) and 180,000 scene and object images. We follow the protocol
gk (y) is the distance between gk (x) and the threshold of [13] and use the same subset of 14,871 images. 1,000 are
tk , i.e. d(gk (x), gk (y)) ≥ d(gk (x), tk ). used as queries and the remaining 13,781 as database im-
ages. Images are described using a 320 dimensional GIST
• If hk (x) = hk (y), i.e. gk (x) and gk (y) are on the same [10] descriptor. A nominal threshold of the average distance
half-line, then we have the following obvious lower- to the 50th nearest neighbor is used to determine whether a
bound: d(gk (x), gk (y)) ≥ 0 (actually, this bound is database image returned for a given query is considered a
always true). true positive [13]. We note that the number of true posi-
tives varies widely from one query to another (from 0 to
Merging the two cases in a single equation, we have the
2,353). We compute the Average Precision (AP) for each
following lower-bound on d(g(x), g(y)):
query (with a non-zero number of true positives) and report
X the mean over the 1,000 queries (Mean AP or MAP).
dLB (x, y) = δ̄hk (x),hk (y) d(gk (x), tk ), (27) MNIST. The MNIST dataset 3 consists of 70,000 hand-
k
written digits of size 28 × 28 pixels. The dataset is par-
titioned into a training set of 60,000 samples and a test
where δ̄i,j is the negation of the Kronecker delta, i.e. δ̄i,j =
set of 10,000 samples. Following the protocol of [16], we
0 if i = j and 1 otherwise. We note that, in the case of LSH,
use directly the pixel values as features resulting in 784 di-
equation (27) is equivalent to the asymmetric LSH distance
mensional image signatures. We randomly select 1,000 test
proposed in [3].
samples as queries and all 60,000 training samples are used
In practice, for a given query signature x, we can pre-
as database images. Accuracy is reported in terms of the
compute online the values:
proportion of relevant images in the top 500 retrieved im-
ages, i.e. precision@500.
γk0 = δ̄hk (x),0 d(gk (x), tk ), (28)
UKB. The University of Kentucky Benchmark (UKB)
γk1 = δ̄hk (x),1 d(gk (x), tk ). (29) [9] contains 10,200 images of 2,550 objects (4 images per
object). Each image is used in turn as query to search
We note that one of these two values is guaranteed to be through the 10,200 images. The accuracy is measured in
0 for each dimension k. We can subsequently pack these terms of the number of relevant images retrieved in the
values by blocks of 8 dimensions for faster computation as top 4, i.e. 4 × recall@4. We use the same low-level fea-
was the case of the expectation-based approximation. By ture detection and description procedure as in [5] (Hessian-
definition we have: Affine extractor and SIFT descriptor) since the code is avail-
X h (y) able online. We reduce the dimensionality of these fea-
dLB (x, y) = γk k . (30) tures from 128 to 64 dimensions with PCA. To aggregate
k these local feature vectors into a single image-level signa-
ture, we use the Fisher Vector (FV) framework [11] which
A major difference between dE and dLB is that the for- was recently shown to yield excellent results for object and
mer one makes use of the data distribution while the latter scene retrieval [12, 7]. We use probabilistic codebooks (i.e.
one does not. Despite its very crude nature, we will see that GMMs) with 64 Gaussians so that an image is described
dLB leads to excellent results on a variety of binary embed-
ding algorithms. 3 http://yann.lecun.com/exdb/mnist/

733

Authorized licensed use limited to: XIDIAN UNIVERSITY. Downloaded on April 19,2024 at 10:37:36 UTC from IEEE Xplore. Restrictions apply.
by a 64 × 64 = 4, 096 dimensional FV. For learning pur- Table 1. mAP results (in %) as a function of the number of bits
on the LabelMe dataset. The uncompressed baseline is 100%. For
poses (e.g. to learn the PCA on the SIFT descriptors and
SSH and PCAE, the number of embedding dimensions is limited
the GMM for the FV), we use an additional set of 60,000
by the dimensionality of the GIST vectors (320).
images (Flickr60K) made available by the authors of [5]. dimensions 16 32 64 128 256 512
Holidays. The INRIA Holidays dataset [5] contains PCA no bin 66.8 87.4 97.3 99.6 100 -
1,491 images of 500 scenes and objects. The first image Ha 11.5 16.9 24.1 32.2 40.3 46.5
LSH dE 13.9 20.7 29.1 37.2 44.4 49.2
of each scene is used as query to search through the remain- dLB 13.6 20.3 28.5 36.9 44.3 49.4
ing 1,490 images. We measure the accuracy for each query Ha 7.5 11.9 19.4 29.7 44.2 59.8
LSBC dE 9.4 16.4 26.8 39.8 55.1 68.5
using AP and report the MAP over the 500 queries. As was dLB 8.4 13.9 22.6 34.2 49.5 65.1
the case for UKB, images are described using 4,096 dimen- Ha 10.4 12.0 16.8 23.1 24.6 18.3
SH dE 16.7 20.7 29.7 39.6 37.5 22.9
sional FVs. dLB 14.4 17.6 28.8 42.1 41.0 24.2
Ha 9.6 8.9 8.7 7.7 6.4 -
4.2. Implementation Details SSH dE 16.8 19.6 21.4 21.8 21.9 -
dLB 16.6 19.3 21.1 21.6 21.9 -
Ha 9.8 9.1 8.6 7.7 6.2 -
To learn the parameters of the embedding functions and PCAE dE 17.5 20.2 21.9 22.4 22.5 -
to compute offline the α values for dE we need training dLB 17.2 19.4 21.1 21.8 21.9 -
data. For LabelMe and MNIST we use respectively the
13,781 and 60,000 database images. For UKB and Holi- Table 2. Precision@500 (in %) as a function of the number of bits
days we use the Flickr60K dataset [5]. on the MNIST dataset. The uncompressed baseline is 72.0%
dimensions 8 16 24 32 40 48
For LSH and LSBC, which perform binarization through PCA no bin 69.0 75.2 76.2 75.9 75.3 75.0
random projections, experiments are repeated 5 times with Ha 25.9 37.3 43.6 48.8 52.8 55.7
5 different projection matrices and we report the average LSH dE 27.7 42.9 49.8 55.2 59.0 61.7
dLB 27.7 42.8 49.4 54.6 58.4 60.9
results. Ha 17.7 24.8 28.1 31.5 34.7 37.3
For LSBC, the γ bandwidth parameter is set to the in- LSBC dE 19.6 29.4 34.2 39.1 43.6 46.4
dLB 17.9 26.0 29.9 33.6 37.1 39.7
verse of the mean distance between the descriptors in the Ha 44.9 52.5 55.9 58.7 59.8 59.8
training set. This heuristic provided good results in practice SH dE 48.9 62.3 66.6 69.1 70.4 70.0
dLB 47.9 60.2 64.3 66.8 67.6 67.5
(i.e. hand tuning this parameter did not lead to major im- Ha 54.2 55.8 57.2 57.0 56.4 55.5
provements). We are not aware of any principled approach SSH dE 58.1 65.3 68.1 69.0 69.5 69.6
dLB 58.1 65.0 67.5 68.1 68.4 68.4
to set automatically this parameter. Ha 47.4 53.6 56.7 57.4 57.0 56.2
For SSH, the η parameter, which controls the relative PCAE dE 52.3 64.3 68.0 69.4 69.8 70.0
dLB 51.7 64.3 67.7 68.9 69.2 69.4
weight of the unsupervised vs the supervised information
during the learning, is set to 5. We observed that this pa-
rameter has, in general, a very limited impact on the final 4.3. Results and Analysis
accuracy. Again, we are not aware of any principled ap-
proach to set automatically this parameter. We report results on the four datasets in Tables 1 to 4.
SSH also requires neighboring relationships between We provide two baselines: i) the uncompressed baseline
training samples. This is straightforward to derive for which computes the Euclidean distance between the orig-
MNIST where digit labels are available. For the other inal vectors and ii) the PCA baseline which computes the
datasets, we follow a procedure similar to that of [16]. We Euclidean distance between the vectors after PCA but with
sort the distances between all pairs of images in the train- no binarization (referred to as “PCA no bin” in the tables).
ing set and use the distances at the 5th and 95th percentiles The PCA baseline is important for two reasons: PCA is an
as thresholds to decide whether two images are neighbors integral component of several embedding techniques (SH,
(distance in the smallest 5%), not neighbors (distance in the SSH and PCAE) and this operation often improves the re-
largest 5%) or unrelated (distance between the smallest and trieval accuracy. For the five binary embedding algorithms
largest 5%). discussed in section 2 and for each of the four datasets we
SSH requires the data to have zero mean [16] and, as dis- provide results for the symmetric Hamming distance Ha
cussed in section 2.1, mean-centering the data can impact and the proposed asymmetric distances dE and dLB . We
very positively LSH. Therefore, the LabelMe and MNIST can draw the following conclusions on the proposed asym-
signatures (respectively GIST and pixel values) are mean- metric distances.
centered. This is not necessary for FVs (i.e. for UKB and Asymmetric vs symmetric. The use of asymmetric dis-
Holidays) since, by definition, they are already (approxi- tances consistently improves the results over the symmetric
mately) zero-mean. We note that centering the data on the Hamming distance, independently of the dataset and of the
origin does not impact SH and PCAE (which perform a binary embedding technique. The gain in accuracy is often
PCA of the signatures) or LSBC (which is shift-invariant). impressive both in terms of absolute and relative improve-

734

Authorized licensed use limited to: XIDIAN UNIVERSITY. Downloaded on April 19,2024 at 10:37:36 UTC from IEEE Xplore. Restrictions apply.
Table 3. 4 ×recall@4 as a function of the number of bits on the (SH, SSH and PCAE) clearly outperform random-based ap-
UKB dataset. The uncompressed baseline is 3.35.
dimensions 16 32 64 128 256 512 1024
proaches (LSH and LSBC) especially for a small number of
PCA no bin 2.71 3.02 3.19 3.31 3.39 3.44 3.35 bits, except on LabelMe. This shows that one can expect
Ha 1.04 1.16 1.44 1.90 2.39 2.77 3.01 better results by exploiting the data structure than by rely-
LSH dE 1.06 1.28 1.70 2.21 2.64 2.94 3.12 ing on “chance”. We underline that LSH and LSBC still
dLB 1.06 1.26 1.65 2.14 2.58 2.90 3.10
Ha 0.40 1.04 1.13 1.33 1.66 2.10 2.51 enjoy an important property: the guarantee that with more
LSBC dE 0.27 1.03 1.27 1.63 2.06 2.49 2.82 bits the Hamming distance converges to the true distance.
dLB 0.41 1.05 1.17 1.42 1.80 2.25 2.63
Ha 1.15 1.72 2.15 2.54 2.82 3.00 3.15 LSH vs LSBC. LSBC is the best performing method
SH dE 1.14 2.06 2.53 2.90 3.13 3.25 3.33 on the LabelMe dataset but is significantly outperformed
dLB 1.25 2.00 2.46 2.85 3.09 3.23 3.31
Ha 1.32 1.83 2.25 2.56 2.79 2.95 3.09 by LSH (as well as all other embedding techniques) on
SSH dE 1.40 2.18 2.63 2.91 3.08 3.20 3.29 MNIST, UKB and Holidays. This shows that LSBC is ap-
dLB 1.47 2.19 2.63 2.90 3.07 3.19 3.29
Ha 1.31 1.82 2.24 2.56 2.78 2.94 3.09 propriate when the goal is to retrieve database points which
PCAE dE 1.38 2.17 2.62 2.90 3.08 3.20 3.29 are contained in a ball centered on a given query point. This
dLB 1.45 2.18 2.62 2.89 3.06 3.18 3.28
is not surprising given that LSBC approximates the Gaus-
sian kernel and that the iso-density lines for this kernel are
Table 4. mAP results (in %) as a function of the number of bits on
the Holidays dataset. The uncompressed baseline is 59.7%. hyper-spheres. On the other hand, LSBC seems less appro-
dimensions 16 32 64 128 256 512 1024 priate for the general k-nearest neighbors problem since the
PCA no bin 40.8 47.2 52.0 55.8 57.9 60.0 59.7 distance to the k-th neighbor may vary significantly from
Ha 2.8 5.6 12.9 22.3 33.2 42.2 49.8 one query to another.
LSH dE 3.7 9.4 18.5 29.1 39.0 47.0 52.3
dLB 3.7 8.8 17.1 27.2 37.5 46.0 51.8 SH vs SSH vs PCAE. The three PCA-based embed-
Ha 0.9 2.2 4.5 8.7 17.0 25.3 35.7 ding techniques perform fairly similarly on all four datasets.
LSBC dE 1.2 3.5 8.0 15.2 25.6 34.9 43.8
dLB 1.0 2.7 5.7 11.2 19.3 28.6 38.5 However we observe two noticeable exceptions:
Ha 12.6 23.4 33.0 40.4 45.3 48.3 52.4
SH dE 18.2 31.3 39.1 47.9 51.6 53.6 56.8 • SH performs significantly better on LabelMe. This
dLB 17.8 29.8 38.5 46.0 50.6 53.5 56.3 might be explained by the fact that SH approximates
Ha 12.9 23.7 35.4 39.6 44.9 49.3 53.4
SSH dE 19.2 33.4 43.5 49.9 54.4 58.3 60.8 the Gaussian kernel and is therefore very similar in
dLB 18.8 33.4 43.7 49.7 53.0 57.7 60.7 spirit and in practice to LSBC (as noted in section 3
Ha 13.8 24.1 34.2 38.0 44.7 47.3 49.8
PCAE dE 18.7 34.0 42.5 48.7 53.0 57.8 59.7 of [13]). Therefore, SH might be more appropriate for
dLB 18.4 35.7 42.9 48.3 52.7 57.4 59.7 ball search problems (as is the case of LSBC) than k-
nearest neighbor problems.

ment. For instance, at 128 bits, the improvement of LSBC • SSH performs significantly better on MNIST for small
on LabelMe is approximately 10% absolute and 30% rela- operating points (8 and 16 bits). This shows that SSH
tive and the improvement of PCAE on Holidays is approxi- has an advantage when “real” labels are provided, but
mately 10% absolute and 25% relative. that “creating” neighboring relationships from unla-
Expectation vs lower-bound. For almost all embedding beled data brings little additional information.
techniques, dE and dLB yield very similar results, with a These results show that a crucial factor in the success of SH
slight advantage for the expectation-based algorithm. This and SSH is the PCA dimensionality reduction (done explic-
is somewhat surprising given that the two approximations itly in the case of SH and implicitly through the regular-
are very different in nature. The advantage of dE over dLB ization in the case of SSH) and that a conceptually simpler
comes from the fact that the former approach uses informa- technique such as PCAE can often perform comparably.
tion about the data distribution (through the pre-computed
values α) while the latter does not. The only exception is 4.4. Large-Scale Experiments
for LSBC where dE performs significantly better than dLB . We now show that our good results scale to large
We are still investigating this difference but we observed datasets. For these experiments, we merge Holidays with
that the distributions of the values gk (x) in the intermediate a set of 1M distractor Flickr images made available by the
real-valued space for LSBC are significantly different from authors of [5]. We refer to the combined dataset as Holi-
those observed for the other embedding methods (typically days+1M. The same 500 queries are used as for the stan-
U-shaped as opposed to Gaussian-shaped). Note that pre- dard Holidays. We experiment with LSH since it performs
liminary experiments on fusing dE and dLB yielded only significantly better than LSBC on Holidays and with PCAE
marginal improvements. since it performs comparably to SH and SSH while being
We also draw the following conclusions with respect to much simpler.
the different binary embedding techniques: We first report in Figure 1 the recall@K for an operating
PCA-based vs random-based. PCA-based approaches point of 128 bits. For both PCAE and LSH, the asymmet-

735

Authorized licensed use limited to: XIDIAN UNIVERSITY. Downloaded on April 19,2024 at 10:37:36 UTC from IEEE Xplore. Restrictions apply.
70 50
PCAE Binary 128 bits PCAE Binary
65 PCAE Asymm with dE 128 bits PCAE Asymm with dE
45
60 PCAE Asymm with dLB 128 bits PCAE Asymm with dLB
LSH Binary 128 bits 40 LSH Binary
55 LSH Asymm with dE 128 bits LSH Asymm with dE

Mean Average Precision (in %)

50 LSH Asymm with dLB 128 bits LSH Asymm with dLB
35
[Jégou et al., CVPR 2010] k=64 ADC 16x8 (128 bits)
45 [Jégou et al., CVPR 2010] k=64 IVFADC 16x8 (160 bits)
recall at k (in %)

30
40
35 25
30
20
25
20 15

15 10
10
5
5
0 0
1 2 5 10 25 50 100 250 500 1000 16 32 64 128 256 512 1024
k number of bits

Figure 1. Large scale experiments on Holidays+1M using 128 bits Figure 2. Comparison of the proposed asymmetric distances to the
signatures. Recall@k as a function of k. best methods of Jégou et al. [7] on Holiday+1M. MAP as a func-
tion of the number of bits.

ric distances improve very significantly over the symmetric

distance and dE and dLB perform very similarly. [3] W. Dong, M. Charikar, and K. Li. Asymmetric dis-
We now compare in Figure 2 our results in terms of MAP tance estimation with sketches for similarity search in high-
with those of [7] since, to the best of our knowledge, these dimensional spaces. In SIGIR, 2008.
are the best reported results on Holidays+1M for very small [4] P. Indyk and R. Motwani. Approximate nearest neighbors:
towards removing the curse of dimensionality. In STOC,
operating points. The asymmetric PCAE performs compa-
1998.
rably to the asymmetric approach of [7] although PCAE is
[5] H. Jégou, M. Douze, and C. Schmid. Hamming embedding
conceptually simpler. We note however that we should ex-
and weak geometric consistency for large scale image search.
ercise caution when comparing our results to those of [7] In ECCV, 2008. http://lear.inrialpes.fr/˜jegou/data.php.
since the image signatures, although closely related, are not [6] H. Jégou, M. Douze, and C. Schmid. Product quantization
identical (VLAD for [7] and FV in our work). for nearest neighbor search. IEEE TPAMI, 2010.
Finally, we note that the cost of the asymmetric distances [7] H. Jégou, M. Douze, C. Schmid, and P. Pérez. Aggregating
is comparable to the cost of the symmetric Hamming dis- local descriptors into a compact image representation. In
tance (taking into account the computation of the per-query CVPR, 2010.
look-up tables). With our non-optimized C++ code, it takes [8] B. Kulis and K. Grauman. Kernelized locality-sensitive
approximately 40ms to compare a 128 bits query to 1M hashing for scalable image search. In ICCV, 2009.
database images in both symmetric and asymmetric cases [9] D. Nistér and H. Stewénius. Scalable recognition with a vo-
using a single CPU of a 3.16GHz Intel Xeon X5460 with cabulary tree. In CVPR, 2006.
32GB RAM. [10] A. Oliva and A. Torralba. Modeling the shape of the scene:
a holistic representation of the spatial envelope. IJCV, 2001.
5. Conclusion [11] F. Perronnin and C. Dance. Fisher kernels on visual vocabu-
laries for image categorization. In CVPR, 2007.
In this work, we proposed two asymmetric distances for [12] F. Perronnin, Y. Liu, J. Sánchez, and H. Poirier. Large-scale
binary embedding techniques, i.e. distances between bina- image retrieval with compressed fisher vectors. In CVPR,
rized and non-binarized signatures. We showed their appli- 2010.
cability to several embedding algorithms: LSH, LSBC, SH [13] M. Raginsky and S. Lazebnik. Locality-sensitive binary
and SSH. We demonstrated on four datasets with up to 1M codes from shift-invariant kernels. In NIPS, 2009.
images that the proposed asymmetric distances consistently, [14] A. Rahimi and B. Recht. Random features for large-scale
and often very significantly, improve the retrieval accuracy kernel machines. In NIPS, 2007.
over the symmetric Hamming distance. [15] B. Russell, A. Torralba, K. Murphy, and W. Freeman. La-
belMe: A database and web-based tool for image annotation.
References IJCV, 2008.
[16] J. Wang, S. Kumar, and S.-F. Chang. Semi-supervised hash-
[1] J. Brandt. Transform coding for fast approximate nearest ing for large scale search. In CVPR, 2010.
neighbor search in high dimensions. In CVPR, 2010. [17] Y. Weiss, A. Torralba, and R. Fergus. Spectral hashing. In
[2] M. Charikar. Similarity estimation techniques from rounding NIPS, 2008.
algorithms. In STOC, 2002.

736

Authorized licensed use limited to: XIDIAN UNIVERSITY. Downloaded on April 19,2024 at 10:37:36 UTC from IEEE Xplore. Restrictions apply.

Asymmetric Distances in Binary Embeddings
No ratings yet
Asymmetric Distances in Binary Embeddings
15 pages
Nearest Neighbor Retrieval Using Distance-Based Hashing
No ratings yet
Nearest Neighbor Retrieval Using Distance-Based Hashing
10 pages
Near-Optimal LSH for High-Dimensions
No ratings yet
Near-Optimal LSH for High-Dimensions
6 pages
Locality-Sensitive Hashing Scheme Based On P-Stable Distributions
No ratings yet
Locality-Sensitive Hashing Scheme Based On P-Stable Distributions
10 pages
Product Quantization For Nearest Neighbor Search
No ratings yet
Product Quantization For Nearest Neighbor Search
13 pages
Iterative Quantization: A Procrustean Approach To Learning Binary Codes
No ratings yet
Iterative Quantization: A Procrustean Approach To Learning Binary Codes
8 pages
Similarity Algo For Data Science (Section - 1)
No ratings yet
Similarity Algo For Data Science (Section - 1)
6 pages
Binary Hashing for Big Data Search
No ratings yet
Binary Hashing for Big Data Search
16 pages
Locality-Sensitive Hashing Scheme Based On P-Stable Distributions
No ratings yet
Locality-Sensitive Hashing Scheme Based On P-Stable Distributions
10 pages
A Survey On Learning To Hash
No ratings yet
A Survey On Learning To Hash
22 pages
Nguyen Princeton 0181D 11063
No ratings yet
Nguyen Princeton 0181D 11063
168 pages
Brief PDF
No ratings yet
Brief PDF
14 pages
Locality-Sensitive Binary Codes From Shift-Invariant Kernels
No ratings yet
Locality-Sensitive Binary Codes From Shift-Invariant Kernels
9 pages
Achilioptas
No ratings yet
Achilioptas
17 pages
Big Data Unit II
No ratings yet
Big Data Unit II
23 pages
CS246 Hw1
No ratings yet
CS246 Hw1
5 pages
Local Density Estimation in High Dimensions: Xian Wu Moses Charikar Vishnu Natchu
No ratings yet
Local Density Estimation in High Dimensions: Xian Wu Moses Charikar Vishnu Natchu
10 pages
Lshanalysis Preprint
No ratings yet
Lshanalysis Preprint
12 pages
1 Applications of Nearest Neighbor
No ratings yet
1 Applications of Nearest Neighbor
5 pages
Efficient Nearest Neighbor Search in High Dimensional Hamming Space
No ratings yet
Efficient Nearest Neighbor Search in High Dimensional Hamming Space
11 pages
Double-Bit Quantization For Hashing: Weihao Kong Wu-Jun Li
No ratings yet
Double-Bit Quantization For Hashing: Weihao Kong Wu-Jun Li
7 pages
Hashing For Similarity Search: A Survey: Jingdong Wang, Heng Tao Shen, Jingkuan Song, and Jianqiu Ji
No ratings yet
Hashing For Similarity Search: A Survey: Jingdong Wang, Heng Tao Shen, Jingkuan Song, and Jianqiu Ji
29 pages
Learning Hash Functions Using Column Generation: Xi Li Guosheng Lin Chunhua Shen Anton Van Den Hengel Anthony Dick
No ratings yet
Learning Hash Functions Using Column Generation: Xi Li Guosheng Lin Chunhua Shen Anton Van Den Hengel Anthony Dick
9 pages
Near-Optimal Hashing for Nearest Neighbor
No ratings yet
Near-Optimal Hashing for Nearest Neighbor
31 pages
Lecture21 1
No ratings yet
Lecture21 1
14 pages
PSLSH
No ratings yet
PSLSH
10 pages
Tutti Gli Articoli
No ratings yet
Tutti Gli Articoli
140 pages
E Cient Histogram-Based Similarity Search in Ultra-High Dimensional Space
No ratings yet
E Cient Histogram-Based Similarity Search in Ultra-High Dimensional Space
15 pages
Elpis
No ratings yet
Elpis
12 pages
Approximate Data Structures With Applications
No ratings yet
Approximate Data Structures With Applications
8 pages
Nips 11
No ratings yet
Nips 11
9 pages
17 Random Projections and Orthogonal Matching Pursuit
No ratings yet
17 Random Projections and Orthogonal Matching Pursuit
7 pages
Advances in Digital Image Compression by Adaptive Thinning
No ratings yet
Advances in Digital Image Compression by Adaptive Thinning
5 pages
Attribute Discovery Via Predictable Discriminative Binary Codes
No ratings yet
Attribute Discovery Via Predictable Discriminative Binary Codes
14 pages
Similarity Search and Hashing Techniques
No ratings yet
Similarity Search and Hashing Techniques
52 pages
L3: Finding Similar Items: Locality Sensitive Hashing
No ratings yet
L3: Finding Similar Items: Locality Sensitive Hashing
54 pages
Very Sparse Random Projections: Ping Li Trevor J. Hastie Kenneth W. Church
No ratings yet
Very Sparse Random Projections: Ping Li Trevor J. Hastie Kenneth W. Church
10 pages
An Optimal Algorithm For Approximate Nearest
No ratings yet
An Optimal Algorithm For Approximate Nearest
33 pages
Multimedia Indexing Techniques
No ratings yet
Multimedia Indexing Techniques
28 pages
Dimensionality Reduction for Scientists
No ratings yet
Dimensionality Reduction for Scientists
8 pages
Report
No ratings yet
Report
7 pages
Learning To Hash For Indexing Big Data - A Survey
No ratings yet
Learning To Hash For Indexing Big Data - A Survey
22 pages
03 Face Detection
No ratings yet
03 Face Detection
7 pages
Face Recognition Using Wavelet Based Kernel Locally Discriminating Projection
No ratings yet
Face Recognition Using Wavelet Based Kernel Locally Discriminating Projection
6 pages
Randomized Algorithms Notes
No ratings yet
Randomized Algorithms Notes
13 pages
Proximity Graphs For Similarity Search Fast Construction, Lower Bounds, and Euclidean Separation
No ratings yet
Proximity Graphs For Similarity Search Fast Construction, Lower Bounds, and Euclidean Separation
27 pages
07.01.approximate Nearest Neighbor Queries in Fixed Dimensions
No ratings yet
07.01.approximate Nearest Neighbor Queries in Fixed Dimensions
11 pages
UNIT 2 Bigdata Mining and Analytics
No ratings yet
UNIT 2 Bigdata Mining and Analytics
18 pages
What Is The Nearest Neighbor in High Dimensional Spaces
No ratings yet
What Is The Nearest Neighbor in High Dimensional Spaces
10 pages
Hadsell Chopra Lecun 06 PDF
No ratings yet
Hadsell Chopra Lecun 06 PDF
8 pages
Property Testing of Data Dimensionality: ICSI and UC Berkeley
No ratings yet
Property Testing of Data Dimensionality: ICSI and UC Berkeley
26 pages
Binary Signatures
No ratings yet
Binary Signatures
19 pages
1 s2.0 S0167865510001169 Main
No ratings yet
1 s2.0 S0167865510001169 Main
11 pages
Unsupervised Learning: Dimensionality Reduction
No ratings yet
Unsupervised Learning: Dimensionality Reduction
20 pages
Sample New
No ratings yet
Sample New
6 pages
Face Identification Abstract
No ratings yet
Face Identification Abstract
5 pages
Reformer - Efficient Sub-Quadratic Attention
No ratings yet
Reformer - Efficient Sub-Quadratic Attention
5 pages
LSH Lecture
No ratings yet
LSH Lecture
101 pages
Module7 PCA Clustering November 9-13-2023
No ratings yet
Module7 PCA Clustering November 9-13-2023
41 pages
Umadevithangamani
No ratings yet
Umadevithangamani
4 pages
Azure DevOps Comprehensive Guide PDF
100% (1)
Azure DevOps Comprehensive Guide PDF
78 pages
Mike Flood Manual
No ratings yet
Mike Flood Manual
63 pages
Mappc900eng PDF
No ratings yet
Mappc900eng PDF
306 pages
Robotic Light Switch Automation Guide
No ratings yet
Robotic Light Switch Automation Guide
7 pages
Machine Learning: Usman Roshan Dept. of Computer Science Njit
No ratings yet
Machine Learning: Usman Roshan Dept. of Computer Science Njit
14 pages
Ionic-Vue Android
No ratings yet
Ionic-Vue Android
4 pages
Project Report
No ratings yet
Project Report
23 pages
Gaurav I & C
No ratings yet
Gaurav I & C
4 pages
Accenture How Banks Can Thrive API Economy
No ratings yet
Accenture How Banks Can Thrive API Economy
16 pages
Os Lab Front Page
No ratings yet
Os Lab Front Page
4 pages
Java Parameter Passing Explained
No ratings yet
Java Parameter Passing Explained
24 pages
Quectel BC92 AT Commands Manual V1.1
No ratings yet
Quectel BC92 AT Commands Manual V1.1
103 pages
ERP Overview and Business Integration
No ratings yet
ERP Overview and Business Integration
134 pages
Foreword From The Director at Neptune Holidays Limited
No ratings yet
Foreword From The Director at Neptune Holidays Limited
24 pages
Tangazo La Kazi NSSF-3
No ratings yet
Tangazo La Kazi NSSF-3
30 pages
Wordpress
No ratings yet
Wordpress
9 pages
GSM Based Localisation
No ratings yet
GSM Based Localisation
2 pages
C++ Mid-Term Solutions: Q.1 (A) What Is The Data Type? Why Do We Require Data Type?
No ratings yet
C++ Mid-Term Solutions: Q.1 (A) What Is The Data Type? Why Do We Require Data Type?
86 pages
SOC Analyst Course Content v3
No ratings yet
SOC Analyst Course Content v3
16 pages
Module 02 - Assemblers
No ratings yet
Module 02 - Assemblers
45 pages
NCERT Solution For Class 10 Maths Chapter 8 Introduction To Trigonometry Exercise 8.4
No ratings yet
NCERT Solution For Class 10 Maths Chapter 8 Introduction To Trigonometry Exercise 8.4
9 pages
Root Disk Mirroring On Debian Linux
No ratings yet
Root Disk Mirroring On Debian Linux
20 pages
UML State-Chart Diagrams Explained
No ratings yet
UML State-Chart Diagrams Explained
33 pages
Error Detection & Wireless Comm
No ratings yet
Error Detection & Wireless Comm
11 pages
Distinctive Image Feature From Scale-Invariant Keypoints: David G. Lowe, 2004
No ratings yet
Distinctive Image Feature From Scale-Invariant Keypoints: David G. Lowe, 2004
27 pages
Scholarship MCQ Number System
No ratings yet
Scholarship MCQ Number System
2 pages
Introduction to Database Administration
No ratings yet
Introduction to Database Administration
15 pages
Lab 6 Data Carving With Foremost (15 Points)
No ratings yet
Lab 6 Data Carving With Foremost (15 Points)
5 pages
B) Ontology Alignment and Mapping
No ratings yet
B) Ontology Alignment and Mapping
23 pages

Asymmetric Distances For Binary Embeddings

Uploaded by

Asymmetric Distances For Binary Embeddings

Uploaded by

Asymmetric Distances for Binary Embeddings

Albert Gordo ∗ Florent Perronnin

Er∼p ||r′ x − r′ y||2 = ||x − y||2 .

embedding functions hk into real-valued embedding func- 1 X

Mean Average Precision (in %)

ric distances improve very significantly over the symmetric

You might also like