0% found this document useful (0 votes)

90 views10 pages

Towards Open-Set Object Detection and Discovery

Uploaded by

Frezzy Chow

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

90 views10 pages

Towards Open-Set Object Detection and Discovery

Uploaded by

Frezzy Chow

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Towards Open-Set Object Detection and Discovery

Jiyang Zheng⋆† Weihao Li† Jie Hong⋆† Lars Petersson† Nick Barnes⋆
⋆
The Australian National University † Data61-CSIRO
†
[Link]@{⋆ [Link], [Link]}

Abstract (a) Closed-Set Object Detection (c) Open-Set Object

Detection and Discovery
With the human pursuit of knowledge, open-set object (OSODD)
detection (OSOD) has been designed to identify unknown
objects in a dynamic world. However, an issue with the
current setting is that all the predicted unknown objects
share the same category as “unknown”, which require in-
cremental learning via a human-in-the-loop approach to
label novel classes. In order to address this problem, we (b) Open-Set Object Detection
present a new task, namely Open-Set Object Detection and (OSOD)
Discovery (OSODD). This new task aims to extend the abil-
ity of open-set object detectors to further discover the cate-
gories of unknown objects based on their visual appearance
without human effort. We propose a two-stage method that
first uses an open-set object detector to predict both known
and unknown objects. Then, we study the representation of
predicted objects in an unsupervised manner and discover
new categories from the set of unknown objects. With this
Dog Bird Cat Novel Class 1 (Potted Plant)
method, a detector is able to detect objects belonging to Unknown (Vase & Potted Plant) Novel Class 2 (Vase)
known classes and define novel categories for objects of un-
known classes with minimal supervision. We show the per- Figure 1. A visual comparison of object detection tasks. In closed-
formance of our model on the MS-COCO dataset under a set detection, objects from unseen classes are ignored or incor-
thorough evaluation protocol. We hope that our work will rectly classified into the set of known classes. While in open-set
promote further research towards a more robust real-world object detection, unknown objects are localised but share the same
category. Our task aims to detect objects of known classes and dis-
detection system.
cover novel visual categories for the identified objects of unknown
classes, which provides better scene understanding and a scalable
learning paradigm.
1. Introduction
Object detection is the task of localising and classify- jects from the set of known classes and localising objects
ing objects in an image. In recent years, deep learning that belong to an unknown class. Although OSOD has im-
approaches have advanced the detection models [3, 4, 15, proved the practicality of object detection by enabling de-
20, 37, 38, 45] and achieved remarkable progress. However, tection of instances of unknown classes, there is still the
these methods work under a strong assumption that all ob- issue that all identified objects of an unknown class share
ject classes are known at the training phase. As a result of the same category as “unknown” (see Fig. 1(b)). Additional
this assumption, object detectors would incorrectly treat ob- human annotation is required to incrementally learn novel
jects of unknown classes as background or classify them as object categories [24].
belonging to the set of known classes [11] (see Fig. 1(a)). Consider a child who is visiting a zoo for the first time.
To relax the above closed-set condition, open-set object The child can recognise some animals that are seen and
detection (OSOD) [11, 24, 32] considers a realistic scenario learned before, for example, ‘rabbit’ or ‘bird’, while the
where test images might contain novel classes that did not child might not recognise the species of many other rarely
appear during training. OSOD aims at jointly detecting ob- seen animals, like ‘zebra’ and ‘giraffe’. After observing, the

3961
child’s perception system will learn from these previously Task Dataset Known classes Unknown classes
unseen animals’ appearances and cluster them into different ODL Open-Set Non-Action Loc/Cat
categories even without being told what species they are. OSOD Open-Set Detect Loc
In this work, we consider a new task, where we aim to OSODD (Ours) Open-Set Detect Loc/Cat
localise objects of both known and unknown classes, as-
sign pre-defined category labels for known objects, and dis- Table 1. Comparisons of different Object Detection and Discovery
cover new categories for objects of unknown classes (see tasks. OSOD: open-set object detection; ODL: Object discovery
Fig. 1(c)). We term this task Open-Set Object Detection and localization. Loc means localise the objects of interest; Cat
and Discovery (OSODD). We motivate our proposed task, means discover novel categories.
OSODD, by suggesting that it is better suited to extracting
classifier. Recently, Liu et al. [31] proposed a deep metric
information from images. New category discovery provides
learning method to identify unseen classes for imbalanced
additional knowledge of data belonging to classes not seen
datasets. Self-supervised learning [14, 35, 43] approaches
before, helping intelligent vision-based systems to handle
have been explored to minimise external supervision.
more realistic use cases.
Miller et al. [32] first investigate the utility of label
We propose a two-stage framework to tackle the prob-
uncertainty in object detection under open-set conditions
lem of OSODD. First, we leverage the ability of an open
using dropout sampling. Dhamija et al. [11] define the
set object detector to detect objects of known classes and
problem of open-set object detection (OSOD) and con-
identify objects of unknown classes. The predicted propos-
ducted a study on traditional object detectors for their
als of objects of known and unknown classes are saved to a
abilities in avoiding classifying objects of unknown classes
memory buffer; Second, we explore the recurring pattern of
into one of the known classes. An evaluation metric is also
all objects and discover new categories from objects of un-
provided to assess the performance of the object detector
known classes. Specifically, we develop a self-supervised
under the open-set condition.
contrastive learning approach with domain-agnostic data
augmentation and semi-supervised k-means clustering for
category discovery. Open-World Recognition. The open-world setting intro-
duced a continual learning paradigm that extends the open-
Our contributions: set condition by assuming new semantic classes are in-
• We formalise the task Open-Set Object Detection and troduced gradually at each incremental time step. Ben-
Discovery (OSODD), which enables a richer under- dale et al. [2] first formalise the open-world setting for im-
standing within real-world detection systems. age recognition and propose an open-set classifier using the
nearest non-outlier algorithm. The model evolves when new
• We propose a two-stage framework to tackle this prob- labels for the unknown are provided by re-calibrating the
lem, and we present a comprehensive protocol to eval- class probabilities.
uate the object detection and category discovery per- Joseph et al. [24] transfer the open-world setting to an
formance. object detection system and propose the task of open-world
object detection (OWOD). The model uses example replay
• We propose a category discovery method in our frame-
to make the open-set detector learn new classes incremen-
work using domain-agnostic augmentation, contrastive
tally without forgetting the previous ones. The OWOD or
learning and semi-supervised clustering. The novel
OSOD model cannot explore the semantics of the identified
method outperforms other baseline methods in experi-
unknown objects, and extra human annotation is required to
ments.
learn novel classes incrementally. In contrast, our OSODD
model can discover novel category labels for objects of
2. Related Work
unknown classes without human effort.
Open-Set Recognition. Compared with closed-set learn-
ing, which assumes that only previously known classes Novel Category Discovery. The novel category discovery
are present during testing, open-set learning assumes the task aims to identify similar recurring patterns in the unla-
co-existence of known and unknown classes. Scheirer et belled dataset. In image recognition, it was earlier viewed
al. [40] first introduce the problem of open-set recognition as an unsupervised clustering problem. Xie et al. [46] pro-
with incomplete knowledge at training time, i.e., unknown posed the deep embedding network that can cluster data
classes can appear during testing. They developed a classi- and at the same time learn a data representation. Han et
fier in a one-vs-rest setting, which enables the rejection of al. [18] formulated the task of novel class discovery (NCD),
unknown samples. [22, 41] extend the framework in [40] to which clusters the unlabelled images into novel categories
a multi-class classifier using probabilistic models with the using deep transfer clustering. The NCD setting assumes
extreme value theory to minimise fading confidence of the that the training set contains both labelled and unlabelled

3962
data, the knowledge learned on labelled data could be studies the recurring pattern of the objects from the memory
transferred to targeted unlabelled data for category discov- buffer and discovers novel categories in the working mem-
ery [13, 17, 23, 48, 52]. ory. We assign the predicted objects of unknown classes
Object discovery and localisation (ODL) [6,9,27–29,36] from the detector with novel category labels using the dis-
aims to jointly discover and localise dominant objects from covered categories. The visualisation is shown in Fig. 4.
an image collection with multiple object classes in an unsu- The OCD module explores the working memory to dis-
pervised manner. Lee and Grauman [27] used object-graph cover new visual categories. It consists of an encoder com-
and appearance features for unsupervised discovery. Ramb- ponent as the feature extractor and a discriminator which
hat et al. [36] assumed partial knowledge of class labels and clusters the object representations. To train the encoder, we
conducted the discovery leveraging a dual memory module. first retrieve the predicted objects from known classes saved
Compared to ODL, our OSODD both performs detection in the known memory and the identified objects of unknown
on previously known classes and discovers novel categories classes saved in working memory. Then, these instance
for unknown objects, which provide a comprehensive scene samples are transformed using class-agnostic augmentation
understanding. to create a generalised view over the data [10, 26, 51]. We
Please refer to Tab. 1 to see the summarised differences use unsupervised contrastive learning where the predicted
between our setting and other similar settings in the object labels for the objects of known classes are ignored, the pair-
detection problem. wise contrastive loss [33] penalises dissimilarity of the same
object in different views regardless of the semantic informa-
3. Task Format tion. The contrastive learning enables the encoder to learn
a more discriminating feature representation in the latent
In this section, we formulate the task of Open-Set space [7, 19]. Lastly, with the learned feature space from
Detection and Discovery (OSODD). We have a set of the encoder, the discriminator clusters the object embed-
known object classes Ck = {C1 , C2 , · · · , Cm }, and ding into novel categories by using the constrained k-means
there exists a set of unknown visual categories Cu = clustering algorithm [44].
{Cm+1 , Cm+2 , · · · , Cm+n }, where Ck ∩ Cu = ∅. The
training dataset contains objects from Ck , and the testing 4.1. Object Detection and Retrieval
dataset contains objects from Ck ∪ Cu . An object instance Open-Set Object Detector. An open-set object detector
I is represented by I = [c, x, y, w, h], denoting the class predicts the location of all objects of interest. Then it clas-
label (c ∈ Ck or Cu ), the top-left x, y coordinates, and the sifies the objects into semantic classes and identifies the un-
width and height from the centre of the object bounding box seen objects as unknown (See ‘OSOD’ in Fig. 3).
respectively. A model is trained to localise all objects of in- We use the Faster RCNN architecture [38] as the base-
terest. Then, it classifies objects of a known class as one line model, following ORE [24]. Leveraging the class-
of Ckt and clusters objects of an unknown class into novel agnostic property of the region proposal network, we utilise
visual categories Cut . an unknown-aware RPN to identify unknown objects. The
unknown-aware RPN labels the proposals that have high
4. Our Approach scores but do not overlap with any ground-truth bounding
box as the potential unknown objects. To learn a more dis-
This section describes our approach for tackling OS-
criminative representation for each class, we use a prototype
ODD, beginning with an overview of our framework. We
based constrictive loss on the feature vectors fc generated
propose a generic framework consisting of two main mod-
by an intermediate layer in the ROI pooling head. A class
ules, Object Detection and Retrieval (ODR) and Object Cat-
prototype pi is computed by the moving average of the class
egory Discovery (OCD) (see Fig. 2).
instance representations, and the features fc of objects will
The ODR module uses an open-set object detector with
keep approaching their class prototype in the latent space.
a dual memory buffer for object instances detection and re-
The objective is formulated as:
trieval. The detector predicts objects of known classes with
their semantic labels from Ck and the location information, c
X
where the unknown objects are localised but with no seman- ℓpcl (fc ) = ℓ(fc , pi )
tic information available. We store the predicted objects in i=0
( (1)
the memory buffer [36], which is used to explore novel cate- ∥fc , pi ∥ if i = c
gories. The buffer is divided into two parts: known memory ℓ(fc , pi ) =
max (0, ∆ − ∥fc , pi ∥) otherwise
and working memory. The known memory contains pre-
dicted objects of known classes with semantic labels; the where fc is the feature vector of class c, pi is the prototype
working memory stores all current identified objects of un- of class i, ∥f, p∥ measures the distance between feature vec-
known classes without categorical information. The model tors and ∆ is a fixed value that defines the maximum dis-

3963
Unsupervised Constrained
Object-wise Novel
Image set S mix-up k-means
Predictions contrastive learning categories
augmentation clustering
Unknown
ENCODER Aware RPN
Known Unknown
objects objects

ROI Head
Object and its Other Objects
augmented version

Open-Set Object Detection Memory Buffer Category Discovery

OSODD Prediction
OSOD Prediction Novel Category Label

Figure 2. Illustration of the two-stage method for Open-Set Object Detection and Discovery (OSODD). The first stage includes detecting
objects of known classes and identifying objects of unknown classes using an open-set object detector. The instances of unknown classes
are saved into the working memory for category discovery. The instances of known classes are saved into the known memory with their
predicted semantic categories to assist the representation learning and clustering. The second stage pre-processes the objects from the
memory buffer in an unsupervised manner. The representations of these saved objects are first learned in the latent space by contrastive
learning, followed by a constrained k-means clustering used to find the novel categories beyond the known classes. Lastly, we update the
open-set detection predictions with the novel category labels to generate the final OSODD prediction (See visualisations in Figs. 3 and 4).

tance for dissimilar pairs. The total loss for the region of
interest pooling is defined as:

ℓroi = αpcl · ℓpcl + αcls · ℓcls + αreg · ℓreg (2)

where αpcl , αcls and αreg are positive adjustment ratios.

ℓcls , ℓreg are the regular classification and regression loss.
Given the encoded feature fc , we use an open-set
classifier with an energy-based model [25] to distinguish Figure 3. Comparison between OSOD and OSODD prediction.
the objects of known and unknown classes. The trained OSODD (Right figure) has extended the OSOD (Left figure) pre-
model is able to assign low energy values to known data diction by assigning novel category labels to instances of an un-
and thus creates dissimilar representations of distribution known class.
for the objects of known and unknown classes. When new
known class annotations are made available, we utilise the clustering method to estimate the category number in
example replay to alleviate forgetting the previous classes. the target dataset without any parametric learning. The
generalisation ability of the method towards our problem
Memory Module. As described above, we propose to use has been evaluated in Sec. 6.2.1.
a dual memory module to store predicted instances for cate-
gory discovery. The open-set detector detects the objects of Representation Learning. Representation learning aims
interest with their locations and the predicted label. The ob- to learn more discriminative features for input samples. We
jects of a known class Ik are saved into known memory Mk adapt contrastive learning [33] and utilise objects from both
with their semantic labels c ∈ Ck . These objects are treated known and working memory to help the network to learn
as a labelled dataset for the following category discovery. an informative embedding space. The learning is conducted
The identified objects of an unknown class Iu are stored in in an unsupervised manner. Following [19], we build a dy-
the working memory Mw . We perform the category dis- namic dictionary to store samples. The network is trained
covery on Mw , which aims to assign all instances in Mw to maximise similarities for positive pairs (an object and
with a novel category label c ∈ Cu . We update the open-set its augmented version) while minimising similarity for neg-
object detector’s prediction using the novel category labels ative pairs (different object instances) in the embedding
and produce our final OSODD predictions. space. For an object representation, the contrastive loss is
formulated as [8]:
4.2. Object Category Discovery
exp(q · k + /τ )
Category Number Estimation. Our category discovery ℓq,{k} = − log (3)
+ exp(q· k − /τ )
P
exp(q· k + /τ )
approach requires an estimation of the number of potential k−
classes. We use the class estimation method from [18],
one of the most commonly used techniques for image-level where q is a query object representation, {k} is the queue of
novel category discovery. The model uses a k-means key object samples, k + is an augmented version of q, known

3964
as the positive key, and k − is the representations of other Task-1 Task-2 Task-3

samples, known as the negative key. τ is a temperature pa- Outdoor,Accessory,

Sports,
Semantic Split VOC Classes Appliance, Truck
rameter. On top of the contrastive learning head, we adopt Wild Animal
Food

an unsupervised augmentation strategy [26] which replaces Known/Unknown Class 20/60 40/40 60/20
all samples with mixed samples. It minimises the vicinal Training Set 16551 45520 39402
risk [5] which discriminates classes with very different pat- Validation Set 1000
tern distributions and create more training samples [47]. For Test Set 4952
each sample in the queue {k}, we combine it with the query
object representation q via linear interpolation and generate Table 2. Details of class split for the Benchmark. Task-1, Task-
a new view km,i . Correspondingly, a new virtual label vi 2 and Task-3 have different dataset splits of known and unknown
for the ith mix sample xm,i is defined as: classes.

( 5.1. Benchmark Dataset

1, if q and k + are chosen;
vi = (4) Pascal VOC 2007 [12] contains 10k images with 20 la-
0, otherwise;
belled classes. MS-COCO [30] contains around 80k train-
ing and 5k validation images with 80 labelled classes. These
where q and k + are the positive sample pairs, the virtual two object detection datasets are used to build our bench-
label is assigned to 1 if the mixing pair are from the same mark. Following the setting of open-world object detec-
object instance. tion [49], the classes are separated into known and un-
known for three tasks T = {T1 , T2 , T3 }. For task Tt ∈ T ,
Novel Category Labelling. Using the encoded represen- all known classes from {Ti | i < t} are treated as known
tation of the objects, we perform the label assignments us- classes for Tt while the remaining classes are treated as un-
ing constrained k-means clustering [44], a non-parametric known. For the first task T1 , we consider 20 VOC classes
semi-supervised clustering method. The constrained k- as known classes, and the remaining non-overlapping 60
means clustering takes object encoding from both known classes in MS-COCO are treated as the unknown classes.
and working memory as its input. It converts the standard k- New classes are added to the known set in the successive
means clustering into a constraint algorithm by forcing the tasks,i.e., T2 and T3 . For evaluation, we use the validation
labelled object representation to be hard-assigned to their set from MS-COCO except for 48 images that are incom-
ground-truth class. In particular, we treat the object in- pletely labelled [49]. We summarise the benchmark details
stances from the known memory Mk as the labelled sam- in Tab. 2.
ples. We manually calculate the centroid for each labelled
class. These centroids from Mk serve as the first group of 5.2. Evaluation Metrics.
initial centroids for the k-means algorithm. We then ran-
Object Detection Metrics. A qualified open-set object de-
domly initialise the rest of the centroids for novel categories
tector needs to accurately distinguish unknown objects [11].
using the k-mean++ algorithm [1]. For each iteration, the la-
UDR (Unknown Detection Recall) [49] is defined as the lo-
belled object instances are assigned to the pre-defined clus-
calisation rate of unknown objects, and UDR (Unknown
ters while the unknown object instances from Mw are as-
Detection Precision) [49] is defined as the rate of cor-
signed to the cluster with the minimal distance between the
rect rejection of objects of an unknown class. Let true-
cluster centroids and the object embedding. By doing this,
positives (TPu ) be the predicted unknown object proposals
we effectively avoid falsely predicted objects (i.e. objects
that have intersection over union IoU > 0.5 with ground
that belong to one of the semantic classes being predicted as
truth unknown objects. Half false-negatives (FN*u ) be the
unknown) from influencing the centroid update. We run the
predicted known object proposals that have IoU > 0.5 with
last cluster assignment step using only the novel centroids
ground truth unknown objects. False-negatives(FNu ) is the
to ensure that all unknown objects from working memory
missed ground truth unknown objects. UDR and UDP are
are assigned to a discovered visual category in the final pre-
calculated as follow:
diction. The novel centroids from the algorithm represent
the discovered novel categories. TPu + FN*u
UDR =
TPu + FNu
(5)
5. Experimental Setup UDP =
TPu
TPu + FN*u
We provide a comprehensive evaluation protocol for
studying the performance of our model in detecting objects In our task, the other important aspect is to localise and
from known classes and discovery of new novel categories classify objects of interest from the known classes. We
for objects of unknown classes in our target dataset. evaluate the closed-set detection performance using the

3965
standard mean average precision (mAP) at IoU threshold energy-based classifier to discriminate the representations
of 0.5 [38]. To show the incremental learning ability, of known and unknown data. Our generic framework could
we provide the mAP measurement for the newly in- cooperate with any open-set object detector, hence it is
troduced known classes and previously known classes highly flexible.
separately [24, 34].
Category Discovery Baselines. We compare our novel
Category Discovery Metrics. Category discovery can be method with three baseline methods, including k-means,
evaluated using clustering metrics [18, 21, 27, 36, 44, 50]. FINCH [39] and a modified approach from DTC [18].
We adopt the three most commonly used clustering met- K-means clustering is a non-parametric clustering
rics for our object-based category discovery performance. method that minimises within-cluster variances. In every
Suppose a predicted proposal of an object of an unknown iteration, the algorithm first assigns the data points to the
class has matched to a ground truth unknown object. Let cluster with the minimum pairwise squared deviations be-
the predicted category label of the object proposal be yˆi , tween samples and centroids; then, it updates cluster cen-
the ground truth label for the object is denoted as yi . We troids with the current data points belonging to the cluster.
calculate the clustering accuracy (ACC) [18] by: FINCH [39] is a parameter-free clustering method that
discovers linking chains in the data by using the first near-
N
1 X est neighbour. The method directly develops the grouping
ACC = max 1{yi = p(yˆi )} (6)
p∈Py N of data without any external parameters. To make a fair
i=1
comparison, we set the number of clusters to the same as
where N is the number of clusters, and Py is the set of all the other baseline methods. We discuss the performance
permutations of the unknown class labels. of FINCH in estimating the number of novel classes in
Mutual Information I(X, Y ) quantifies the correlation Sec. 6.2.1.
between two random variables X and Y . The range of DTC+, the DTC method [18] is proposed for NCD prob-
I(X, Y ) is from 0 (Independent) to +∞. Normalised mu- lems [16], where the setting assumes the availability of un-
tual information (NMI) [42] is bounded in the range [0, 1]. labelled data at the training phase. The algorithm modifies
Let Cl be the set of ground truth classes, and Cl
c be the set deep embedded clustering [46] to learn knowledge of the
of predicted clusters. The NMI is formulated as: labelled subset and transfer it to the unlabelled subset. This
setting requires the unlabelled data in the training and test-
I(Cl, Cl)
c ing set to be from the same classes. However, no unknown
NMI = (7)
[H(Cl) + H(Cl)]/2
c instances are available in training under the open-set detec-
tion setting. Hence, the NCD-based approaches, such as
where I(Cl, Cl)
c is the sum of mutual information between DTC cannot be directly applied to our problem. To facil-
each class-cluster pair. H(Cl) and H(Cl)
c compute the en- itate the method in our settings, we modify it by transfer-
tropy using maximum likelihood estimation. The Purity of ring a portion of the classes from the known memory to the
the clusters is defined as: working memory during training and treating them as addi-
tional unknown classes. We evaluate DTC’s generalisation
N performance on our problem in Sec. 6.2.3.
1 X
Purity = max |Clk ∩ Cl
ci | (8)
N i=1 k 6.2. Experimental Results
Here, N is the number of clusters and max is the highest We report the quantitative results of the novel category
count of objects for a single class within each cluster. number estimation, object detection and novel category dis-
covery performance in Secs. 6.2.1 to 6.2.3. We show and
6. Results and Analysis discuss the qualitative results in Fig. 4 and in the supple-
mentary material.
6.1. Baselines
Object Detection Baselines. Our framework uses an 6.2.1 Novel Category Number Estimation
open-set object detector for known and unknown instance
detection. We compare two recent approaches: Faster- We show the results of estimating the number of novel
RCNN+ [24] and ORE [24]. The Faster-RCNN+ is categories in Tab. 3. The middle two columns show the
a popular two-stage object detection method, which is automatically discovered grouping by the FINCH algo-
modified from Faster RCNN [38] to localise objects of un- rithm [39]. The numbers are under-estimated by a large
known classes by additionally adapting an unknown-aware margin of 30%, 32.5% and 40% respectively. The last two
regional proposal. ORE uses contrastive clustering and an columns show the result using DTC [18]. It is found that

3966
Task GT FINCH [39] Error Est. [18] Error Task-1 Task-2 Task-3
Method mAP UDR UDP mAP UDR UDP mAP UDR UDP
1 60 42 30% 48 20%
F-RCNN + -/ 56.16 20.14 - 51.09/ 23.84 21.54 - 35.69/ 11.53 30.01 -
2 40 27 32.5% 31 22.5%
ORE [24] -/ 56.02 20.10 36.74 52.19/ 25.03 22.63 21.51 37.23/ 12.02 31.82 23.55
3 20 12 40% 16 20%

Table 4. Baseline model comparison for open-set detectors. The

Table 3. Result of novel Category estimation.
mean average precision (mAP) is recorded for the previous/current
known objects, there is no previous known for Task-1.
the estimated number was lower than the ground truth class
number, with an average error rate of 21%. By exploring
Task-1 Task-2 Task-3
the ground-truth labels in the grouping, we found that both
Method NMI ACC Purity NMI ACC Purity NMI ACC Purity
methods tend to ignore object classes with a small number
K-means 8.5 5.3 9.3 5.0 6.2 12.0 5.3 10.9 27.6
of samples. Compared to the class estimation in the image FINCH [39] 2.8 6.0 8.2 5.4 6.3 9.9 5.3 17.2 29.4
DTC+ [18] 7.5 4.6 5.2 4.0 4.2 7.5 3.9 5.0 25.4
recognition task [18, 44], the detection task faces more bi-
Ours 11.0 6.3 12.6 5.8 6.9 13.3 6.5 16.4 29.3
ased datasets as well as fewer available samples. Hence, it
is still a challenging task for object category estimation.
Table 5. Results of discovery with estimated class number (48, 31,
16 for Task-1, Task-2 and Task-3 respectively). The highest score
6.2.2 Open-Set Object Detection in each column is bold in black, and the second-highest score in
each column is bold in grey. Our novel method has outperformed
We compare two baseline models for the object detection the proposed baseline models for all scores in Task-1 and Task-2.
part in our framework and show the result in Tab. 4. For The cluster accuracy and purity scores are the second-highest in
each task, we record the mAP of all objects to evaluate the Task-3, with a marginal difference to the best-performed baseline.
closed-world detection result. UDR and UDP reflect the un-
known objectness performance and discrimination perfor- Task-1 Task-2 Task-3
mance. The ORE outperforms the modified Faster-RCNN Method NMI ACC Purity NMI ACC Purity NMI ACC Purity
on known classes detection by a smaller margin, which K-means 11.9 6.0 12.4 5.9 6.1 12.8 6.0 11.6 27.9
FINCH [39] 10.3 6.1 12.5 4.8 7.5 13.4 5.5 13.6 28.3
are −0.14%, +1.14% and +1.01% respectively. The mAP DTC+ [18] 8.3 4.7 9.2 4.2 5.0 12.1 5.0 7.7 26.1
scores get lower when new semantic classes are being in- Ours 13.1 6.5 13.1 7.0 7.5 13.8 6.1 13.2 29.1
troduced. The UDR result shows that ORE performs bet-
ter on unknown object localisation, with a +0.95% average Table 6. Results of discovery with ground truth class number (60,
unknown detection rate. As opposed to closed-set detec- 40, 20 for Task-1, Task-2 and Task-3 respectively). The highest
tion, the UDR scores improved when more classes are made score in each column is bold in black, and the second-highest score
available to the model. The Faster-RCNN baseline can only in each column is bold in grey. With the pre-defined number of
classes, our method has achieved the highest scores for all three
localise objects of an unknown class, but it does not identify
tasks, except for the accuracy in Task-3, which is behind the high-
them from known classes hence there is no UDP score. est scoring baseline method by a small margin. The overall perfor-
mance of our method is the best among all the proposed baselines.
6.2.3 Novel Category Discovery
6.3. Ablation study
Results of the object category discovery are shown in Tab. 5
and Tab. 6. The test condition is the same as the open-set de- To study the contribution of each component in our
tection. Our discovery method is able to accurately explore proposed framework, we design ablation experiments and
novel categories among the objects of unknown classes. show the results in Tab. 7.
Using the estimated number of classes, the discovery re-
sults are reported in Tab. 5. We observe that our method out- Representation Learning. The effects of the represen-
performed other baseline methods in the first two tasks. In tation learning in discovering novel classes are shown in
Task-3, where there are 60 known classes and 20 unknown Cases I, II and IV. The clustering result without encoding is
classes, our accuracy and purity score is slightly lower than reported in Case I. The result with only contrastive learning
the FINCH algorithm by 0.8% and 0.1%. We suggest that is reported in Case II. We observe that the performance
Task-3 may contain more biased unknown object classes, without encoding is around 10% lower compared to Case
therefore becoming challenging for self-supervised learning IV which is our method. Contrastive learning without the
to learn generalised representations. mix-up argumentation reflects higher scores compared to
We report the results using the ground truth number of Case I, but it is still around 4% lower in the aggregated
classes in Tab. 6. The results shown are similar to Tab. 5, scores compared to Case IV. This suggests that representa-
where our method has the best-aggregated performance tion learning is critical for constructing a strong baseline.
over three tasks. The method achieves respectable quan-
titative results considering the challenging level of the task. Category Discovery. We evaluate the effects of using

3967
Figure 4. Visualisation of OSODD predictions for Task-1. The tennis racket, stop sign, fire hydrant, clock, giraffe and zebra are the novel
classes that have not been introduced at this stage. The same bounding box colour indicates objects that belong to the same class or novel
category. The last column demonstrates a failure case where a giraffe is not detected, and one of the zebras is assigned to the wrong visual
category. More visualised results are provided in the supplementary material.

Representation Learning Category Discovery Task-1 Task-2 Task-3

Mix-Up Augmentation Contrastive Learning Semi-supervised Clustering NMI ACC Purity NMI ACC Purity NMI ACC Purity
I ✗ ✗ ✓ 8.9 5.6 10.5 4.7 5.4 11.9 5.5 14.7 27.7
II ✗ ✓ ✓ 10.5 6.3 12.0 5.6 5.4 13.2 6.1 15.5 28.6
III-1 ✓ ✓ ✗ 9.6 5.7 11.7 5.2 6.3 12.9 5.8 15.9 28.8
III-2 ✓ ✓ ✗ 7.4 6.3 12.3 5.4 6.4 13.1 6.0 16.8 28.7
IV ✓ ✓ ✓ 11.0 6.3 12.6 5.8 6.9 13.3 6.5 16.4 29.3

Table 7. Ablation Study on components of our proposed category discovery method. The complete method with all the proposed modules
achieves the best-aggregated performance in all tasks, which shows the importance of each component contributing to the method.

semi-supervised clustering in Case III-1, III-2 and IV. the supplementary material.
In Case III-1. We make the clustering algorithm fully
unsupervised by removing the labelled centroids and 7. Conclusion
instances. The results decrease by around 8% in all tasks.
Since the FINCH algorithm [39] shows a competitive In this work, we propose a framework to detect known
result in Tab. 5 and Tab. 6. In Case III-2, we replace the objects and discover novel visual categories for unknown
semi-supervised clustering with the FINCH algorithm. The objects. We term this task Open-Set Object Detection and
results show that Case IV outperforms Case III-2 in the Discovery (OSODD), as a natural extension of open-set ob-
task aggregation scores, which indicates our model bet- ject detection tasks. We develop a two-stage framework and
ter clusters the samples with the same learned feature space. a novel method for label assignment, outperforming other
popular baselines. Compared to detection and discovery
tasks, OSODD can provide more comprehensive informa-
Memory Module. To show the effects of the current mem- tion for real-world practices. We hope our work will con-
ory design, we ablate the module by removing the known tribute to the object detection community and motivate fur-
memory in representation learning. We report the results in ther research in this area.

3968
References [15] Ross Girshick. Fast r-cnn. In Proceedings of the IEEE inter-
national conference on computer vision, pages 1440–1448,
[1] David Arthur and Sergei Vassilvitskii. k-means++: The 2015. 1
advantages of careful seeding. Technical report, Stanford, [16] Kai Han, Sylvestre-Alvise Rebuffi, Sebastien Ehrhardt, An-
2006. 5 drea Vedaldi, and Andrew Zisserman. Automatically discov-
[2] Abhijit Bendale and Terrance Boult. Towards open world ering and learning new visual categories with ranking statis-
recognition. In Proceedings of the IEEE conference on tics. arXiv preprint arXiv:2002.05714, 2020. 6
computer vision and pattern recognition, pages 1893–1902, [17] Kai Han, Sylvestre-Alvise Rebuffi, Sebastien Ehrhardt, An-
2015. 2 drea Vedaldi, and Andrew Zisserman. Autonovel: Automati-
[3] Zhaowei Cai and Nuno Vasconcelos. Cascade r-cnn: Delv- cally discovering and learning novel visual categories. IEEE
ing into high quality object detection. In Proceedings of the Transactions on Pattern Analysis and Machine Intelligence,
IEEE conference on computer vision and pattern recogni- 2021. 3
tion, pages 6154–6162, 2018. 1 [18] Kai Han, Andrea Vedaldi, and Andrew Zisserman. Learning
[4] Nicolas Carion, Francisco Massa, Gabriel Synnaeve, Nicolas to discover novel visual categories via deep transfer cluster-
Usunier, Alexander Kirillov, and Sergey Zagoruyko. End-to- ing. In Proceedings of the IEEE/CVF International Confer-
end object detection with transformers. In European conference on Computer Vision (ICCV), October 2019. 2, 4, 6, 7
ence on computer vision, pages 213–229. Springer, 2020. 1 [19] Kaiming He, Haoqi Fan, Yuxin Wu, Saining Xie, and Ross
[5] Olivier Chapelle, Jason Weston, Léon Bottou, and Vladimir Girshick. Momentum contrast for unsupervised visual rep-
Vapnik. Vicinal risk minimization. Advances in neural in- resentation learning. In Proceedings of the IEEE/CVF Con-
formation processing systems, 13, 2000. 5 ference on Computer Vision and Pattern Recognition, pages
[6] Jia Chen, Yasong Chen, Weihao Li, Guoqin Ning, Ming- 9729–9738, 2020. 3, 4
wen Tong, and Adrian Hilton. Channel and spatial attention [20] Kaiming He, Georgia Gkioxari, Piotr Dollár, and Ross Gir-
based deep object co-segmentation. Knowledge-Based Sys- shick. Mask r-cnn. In Proceedings of the IEEE international
tems, 211:106550, 2021. 3 conference on computer vision, pages 2961–2969, 2017. 1
[7] Ting Chen, Simon Kornblith, Mohammad Norouzi, and Ge- [21] Jie Hong, Weihao Li, Junlin Han, Jiyang Zheng, Pengfei
offrey Hinton. A simple framework for contrastive learning Fang, Mehrtash Harandi, and Lars Petersson. Goss: Towards
of visual representations. In International conference on ma- generalized open-set semantic segmentation. arXiv preprint
chine learning, pages 1597–1607. PMLR, 2020. 3 arXiv:2203.12116, 2022. 6
[8] Xinlei Chen, Haoqi Fan, Ross Girshick, and Kaiming He. [22] Lalit P Jain, Walter J Scheirer, and Terrance E Boult. Multi-
Improved baselines with momentum contrastive learning. class open set recognition using probability of inclusion. In
arXiv preprint arXiv:2003.04297, 2020. 4 European Conference on Computer Vision, pages 393–409.
Springer, 2014. 2
[9] Minsu Cho, Suha Kwak, Cordelia Schmid, and Jean Ponce.
[23] Xuhui Jia, Kai Han, Yukun Zhu, and Bradley Green.
Unsupervised object discovery and localization in the wild:
Joint representation learning and novel category discovery
Part-based matching with bottom-up region proposals. In
on single-and multi-modal data. In Proceedings of the
Proceedings of the IEEE conference on computer vision and
IEEE/CVF International Conference on Computer Vision,
pattern recognition, pages 1201–1210, 2015. 3
pages 610–619, 2021. 3
[10] Ekin D Cubuk, Barret Zoph, Dandelion Mane, Vijay Vasude- [24] KJ Joseph, Salman Khan, Fahad Shahbaz Khan, and Vi-
van, and Quoc V Le. Autoaugment: Learning augmentation neeth N Balasubramanian. Towards open world object detec-
policies from data. arXiv preprint arXiv:1805.09501, 2018. tion. In Proceedings of the IEEE/CVF Conference on Com-
3 puter Vision and Pattern Recognition, pages 5830–5840,
[11] Akshay Dhamija, Manuel Gunther, Jonathan Ventura, and 2021. 1, 2, 3, 6, 7
Terrance Boult. The overlooked elephant of object detection: [25] Yann LeCun, Sumit Chopra, Raia Hadsell, M Ranzato, and
Open set. In Proceedings of the IEEE/CVF Winter Confer- F Huang. A tutorial on energy-based learning. Predicting
ence on Applications of Computer Vision, pages 1021–1030, structured data, 1(0), 2006. 4
2020. 1, 2, 5 [26] Kibok Lee, Yian Zhu, Kihyuk Sohn, Chun-Liang Li, Jinwoo
[12] Mark Everingham, Luc Van Gool, Christopher KI Williams, Shin, and Honglak Lee. I-mix: A domain-agnostic strat-
John Winn, and Andrew Zisserman. The pascal visual object egy for contrastive representation learning. arXiv preprint
classes (voc) challenge. International journal of computer arXiv:2010.08887, 2020. 3, 5
vision, 88(2):303–338, 2010. 5 [27] Yong Jae Lee and Kristen Grauman. Object-graphs for
[13] Enrico Fini, Enver Sangineto, Stéphane Lathuilière, Zhun context-aware category discovery. In 2010 IEEE Computer
Zhong, Moin Nabi, and Elisa Ricci. A unified objective for Society Conference on Computer Vision and Pattern Recog-
novel class discovery. In Proceedings of the IEEE/CVF Inter- nition, pages 1–8. IEEE, 2010. 3, 6
national Conference on Computer Vision, pages 9284–9292, [28] Weihao Li, Omid Hosseini Jafari, and Carsten Rother. Deep
2021. 3 object co-segmentation. In Asian Conference on Computer
[14] Spyros Gidaris, Praveer Singh, and Nikos Komodakis. Un- Vision, pages 638–653. Springer, 2018. 3
supervised representation learning by predicting image rota- [29] Weihao Li, Omid Hosseini Jafari, and Carsten Rother. Local-
tions. In ICLR, 2018. 2 izing common objects using common component activation

3969
map. In Proceedings of the IEEE/CVF Conference on Com- tions. Journal of machine learning research, 3(Dec):583–
puter Vision and Pattern Recognition (CVPR) Workshops, 617, 2002. 6
June 2019. 3 [43] Jihoon Tack, Sangwoo Mo, Jongheon Jeong, and Jinwoo
[30] Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Shin. Csi: Novelty detection via contrastive learning on dis-
Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence tributionally shifted instances. In NeurIPS, 2020. 2
Zitnick. Microsoft coco: Common objects in context. In [44] Sagar Vaze, Kai Han, Andrea Vedaldi, and Andrew Zis-
European conference on computer vision, pages 740–755. serman. Generalized category discovery. arXiv preprint
Springer, 2014. 5 arXiv:2201.02609, 2022. 3, 5, 6, 7
[31] Ziwei Liu, Zhongqi Miao, Xiaohang Zhan, Jiayun Wang, [45] Xin Wang, Thomas E Huang, Trevor Darrell, Joseph E Gon-
Boqing Gong, and Stella X Yu. Large-scale long-tailed zalez, and Fisher Yu. Frustratingly simple few-shot object
recognition in an open world. In Proceedings of the detection. arXiv preprint arXiv:2003.06957, 2020. 1
IEEE/CVF Conference on Computer Vision and Pattern [46] Junyuan Xie, Ross Girshick, and Ali Farhadi. Unsupervised
Recognition, pages 2537–2546, 2019. 2 deep embedding for clustering analysis. In International
[32] Dimity Miller, Lachlan Nicholson, Feras Dayoub, and Niko conference on machine learning, pages 478–487. PMLR,
Sünderhauf. Dropout sampling for robust object detection 2016. 2, 6
in open-set conditions. In 2018 IEEE International Confer- [47] Hongyi Zhang, Moustapha Cisse, Yann N Dauphin, and
ence on Robotics and Automation (ICRA), pages 3243–3249. David Lopez-Paz. mixup: Beyond empirical risk minimiza-
IEEE, 2018. 1, 2 tion. arXiv preprint arXiv:1710.09412, 2017. 5
[33] Aaron van den Oord, Yazhe Li, and Oriol Vinyals. Repre- [48] Bingchen Zhao and Kai Han. Novel visual category discov-
sentation learning with contrastive predictive coding. arXiv ery with dual ranking statistics and mutual knowledge distil-
preprint arXiv:1807.03748, 2018. 3, 4 lation. Advances in Neural Information Processing Systems,
[34] Can Peng, Kun Zhao, and Brian C Lovell. Faster ilod: In- 34, 2021. 3
cremental learning for object detectors based on faster rcnn. [49] Xiaowei Zhao, Xianglong Liu, Yifan Shen, Yuqing Ma, Yix-
Pattern Recognition Letters, 140:109–115, 2020. 6 uan Qiao, and Duorui Wang. Revisiting open world object
[35] Pramuditha Perera, Vlad I Morariu, Rajiv Jain, Varun Man- detection. arXiv preprint arXiv:2201.00471, 2022. 5
junatha, Curtis Wigington, Vicente Ordonez, and Vishal M [50] Zhun Zhong, Enrico Fini, Subhankar Roy, Zhiming Luo,
Patel. Generative-discriminative feature representations for Elisa Ricci, and Nicu Sebe. Neighborhood contrastive
open-set recognition. In Proceedings of the IEEE/CVF Con- learning for novel class discovery. In Proceedings of the
ference on Computer Vision and Pattern Recognition, pages IEEE/CVF Conference on Computer Vision and Pattern
11814–11823, 2020. 2 Recognition, pages 10867–10875, 2021. 6
[36] Sai Saketh Rambhatla, Rama Chellappa, and Abhinav Shri- [51] Zhun Zhong, Liang Zheng, Guoliang Kang, Shaozi Li, and
vastava. The pursuit of knowledge: Discovering and local- Yi Yang. Random erasing data augmentation. In Proceedings
izing novel categories using dual memory. arXiv preprint of the AAAI Conference on Artificial Intelligence, volume 34,
arXiv:2105.01652, 2021. 3, 6 pages 13001–13008, 2020. 3
[37] Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali [52] Zhun Zhong, Linchao Zhu, Zhiming Luo, Shaozi Li, Yi
Farhadi. You only look once: Unified, real-time object de- Yang, and Nicu Sebe. Openmix: Reviving known knowledge
tection. In Proceedings of the IEEE conference on computer for discovering novel visual categories in an open world. In
vision and pattern recognition, pages 779–788, 2016. 1 Proceedings of the IEEE/CVF Conference on Computer Vi-
[38] Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. sion and Pattern Recognition, pages 9462–9470, 2021. 3
Faster r-cnn: towards real-time object detection with region
proposal networks. IEEE transactions on pattern analysis
and machine intelligence, 39(6):1137–1149, 2016. 1, 3, 6
[39] Saquib Sarfraz, Vivek Sharma, and Rainer Stiefelhagen. Effi-
cient parameter-free clustering using first neighbor relations.
In Proceedings of the IEEE/CVF Conference on Computer
Vision and Pattern Recognition, pages 8934–8943, 2019. 6,
7, 8
[40] Walter J Scheirer, Anderson de Rezende Rocha, Archana
Sapkota, and Terrance E Boult. Toward open set recogni-
tion. IEEE transactions on pattern analysis and machine
intelligence, 35(7):1757–1772, 2012. 2
[41] Walter J Scheirer, Lalit P Jain, and Terrance E Boult. Prob-
ability models for open set recognition. IEEE transactions
on pattern analysis and machine intelligence, 36(11):2317–
2324, 2014. 2
[42] Alexander Strehl and Joydeep Ghosh. Cluster ensembles—a
knowledge reuse framework for combining multiple parti-

3970

2021 - Joseph Et Al - Towards Open World Object Detection
No ratings yet
2021 - Joseph Et Al - Towards Open World Object Detection
16 pages
Unknown Object Detection Breakthrough
No ratings yet
Unknown Object Detection Breakthrough
10 pages
Object Detection and Game-Based Learning
No ratings yet
Object Detection and Game-Based Learning
23 pages
Oriented Tiny Object Detection Dataset and Learning
No ratings yet
Oriented Tiny Object Detection Dataset and Learning
18 pages
Deep Learning For X Ray Image To Text Generation
No ratings yet
Deep Learning For X Ray Image To Text Generation
4 pages
Attribute-Based Classification For Zero-Shot Visual Object Categorization
No ratings yet
Attribute-Based Classification For Zero-Shot Visual Object Categorization
13 pages
OD Trans Christopher-Lang2022 Q2
No ratings yet
OD Trans Christopher-Lang2022 Q2
15 pages
Vaze Generalized Category Discovery CVPR 2022 Paper
No ratings yet
Vaze Generalized Category Discovery CVPR 2022 Paper
10 pages
3DOS: Towards 3D Open Set Learning - Benchmarking and Understanding Semantic Novelty Detection On Point Clouds
No ratings yet
3DOS: Towards 3D Open Set Learning - Benchmarking and Understanding Semantic Novelty Detection On Point Clouds
13 pages
Object Detection Using Machine Learning and Deep Learning
No ratings yet
Object Detection Using Machine Learning and Deep Learning
6 pages
Review of Deep Learning for Object Detection
No ratings yet
Review of Deep Learning for Object Detection
21 pages
Object Detection in Real Images
No ratings yet
Object Detection in Real Images
27 pages
General Framework For Object Detection
No ratings yet
General Framework For Object Detection
9 pages
Object Detection Survey
No ratings yet
Object Detection Survey
39 pages
Efficient Dense Object Detection
No ratings yet
Efficient Dense Object Detection
15 pages
Fin Irjmets1654850281
No ratings yet
Fin Irjmets1654850281
10 pages
Learning To Detect Objects in Images Via A Sparse, Part-Based Representation
No ratings yet
Learning To Detect Objects in Images Via A Sparse, Part-Based Representation
16 pages
LP-OVOD: Advanced Open-Vocabulary Detection
No ratings yet
LP-OVOD: Advanced Open-Vocabulary Detection
10 pages
Hierarchical Relation Aided Semi-Supervised Domain Adaptatio
No ratings yet
Hierarchical Relation Aided Semi-Supervised Domain Adaptatio
9 pages
Real Time Object Recognition and Classification
No ratings yet
Real Time Object Recognition and Classification
6 pages
Open Set Recognition in Machine Learning
No ratings yet
Open Set Recognition in Machine Learning
16 pages
Sensors 22 04833
No ratings yet
Sensors 22 04833
17 pages
OPODet Toward Open World Potential Oriented Object Detection in Remote Sensing Images
No ratings yet
OPODet Toward Open World Potential Oriented Object Detection in Remote Sensing Images
13 pages
Universal Object Detection Guide
No ratings yet
Universal Object Detection Guide
11 pages
Object Detection and Recognition System (Using TensorFlow)
No ratings yet
Object Detection and Recognition System (Using TensorFlow)
8 pages
Autonomous Drone Object Detection
No ratings yet
Autonomous Drone Object Detection
8 pages
Object Detection With Deep Learning: A Review
No ratings yet
Object Detection With Deep Learning: A Review
21 pages
Object Detection Explained
No ratings yet
Object Detection Explained
45 pages
PhD Report: Real Image Object Detection
No ratings yet
PhD Report: Real Image Object Detection
124 pages
Open Set Deep Networks with OpenMax
No ratings yet
Open Set Deep Networks with OpenMax
14 pages
(2025-AEJ) A Relation-Enhanced Mean-Teacher Framework For Source-Free Domain Adaptation of Object Detection
No ratings yet
(2025-AEJ) A Relation-Enhanced Mean-Teacher Framework For Source-Free Domain Adaptation of Object Detection
12 pages
Open-World Classification Report
No ratings yet
Open-World Classification Report
48 pages
1 s2.0 S0952197625001101 Main
No ratings yet
1 s2.0 S0952197625001101 Main
12 pages
2021 - 2 Hslskhkshsigs
No ratings yet
2021 - 2 Hslskhkshsigs
2 pages
bmvc14 Sun Fromvirtualtoreal
No ratings yet
bmvc14 Sun Fromvirtualtoreal
12 pages
CVPR Unit 3
No ratings yet
CVPR Unit 3
13 pages
Paper 3 Openset
No ratings yet
Paper 3 Openset
28 pages
Tools, Techniques, Datasets and Application Areas For Object Detection in An Image: A Review
No ratings yet
Tools, Techniques, Datasets and Application Areas For Object Detection in An Image: A Review
55 pages
A Literature Review of Object Detection Using YOLOv4 Detector
No ratings yet
A Literature Review of Object Detection Using YOLOv4 Detector
7 pages
Ankan Bansal Zero-Shot Object Detection ECCV
No ratings yet
Ankan Bansal Zero-Shot Object Detection ECCV
17 pages
Real Time Object Detection Using Deep Learning
No ratings yet
Real Time Object Detection Using Deep Learning
6 pages
Attribute-Centric Recognition For Cross-Category Generalization
No ratings yet
Attribute-Centric Recognition For Cross-Category Generalization
8 pages
Pami09 Compressed
No ratings yet
Pami09 Compressed
18 pages
Beery Synthetic Examples Improve Generalization For Rare Classes WACV 2020 Paper
No ratings yet
Beery Synthetic Examples Improve Generalization For Rare Classes WACV 2020 Paper
11 pages
Recent Advances in Deep Learning For Object Detection
No ratings yet
Recent Advances in Deep Learning For Object Detection
26 pages
Few-Shot Object Detection Survey
No ratings yet
Few-Shot Object Detection Survey
21 pages
Real-Time CNN Visual Recognition
No ratings yet
Real-Time CNN Visual Recognition
13 pages
Grounding DINO for Open-Set Detection
No ratings yet
Grounding DINO for Open-Set Detection
17 pages
Camouflaged Object Detection with Deep Learning
No ratings yet
Camouflaged Object Detection with Deep Learning
8 pages
Paper 7 - The Object Detection Based On Deep Learning
No ratings yet
Paper 7 - The Object Detection Based On Deep Learning
6 pages
Biological Vision Inspired Context-Awareness Network For Various Non-Generic Object Detection
No ratings yet
Biological Vision Inspired Context-Awareness Network For Various Non-Generic Object Detection
15 pages
Object Detection and Localization Using Local and Global Features
No ratings yet
Object Detection and Localization Using Local and Global Features
19 pages
Object Detection Techniques with ODUELAN
No ratings yet
Object Detection Techniques with ODUELAN
6 pages
Dip Project
No ratings yet
Dip Project
36 pages
Incremental Training for Unseen Object Classification
No ratings yet
Incremental Training for Unseen Object Classification
19 pages
Object Detection: Current and Future Directions: Rodrigo Verschae and Javier Ruiz-del-Solar
No ratings yet
Object Detection: Current and Future Directions: Rodrigo Verschae and Javier Ruiz-del-Solar
7 pages
2 - 4891-Article Text-7957-1-10-20190709
No ratings yet
2 - 4891-Article Text-7957-1-10-20190709
8 pages
Methods of Research
No ratings yet
Methods of Research
17 pages
Numerical Simulations Confirm Wave-Induced Shear Mixing in Stellar Interiors
No ratings yet
Numerical Simulations Confirm Wave-Induced Shear Mixing in Stellar Interiors
14 pages
Echoes From The Dark: Galaxy Catalog Incompleteness in Standard Siren Cosmology
No ratings yet
Echoes From The Dark: Galaxy Catalog Incompleteness in Standard Siren Cosmology
14 pages
Solar Photospheric Spectrum Microvariability: Dainis Dravins (1) and Hans-G Unter Ludwig
No ratings yet
Solar Photospheric Spectrum Microvariability: Dainis Dravins (1) and Hans-G Unter Ludwig
8 pages
Automatic Classification of Magnetic Chirality of Solar Filaments From H-Alpha Observations
No ratings yet
Automatic Classification of Magnetic Chirality of Solar Filaments From H-Alpha Observations
8 pages
Status of The Trinity Pev Neutrino Observatory: Sofia Stepanoff For The Collaboration
No ratings yet
Status of The Trinity Pev Neutrino Observatory: Sofia Stepanoff For The Collaboration
9 pages
The Trinity-One Pev-Neutrino Telescope: David A. Raudales O. and A. Nepomuk Otte For The Collaboration
No ratings yet
The Trinity-One Pev-Neutrino Telescope: David A. Raudales O. and A. Nepomuk Otte For The Collaboration
9 pages
Door Slam
No ratings yet
Door Slam
8 pages
CV of ALi Mughal
No ratings yet
CV of ALi Mughal
2 pages
Follistatin-344 Plasmid Therapy Benefits
No ratings yet
Follistatin-344 Plasmid Therapy Benefits
1 page
Predictive Network Maintenance
No ratings yet
Predictive Network Maintenance
119 pages
CSE 1010e 2004 S1
No ratings yet
CSE 1010e 2004 S1
11 pages
A Look Back: The 1992 Fort Myers Green Wave
No ratings yet
A Look Back: The 1992 Fort Myers Green Wave
33 pages
AAiT 2016 Y I S II 20 Sections April 2024
No ratings yet
AAiT 2016 Y I S II 20 Sections April 2024
20 pages
GCSE English Writing Guide
100% (2)
GCSE English Writing Guide
68 pages
Chemistry Teaching Practice Report
No ratings yet
Chemistry Teaching Practice Report
3 pages
Bridge Rectifier Power Supply Guide
No ratings yet
Bridge Rectifier Power Supply Guide
3 pages
Class 11 Business Studies Notes Chapter 4 Business Services
No ratings yet
Class 11 Business Studies Notes Chapter 4 Business Services
59 pages
Petition To Receive Document
50% (2)
Petition To Receive Document
3 pages
Problems: Problem 12 - 2
No ratings yet
Problems: Problem 12 - 2
9 pages
02 Design and Performance Data-1
No ratings yet
02 Design and Performance Data-1
34 pages
Clear and Confident Pronunciation Challenge Workbook
No ratings yet
Clear and Confident Pronunciation Challenge Workbook
17 pages
Water Bodies Judgment
No ratings yet
Water Bodies Judgment
17 pages
Republic Act No 10817
No ratings yet
Republic Act No 10817
5 pages
Understanding Research Objectives, Aims, Scope
No ratings yet
Understanding Research Objectives, Aims, Scope
14 pages
Robbins Basic Pathology 10th Edition by Vinay Kumar, Abul Abbas, Jon Aster ISBN 9780323394123 0323394124
No ratings yet
Robbins Basic Pathology 10th Edition by Vinay Kumar, Abul Abbas, Jon Aster ISBN 9780323394123 0323394124
28 pages
Understanding ICT: Definitions & Distinctions
No ratings yet
Understanding ICT: Definitions & Distinctions
3 pages
Commerce Exam Paper
No ratings yet
Commerce Exam Paper
6 pages
Immigration Essay
100% (1)
Immigration Essay
4 pages
HAMLET-A Revenge Tragedy
92% (13)
HAMLET-A Revenge Tragedy
2 pages
Behavioral Based Safety & Safety Leadership
50% (2)
Behavioral Based Safety & Safety Leadership
3 pages
1.2 Practical Transformers
No ratings yet
1.2 Practical Transformers
11 pages
CN Practical
No ratings yet
CN Practical
9 pages
HRM vs. HRD and SHRM Overview
No ratings yet
HRM vs. HRD and SHRM Overview
17 pages
Mandarin-Malay Basic Phrases Guide
No ratings yet
Mandarin-Malay Basic Phrases Guide
4 pages
Innovation Processes and Diffusion Insights
No ratings yet
Innovation Processes and Diffusion Insights
49 pages
Practical Research 1 Qualitative 2
100% (6)
Practical Research 1 Qualitative 2
112 pages

Towards Open-Set Object Detection and Discovery

Uploaded by

Towards Open-Set Object Detection and Discovery

Uploaded by

Towards Open-Set Object Detection and Discovery

Abstract (a) Closed-Set Object Detection (c) Open-Set Object

Open-Set Object Detection Memory Buffer Category Discovery

ℓroi = αpcl · ℓpcl + αcls · ℓcls + αreg · ℓreg (2)

where αpcl , αcls and αreg are positive adjustment ratios.

samples, known as the negative key. τ is a temperature pa- Outdoor,Accessory,

( 5.1. Benchmark Dataset

Table 4. Baseline model comparison for open-set detectors. The

Representation Learning Category Discovery Task-1 Task-2 Task-3

You might also like