0% found this document useful (0 votes)

76 views4 pages

CVIS Label Errors Camera Ready

This paper investigates the impact of label errors in training data on the performance of machine learning models, particularly focusing on convolutional neural networks (CNNs). The study finds that a small CNN model maintains high accuracy even with significant mislabeling, with performance degradation occurring only at extreme error rates related to the number of classes. These findings suggest that certain model architectures may exhibit robustness against label errors, paving the way for further research into model generalization and overfitting.

Uploaded by

David Ignatius Anyaeche

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

76 views4 pages

CVIS Label Errors Camera Ready

Uploaded by

David Ignatius Anyaeche

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

The Effects of Label Errors in Training Data on Model Performance and Overfitting

Nicholas Pellegrino1 Nolen Zhao1,2 Paul Fieguth1

1 Vision and Image Processing Group, Systems Design Engineering, University of Waterloo
2 Mechanical & Mechatronics Engineering, University of Waterloo
{npellegr,n37zhao,pfieguth}@uwaterloo.ca

Abstract Outliers Label Errors

Training data used in machine learning applications are often as-

Data points
sumed to be perfect, i.e., do not contain any errors; however, this
is almost never the case and may lead to limitations in the resulting
model performance. In this paper, the effects of the presence of la-
bel errors in training data are studied quantitatively and in relation to
model overfitting. By artificially creating label errors, it is observed that
a constrained (small) CNN model exhibits remarkable generalizability
— retaining high accuracy even when most data are mislabelled! Test
accuracy catastrophically falls only for unrealistically high label error
rates, at a point related to the number of classes present in the data.
These preliminary experiments pave the road towards further studies

1-NN
of model robustness, possibly offering a quantitative method through
which to compare models.

1 Introduction
In supervised learning problems, a set of labelled data, known as
training data, are required to optimize / train the model [1, 2]. Deep
neural networks, including convolutional neural networks (CNNs) [3,
5-NN
4], consist of layers of interconnected artificial neurons with associ-
ated weights which must be optimized in order to train the model.
Machine learning engineers normally assume that the “ground truth”
training data are labelled correctly ; however, this is not necessarily
the case, and in fact is often not the case! Indeed, in many bench-
mark datasets, label errors are present in rates on the order of 5% [5],
for example, in ImageNet [6]. In biological data, for example, the re- Fig. 1: Training data may contain both outliers and label errors. The
cently introduced BIOSCAN-1M Insect Dataset [7], where images of two columns include versions of a 2-class dataset: one with outliers
insects are labelled according to their taxonomy, the presence of la- and the other with label errors. The first row shows the data points,
belling errors is nearly inevitable given the difficulty of the taxonomic while the subsequent rows show nearest-neighbour (1-NN) and 5-
assessment problem [7, 8] and human error. In cases where train- nearest-neighbour (5-NN) classification regions. Data point shape
ing data label errors exist, one must ask the question of how model (circle vs. triangle) indicates true class and colour (red vs. blue) in-
performance ought to be evaluated, and what it means to achieve a dicates ground truth label. Outliers and mislabelled data may appear
particular percentage accuracy when some (likely unknown) fraction to be similar, but arise from completely different causes.
of labels are incorrect.
For illustration purposes, two versions of labelled data from a sim-
ple 2-class problem are pictured in the top row of Figure 1. In the first
training and evaluating a simple CNN model. By setting the corruption
column, the data contains outliers, and in the second column, there
rate, evaluations of model overfitting are made in a very controlled en-
are label errors. Data point shape (circle vs. triangle) indicates the
vironment. Techniques shown here may also lend themselves toward
true class and colour (red vs. blue) indicates the ground truth (train-
the determination of whether a particular model type may be more or
ing) label. Note that while outliers and mislabelled data may appear
less robust to the presence of training label errors.
to be similar, the two arise from completely different causes and will
impact classification models differently. Assuming data are clustered
with high density, surrounding some prototypical center point, outlier
points are those that are far from their true class’s center, whereas
2 Background
mislabelled data may appear anywhere but actually are often (due to
the assumption of high density) near their true class’s center. The As introduced in Section 1, biological data are especially prone to
presence of mislabelled data may lead to local classification error, es- mislabelling due to their complex nature. In particular, the BIOSCAN
pecially in cases of overfitting. Indeed, a more local classifier, such as project [10] is an ecologically important and relevant research effort
a nearest-neighbour scheme (shown in middle row of Figure 1), would in which the presence of label errors must be considered. In the
be highly susceptible to overfitting and local errors, whereas a more BIOSCAN project, insects are hand-labelled by taxonomic experts
global classifier, for example a 5-nearest-neighbour scheme (shown who make their assessment based on captured images. The main
in bottom row of Figure 1), would be less susceptible to overfitting difficulty here, ignoring the requirement for a high level of expertise,
and local classification errors due to its reliance on the consensus of is the lack of consensus and certainty about the taxonomy of life itself
multiple training data points. (i.e., the locations and numbers of branches / subcategories within
While simple nearest neighbour classification schemes may be the tree-like hierarchy). Fundamentally, the taxonomic categorization
easy to envision and intuitively understood for simple problems such of life is based on theory more so than an observable underlying struc-
as that of Figure 1, the behaviours of deep-neural-network-based ture. Indeed, much controversy may be found within the community of
classifiers on real-world problems are not. This paper studies the taxonomists! Nonetheless, it is accepted that a hierarchical structure
impact of having mislabelled training data by artificially corrupting the does exist and may eventually be largely uncovered. Therefore, the
training data from a familiar benchmark dataset, MNIST [9], and then notion of what should be considered an error is somewhat vague. Er-
rors may arise as a result of human error (e.g., labelling two examples
* Indicates equal contribution, joint first-authorship. of the same species as being of different taxa), or as a result of simply
not knowing in which category a given example belongs (e.g., labelling
an example (or entire group of examples) as being part of a given cat-
egory, when in fact it would better fit elsewhere). In the BIOSCAN-1M
Insect Dataset, the error rate is unknown; however, there is no doubt
that some errors are present.
To address the presence of label errors, in 2021, Northcutt et al.
developed a method for automatically detecting and correcting errors
in training data, known as Confident Learning [5, 11]. In doing so,
benchmark datasets including MNIST [9], CIFAR [12], ImageNet [6],
and more were examined, and possible mislabelled examples were
identified. Crowd-sourcing (Mechanical Turk) was then used both to
verify which selected examples were indeed incorrectly labelled, and
to propose a corrected label through consensus. These results are
available at labelerrors.com and provide a valuable resource for
those in the field. While this work provides one possible path for-
wards in contending with label errors in training data, little is known
about the behaviour and robustness of specific deep neural network
architectures in terms of handling label errors.

3 Preliminary Experiments & Results

Experiments are conducted upon the MNIST dataset, known to have Fig. 2: Model accuracy as a function of training data corruption rate.
a very low error rate (0.15%) [5] due to its simplicity. To evaluate Accuracy remains remarkably high even when most training data are
the impact of having increased error rates on model accuracy, the mislabelled! Until the corruption rate nears 0.9, model performance is
training partition of the dataset is artificially corrupted. Data are re- hardly affected. Beyond this point, there becomes fewer labels of the
labelled according to a specified corruption rate, rc ∈ [0, 1]. Whether correct class than any other, incorrect, class, and accuracy plummets
any given example is re-labelled is determined randomly according to towards zero.
whether a random number drawn (from a uniform distribution) is less
than rc . In this manner, over large quantities of data, the proportion of
re-labelled data approximates rc . Note that if selected, an example’s those of a 6-class and 2-class problem. In each case, for the general
label is necessary changed, i.e., made incorrect. Throughout all ex- M-class problem, examples from the first M classes of MNIST are
periments, model and training hyperparameters are set according to retained, omitting the remainder. Figure 3 shows the results of this
values specified in Table 1. To keep experiments simple, a minimalis- experiment which indeed confirm that the point at which the catas-
tic model based on an introductory example from PyTorch [13] capa- trophic change occurs is related to the number of classes through
ble of achieving > 99% accuracy on the MNIST dataset was selected. rc′ = 1 − 1/M.
The model used is a CNN consisting of two convolutional layers, fol- To gain further insight, training and testing loss are examined in
lowed by max pooling, dropout, a fully connected layer, dropout, and a Figure 4. Notice that while both losses do increase with increasing
final fully connected layer. In total, the model has only 1.2 M trainable corruption rate, the testing loss remains below the training loss until a
parameters. cross-over point at rc = 0.9, demonstrating for this range of corruption
rates that the model performs better during testing than it does during
Table 1: Hyperparameters used for experiments. training and is able to generalize quite well (i.e., not overfit) in spite
of large quantities of mislabelled data. At rc = 0.9, the cross-over
point, for each true class, there are approximately equal numbers of
Parameter Setting training samples labelled as all ten classes, and the model learns to
randomly guess, thereby resulting in equal loss during training and
Loss function Cross-Entropy testing. Beyond the cross-over point, fewer training samples of each
Optimizer SGD with momentum given class are labelled correctly than all other classes, the model
Learning rate 0.01 learns to not estimate the correct class (i.e., has overfit to mislabelled
Momentum 0.9 data), training loss plateaus, and testing loss spikes.
Batch-Size 64
Num. Epochs 12
4 Discussion

Firstly, the model validation accuracy is examined as a function In Figure 2, accuracy tapers quite gradually for modest (i.e., realis-
of training data corruption rate, shown in Figure 2. Observe that ac- tic) corruption rates, for example 0.05 < rc < 0.3. This insensitivity
curacy remains approximately steady and high (over 95%!) until a to corruption rate indicates that the model is able to generalize well,
corruption rate of approximately rc = 0.9, where an abrupt downward and may be a feature useful as a point of comparison between model
change occurs, before settling-out once again. This finding is quite types. Models for which accuracy decreases at a greater rate would
remarkable, given that the model continues to be accurate even when have a greater tendency to overfit and would generalize more poorly
most training data is mislabelled! The abrupt change seems to corre- than those models for which accuracy decreases more gradually.
spond to the transition point at which for any class, the number of Comparing loss with accuracy, in Figure 4, loss increases grad-
labels indicating the correct class equals the number of labels for ually with corruption rate, when meanwhile in Figure 2, the accuracy
any other, incorrect, class. Before this point, the model still tends is almost invariant to corruption rate until a point at which there is a
to learn the correct class-label association, and performs quite well. catastrophic and large change. This behaviour in accuracy seems to
After this point, the model has overfit to the mislabelled data and per- contradict what is seen in the loss:
forms poorly during testing. In terms of the number of classes within
the dataset (M = 10, for MNIST), the relationship determining the lo- Why is it that loss changes by only a small amount (specifically
cation of this catastrophic change in model behaviour appears to be surrounding the rc = 0.9 point) when yet accuracy rapidly plummets
rc′ = 1 − 1/M. from near 100% to near 0%?
To verify the relationship between the location of the abrupt
change and the number of classes, a similar experiment over which This is a result of how inference is performed and how cross-entropy
the number of classes is artificially reduced is conducted. Here, ac- loss is defined. The model outputs (after running through SoftMax)
curacy results for the original 10-class problem are shown alongside a set of predicted class probabilities, { p̂i }i∈[1,10] . The class with the
Fig. 3: Similar to the accuracy vs. corruption rate plot of Figure 2, Fig. 4: Training and testing loss as a function of corruption rate. Ob-
model accuracy is evaluated for a 10-class, 6-class, and 2-class prob- serve the cross-over point at rc = 0.9, whereby testing loss begins to
lem. For the 10-class problem, the abrupt change occurs at a corrup- exceed training loss. Training loss tends to plateau as false-labels
tion rate of approximately 0.9, whereas for the 6-class problem, the tend towards being fully uncorrelated and then anti-correlated with
abrupt change occurs at only 0.83, and for the 2-class problem, al- the data itself, i.e., random but not correct. Testing loss initially is
ready at 0.5. This location follows a trend specified by rc′ = 1 − 1/M, below training loss, as the model is still able to partially learn the
where M is the number of classes. correct class-label relationships (given that most data is still correctly
labelled); however, beyond the cross-over point, most data is not la-
belled correctly, the model learns to not estimate the correct class,
and testing loss spikes.
highest predicted probability is selected as the inferred class for a
given input, i.e.,

predicted class = arg max p̂i . (1) 5 Conclusion

i
This study investigated the impacts of the presence of label errors in
training data on model accuracy and training and testing loss. A sim-
So long as the predicted probability for the correct class, p̂c , is slightly
higher that that of all others p̂i , i ̸= c, the network will infer the cor-ple CNN model was used, with data artificially corrupted in the MNIST
rect class. As corruption rates increase towards rc = 0.9, fewer and dataset. Remarkably, the model continued to perform with high accu-
fewer samples are correctly labelled, and the predicted class proba- racy (over 95%) even when most training data was mislabelled! While
bilities tend towards that of a uniform random distribution. Just prior cases of data with large error rates are highly unlikely in practice, sim-
to rc = 0.9, the amount of correctly labelled data slightly exceeds the ilar investigations may be useful for machine learning engineers to
amount of incorrectly labelled data for each label, the predicted class learn more about which model architectures tend to generalize better
probability for the correct class, p̂c , generally slightly exceeds that of and can be used to avoid overfitting to mislabelled data.
all others (just greater than 0.1), and the model tends to still correctly Much future work remains in the study of label errors and model
classify testing data correctly. However, cross-entropy loss computes overfitting. Investigations of
the negative natural log of the predicted correct class probability, p̂c , • more complicated models and classification problems
averaged over all samples, indexed by n, in a batch of size N, (datasets),
• non-uniform error distributions (since label errors in real data
are likely to exhibit some correlation), and
1 N • constraints that may induce overfitting (e.g., limiting the amount
JCE = − ∑ ln ( p̂c ). (2)
N n=1 of data)
will be performed in order to better understand the architectural fea-
tures that make certain models more robust.
Notice that − ln (0.1) ≈ 2.3026, almost exactly the loss seen at the
cross-over point, at rc = 0.9. The negative log of the predicted cor-
rect class probability, − ln ( p̂c ), is smooth and does not exhibit large Acknowledgments
change surrounding the point p̂c = 0.1, whereas the highly non-linear
class selection method of Equation (1), which simply selects the class This research was enabled in part by support provided by Calcul
with highest predicted probability, abruptly changes as p̂c decreases Québec (calculquebec.ca) and the Digital Research Alliance of
below 0.1, leading to a near instantaneous loss in accuracy. Canada (alliancecan.ca).
While it is totally unrealistic to assume that models are being We acknowledge the support of the Natural Sciences and Engi-
trained with data having error rates towards rc = 0.9 in practice, the neering Research Council of Canada (NSERC), NSERC-PGS D, and
resulting observed trends in accuracy vs. corruption rate do reveal NSERC Discovery Grant, funding reference number RGPIN-2020-
a great deal about the robustness of a particular model to the pres- 04490.
ence of label errors. Having robustness to mislabelled data indicates Cette recherche a été financée par le Conseil de recherches en
that a model is better able to generalize, and not overfit to mislabelled sciences naturelles et en génie du Canada (CRSNG), CRSNG-ES D,
data. While only one model was explored in this study, this type of et CRSNG Subvention à la Découverte, numéro de référence RGPIN-
approach may be used to analyze and compare other prospective 2020-04490.
models for use in more complex classification problems in the real
world, allowing a designer to discover which models or architectures
are most susceptible to overfitting the dataset at hand, and select the
most suitable one.
References
[1] V. Nasteski, “An overview of the supervised machine learning
methods,” Horizons. b, vol. 4, pp. 51–62, 2017.
[2] A. Mathew, P. Amudha, and S. Sivakumari, “Deep learning tech-
niques: an overview,” Advanced Machine Learning Technologies
and Applications: Proceedings of AMLTA 2020, pp. 599–608,
2021.

[3] Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” nature, vol.

521, no. 7553, pp. 436–444, 2015.

[4] I. Goodfellow, Y. Bengio, and A. Courville, Deep learning. MIT

press, 2016. [Online]. Available: http://www.deeplearningbook.
org

[5] C. G. Northcutt, A. Athalye, and J. Mueller, “Pervasive label

errors in test sets destabilize machine learning benchmarks,”
NeurIPS 2021 Datasets and Benchmarks Track, 2021.

[6] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “Ima-
genet: A large-scale hierarchical image database,” in 2009 IEEE
conference on computer vision and pattern recognition. Ieee,
2009, pp. 248–255.
[7] Z. Gharaee, Z. Gong, N. Pellegrino, I. Zarubiieva, J. B. Haurum,
S. C. Lowe, J. T. McKeown, C. C. Ho, J. McLeod, Y.-Y. C. Wei
et al., “A step towards worldwide biodiversity assessment: The
bioscan-1m insect dataset,” arXiv preprint arXiv:2307.10455,
2023.

[8] N. Pellegrino, Z. Gharaee, and P. Fieguth, “Machine learning

challenges of biological factors in insect image data,” Journal of
Computational Vision and Imaging Systems, vol. 8, no. 1, pp.
34–37, 2022.
[9] L. Deng, “The mnist database of handwritten digit images for
machine learning research,” IEEE Signal Processing Magazine,
vol. 29, no. 6, pp. 141–142, 2012.
[10] “BIOSCAN,” Jun 2022. [Online]. Available: https://ibol.org/
programs/bioscan/

[11] C. Northcutt, L. Jiang, and I. Chuang, “Confident learning: Esti-

mating uncertainty in dataset labels,” Journal of Artificial Intelli-
gence Research, vol. 70, pp. 1373–1411, 2021.

[12] A. Krizhevsky, “Learning multiple layers of features from tiny im-

ages,” Tech. Rep., 2009.

[13] PyTorch, “Basic mnist example,” Sep 2022. [Online]. Available:

https://github.com/pytorch/examples/tree/main/mnist

Detecting Label Errors in Object Detection
No ratings yet
Detecting Label Errors in Object Detection
16 pages
Learning To Reweight
No ratings yet
Learning To Reweight
13 pages
A Comprehensive Introduction To Label Noise: Completely at Random (NCAR) at Random (NAR) Not at Random (NNAR)
No ratings yet
A Comprehensive Introduction To Label Noise: Completely at Random (NCAR) at Random (NAR) Not at Random (NNAR)
10 pages
Is The Performance of My Deep Network Too Good To Be True? A Direct Approach To Estimating The Bayes Error in Binary Classification
No ratings yet
Is The Performance of My Deep Network Too Good To Be True? A Direct Approach To Estimating The Bayes Error in Binary Classification
22 pages
0 Machine Learning Overview and Metrics LT
No ratings yet
0 Machine Learning Overview and Metrics LT
84 pages
Types and Remedies for Prediction Errors
No ratings yet
Types and Remedies for Prediction Errors
9 pages
Validation and Verification
No ratings yet
Validation and Verification
29 pages
On Truthing Issues in Supervised Classification: Jonathan K. Su
No ratings yet
On Truthing Issues in Supervised Classification: Jonathan K. Su
91 pages
MidA F21
No ratings yet
MidA F21
8 pages
Basics of ML and Evaluation
No ratings yet
Basics of ML and Evaluation
42 pages
Complete ML Concepts
No ratings yet
Complete ML Concepts
30 pages
Lecture 2 - Supervised Learning
No ratings yet
Lecture 2 - Supervised Learning
6 pages
ML Tips and Tricks
No ratings yet
ML Tips and Tricks
32 pages
When Does Label Smoothing Help4
No ratings yet
When Does Label Smoothing Help4
13 pages
Supervised Machine Learning
No ratings yet
Supervised Machine Learning
25 pages
When Does Label Smoothing Help?: This Work Was Done As Part of The Google AI Residency
No ratings yet
When Does Label Smoothing Help?: This Work Was Done As Part of The Google AI Residency
12 pages
Learning From Noisy Labels With Deep Neural Networks Survey
No ratings yet
Learning From Noisy Labels With Deep Neural Networks Survey
19 pages
ML 01
No ratings yet
ML 01
24 pages
UNIT 4 Supervised Learning
No ratings yet
UNIT 4 Supervised Learning
38 pages
School of Computing and Information Systems The University of Melbourne COMP90049 Introduction To Machine Learning (Semester 1, 2022)
No ratings yet
School of Computing and Information Systems The University of Melbourne COMP90049 Introduction To Machine Learning (Semester 1, 2022)
4 pages
Noisy Label Detection with DynaCor
No ratings yet
Noisy Label Detection with DynaCor
11 pages
NIPS 2017 Decoupling When To Update From How To Update Paper
No ratings yet
NIPS 2017 Decoupling When To Update From How To Update Paper
11 pages
Understanding and Utilizing Deep Neural Networks Trained With Noisy Labels
No ratings yet
Understanding and Utilizing Deep Neural Networks Trained With Noisy Labels
13 pages
Subtitle
No ratings yet
Subtitle
2 pages
05 Labels and Losses PDF
No ratings yet
05 Labels and Losses PDF
80 pages
JB 14
No ratings yet
JB 14
6 pages
Machine Learning Overview Cheatsheet
No ratings yet
Machine Learning Overview Cheatsheet
14 pages
Data Quality for Neural Networks
No ratings yet
Data Quality for Neural Networks
13 pages
Lec15 Regression Gradient Descent
No ratings yet
Lec15 Regression Gradient Descent
84 pages
Learning Discriminative Dynamics With Label Corruption For Noisy Label Detection
No ratings yet
Learning Discriminative Dynamics With Label Corruption For Noisy Label Detection
14 pages
IT 802 ML Unit-2 Notes
No ratings yet
IT 802 ML Unit-2 Notes
19 pages
Overfitting Solutions in Machine Learning
No ratings yet
Overfitting Solutions in Machine Learning
7 pages
Understanding Overfitting and Bias
No ratings yet
Understanding Overfitting and Bias
3 pages
Machine Learning-2
No ratings yet
Machine Learning-2
87 pages
Accuracy
No ratings yet
Accuracy
15 pages
Supervised Learning
No ratings yet
Supervised Learning
5 pages
Main
No ratings yet
Main
5 pages
Model Evaluation
No ratings yet
Model Evaluation
29 pages
2020 BO and Adversarial Attacks
No ratings yet
2020 BO and Adversarial Attacks
26 pages
Nderstanding Out OF Distribution Accuracies Through Quantifying Difficulty of Test Samples
No ratings yet
Nderstanding Out OF Distribution Accuracies Through Quantifying Difficulty of Test Samples
18 pages
Machine Learning HC
No ratings yet
Machine Learning HC
4 pages
MtechDL Unit2
No ratings yet
MtechDL Unit2
25 pages
Machine Learning Yearning Overview
No ratings yet
Machine Learning Yearning Overview
40 pages
Lecture 3 Precision Vs Recall Machine Learning
No ratings yet
Lecture 3 Precision Vs Recall Machine Learning
18 pages
The Real-World-Weight Cross-Entropy Loss Function: Modeling The Costs of Mislabeling
No ratings yet
The Real-World-Weight Cross-Entropy Loss Function: Modeling The Costs of Mislabeling
8 pages
Mod 7 Smote ML
No ratings yet
Mod 7 Smote ML
40 pages
DSOST3
No ratings yet
DSOST3
31 pages
1611 03530 PDF
No ratings yet
1611 03530 PDF
15 pages
16 Boosting
No ratings yet
16 Boosting
7 pages
Underfitting and Overfitting Slides and Transcript
No ratings yet
Underfitting and Overfitting Slides and Transcript
13 pages
2234-Article Text-7020-1-10-20250217
No ratings yet
2234-Article Text-7020-1-10-20250217
8 pages
DL Unit1
100% (2)
DL Unit1
79 pages
The Lack of A Priori Distinctions Between Learning Algorithms
No ratings yet
The Lack of A Priori Distinctions Between Learning Algorithms
51 pages
Lec 3
No ratings yet
Lec 3
31 pages
Classification
No ratings yet
Classification
53 pages
Chapter 7 - LAST
No ratings yet
Chapter 7 - LAST
29 pages
ML LVC 3 Post-Session Summary
No ratings yet
ML LVC 3 Post-Session Summary
16 pages
Cours 7 B
No ratings yet
Cours 7 B
31 pages
Mastercard Multi Rail Payments Opportunity For Small and Medium Financi
No ratings yet
Mastercard Multi Rail Payments Opportunity For Small and Medium Financi
14 pages
Aeris IoT Healthcare Medical Device Security Buyers Guide
No ratings yet
Aeris IoT Healthcare Medical Device Security Buyers Guide
22 pages
Entropy 25 01469
No ratings yet
Entropy 25 01469
22 pages
Gene Pool Engineering - KV
No ratings yet
Gene Pool Engineering - KV
19 pages
Company Profile - Indocool Group
No ratings yet
Company Profile - Indocool Group
20 pages
VTSM
No ratings yet
VTSM
4 pages
The Ultimate Guide To Getting Free Instagram Followers in 2024
No ratings yet
The Ultimate Guide To Getting Free Instagram Followers in 2024
4 pages
Uniflair Direct Expansion InRow Cooling - ACRD301P
No ratings yet
Uniflair Direct Expansion InRow Cooling - ACRD301P
2 pages
WH Dde
No ratings yet
WH Dde
21 pages
Wiresize
No ratings yet
Wiresize
10 pages
Can You Hear Me?: Background Knowledge
No ratings yet
Can You Hear Me?: Background Knowledge
2 pages
Bug Bounty Tools
No ratings yet
Bug Bounty Tools
2 pages
Am GF Gi-Gg Gi-Gg Am
No ratings yet
Am GF Gi-Gg Gi-Gg Am
3 pages
100% Job Oriented Full Stack Development Internship
No ratings yet
100% Job Oriented Full Stack Development Internship
5 pages
1730680363
No ratings yet
1730680363
3 pages
1 Juniper Networks Security Firewall Gateway Comparison Chart
No ratings yet
1 Juniper Networks Security Firewall Gateway Comparison Chart
4 pages
DMIX Speedcrafts Self Loading Mixer
No ratings yet
DMIX Speedcrafts Self Loading Mixer
2 pages
Stern Tube
100% (2)
Stern Tube
79 pages
Data Analysis for Researchers
No ratings yet
Data Analysis for Researchers
28 pages
S4510 Main SAP MM Course
No ratings yet
S4510 Main SAP MM Course
27 pages
Linguistic Diversity
No ratings yet
Linguistic Diversity
5 pages
Neos Epabx 6s Programmingmanual
No ratings yet
Neos Epabx 6s Programmingmanual
124 pages
Installation Qualification, Operation Qualification, and Performance Qualification Procedure
No ratings yet
Installation Qualification, Operation Qualification, and Performance Qualification Procedure
3 pages
POS Audit Checklist
No ratings yet
POS Audit Checklist
1 page
DS 2009-2016
No ratings yet
DS 2009-2016
59 pages
Default
No ratings yet
Default
4 pages
FLGT Guide Dec21
No ratings yet
FLGT Guide Dec21
379 pages
Intel® Core™ Processors - View Latest Generation Core Processors
No ratings yet
Intel® Core™ Processors - View Latest Generation Core Processors
3 pages
Superflow 110 Instructions
67% (3)
Superflow 110 Instructions
30 pages
2020EHE BrandGuidelines
No ratings yet
2020EHE BrandGuidelines
38 pages
Different Plumbing Tools
No ratings yet
Different Plumbing Tools
6 pages
Associate-Google-Workspace-Administrator (77 Questions)
No ratings yet
Associate-Google-Workspace-Administrator (77 Questions)
6 pages
Record of Daily Attendance
No ratings yet
Record of Daily Attendance
2 pages
EPAS Gateway EN UM E51
No ratings yet
EPAS Gateway EN UM E51
473 pages

CVIS Label Errors Camera Ready

Uploaded by

CVIS Label Errors Camera Ready

Uploaded by

The Effects of Label Errors in Training Data on Model Performance and Overfitting

Nicholas Pellegrino*1 Nolen Zhao*1,2 Paul Fieguth1

Abstract Outliers Label Errors

Training data used in machine learning applications are often as-

3 Preliminary Experiments & Results

predicted class = arg max p̂i . (1) 5 Conclusion

[3] Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” nature, vol.

[4] I. Goodfellow, Y. Bengio, and A. Courville, Deep learning. MIT

[5] C. G. Northcutt, A. Athalye, and J. Mueller, “Pervasive label

[8] N. Pellegrino, Z. Gharaee, and P. Fieguth, “Machine learning

[11] C. Northcutt, L. Jiang, and I. Chuang, “Confident learning: Esti-

[12] A. Krizhevsky, “Learning multiple layers of features from tiny im-

[13] PyTorch, “Basic mnist example,” Sep 2022. [Online]. Available:

You might also like

Nicholas Pellegrino1 Nolen Zhao1,2 Paul Fieguth1