0% found this document useful (0 votes)
12 views21 pages

Extracted Data

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as XLSX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views21 pages

Extracted Data

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as XLSX, PDF, TXT or read online on Scribd

ID General Context

1.1 1.2
What problems What are the
Study ID Title in testing MLSs testing levels
Addressed
are tackled?ProbTesting
addressedLevels
by
the proposed
Textual descriptiinput, model, int
Abdessalem, ASE, 2016 Testing advanced driver assistance systems using mul Cost system
Abdessalem, ASE, 2018 Testing autonomous cars for feature interaction failu Integration integration
Abdessalem, ICSE, 2018 Testing vision-based control systems using learnable Cost system
Abeysirigoonawardena, ICRA, 2019 Generating Adversarial Driving Scenarios in High-FideCost system
Aniculaesei, SEFAIS, 2018 Towards a holistic software systems engineering ap Online Monitorinsystem
Beglerovic, ITSC, 2017 Testing of autonomous vehicles using surrogate model Cost system
Bolte, IV, 2019 Towards Corner Case Detection for Autonomous DrivOnline Monitorininput
Bühler, SAE, 2004 Automatic Testing of an Autonomous Parking SystemCost system
Byun, AITest, 2019 Input Prioritization for Testing Neural Networks Regression input
Xie, arXiv, 2018 METTLE: A Metamorphic Testing Approach To ValidaOracle model
Cheng, arXiv, 2018 Quantitative Projection Coverage for Testing ML-en Oracle system
Cheng, QRS, 2018 Manifesting bugs in machine learning code: An explora Faults and Debumodel
Ding, MET, 2017 Validating a Deep Learning Framework by MetamorphOracle model
Du, FSE, 2019 DeepStellar: model-based quantitative analysis of staCost model
Dwarakanath, ISSTA, 2018 Identifying implementation bugs in machine learning Faults and Debumodel
Eniser, FASE, 2019 DeepFault: Fault Localization for Deep Neural NetworFaults and Debumodel
Fremont, PLDI, 2019 Scenic: Language-Based Scene Generation Scenario specifi system
Gopinath, arXiv, 2018 Symbolic Execution for Deep Neural Networks Oracle model
Groce, TSE, 2014 You are the only possible oracle: Effective test selec Regression model
Guo, FSE, 2018 DLFuzz: Differential fuzzing testing of deep learning Adequacy Criterimodel
Henriksson, arXiv, 2019 Towards Structured Evaluation of Deep Neural NetwoOnline Monitorininput
Kim, ICSE, 2019 Guiding Deep Learning System Testing using Surpri Adequacy Criterimodel
Klueck, ISSREW, 2018 Using Ontologies for Test Suites Generation for Aut Scenario specifi system
Li, ISSREW, 2018 TensorFI: A Configurable Fault Injector for TensorFlo Faults and Debumodel
Li, IVtran, 2016 Intelligence testing for autonomous vehicles: a new Realism system
Ma, ASE, 2018 DeepGauge: Multi-granularity testing criteria for deepAdequacy Criterimodel
Context
1.3 1.4 2.1 2.2
What domains To which kind of Machine What are the generated test Which test adequacy criteria have
can the Learning algorithm can the artefacts? been adopted or proposed?
Domain
considered ML Algorithm
testing solutions be Test artefacts Test Adequacy Criterion
MLSs be applied applied?
e.g. Domain-agnostic,
e.g. Convolutional
Self-Driving Cars,
Neural
Robot
NetNavigation,
Test inputs,Image
oracle,Classification,
etc. Chatbots,
Coverage,
Financial
combinatorial,
Operator, diversity,
unmannedetaerial vehicles, unmanne
autonomous sysalgorithm agnostic Test Inputs N/A
autonomous sysalgorithm agnostic Test Inputs N/A
autonomous sysalgorithm agnostic Test Inputs N/A
autonomous sysML-enabled autonomous vehTest Inputs N/A
domain-agnosticML-enabled autonomous vehN/A N/A
autonomous sysalgorithm agnostic Test Inputs N/A
autonomous sysML-enabled autonomous vehN/A N/A
autonomous sysalgorithm agnostic Test Inputs N/A
domain-agnosticNN Prioritized Tests Confidence (based on softmax output)
domain-agnosticML -> clustering algorithms Oracle (Metamorphic) K-projection coverage (proposed in
domain-agnosticalgorithm agnostic N/A Basic State Coverage; k-step State
domain-agnosticalgorithm agnostic Analysis of Mutants N/A
classifiers (imag NN Oracle (Metamorphic) N/A
domain-agnosticNN -> RNN Test Inputs 5 coverage criteria of the abstract st
classifiers (imag ML -> classifiers -> image claOracle (Metamorphic) N/A
domain-agnosticNN Test Inputs Neuron Coverage (based on spectrum
autonomous sysML-enabled autonomous vehTest Inputs N/A
classifiers (imag NN Pixels in the Image Sorted by ImportaN/A
classifiers (generML -> classifiers -> generic Prioritized Tests Diversity
domain-agnosticNN Test Inputs Neuron Coverage
domain-agnosticalgorithm agnostic N/A N/A
domain-agnosticNN N/A Surprise Adequacy
autonomous sysML-enabled autonomous vehTest Inputs Combinatorial
domain-agnosticNN N/A N/A
autonomous sysML-enabled autonomous vehTest Inputs Scenario Space Coverage
domain-agnosticNN N/A Neuron Coverage; k-multisection Ne
Proposed Approach
2.3 2.4 2.5 2.6 2.7
What approaches are adopted to Which test oracles are used? What is the Is the context of the MLS Are the proposed solutions
generate test input data? access to the modelled? available?
Test Input Generation Approach Test Oracle Access type
tested Context Modele Context Model DAvailability of
Random, search-based, manual, advers component?
Misclassification, mutation killing, mBlack Box, White Yes, No Description of th No, Yes and open
Search-Based (NSGA and NSGAII)Failure (Domain-Specific) black box Yes range visible for No
Search-Based Failure (Domain-Specific) white box Yes cars, pedestrian No
Search-Based (with Decision Tree)Failure (Domain-Specific) black box Yes very detailed (to No
Adversarial (using Bayesian OptimisFailure (Domain-Specific) black box Yes The behaviour/poNo
N/A Metamorphic black box No N/A No
Search-Based (Metaheuristic algoriFailure (Domain-Specific) black box Yes Vector of paramet No
N/A N/A black box Yes Image is segment
- parking No
space geometry, i.e. set of points tha
Search-Based Failure (Domain-Specific) black box Yes - vehicle state (position,
No orientation and shape
N/A N/A white box No N/A Yes and Open S
Input Mutation (Metamorphic) Metamorphic black box No N/A No
Linear Programming N/A data box Yes Categories (i.e., No
Adversarial Mutation Killing white box (we neNo N/A No
Input Mutation (Metamorphic) Metamorphic data box No N/A No
Input Mutation (Metamorphic) Metamorphic white box (finite No
data box N/A No
Input Mutation (Metamorphic) Metamorphic No N/A No
Search-Based (based on suspiiciouN/A white box No N/A Yes and Open So
Random (Adaptive) Misclassification black box Yes parameterized s Yes and Open S
N/A Misclassification white box (need No N/A No
N/A N/A data box (distancNo N/A No
Input Mutation (with optimization al Metamorphic white box (neuroNo N/A Yes and Open S
N/A N/A data box No N/A No
Adversarial (with known methods) N/A white box No N/A Yes and Open S
Combinatorial N/A
Accuracy Drop black box Yes domain ontologyNo t
N/A Failure (Domain-Specific) white box (accesNo N/A Yes and Open S
Random (Adaptive) black box Yes virtual reality No
N/A N/A white box (neuroNo N/A No
2.7 3.1 3.2 3.3 3.4 3.5 3.6 3.7
Are the proposed solutions What type of What kind of Which ML Are the Are the ML What datasets Which systems
available? research has research models have considered ML models already have been used have been
Link to the propEvaluation
been TypeEvaluation
method has Met been
ML Model ML Models Availtrained?
models Pretrained modeDataset
to System
train the ML considered?
link conducted? been adopted?
no evaluation, ac experiment, experconsidered? available?
Model name, e.g. Yes, No Yes/No models?
Dataset Name, e.gSystem Name
N/A academic and indexperiment and N/A N/A N/A N/A ADAS
N/A academic and indexperiment and commercial onesNo No N/A ADAS (SafeDrive)
N/A academic and indexperiment and N/A N/A N/A N/A ADAS
N/A academic experiment N/A N/A N/A N/A ADAS
N/A no evaluation N/A N/A N/A N/A N/A N/A
N/A academic experiment N/A N/A N/A N/A ADAS
N/A academic proof of concept PredNet (Deep CoYes No Cityscapes N/A
N/A industrial experiment automated parkinN/A N/A N/A ADAS
http://www.githubindustrial experiment 4 models for imagYes for image cla No MNIST, EMNIST fo N/A
N/A academic experiment WEKA 3.6.6: k-mea Yes No N/A N/A
N/A academic proof of concept N/A N/A N/A N/A ADAS
N/A academic experiment Weka Naive Bayes No No UCI ML repositorN/A
N/A academic experiment AlexNet, VGG, GoYes No Custom (MedicalN/A
N/A academic experiment Mozilla DeepSpeeYes Yes Fisher, LibriSpeeN/A
N/A academic experiment SVM, ResNet (offiYes No CIFAR-10, SVHN, N/A
K
https://deepfault.academic experiment 3 pretrained modYes Yes MNIST, CIFAR-10N/A
https://github.c academic proof of concept SqueezeDet: DNNYes f No Custom (Images)ADAS (perception
N/A academic experiment "fully connected No No MNIST N/A
N/A academic experiment and Naive Bayes, SVMN/A No "20 Newsgroups",N/A
https://github.c academic experiment 6 pretrained modYes Yes MNIST, ImageNetN/A
N/A academic experiment N/A N/A N/A N/A N/A
https://github.co academic experiment Convolutional Neural
N/A Network; Convolutional
Yes Neural
MNIST,
Network
CIFAR-10,
+ LSTM
N/A
N/A academic proof of concept N/A N/A N/A N/A N/A
Dave-2, Chaffeur
https://github academic experiment kNN, LR (LogisticNo No UCI Machine Learn N/A
N/A academic proof of concept N/A N/A N/A N/A ADAS
N/A academic experiment 5 pretrained modYes Yes MNIST, ImageNetN/A
Evaluation
3.8 3.9 3.10 3.11 3.12 3.13
Are the systems Which What type of What metrics have been Are there comparative studies? Is the
available? simulators have failures have adopted? experimental data
Systems Availabibeen
Simulator
used? Failure
been Metric Presence of a comparative study Experimental data
available?
Yes, No considered?
Simulator Name Failure Type Metric name Yes, No Compared Techniq Yes, No
No PreScan (DS:AV) compound number of detected failures and o No N/A No
No PreScan (DS:AV) feature inumber of detected failures and o No N/A Yes
No PreScan (DS:AV:C) collisi (1) various metrics for the pareto No N/A No
No CARLA (DS:AV:C) collisi number & magnitude of collisions Yes random search; croNo
No N/A N/A N/A No N/A No
No Matlab/Simulink(DS:AV:C) collisi Number of collisions found No N/A No
No N/A N/A MSE (Mean Squared Error), SSIM (S No N/A No
No Matlab (DS:AV:C) collisi surface defined by fitness values Yes Internal comparison No
No X-plane 11 sueface defined by
(M) misclassifica - validation accuracylog(abs(fitness value))
Yes - Compares the 3 considered
Yes prioritization techniqu
No N/A - test accuracy
(MET) violation reclustering percentage: measure of No - compares
N/A against random
No selection for retraining
[Note: for classification problem, accuracy is the percentage of correct classification, for regression is mea
Yes CARLA N/A k-projection coverage No N/A No
- Efficacy: Variant of Average Percentage of Fault Detected (APFD) [metric commonly used for measuring
No N/A (MK) mutation kilm utation score
described Yes to the area under
as the ratio of the technique metamorphic
the curve ofrelatio
theNideal
o prioritization criterion
No N/A (M) misclassificaclassification accuracy Yes AlexNet, VGG, GoogNo
No N/A (DS:OT) Word/Cha 1. Jaccard index between the state-based
No coverage of two
N/Ainputs (originalNo and mutant) to measure the cov
No N/A N/A 2. Word Error
mutation scoreRate against
of the JaccardNIndex
metamorphic o to measure correlation
N/A betweenNo erroneous behaviors of RNN an
3. increase of bsc coverage over time to measure effectiveness of the approach
No N/A (M) misclassificaLoss/Accuracy, distance metrics (� No N/A Yes
4. average Word Error Rate to measure defect detection
No GTAV (M) misclassifica- precision and recall of object det Yes - state of the art s No
No N/A N/A importance score for pixels No N/A No
No N/A (M) misclassificanumber of failures (misclassificatiYes Random, Canonical,No
No N/A (M) misclassificaneuron coverage; L2 distance; numYes DeepXplore No
No N/A N/A N/A No N/A Yes
No N/A (M) misclassificaAUROC, Neuron Coverage (NC), k-M Yes DeepXplore, DeepTYes
No N/A N/A N/A No N/A No
No N/A (AC) accuracy dr accuracy drop No N/A No
No custom (DS:AV:C) collisi score depending on jerk and the d No N/A No
No N/A (M) misclassifica(1) k-multisection neuron coverage Yes DeepXplore (neuron No
Research Trends
3.14 3.15 3.16 4.1 4.2 4.3 4.4
What time What is the What is the What is the In which fora is research on MLSs Which are the most Which are the
budget has been software setup? hardware number of testing published? active researchers author
Time Budget Software Setup setup?
used? Hardware SetupYear of PublVenue Type
published Venue Name List
in ofarea?
this Authors Organisations
affiliations?
articles
Time Budget LenSoftware setup d Hardware setupjournal, per
Year conference, workshop, preprint
acronym Authors' Name Organisations' n
N/A N/A N/A 2016 c ASE Abdessalem, R.; Nejati University of L
12h N/A N/A 2018 c ASE Abdessalem, R.; Panich University of L
24h N/A N/A 2018 c ICSE Abdessalem, R.; Nejati University of L
N/A N/A N/A 2019 c ICRA Abeysirigoonawardena McGill University
N/A N/A N/A 2018 c SEFAIS Aniculaesei, A.; GrieTechnische Unive
N/A Matlab/SimulinkN/A 2017 c ITSC Beglerovic, H. ; Stol AVL List GmbH (A
N/A N/A Nvidia GeForce G 2019 c IV Symposium Bolte, J.-A.; Bär, A.; Technische Univ
N/A Matlab (GEA toolbN/A 2004 j SAE Techical PapersBühler, O.; Wegener,STZ Softwaretec
N/A Keras, Ubuntu 16Intel i5 CPU, 32 2019 c AITest Byun, T.; Sharma, V.University of Mi
N/A Windows 10 X64,Intel(R) Core(TM 2018 a arXiv (cs.SE) Xie, X.; Zhang, Z.; CWuhan University
N/A N/A N/A 2018 c ATVA Cheng, C.-H. ; Huanfortiss (Germany
N/A N/A N/A 2018 c QRS Cheng, D.; Cao, C.; Nanjing Universi
N/A Caffe GPU workstation 2017 w MET Ding, J. ; Kang, X. ; East Carolina Un
12h N/A N/A 2018 c ESEC/FSE Xiaoning, Du ; Xiaofei Nanyang Technolo
N/A N/A N/A 2018 c ISSTA Dwarakanath, A.; Ahu Accenture Techn
N/A Keras (v2.2.2) wiUbuntu server wi 2019 c FASE Eniser, H.; GerasimoBogazici Universi
N/A N/A Nvidia Titan Xp 2019 c PLDI Fremont, D.; Yue, X.;University of Cal
N/A N/A N/A 2018 a arXiv (cs.SE) Gopinath, D. ; Wang,Carnegie Mellon U
N/A Mallet framewor N/A 2014 j TSE Groce, A.; Kulesza, Oregon State Uni
N/A N/A 4 cores (Intel i 2018 c ESEC/FSE Guo J.; Jiang Y.; ZhTsinghua Univers
N/A N/A N/A 2019 c AITest Henriksson, J.; BergSemcon AB (Swed
N/A Keras Intel i7-8700 CPU 2019 c ICSE Kim, J.; Feldt, R.; Y Korea Advanced I
N/A N/A N/A 2018 w ISSREW Klueck, F.; Li, Y.; N Graz University
N/A N/A N/A 2018 w ISSREW Li, G.; PattabiramanUniversity of Bri
N/A N/A N/A 2016 j IV Transactions Li, L.; Huang, W.-L.;Tsinghua Univers
N/A Keras 2.1.3; TensComputer cluster 2018 c ASE Ma, L.; Juefei-Xu, F.Harbin Institute
Research Trends
4.5 4.6
Which countries Which are the
have produced most influential
Countries
more CitationinCount
studies? studies terms
of citation
Country Codes (I citation count from Google Scholar
LUX 39
LUX 10
LUX 32
CAN 2
DEU 3
AUS 7
DEU 1
DEU 45
USA 2
CHN; AUS; SGP 0
DEU; JPN 9
CHN 3
USA; CHN 17
CHN; JPN 19
IND 29
TUR; GBR 2
USA 6
USA 14
USA; GBR 24
CHN 18
SWE 0
KOR; SWE 28
AUT 5
CAN 3
CHN 68
CHN; USA; AUS; 73
Ma, ESEC/FSE, 2018 MODE: Automated neural network model debugging via s and Debumodel
Faults
Ma, ISSRE, 2018 DeepMutation: Mutation Testing of Deep Learning SyData Quality As model
Ma, SANER, 2019 DeepCT: Tomographic Combinatorial Testing for DeeAdequacy Criterimodel
Majumdar, arXiv, 2019 Paracosm: A Language and Tool for Testing AutonomScenario specifi system
Mullins, JSS, 2018 Adaptive generation of challenging scenarios for test Boundary Identifisystem
Murphy, -, 2008 Improving the Dependability of Machine Learning ApplAdequacy Criterimodel
Murphy, ISSTA, 2009 Automatic System Testing of Programs without Test Cost model
Murphy, RT, 2007 Parameterizing random test data according to equiva Adequacy Criterimodel
Murphy, SEKE, 2007 An approach to software testing of machine learning aOracle model
Murphy, SEKE, 2008 Properties of machine learning applications for use i Oracle model
Nakajima, APSEC, 2016 Dataset coverage for testing machine learning compuOracle model
Nakajima, SEFM-W, 2017 Generalized oracle for testing machine learning com Oracle model
Nakajima, SOFL MSVL, 2018 Dataset Diversity for Metamorphic Testing of Machin Oracle model
Neves, IJES, 2016 Combination and mutation strategies to support test dRealism system
Odena, PMLR, 2019 TensorFuzz: Debugging Neural Networks with Cover Faults and Debumodel
Patel, IROS, 2018 Adversarial Learning-Based On-Line Anomaly MonitoOnline Monitorinmodel
Pei, SOSP, 2017 DeepXplore: Automated Whitebox Testing of Deep L Adequacy Criterimodel
Qin, QRS, 2018 SynEva: Evaluating ML programs by mirror program sOracle model
Rubaiyat, PRDC, 2018 Experimental Resilience Assessment of an Open-Sour Faults and Debusystem
Saha, arXiv, 2019 Fault Detection Effectiveness of Metamorphic Relatio Oracle model
Sekhon, ICSE-Nier, 2019 Towards Improved Testing For Deep Learning Adequacy Criterimodel
Shen, QRS-C, 2018 MuNN: Mutation Analysis of Neural Networks Adequacy Criterimodel
Shi, arXiv, 2019 DeepGini: Prioritizing Massive Tests to Reduce LabelRegression model
Spieker, arXiv, 2019 Towards Testing of Deep Learning Systems with TraiCost model
Strickland, ICRA, 2018 Deep Predictive Models for Collision Risk AssessmenOnline Monitorinmodel
Sun, arXiv, 2018 Testing Deep Neural Networks Adequacy Criterimodel
Sun, ASE, 2018 Concolic testing for deep neural networks Adequacy Criterimodel
Tian, ICSE, 2018 DeepTest: Automated Testing of Deep-Neural-Netwo Realism model
Tuncali, arXiv, 2019,#1 Simulation-based Adversarial Test Generation for A Cost system
Tuncali, arXiv, 2019,#2 Rapidly-exploring Random Trees-based Test GeneratBoundary Identifisystem
Udeshi, arXiv, 2019 Grammar Based Directed Testing of Machine Learni Realism model
Udeshi, ASE, 2018 Automated directed fairness testing Data Quality As model
domain-agnosticNN Test Inputs N/A
domain-agnosticNN Mutation Operators N/A
domain-agnosticNN Test Inputs Neuron Coverage (Combinatorial t-w
autonomous sysML-enabled autonomous vehTest Inputs N/A
autonomous sysML-enabled autonomous vehN/A N/A
domain-agnosticML -> anomaly detector; ML Training
-> Data N/A
domain-agnosticML -> classifiers -> generic; Oracle (Metamorphic) N/A
domain-agnosticML -> ranking algorithm Test Inputs Random
domain-agnosticML -> classifiers -> image cl Test Inputs N/A
domain-agnosticML -> ranking algorithm Oracle (Metamorphic) N/A
domain-agnosticML -> classifiers -> generic Oracle (Metamorphic) N/A
domain-agnosticML -> classifiers -> generic Oracle (Metamorphic) N/A
domain-agnosticalgorithm agnostic Oracle (Metamorphic) Diversity
autonomous sysML-enabled autonomous vehTest Inputs N/A
domain-agnosticNN Test Inputs N/A
autonomous sysML-enabled autonomous vehN/A N/A
domain-agnosticNN Test Inputs Neuron Coverage
classifiers (generML -> clustering algorithms Oracle (Differential, created automatN/A
autonomous sysML-enabled autonomous vehN/A N/A
domain-agnosticML -> classifiers -> generic Oracle (Metamorphic) N/A
domain-agnosticNN Test Inputs Neuron Coverage (2-way)
domain-agnosticNN Mutation Operators N/A
domain-agnosticalgorithm agnostic N/A NAC, NBC, SNAC, TKNC, KMNC
classifiers (imag NN Training Data N/A
autonomous sysML-enabled autonomous vehN/A N/A
domain-agnosticNN Test Inputs Sign-Sign Cover, Distance-Sign Cover
domain-agnosticNN Test Inputs Lipschitz continuity; Neuron cover
domain-agnosticNN Test Inputs Neuron Coverage
autonomous sysML-enabled autonomous vehTest Inputs Combinatorial
autonomous sysML-enabled autonomous vehTest Inputs Scenario Space Coverage
domain-agnosticML -> classifiers -> generic Test Inputs N/A
classifiers (generML -> classifiers -> generic Test Inputs N/A
GAN Misclassification white box No N/A No
N/A Mutation Killing white box (we neNo N/A No
Constraint Solving (Search, SAT) Misclassification (of newly generated white box No N/A No
N/A N/A black box No N/A No
Search-Based (Driven by Adaptive N/A black box Yes The model and th No
Manual; Random; Input Mutation ( Metamorphic white box No N/A No
N/A Metamorphic black box No N/A No
Random (with parameters) Differential data box (they p No N/A No
Manual Differential white box No N/A No
Input Mutation (Metamorphic) Metamorphic black box No N/A No
Input Mutation (Metamorphic) Metamorphic white box No N/A No
N/A Metamorphic black box No N/A No
Input Mutation (Metamorphic) Metamorphic white box (weighNo N/A No
Input Mutation (Mutation and recomb
N/A black box Yes Time sequence of No
Input Mutation Failure (Domain-Specific) white box No N/A Yes and Open S
N/A N/A black box No The context is r No
Search-Based (Gradient Based) Differential white box No N/A Yes and Open S
N/A
Input Mutation Differential (a mirror-like program c black box No N/A No
Failure (Domain-Specific) black box No N/A No
Input Mutation (Metamorphic) Metamorphic data box No N/A No
Search-Based (guided by joint optim
Differential white box No N/A No
N/A Mutation Killing white box (need No N/A No
N/A Misclassification (of newly generated white box No N/A Yes and Open So
N/A N/A data box (trainin No N/A No
No
N/A N/A white box No The context is r
Linear Programming Metamorphic (according to L-inf no white box No N/A Yes and Open So
Concolic N/A white box No N/A Yes and Open So
Input Mutation (Image TransformatiMetamorphic white box No N/A Yes and Open So
Random; Combinatorial; Search-B Metamorphic (Outputs at the Frontiblack box Yes parameterized scNo
Search-Based (Based on Rapidly-eMetamorphic (Outputs at the Frontiblack box Yes parameterized scNo
Search-Based (Grammar Based) Differential black box No N/A Yes and Open So
Search-Based Metamorphic data box No N/A Yes and Open So
N/A academic experiment MNIST (LeNet-1, Yes Yes MNIST, FASHION-N/A
N/A academic experiment LeNet5,28,39 Yes No MNIST, CIFAR-10N/A
N/A academic experiment 2 pretrained mode N/A Yes MNIST N/A
N/A academic proof of concept https://github.com/naokishibuya/car-behavioral-cloning
Yes No KITTI, Pascal; GTADAS
N/A academic experiment N/A N/A N/A N/A unmanned underw
N/A academic and indproof of concept N/A N/A N/A Custom (device f Device Failure Pr
N/A academic experiment SVM, C4.5(algoritNo No IRIS Flower DataN/A
N/A academic experiment SVM-Light, MartiNo No N/A N/A
N/A academic and indexperiment N/A N/A N/A Custom (device faN/A
N/A academic experiment MartiRank(rankinNo No N/A N/A
N/A academic proof of concept N/A N/A N/A N/A N/A
N/A no evaluation N/A - N/A N/A N/A N/A
N/A academic experiment N/A N/A N/A MNIST N/A
N/A academic experiment N/A N/A N/A Custom (Autonomo Autonomous Vehi
https://github.c academic experiment custom Yes no MNIST N/A
N/A academic experiment PredNet Yes Yes Udacity, Custom N/A
https://github.c academic experiment MNIST (LeNet-1, YLes Yes/No (both) MNIST, ImageNet,N/A
N/A academic experiment K-means clusterinNo No IRIS Flower DataN/A
N/A academic experiment N/A N/A N/A N/A ADAS
N/A academic experiment kNN No No N/A N/A
N/A academic proof of concept LeNet-1, LeNet-4Yes Yes MNIST N/A
N/A academic experiment fully connected nNo No MNIST N/A
https://github.c academic experiment LeNet-1, LeNet-4Yes Yes (MNIST & CI MNIST, CIFAR-10N/A
N/A academic experiment Conv Neural NetwYes No CIFAR-10 N/A
N/A academic experiment N/A N/A N/A Custom (Autonom Rule-based agent
https://github.c academic experiment N/A N/A N/A MNIST N/A
https://github.c academic experiment N/A N/A N/A MNIST, CIFAR-10N/A
https://deeplearnacademic experiment Chauffeur, RamboYes Yes dataset HMB_3.bag N/A
N/A academic proof of concept SqueezeDet: DNNYes f Yes KITTI ADAS (collision
N/A academic proof of concept N/A N/A N/A N/A ADAS (rule-based
https://github.c academic experiment - 3 real world NLPYesclassifiers: Rosette,
Yes uClassify, Aylien
N/A N/A
https://github.c academic experiment -52standard
classifiers from scikit-learn
implementations
Yes implementations
of classifiers
No of a regularized
+ 1 classifier
datadesigned linear
from US to
CenN/A model
provide with stochastic gradient
fairness

- standard implementations in scikit-learn (SVM , MLPC, Random Forest, Decision Tree, Ensemble of R
No N/A (M) (DS:AV:D) mis accuracy Yes 14 GAN implementNo
No N/A (MK) mutation kilmutation score & average error raNo N/A No
No N/A (M) misclassificacombinatorial coverage; percentagYes random (random pert No
Yes Udacity (DS:AV:C) collisi variety of closed loop evaluation No N/A No
No ATK (DS:AV) compound: precision (percentage of samples Yes Sobol and Lola No
adap
No N/A (INEC) unexpectedetected issues in the algorithms aNo N/A No
No N/A (MET) violation Number of injected faults exposedNo N/A No
No N/A (INEC) inconsisteinconsistency according to oracle No N/A No
No N/A (INEC) crash; randetected failures (not as a numberNo N/A No
No N/A (MET) violation inconsistency according to metamo No N/A No
No N/A N/A N/A No N/A No
No N/A N/A N/A No N/A No
No N/A N/A squared distance of mean and varia No N/A No
No ROS N/A coverage of nodes/edges in the pr No N/A No
No N/A (INEC) NaN & disnumber of detected failures No N/A No
No N/A (DS:AV:D) deviatidistance from a threshold No N/A No
No N/A (M) (DS:AV:D) mis neuron coverage No N/A Yes
No N/A (M) misclassificasimilarity measurment No N/A No
Yes openpilot (DS:AV:C) collisi occured failures, occured hasardous No N/A No
No N/A (MET) violation number of injected faults exposed Yes mutation killing sc No
No N/A (M) misclassifica2-way neuron coverage Yes DeepXplore;DeepC No
No N/A (MK) mutation kilmutation score No N/A No
No N/A (M) misclassificaAPFD (average percentage of fault-Yes NAC, KMNC, NBC,Yes
No N/A N/A similarity between the averaged los Yes Distance-Based, No
Yes V_REP (UN) undetected prediction accuracy No N/A No
No N/A (M) misclassificaCoverage metrics proposed in the p Yes neuron coverage Yes
No N/A (M) misclassificacoverage Yes DeepXplore Yes
No N/A (DS:AV:D) deviatineuron coverage; Jaccard distanceNo N/A Yes
No WeBots (DS:AV:C) collisi mean robustness property values. Yes T Internal comparison: Noit compares the 3 test input g
No Matlab (DS:AV:C) collisi - min, mean, max values of the proposed 1. random search
Yes cost functionoptimization-guided
achieved by the considered
No approaches
- qualitative evaluation of the generated agent 2. covering arrays for discrete variables and rando
trajectories
No N/A (M) misclassifica - total number of erroneous inputsYes Random
3. coveringtestarrays
generaYes
for discrete variables and simul
No N/A - error ratio: of error inducing inputs
(FV) individual fa- effectiveness: number of discriminatory Yesto the total number
inputs generated of inputs
wrt total
- Compares generated
inputs
the Yesgeneratedperturbation approa
3 considered
-- improvement
efficiency: timeoftoerror ratio 10k
generate of the proposed technique
discriminatory wrt random
- compares
inputs test generation
against random selection
-- improvement
improvement in of classification
fairness after accuracy after retraining
retraining:decrease (total number
of estimated ratio ofofdiscriminatory
erroneous inputs before
inputs and aft
(probabili
2 to 4 hours (smaN/A N/A 2018 c ESEC/FSE Ma S.; Liu Y.; Lee Purdue Universit
N/A GNU/Linux system high performance 2018 c ISSRE Ma, L.; Zhang, F.; Sun Harbin Institute
N/A N/A N/A 2019 c SANER Ma, L. ; Zhang, F. ; XHarbin Institute
N/A N/A N/A 2019 a arXiv (cs.SE) Majumdar, R. ; Mathur Max Planck Insti
1/2 to 1 h N/A N/A 2018 j JSS Mullins, G.E.; Stank University of Ma
N/A N/A N/A 2008 a CUCS (report) Murphy, C.; Kaiser, Columbia Univers
N/A Weka 3.5.8 quad-core 3GHz 2009 c ISSTA Murphy, C.; Shen, K.Columbia Univers
N/A N/A N/A 2007 w RT Murphy, C.; Kaiser, G Columbia Univers
N/A N/A N/A 2007 c SEKE Murphy, C.; Kaiser, G Columbia Univers
N/A N/A N/A 2008 c SEKE Murphy, C.; Kaiser, G Columbia Univers
N/A N/A N/A 2016 c APSEC Nakajima, S.; Bui, HVietnam National
N/A N/A N/A 2018 w FAACS SEFM-W Nakajima, S. National Institute
N/A N/A N/A 2018 w SOFL+MSVL Nakajima, S. National Institute
N/A N/A N/A 2016 j IJES Neves, V.; Delamaro, Universidade de
N/A TensorFlow N/A 2019 c PMLR Odena, A.; GoodfelloGoogle Brain
N/A N/A N/A 2018 c IROS Patel, N.; Saridena, New York Univer
N/A Python; TensorFloUbuntu 16.04 (on 2017 c SOSP Pei, K. ; Cao, Y.; YaColumbia Univers
N/A N/A commodity PC wi 2018 c QRS Qin Y.; Wang H.; XuNanjing Universi
N/A N/A N/A 2018 c PRDC Rubaiyat, A.; Qin, Y University of Virg
N/A Weka 3.5.7 N/A 2019 a arXiv (cs.SE) Saha, P.; Kanewala,Montana State Un
N/A N/A N/A 2019 w ICSE-Nier Sekhon, J.; Fleming,University of Virg
N/A N/A N/A 2018 w QRS-C Shen, W.; Wan, J.; Nanjing Universi
N/A Keras 2.1.3 with “Intel(R) Xeon(R 2019 a arXiv (cs.SE) Qingkai Shi; Jun W Nanjing Universi
N/A PyTorch Abel Cluster 2019 a arXiv (stat.ML) Spieker, H.; Gotlieb,Simula Research
N/A N/A N/A 2017 c ICRA Strickland, M.; Fain Arizona State Un
N/A N/A MacBook 2.5 GHz 2018 a arXiv (cs.LG) Sun, Y; Huang, X; KUniversity of Oxf
12h N/A 24 core Intel(R) 2018 c ASE Sun, Y.; Huang, X.; University of Oxf
N/A N/A N/A 2018 c ICSE Tian, Y. ; Pei, K. ; J University of Vir
N/A TensorFlow, openN/A 2018 c IV Symposium Tuncali, C. ; Fainekos Toyota Research
50h Matlab N/A 2019 a arXiv (cs.RO) Tuncali, C.; FainekoArizona State Un
N/A N/A N/A 2019 j TSE Udeshi, S.; Chattop Singapore Univer
N/A Ubuntu 16.04 Intel i7 process 2018 c ASE Udeshi, S.; Arora, PSingapore Univer
USA 11
CHN; SGP; JPN; 51
CHN; USA; AUS; 16
USA; CAN 0
USA 10
USA 20
USA 88
USA 19
USA 37
USA 107
JPN; VNM 15
JPN 3
JPN 3
BRA 1
USA 40
USA 1
USA 275
CHN 3
USA 8
USA 3
USA 3
CHN 8
CHN 4
NOR 1
USA 11
GBR 63
GBR 68
USA 242
USA 36
USA 0
SGP 1
SGP; IND 8
Uesato, NeurIPSW, 2018 Rigorous Agent Evaluation: An Adversarial Approach Online Monitorinmodel
Wang, ICSE, 2019 Adversarial Sample Detection for Deep Neural NetworOnline Monitorininput
Wolschke, CVT, 2018 Mining Test Inputs for Autonomous Vehicles Realism input
Wolschke, ISSREW, 2017 Observation based creation of minimal test suites fo Regression system
Xie, ISSTA, 2019 DeepHunter: Hunting Deep Neural Network Defects vAdequacy Criterimodel
Xie, JSS, 2011 Testing and validating machine learning classifiers b Oracle model
Zhang, arXiv, 2019 A Noise-Sensitivity-Analysis-Based Test Prioritizati Regression model
Zhang, ASE, 2018 Deeproad: GaN-based metamorphic testing and inputOnline Monitorinmodel
Zhang, ISSSR, 2016 Improve the quality of ARC systems based on the metIntegration system
Zhang, ISSTA, 2018 An Empirical Study on TensorFlow Program Bugs Faults and Debumodel
Zhao, QRS-C, 2018 An AI Software Test Method Based on Scene DeductCost model
Zheng, arXiv, 2018 Testing Untestable Neural Machine Translation: An InOracle system
domain-agnosticNN N/A N/A
classifiers (generNN N/A N/A
autonomous sysML-enabled autonomous vehTest Inputs N/A
autonomous sysML-enabled autonomous vehTest Inputs Diversity (of covered behaviours)
domain-agnosticML -> clustering algorithms Test Inputs NAC, NBC, SNAC, TKNC, KMNC
domain-agnosticNN Oracle (Metamorphic) N/A
domain-agnosticML -> classifiers -> generic Prioritized Tests N/A
autonomous sysNN Test Inputs N/A
others (ARC) NN Oracle (Metamorphic) N/A
domain-agnosticCV Repository of Bugs N/A
domain-agnosticML-enabled autonomous vehN/A N/A
others (NMT) NMT Test Inputs N/A
No
N/A Failure (Domain-Specific) black box Probability distri No
Adversarial (with known methods) N/A white box No N/A Yes and Open So
N/A N/A black box No N/A no
Manual (Extracted from Real Data Invariant (Domain-Specific) black box Yes predicates to desNo
Input Mutation (Metamorphic) Metamorphic white box (neuroNo N/A No
Input Mutation (Metamorphic) Metamorphic data box No N/A No
Adversarial Misclassification (of adversarial ex white box No N/A No
GAN Metamorphic
Metamorphic data
blackbox
box(the
(thetrARC
No contains of two
N/Ainteracting MLNo systems - segmentation and
N/A No N/A No
N/A N/A white box No N/A Yes and Open So
Manual building of mazes N/A black box Yes simple definitio No
Manual (crawled online, subsampled
Human Oracle black box No N/A Yes and Closed
N/A academic experiment pretrained modelYes Yes N/A Rule-based agent
https://github.c academic experiment LeNet and Googl Yes Yes MNIST; CIFAR-10N/A
N/A no evaluation N/A N/A N/A N/A N/A N/A
N/A no evaluation N/A N/A N/A N/A N/A N/A
N/A academic experiment 7 pretrained modYes Yes (just for Ima MNIST, CIFAR-10N/A
N/A academic experiment kNN, Naive BayesNo No N/A N/A
N/A academic experiment Two networks trai
No No MNIST, CIFAR-10N/A
N/A academic experiment Autumn (CNN), CYes Yes real-world dataseN/A
N/A no evaluation N/A N/A N/A N/A N/A N/A
https://github. academic experiment N/A N/A N/A N/A N/A
N/A no evaluation N/A N/A N/A N/A N/A N/A
https://bit.ly/2P industrial experiment N/A N/A N/A Custom (Language WeChat
Yes TORCS/MuJoCo (DS:AV:C) collisi risk estimation rate No N/A No
No N/A N/A distance of LCR (label change rat No N/A No
No N/A N/A N/A No N/A No
No N/A N/A N/A No N/A No
No N/A (M) misclassificacoverage (neuron coverage; k-multNo N/A No
No N/A (MET) violation number of injected faults exposed Y ( es k-fold cross-validatNo
No N/A (M) misclassificaratio of successful adversarial exa Yes Random input seleNo
No Udacity (DS:AV:D) deviatinumber of inconsistent behaviors o No N/A No
No N/A N/A N/A No N/A No
No N/A N/A N/A No N/A Yes
No N/A N/A N/A No N/A No
No N/A (DS:OT) under-tra Precision, recall, and F-measure Yes primitive dictionar No
N/A N/A N/A 2018 a arXiv (cs.LG) Uesato, J.; Kumar, ADeepMind
N/A N/A N/A 2019 c ICSE Wang, J.; Dong, G.; Singapore Univer
N/A N/A N/A 2018 j CVT Wolschke, C.; Romba Technische Unive
N/A N/A N/A 2017 w ISSREW Wolschke, C.; Kuhn,Technische Unive
24h Keras 2.1.3; TensLinux kernel 4.4 2019 c ISSTA Xie, X. ; Ma, L. ; JueNanyang Technolo
N/A Weka 3.5.7 N/A 2011 j JSS Xie, X.; Ho, J.W.K.; University of Sy
N/A N/A N/A 2019 a arXiv (cs.LG) Zhang, L.; Sun, X.; LUniversity of Ch
N/A Autumn (OpenCV,N/A 2018 c ASE Zhang M.; Zhang Y.;University of Te
N/A N/A N/A 2016 c ISSSR Zhang, J.; Jing X.; Northwestern Pol
N/A N/A N/A 2018 c ISSTA Zhang Y.; Chen Y.; Peking Universit
N/A N/A N/A 2018 w QRS-C Zhao X.; Gao X. The 27th Researc
N/A N/A N/A 2018 a arXiv (cs.CL) Zheng, W. ; Wang, W. University of Ill
GBR 1
SGP; CHN 17
DEU 0
DEU 5
CHN; USA; SGP 2
AUS; USA; CHN 140
CHN 1
USA; CHN 63
CHN 0
CHN 20
CHN 0
USA; CHE 3

You might also like