Study of Discretization Methods in Classification

This study aims to compare the effectiveness of discretization methods on classification performance and the efficiency of eight discretization methods. Specifically, it will 1) compare classification using numerical attributes versus discretized attributes, 2) compare the efficiency of eight discretization methods (ChiMerge, Chi2, Modified Chi2, Extended Chi2, CAIM, CACC, Ameva, MDLP), and 3) investigate the suitability of these eight discretization methods when applied to five common classifiers (Neural Network, K-NN, Naive Bayes, C4.5, SVM). The paper is organized to first introduce the eight discretization methods and related work, then describe the datasets and classifiers used before explaining the major steps and reporting results.

Uploaded by

Hung

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

35 views6 pages

Study of Discretization Methods in Classification

Uploaded by

Hung

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Study of Discretization Methods in Classification

Lavangnananda, K.1 and Chattanachot, S.2

School of Information Technology (SIT)
King Mongkut’s University of Technology Thonburi (KMUTT)
126 Pra-cha-u-tid Road, Bangmod, Thung-Kru, Bangkok 10140, 7KDLODQG
E-mail: 1 Kitt@[Link] 2 supharoek.c@[Link]

Abstract—Classification is one of the important tasks in Data discretization methods. These are, ChiMerge, Chi2, Modified
Mining or Knowledge Discovery with prolific applications. Chi2, Extended Chi2, Class-Attribute Interdependence
Satisfactory classification depends on characteristics of the Maximization (CAIM), Class-Attribute Contingency
dataset too. Numerical and nominal attributes are commonly Coefficient (CACC), Autonomous Discretization Algorithm
occurred in the dataset. However, classification performance may (Ameva), and Minimum Description Length Principle
be aided by discretization of numerical attributes. At present, (MDLP). The final objective is to investigate how suitable
several discretization methods and numerous techniques for these eight discretization methods perform together with the
implementing classifiers exist. This study has three main commonly known five classifiers. These are, Neural Network,
objectives. First is to study the effectiveness of discretization of
K Nearest Neighbour (K-NN), Naive Bayes, C4.5, and Support
attributes, and second is to compare the efficiency of eight
Vector machine (SVM).
discretization methods. These are ChiMerge, Chi2, Modified
Chi2, Extended Chi2, Class-Attribute Interdependence The paper is organized as follows. Section II introduces the
Maximization (CAIM), Class-Attribute Contingency Coefficient eight discretization methods. Section III presents the related
(CACC), Autonomous Discretization Algorithm (Ameva), and work, this is followed by Section IV where datasets and
Minimum Description Length Principle (MDLP). Finally, the classifiers selected are described. Section V is the section
study investigates suitability of the eight discretization methods where major steps in the study are explained. Results and
when applied to the five commonly known classifiers, Neural discussion are included in Section VI. Finally, the paper is
Network, K Nearest Neighbour (K-NN), Naive Bayes, C4.5, and
concluded in Section VII.
Support Vector machine (SVM).

Keywords—Ameva, CACC, CAIM, ChiMerge, Chi2, C4.5, II. DISCRETIZATION METHODS

Extended Chi2, K-Nearest Neighbour, MDLP, Modified Chi2, Discretization is the process that transforms the continuous
Naive Bayes, Neural Network, SVM
or numerical attribute into the categorical or nominal data. It is
needed in many tasks when using numerical values directly is
I. INTRODUCTION unnecessary from computational perspective or useful
Classification is a subfield in a much larger field of Data information cannot be extracted. For example, ‘recommended
Mining or Knowledge Discovery with prolific applications [1]. exercise duration’ becomes tedious if reported for each age, but
There have been numerous techniques for implementing more comprehensible in ages are expressed in interval format.
classifiers for various tasks. Understanding characteristics of In classification, especially, not only discretization can be
these classifiers is essential to satisfactory classification necessary, but also finding suitable intervals in discretization
performance. Another crucial factor in successful classification can improve its performance considerably.
is the nature of attributes in the dataset. At superficial level, Techniques in discretization can be categorized from two
they can be categorized into 2 types, numerical and nominal perspectives, supervised vs. unsupervised [2] and top-down vs.
attributes, as these two types mostly occur in almost all datasets bottom-up [3]. The supervised discretization considers the class
used in classification. Even though numerical values in an information where as class information may not be available in
attribute may be normalized into zero to one range, using them unsupervised discretization. In another perspective, the top-
directly into a classifier may not always be appropriate. For down and bottom-up are categorized by behavior of the
example, if an attribute ‘temperature’ ranges from 5 to 50 discretization process. The top-down process begins with only
degrees Celsius, theoretically there are infinite values within one interval then separate to many intervals and, on the
this range. Such instances can be detrimental to classification, contrary, bottom-up process starts with many intervals and
especially for some classifiers such as decision trees. reduces the number of interval by merging. In classification,
Therefore, in many cases classification can be aided by the goal of discretization is to make a better classification by
discretization of numerical attributes. To date, several yielding higher accuracy. As classification is a supervised
discretization methods have been developed for many different process, this study investigates supervised discretization
purposes. methods from both top-down and bottom-up perspective. Eight
This study has three main objectives. First is to compare the supervised discretization methods selected for this study are,
classification performance between using numerical values of ChiMerge, Chi2, Modified Chi2, Extend Chi2, CAIM, CACC,
attributes directly with values of attributes after being Ameva, and MDLP. They are briefly introduced as follows :
discretized. Second is to compare efficiency of eight

978-1-4673-9077-4/17/$31.00 ©2017 IEEE

50
Authorized licensed use limited to: Bauman Moscow State Technical University. Downloaded on September 27,2023 at [Link] UTC from IEEE Xplore. Restrictions apply.
ChiMerge : This is a bottom-up discretization method ξ C,D = max m1 ,m2 (3)
using the χ2 value to determine the merging point [4]. The
discretization process of ChiMerge begins with sorting the where:
numerical attributes and calculate the χ2 value for every pair of m1 = 1- min c E,D EC* and 0.5<c E,D
adjacent as shown in equation 1. The pair of adjacent values
m2 =max{c(E,D)|E C* and c E,D < 0.5 },
which has the lowest χ2 value are merged into one interval. The card(E∩D)
2
χ -Threshold is the used as a stopping criterion while merging c E,D = 1- ,
card(E)
process is conducted. C is the equivalence relation set,
D is the decision set,
k (Aij - Eij )
2 C* = { E1 , E2 ,…, En } is the equivalence classes.
= m
i=1 j=1 (1)
E ij CAIM : Class-Attribute Interdependence Maximization
(CAIM) is a top-down discretization proposed by Kurgan and
where: Cios [8]. The main goal of CAIM is to maximize the class-
m = 2 (the 2 intervals being compared), attribute interdependence and to generate the minimal discrete
k = number of classes, interval. CAIM criterion is proposed as a heuristic measure that
Aij = number of examples in the ith interval, jth class, is used to quantify the interdependence between class and the
Eij = expected frequency of Aij =
Ri *Cj
, discrete interval. The larger CAIM value indicates the higher
N interdependence. CAIM method begins discretizing process by
Ri = number of examples in ith interval = kj=1 Aij , select numerical attributes and sorted in ascending then the
Cj = number of examples in jth class = m i=1 Aij , CAIM value is computed using equation 4 for the midpoint for
k
N = total number of examples = j=1 Cj . every pair of adjacent and the cutting point is defined by select
the midpoint that has the highest CAIM value.
Chi2 : This is a bottom-up discretization method which
also use the χ2 to determine the merging point [5]. The process 2
n maxr
of Chi2 is similar to ChiMerge, thus, the χ2 value is also r=1 M+r
calculated by (1). Instead of the user specify χ2-Threshold in CAIM C,D F = (4)
n
ChiMerge, Chi2 introduces an inconsistency rate, used in the
process for determine the proper χ2-Threshold. Discretization in where:
Chi2 also includes feature selection, this makes it perform well
when the dataset contains attributes with many noises. n is the number of intervals,
r iterates through all intervals,
Modified Chi2 : This is a bottom-up discretization method
which stems from Chi2, the χ2 value is also used for determine maxr is the maximum value among within the rth
the merging point [6]. Instead of using the inconsistency rate as column of the quanta matrix,
a stopping criterion in Chi2, the level of consistency (Lc) M+r is the total number of continuous values of
coined from Rough Sets Theory as shown in equation 2. attribute F,
Therefore, the inconsistency checking (In-ConCheck(data) < δ) C is the class variable,
is replaced by (Lc-discretized ≤ Lc-original ) after each step of
D is the discretization variable,
discretization.
F is the attribute.
card(BXi ) CACC : Class-Attribute Contingency Coefficient (CACC)
Lc = (2) is a top-down discretization method [9]. In CACC, contingency
card(U)
coefficient is used to determine the cutting point in discretizing
where: process. To speed up the discretization and to reduce the
number of intervals, the contingency coefficient formula has
U is the set of all objects of the data set, been customized by divide the y with log(n). CACC value can
X can be any subset of U, then be calculated using equation 5.

BX is the lower approximation of X in B (B⊆A),

y'
A is the set of attributes, CACC= (5)
y' + M '
card denotes set cardinality.
where:
Extended Chi2 : This is yet another discretization S n q2ir
extension of Chi2 which uses χ2 value in discretization process y=M i=1 r=1 M M -1
i+ +r
[7]. Its main feature is the ability to deal with uncertain data or
misclassified. It uses the lease upper bound ξ after each step of y' =y/ log (n),
discretization (ξdiscretized < ξoriginal) instead of inconsistency M is the total number of samples,
checking (InCon-Check (data) < δ). Equation 3 is used to
calculate the least upper bound of the dataset. n is the number of intervals,

51
Authorized licensed use limited to: Bauman Moscow State Technical University. Downloaded on September 27,2023 at [Link] UTC from IEEE Xplore. Restrictions apply.
qir is the number of samples with class i, C4.5 [5]. Comparison between Chi2 and MDLP was carried
out in [6] while the work in [10] compared Ameva with
Mi+ is the total number of samples with class i,
CAIM. Application of discretization in C4.5 and Naive Bayes
M+r is the total number of samples in the interval (dr-1,dr). using Equal width binning, RMEP, Error based, and SOM
Based showed in improvement in accuracy and reduction in
Ameva : Autonomous Discretization Algorithm (Ameva) is construction time [12]. The work in [8] claimed that CAIM
a top-down discretization based on χ2 statistic [10]. The lower was the discretization that guarantee the lowest number of
number of interval is shown as a main feature of Ameva. Since intervals and the highest dependence between class and
Ameva is a top-down discretization method, the discretization discrete intervals when compared to Equal-Width, Equal-
begins with one interval and repeatedly divide into many Frequency, Paterson-Niblett, Maximum Entropy, CADD, and
intervals by using the Ameva criterion as shown in equation 6. IEM with CLIP4 and C5.0 classifiers. Similar work also
confirm this facts in [13], however, Naïve Bayes and Semi-
X2 (k) Naïve Bayes with IEM discretization were the best for AODE
Ameva k = (6) classifiers in term of accuracy. ChiMerge had been used
k(l-1)
together with ID3 in [4] where MDLP was adopted for neural
where: network classifier in [11]. MDLP was used as the method for
discretization for comparison of different variations of Naive
k is the selected interval,
Bayes classifiers using medical datasets in [14]. Naïve Bayes
l is the number of classes. with Minimal Entropy Partitioning yielded better results than
Equal Width Interval, Holte’s 1R, and Minimal Entropy
MDLP : Minimum Description Length Principle (MDLP) Partitioning discretization [15]. A combination of Naïve Bayes
is a top-down discretization which uses the entropy with LD, NDD, WPKID, and WNDD discretization methods
minimization heuristic to discretize the continuous or numeric were adopted for natural datasets in [16]. Nevertheless, there
data into a range or interval [11]. The process starts with sort a is a study which contradicts the use of discretization by
selected numeric attributes similar to the other top-down suggesting that using real data directly could yield better result
discretization after that every pair of adjacent are computed for for ID3 and C4.5 [17].
the entropy value to find the optimal cutting point to be a Therefore, it can be said that, in almost most cases,
boundary of the new intervals. MDLP is used as a criterion to discretization of numerical attributes leads to better
decide whether the given partition should accept or reject. If performance and higher accuracy in classification. While there
the condition in the equation 7 is true, the partition induced by have been some studies on comparison of discretization
a cut point is accepted or rejected otherwise.
methods previously, they tend to have specific objectives and
with only few types of classifiers. This work is the first work
log2 (N-1) ∆(A,T;S) which attempts to compare popular discretization methods
Gain A,T;S > + (7)
N N together with popular classifiers with particular objectives of
finding suitable combination of discretization method and type
of classifier.
where:
IV. DATASETS AND CLASSIFIERS USED INVESTIGATION
A is the selected attribute,
A. Datasets
T is the cut point,
In order to validate the study, the number of numerical
S is the set of examples, datasets tested ought to be as many as possible with different
variety. Also number of samples in a dataset must be
N is the number of examples.
sufficiently large. Datasets used in this study are taken from
the well known UCI Repository [18]. Attributes in these
III. RELATED WORKS datasets are of numerical values only, which are relatively
Many discretization methods have been introduced and fewer than those that are of nominal or mixed types. Number
widely used in classification. In fact, the use of different of samples in the datasets ranges from 106 to 20,000. Datasets
discretization methods for classification purposes have been vary from as few as 4 to as many as 90 attributes. The number
investigated with different objectives. The discretization of categories also varies from as few as 2 to as many as 30.
methods were compared by taxonomy without experiment Altogether, twenty five datasets are selected, they are
with the classifier in several aspects such as merge or split, use summarized in Table 1.
class information or not, stopping criterion, sensitive to the
outlier, and etc. to suggest the discretization method for the
users in various environment and datasets [3]. Extended Chi2
was found superior to Modified Chi2, and Boolean reasoning
with the VPRS classifier [7]. Chi2 was compared with original
discretized dataset with respect to accuracy and tree size in

52
Authorized licensed use limited to: Bauman Moscow State Technical University. Downloaded on September 27,2023 at [Link] UTC from IEEE Xplore. Restrictions apply.
TABLE I. DATASETS USED IN THE STUDY popular and open source classifiers available in public domain.
This study, therefore, selects popular and known classifiers
Dataset No. of No. of No. of which are likely to be chosen when classification is required.
samples attributes categories
Five popular classifiers are selected for the study, these are K-
Nearest Neighbor (K-NN), Naive Bayes, Decision Tree, Neural
1 letter-recognition 20000 16 26
Network, and Support Vector Machine (SVM). These
2 pen-based-recognition 7494 16 10 classifiers represent different characteristics from the basic
classification algorithm (i.e. K-NN) to a mathematical oriented
3 wall-following-robot- 5456 24 4 (i.e. SVM). As each of these five classifiers is available in
navigation public domain, users make an assumption that suitable
discretization is done to numerical attributes in the dataset
4 waveform 5000 21 3 used. While some of these five classifiers, such as neural
network is capable of dealing with numerical values directly. It
5 winequality-white 4898 11 7
is also an objective of this study, as stated in Section I, to
6 winequality-red 1599 11 6
investigate whether discretization of numerical attributes is
preferable to using numerical values directly.
7 yeast 1480 8 10
V. METHODOLOGY
8 banknote- 1372 4 2
authentication This study intentionally chooses publicly available for the
five classifiers stated in Section IV instead of implementing
9 pima-indians-diabetes 768 8 2 them. This is because it is an opportunity to test some of their
default discretization methods too. CRAN-R software is
10 transfusion 748 4 2 selected as the main tool [19] with additional library package
CARET [20] and another classification and regression library
11 user-knowledge- 403 5 5
modeling
e1071[21], klaR[22], and RWeka [23]. The study can be
summarized in three steps as follows :
12 movement-libras 360 90 15
Data Preparation : This is the initial step which comprises
13 leaf 340 15 30
two tasks, replacing missing values and normalization. Each
dataset was scanned and found that some samples had missing
14 e coli 336 7 8 attribute values. Missing values were replaced by the mean of
that attribute in the dataset. After this was done, normalization
15 vertebral-column-3C 310 6 3 was carried out to each attribute of each dataset in order to
reduce bias between the attributes in the datasets. The popular
16 vertebral-column-2C 310 6 2 Z-Score was adopted for normalization.
17 haberman 306 3 2 Discretization : The datasets are now ready for
discretization. For each dataset, eight duplications are made, so
18 glass 214 9 6 that each discretization method can then be applied to the
duplicate dataset. Therefore, for each type of dataset, there are
19 seeds 210 7 3 9 datasets of that type, one is the original with real numerical
attribute values and the other eight datasets are those whose
20 sonar-all-data 208 60 2
attributes are discretized by the eight discretization methods as
21 parkinson 195 22 2 described in Section II. As they are 25 different types of data
used in this study, so there are altogether 225 datasets.
22 planning-relax 182 12 2
Classification : This process is the application of five
23 wine 178 13 3 different type of classifiers as described in Section IV. As there
are 9 datasets for each type of data, each classifier is applied to
24 iris 150 4 3 each dataset, with original numerical attribute values and with
nominal attribute values from eight different discretization
25 breast-tissue 106 9 6 methods. To ensure validity of the classification, the ten-fold
cross validation is used and the process is repeated 5 times for
B. Classifiers each classifier. Therefore each discretization is applied to 25
A classifier in this study is taken to mean any algorithm or datasets and subjected to 5 classifiers. So, 5,625 different
technique which enables classification by means of supervised classifications are carried out in this study.
learning. Selecting appropriate classifier for the task at hand is
essential in order to ensure satisfactory result. At present, many
classifiers exist and new ones are continuingly developed.
Unless it is about developing and researching particular aspects
in classification, scientists and engineers tend to resort to

53
Authorized licensed use limited to: Bauman Moscow State Technical University. Downloaded on September 27,2023 at [Link] UTC from IEEE Xplore. Restrictions apply.
Among the six datasets which Chi2 yielded the best
VI. RESULTS AND DISCUSSION performance, classifiers that were used are shown in Table II.
As there are several perspectives where results of the study
can be shown, however it seems tedious to focus results to TABLE II. CLASSIFIERS WHICH CHI2 YIELDED BEST PERFORMANCE
detailed level such as best discretization for each dataset, best
classifier for each dataset and best combination of both for Classifier No. of datasets
each type of dataset. Hence, this Section reveals the key
findings pertaining to the study. C4.5 2

Classification accuracy (i.e. % of correct classification in

Naive Bayes 2
the test phase) is the simplest measure of each classification.
As stated in Section IV, the average of 5 classification SVM 1
accuracies is taken as the representative of the classification
performance. In fact, variation of accuracy among 5 Neural Network 1
classifications in all combinations revealed little discrepancy. It
was found that the default discretization of the C4.5 and Naïve
Bayes classifiers performed poorly and did not yield the best Among the five datasets which MDLP yielded the best
performance from any perspective, therefore, it is omitted in performance, classifiers that were used are shown in Table III.
the discussion. From discretization perspective and regardless
of classifier used, the number of datasets (among the total 25
datasets) where each discretization method yields the best TABLE III. CLASSIFIERS WHICH MDLP YIELDED BEST PERFORMANCE
classification performance is depicted in Figure 1.
Classifier No. of datasets
Fig. 1. Number of datasets where each discretization method yield the best
SVM 2

K-NN 1

C4.5 1

Neural Network 1

Among the four datasets which CACC yielded the best

performance, classifiers that were used are shown in Table IV.

TABLE IV. CLASSIFIERS WHICH CACC YIELDED BEST PERFORMANCE

performance. Classifier No. of datasets

Notice that there are altogether 20 datasets in Figure 1. This C4.5 1

is because there are 5 datasets where using original numerical
values directly yielded best performance than any of the eight Neural Network 1
discretization methods by some classifiers. They are, Neural
Naive Bayes 1
Network which yielded the best performance in three datasets,
while SVM and K-NN yielded the best for one each. Perhaps, SVM 1
because of these three classifiers are originally developed to
deal with numerical values in mind.
Referring to the result in Figure 1, Chi2 and MDLP and Referring to Tables II, III and IV, C4.5 appears to be most
CACC perform relatively well. Chi2 yielded the best suitable for datasets with nominal attributes. From
performance among all the χ2 based methods (AMEVA, discretization perspective, Chi2 and CACC discretization are
ChiMerge, ModChi2 and ExtendChi2). Chi2 also yielded the the most likely choice for discretization in classification using
best overall performance. This is probably because of feature C4.5 following by Naïve Bayes. Note that these two classifiers
selection embedded in the discretization. Since CAIM, were originally developed with nominal attributes in mind. It
AMEVA, ChiMerge, ModChi2 and ExtendChi2 did not mange is interesting to find that SVM appears to be quite a powerful
to yield best performance for more than two datasets, they are classifier, especially when it was developed for numerical
omitted in further discussion. It must be borne in mind that it is attributes.
not the objective of this study to discredit any of the selected
eight discretization methods, since they may well be very
suitable for discretization in other kind of tasks other than
classification.

54
Authorized licensed use limited to: Bauman Moscow State Technical University. Downloaded on September 27,2023 at [Link] UTC from IEEE Xplore. Restrictions apply.
[5] H. Liu and R. Setiono, “Chi2: feature selection and discretization of
VII. CONCLUSION numeric attributes,” in Proc. Seventh International Conference on Tools
with Artificial Intelligence (ICTAI’95), 1995, pp. 388–391.
This study aims to investigate eight discretization methods [6] F. E. H. Tay and L. Shen, “A modified Chi2 algorithm for
for the five popular classifiers. It was found that the top three discretization,” IEEE Trans. Knowledge and Data Engineering, vol. 14,
methods that yielded best performance are Chi2 and MDLP no. 3, pp. 666–670, May 2002.
and CACC. None of the eight discretization methods [7] C.-T. Su and J.-H. Hsu, “An extended Chi2 algorithm for discretization
outperformed others by significant margin. Note that of real value attributes,” IEEE Trans. Knowledge and Data Engineering,
discretization of numerical attributes may not always guarantee vol. 17, no. 3, pp. 437–441, Mar. 2005.
better performance, especially when classifiers used are those [8] L. A. Kurgan and K. J. Cios, “CAIM discretization algorithm,” IEEE
Trans. Knowledge and Data Engineering, vol. 16, no. 2, pp. 145–153,
which were originally developed to deal with numerical values
Feb. 2004.
(i.e. Neural Network, K-NN and SVM). Nevertheless, this
[9] C.-J. Tsai, C.-I. Lee, and W.-P. Yang, “A discretization algorithm based
study recommends Chi2 and MDLP as for discretization. on Class-Attribute Contingency Coefficient,” Information Science, vol.
Hence, discretization of numerical attributes by these two 178, no. 3, pp. 714–731, Feb. 2008.
methods ought to be carried out, especially when resort to [10] L. Gonzalez-Abril, F. J. Cuberos, F. Velasco, and J. A. Ortega, “Ameva:
public domain softwares. An autonomous discretization algorithm,” Expert Systems with
Applications, vol. 36, no. 3, Part 1, pp. 5327–5332, Apr. 2009.
With respect to classifiers, this study recommends C4.5 as [11] U. M. Fayyad and K. B. Irani, “Multi-interval discretization of
the first choice for nominal classifier together with either Chi2 continuous-valued attributes for classification learning,” in Proc. the
or MDLP as discretization method. It must be emphasized the 13th International Joint Conference on Artificial Intelligence, 1993, pp.
above are recommendations and are not meant to be absolute 1022–1027.
rule, as the study reveals that the nature of the dataset tends to [12] F. Kaya, “Discretizing continuous features for naïve Bayes and C4. 5
influence the classifier choice as well as discretization. If classifiers,” University of Maryland publications: College Park, MD,
USA, 2008.
discretization of numerical attributes cannot be carried out or is
[13] M. Mizianty, L. Kurgan, and M. Ogiela, “Comparative Analysis of the
disregarded, then classifiers such as Neural Network, SVM and Impact of Discretization on the Classification with Naive Bayes and
K-NN ought to be the preferred choice. Semi-Naive Bayes Classifiers,” in Proc. Seventh International
Conference on Machine Learning and Applications(ICMLA ’08), 2008,
As discretization is an additional process in data pp. 823–828.
preparation, computation complexity of each discretization [14] R. Abraham, J. B. Simha, and S. S. Iyengar, “A comparative analysis of
must be considered, especially in applications when response discretization methods for medical datamining with naïve Bayesian
time is critical as some methods are likely to be more resource classifier,” in Proc. Information Technology, 2006(ICIT’06), 2006, pp.
demanding than others. Future work can also be carried out on 235–236.
lesser known classifiers as it may reveal useful facts in [15] J. Dougherty, R. Kohavi, M. Sahami, and others, “Supervised and
discretization. unsupervised discretization of continuous features,” in Proc. Machine
learning: proceedings of the twelfth international conference(ML-95),
1995, vol. 12, pp. 194–202.
ACKNOWLEDGMENT [16] Y. Yang and G. I. Webb, “A comparative study of discretization
methods for naive-bayes classifiers,” in Proc. The 2002 Pacific Rim
The authors gratefully acknowledge the School of Knowledge Acquistion Workshop(PKAW 2002), 2002, vol. 2002.
Information Technology (SIT), King Mongkut’s University of
[17] D. Ventura and T. R. Martinez, “An empirical comparison of
Technology Thonburi (KMUTT) for partial scholarship for Mr. discretization methods,” in Proc. the Tenth International Symposium on
Supharoek Chattanachot of this research and for the support of Computer and Information Sciences(ISCIS X), 1995, pp. 443–450.
the computing facilities. [18] M. Lichman, UCI machine learning repository. (2013) [Online].
Available: [Link]
REFERENCES [19] R Core Team, R: A language and environment for statistical computing.
R Foundation for Statistical. (2016) [Online]. Computing, Vienna,
[1] J. Han, J. Pei, and M. Kamber, Data Mining: Concepts and Techniques, Austria. Available: [Link]
Third Edition, Elsevier, 2011. [20] M. Kuhn, The Caret Package. (2016) [Online]. Available: [Link]
[2] I. H. Witten and E. Frank, Data Mining: Practical Machine Learning [Link]/web/packages/caret/[Link]
Tools and Techniques, Second Edition, Morgan Kaufmann, 2005. [21] D. Meyer et al., e1071. (2015) [Online]. Avaliable: [Link]
[3] L. Peng, W. Qing, and G. Yujia, “Study on Comparison of [Link]/web/packages/e1071
Discretization Methods,” in Proc. International Conference on Artificial [22] C. Roever et al., klaR. (2014) [Online]. Avaliable: [Link]
Intelligence and Computational Intelligence, 2009(AICI ’09), 2009, vol. [Link]/web/packages/klaR
4, pp. 380–384.
[23] K. Hornik, C. Buchta, and A. Zeileis, “Open-source machine learning: R
[4] R. Kerber, “ChiMerge: Discretization of Numeric Attributes,” in Proc. meets Weka,” Comput Stat, vol. 24, no. 2, pp. 225–232, May 2008.
The Tenth National Conference on Artificial Intelligence(Aaai-92), San
Jose, California, 1992, pp. 123–128.

55
Authorized licensed use limited to: Bauman Moscow State Technical University. Downloaded on September 27,2023 at [Link] UTC from IEEE Xplore. Restrictions apply.

Discretization Techniques in Data Mining
No ratings yet
Discretization Techniques in Data Mining
17 pages
Improved Discretization Based Decision Tree For Continuous Attributes
No ratings yet
Improved Discretization Based Decision Tree For Continuous Attributes
5 pages
CAIM Discretization Algorithm: Lukasz A. Kurgan, Member, IEEE, and Krzysztof J. Cios, Senior Member, IEEE
No ratings yet
CAIM Discretization Algorithm: Lukasz A. Kurgan, Member, IEEE, and Krzysztof J. Cios, Senior Member, IEEE
9 pages
063143jnr 2 Ijast
No ratings yet
063143jnr 2 Ijast
14 pages
Discretization Methods in Data Mining
No ratings yet
Discretization Methods in Data Mining
3 pages
Survey of Discretization Techniques
No ratings yet
Survey of Discretization Techniques
12 pages
Data Discretization Unification
No ratings yet
Data Discretization Unification
14 pages
Chi2 Algorithm for Feature Selection
No ratings yet
Chi2 Algorithm for Feature Selection
4 pages
KEEL Software Algorithms List 2013
No ratings yet
KEEL Software Algorithms List 2013
53 pages
Entropy-Based Discretization Algorithm
No ratings yet
Entropy-Based Discretization Algorithm
9 pages
Data Mining: Practical Machine Learning Tools and Techniques
No ratings yet
Data Mining: Practical Machine Learning Tools and Techniques
69 pages
1 s2.0 S095741742300951X Main
No ratings yet
1 s2.0 S095741742300951X Main
10 pages
Data Discretization Techniques
No ratings yet
Data Discretization Techniques
21 pages
Discretizing Uncertain Data
No ratings yet
Discretizing Uncertain Data
15 pages
Feature Ranking Methods Based On Information Entropy With Parzen Windows
No ratings yet
Feature Ranking Methods Based On Information Entropy With Parzen Windows
9 pages
5 Data Preprocessing III Editted Notes
No ratings yet
5 Data Preprocessing III Editted Notes
17 pages
DWDM AR16 Unit 1.2
No ratings yet
DWDM AR16 Unit 1.2
14 pages
Types of Data (Qualitative and Quantitative)
No ratings yet
Types of Data (Qualitative and Quantitative)
89 pages
4 - Discretization and Concept Hierarchy
No ratings yet
4 - Discretization and Concept Hierarchy
26 pages
PPA Data Preparation
No ratings yet
PPA Data Preparation
31 pages
A Data Pre Processing
No ratings yet
A Data Pre Processing
7 pages
Data Preprocessing
No ratings yet
Data Preprocessing
8 pages
Discretization of Continuous Attributes
No ratings yet
Discretization of Continuous Attributes
38 pages
Liu, 2021 - Projection - Multiobj - SVM
No ratings yet
Liu, 2021 - Projection - Multiobj - SVM
13 pages
Package Discretization': R Topics Documented
No ratings yet
Package Discretization': R Topics Documented
24 pages
Data Science Feature Engineering Techniques
No ratings yet
Data Science Feature Engineering Techniques
5 pages
Chandra Shekar 2014
No ratings yet
Chandra Shekar 2014
13 pages
4 - Discretization and Concept Hierarchy
No ratings yet
4 - Discretization and Concept Hierarchy
27 pages
Data Discretization
No ratings yet
Data Discretization
4 pages
Data Mining: Dimensionality Reduction
No ratings yet
Data Mining: Dimensionality Reduction
135 pages
Session 2 On Discreatization - Binning Notes
No ratings yet
Session 2 On Discreatization - Binning Notes
14 pages
Machine Learning Mindmap PDF
100% (1)
Machine Learning Mindmap PDF
5 pages
1608 06048 PDF
No ratings yet
1608 06048 PDF
7 pages
Relevant Dimensions For Classification and Visualization
No ratings yet
Relevant Dimensions For Classification and Visualization
11 pages
Discretization and Concept Hierarchy in Data Mining
No ratings yet
Discretization and Concept Hierarchy in Data Mining
2 pages
Topic 2.8: Topics Comes Under The Topic of Discriminant Functions
No ratings yet
Topic 2.8: Topics Comes Under The Topic of Discriminant Functions
6 pages
Module 2
No ratings yet
Module 2
12 pages
Test
No ratings yet
Test
4 pages
ML Unit-5
No ratings yet
ML Unit-5
12 pages
Classification Algorithms
No ratings yet
Classification Algorithms
24 pages
4.10 Fisher Linear Discriminant: Chapter 4. Nonparametric Techniques
No ratings yet
4.10 Fisher Linear Discriminant: Chapter 4. Nonparametric Techniques
8 pages
3-Data Pre-Processing
No ratings yet
3-Data Pre-Processing
18 pages
Data Preparation Techniques in Machine Learning
No ratings yet
Data Preparation Techniques in Machine Learning
32 pages
Fischer LDA
No ratings yet
Fischer LDA
8 pages
DW&DM (Unit - 4)
No ratings yet
DW&DM (Unit - 4)
9 pages
DM 2 Part 2
No ratings yet
DM 2 Part 2
35 pages
Unit-4 Part 3 Feature Engineering
No ratings yet
Unit-4 Part 3 Feature Engineering
29 pages
Regression by Classification: LIACC - University of Porto
No ratings yet
Regression by Classification: LIACC - University of Porto
10 pages
UNIT04
No ratings yet
UNIT04
35 pages
Data Preprocessing in Predictive Data Mining: The Knowledge Engineering Review
No ratings yet
Data Preprocessing in Predictive Data Mining: The Knowledge Engineering Review
33 pages
3point5point2 Normalization
No ratings yet
3point5point2 Normalization
3 pages
Clustering Importante
No ratings yet
Clustering Importante
12 pages
Understanding Feature Engineering
No ratings yet
Understanding Feature Engineering
17 pages
Understanding Perceptrons and Deep Learning
No ratings yet
Understanding Perceptrons and Deep Learning
36 pages
Data Mining Algorithms Comparison
No ratings yet
Data Mining Algorithms Comparison
32 pages
Bayesian Laws
No ratings yet
Bayesian Laws
16 pages
Slides For Students-V3
No ratings yet
Slides For Students-V3
10 pages
Lirneasia-Tabop3-Eoi 21apr08
No ratings yet
Lirneasia-Tabop3-Eoi 21apr08
7 pages
Army's Nett Warrior Evolution
No ratings yet
Army's Nett Warrior Evolution
4 pages
English 6 - Q2 - LP5
No ratings yet
English 6 - Q2 - LP5
9 pages
Library Science Short Questions
No ratings yet
Library Science Short Questions
13 pages
Instruction Manual: Hand Stacker Pa1015 Capacity 1000kg
No ratings yet
Instruction Manual: Hand Stacker Pa1015 Capacity 1000kg
17 pages
Melanie Klein and Early Object Relations Theory
No ratings yet
Melanie Klein and Early Object Relations Theory
72 pages
DLMAIRIL01 Q4-2024 Session1
No ratings yet
DLMAIRIL01 Q4-2024 Session1
84 pages
Calorimetry Problems & Solutions Guide
No ratings yet
Calorimetry Problems & Solutions Guide
3 pages
Dr. Ezeala's Teaching Philosophy
No ratings yet
Dr. Ezeala's Teaching Philosophy
3 pages
Class 10 Electricity
No ratings yet
Class 10 Electricity
16 pages
Waxing and Waning Aspects in Astrology
No ratings yet
Waxing and Waning Aspects in Astrology
5 pages
Geoswath 4R: Next Generation Wide Swath Bathymetry and Side Scan
No ratings yet
Geoswath 4R: Next Generation Wide Swath Bathymetry and Side Scan
2 pages
Jhon Deere (CTM385 Electronic Fuel System)
100% (10)
Jhon Deere (CTM385 Electronic Fuel System)
1,344 pages
Manali Volvo Tour Package 2024
No ratings yet
Manali Volvo Tour Package 2024
7 pages
Summer Vacation Homework 2022 Class 10
No ratings yet
Summer Vacation Homework 2022 Class 10
3 pages
Code of Practice For Liquid Penetrant Flaw Detection (Second Revision)
No ratings yet
Code of Practice For Liquid Penetrant Flaw Detection (Second Revision)
13 pages
Ultimist: Misting Nozzles
No ratings yet
Ultimist: Misting Nozzles
1 page
Highway Horizontal Alignment Guide
No ratings yet
Highway Horizontal Alignment Guide
76 pages
Generator Subtransient Reactance Explained
No ratings yet
Generator Subtransient Reactance Explained
2 pages
AI's Impact on Healthcare: Review
No ratings yet
AI's Impact on Healthcare: Review
18 pages
Chapter 2
No ratings yet
Chapter 2
17 pages
Tutorial AutoCad 2004
No ratings yet
Tutorial AutoCad 2004
14 pages
Nursing Students' Attitudes Toward Elder Care
No ratings yet
Nursing Students' Attitudes Toward Elder Care
6 pages
Information Booklet - Biotechnology - ACY YEAR - 2023-24
No ratings yet
Information Booklet - Biotechnology - ACY YEAR - 2023-24
64 pages
Internet Protocol (IP)
No ratings yet
Internet Protocol (IP)
4 pages
StrongTruss2024 Contest Details
No ratings yet
StrongTruss2024 Contest Details
2 pages
Kreafunk Amove Manual
No ratings yet
Kreafunk Amove Manual
46 pages
Datu. Multi TP
No ratings yet
Datu. Multi TP
1 page
IEPE (Integrated Electronics Piezo-Electric) - Kistler
No ratings yet
IEPE (Integrated Electronics Piezo-Electric) - Kistler
2 pages