0% found this document useful (0 votes)
8 views17 pages

Soft Computing Techniques

The document reviews the application of soft computing techniques in customer segmentation, highlighting its potential to enhance segmentation research through advanced data analysis methods. It critiques existing empirical studies, noting that the use of soft computing in segmentation is still in early stages and suggests further exploration for more actionable results. The article discusses critical methodological issues and criteria for effective segmentation, emphasizing the need for improved understanding and application of these techniques in marketing strategies.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views17 pages

Soft Computing Techniques

The document reviews the application of soft computing techniques in customer segmentation, highlighting its potential to enhance segmentation research through advanced data analysis methods. It critiques existing empirical studies, noting that the use of soft computing in segmentation is still in early stages and suggests further exploration for more actionable results. The article discusses critical methodological issues and criteria for effective segmentation, emphasizing the need for improved understanding and application of these techniques in marketing strategies.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Expert Systems with Applications 40 (2013) 6491–6507

Contents lists available at SciVerse ScienceDirect

Expert Systems with Applications


journal homepage: www.elsevier.com/locate/eswa

Review

Soft computing applications in customer segmentation: State-of-art


review and critique
Abdulkadir Hiziroglu ⇑
Yildirim Beyazit University, Management Information Systems Department, Cinnah Campus, 06400 Ankara, Turkey

a r t i c l e i n f o a b s t r a c t

Keywords: Segmentation has been taken immense attention and has extensively been used in strategic marketing.
Customer segmentation Vast majority of the research in this area focuses on the usage or development of different techniques. By
Segmentation review means of the internet and database technologies, huge amount of data about markets and customers has
Soft computing in segmentation now become available to be exploited and this enables researchers and practitioners to make use of
Data mining
sophisticated data analysis techniques apart from the traditional multivariate statistical tools. These
sophisticated techniques are a family of either data mining or machine learning research. Recent research
shows a tendency towards the usage of them into different business and marketing problems, particu-
larly in segmentation. Soft computing, as a family of data mining techniques, has been recently started
to be exploited in the area of segmentation and it stands out as a potential area that may be able to shape
the future of segmentation research. In this article, the current applications of soft computing techniques
in segmentation problem are reviewed based on certain critical factors including the ones related to the
segmentation effectiveness that every segmentation study should take into account. The critical analysis
of 42 empirical studies reveals that the usage of soft computing in segmentation problem is still in its
early stages and the ability of these studies to generate knowledge may not be sufficient. Given these
findings, it can be suggested that there is more to dig for in order to obtain more managerially interpret-
able and acceptable results in further studies. Also, recommendations are made for other potentials of
soft computing in segmentation research.
Ó 2013 Elsevier Ltd. All rights reserved.

1. Introduction to this problem. Disciplines such as machine learning, statistics,


artificial intelligence (soft and hard computing techniques), expert
Segmentation was first introduced to the marketing literature systems, data and knowledge management technologies are
by Smith (1956). Later, segmentation was mentioned as an alterna- incorporated with KD and DM by making use of their theories
tive concept instead of product differentiation strategy (Beane & and algorithms (Freitas, 2002; Mitra, Pal, & Mitra, 2002; Shaw,
Ennis, 1987; Claycamp & William, 1968; Wind, 1978). The main Subramaniam, Tan, & Welge, 2001; Tyndale, 2002). Marketing
idea of segmentation or clustering is to group similar customers. researchers are interested in the application of these technologies
A segment can be described as a set of customers who have similar in marketing-related problems, such as forecasting, segmentation,
characteristics of demography, behaviours, values, and so on knowledge-based marketing decision support systems, and so
(Nairn & Berthon, 2003). forth, especially in the frame of DM (Liao, 2003; Mitra et al.,
The selection of segmentation techniques has become more 2002; Pal, Talwar, & Mitra, 2002; Smith & Gupta, 2000; Vellido,
important due to the fact that the developments in information Lisboa, & Vaughan, 1999).
and communication technologies, especially database manage- Soft computing, as a family of data mining techniques, has been
ment systems and data mining have changed the way of market- recently started to be exploited in the area of segmentation and it
ing. The vast availability of data and the inefficient performance stands out as a potential area that may be able to shape the future
of traditional statistical techniques (or statistics-oriented segmen- of segmentation research. The significant usage of soft computing
tation tools) on such voluminous data have stimulated researchers techniques in business-related problems, particularly in segmenta-
to find effective segmentation tools in order to discover useful tion, makes segmentation problems more attractive, since these
information about their markets and customers. Thus, knowledge techniques are very effective and applicable. From this perspective,
discovery (KD) and data mining (DM) have been seen as a solution the objective of this article is to find out where the future of seg-
mentation is heading towards in terms of being able to obtain
⇑ Tel.: +90 3122415555. effective segmentation results. In order to accomplish this objec-
E-mail address: [email protected] tive, the current applications of soft computing techniques in

0957-4174/$ - see front matter Ó 2013 Elsevier Ltd. All rights reserved.
http://dx.doi.org/10.1016/j.eswa.2013.05.052
6492 A. Hiziroglu / Expert Systems with Applications 40 (2013) 6491–6507

segmentation problem are reviewed based on some critical factors (called omega square measure) before a large-scale segmentation
including factors related to the segmentation effectiveness that study is undertaken. Also, in a research conducted by Dolnicar, Fre-
every segmentation study should take into account. itag, and Randle (2005), they developed an investigation model
The rest of this study is organised as the following. Critical based on a hypothetical simulation in order to understand the suc-
methodological issues associated with segmentation are presented cess of different segmentation strategies under varying marketing
in Section 2. Section 3 includes background information regarding conditions, which may help one to decide whether to segment or
the soft computing technologies. The method followed in accom- not before undertaking a segmentation study.
plishing the critical analysis of the empirical studies is presented Conceptually, there are other arguments regarding what char-
in Section 4 while the results obtained through the critical analysis acteristics an effective segmentation study should possess. Kotler
is shown in Section 5. Section 6 concludes the article by providing (2003) points out that measurability, accessibility, substantiality,
some discussions on the future of soft computing in segmentation differentiability, and actionability are five criteria for effective seg-
research. mentation. Having a measurable segment means that you have the
ability to measure the variables in terms of the size, the purchasing
power, and the profiles. Accessibility refers to whether the seg-
2. Critical issues in segmentation research
ments can be effectively reached and served. Also, segments
obtained should be large or profitable enough to serve. Differentia-
Although most segmentation research takes individual consum-
bility is the issue of being able to have segments which are concep-
ers into account as a unit of analysis, considerations in consumer
tually distinguishable and respond differently to different
segmentation studies have not been mentioned in the previous lit-
marketing mix elements or programmes. Furthermore, the market-
erature except for Wind’s landmark article (1978) published
ing activities should be designed for the segments that are action-
30 years ago. Moreover, researchers such as Goyat (2011), Myers
able or worth considering in order to attract and serve them.
and Tauber (1977), Wilkie and Cohen (1977), Beane and Ennis
Wedel and Kamakura (2000) omitted the criterion of measurability
(1987), Dolnicar (2004), Yankelovich and Meer (2006), Sun
and added two additional criteria, namely, stability and respon-
(2009) and Tynan and Drayton (1987) provided ample reviews of
siveness. Stability is a criterion that reflects whether the segments
segmentation research. Additionally, in order to give a comprehen-
are stable over time or change their structure while responsiveness
sive discussion regarding the methodological issues in segmenta-
refers to the combination of the criteria of differentiability and
tion research the structure in Wind’s work (1978) will be
actionability in the above definitions. Biggadike (1981) looked at
expanded by reviewing additional literature. His structure was
the issue from the strategic management perspective and replaced
based on five main topics: (1) problem definition; (2) research de-
the last three criteria of Kotler with defensibility, durability, and
sign; (3) data collection; (4) analysis; and (5) implementation and
competitiveness. However, he used the term ‘‘accessibility’’ to refer
interpretation of data and results. Since then, important develop-
to the meaning of actionability in Kotler’s definition. According to
ments have occurred particularly in research methodologies,
him defensibility is a measure to check whether the cost of serving
including new bases and tools for segmentation (Wind, 1978).
a particular segment is unique to it or not. Also, he used the term
Nevertheless, the main considerations he highlighted are still con-
‘‘durability’’ for understanding the differences between segments
temporary and should be taken into account for today’s segmenta-
that are likely to endure or erode, and the term ‘‘competitiveness’’
tion studies.
for making sure that the organisation has a relative advantage in
Table 1 represents major considerations related to the five
terms of the skills required to serve the segments. Furthermore,
methodological issues mentioned above. Detail information will
Raaij and Verhallen (1994) classified these criteria into four catego-
not be provided for all the considerations listed in the table, in-
ries, namely, typifying the segments (including the criteria of iden-
stead, only some of them will be described here since those are
tifiability, differentiability, and measurability), homogeneity
considered to be included as critical factors or variables in the crit-
(variation, stability, and congruity), usefulness (accessibility and
ical analysis part. Since the critical analysis will be done for the
substantiality), and strategic criteria (potentiality, profitability,
academic empirical studies, their evaluations based on other con-
and attractiveness) by adding some additional criteria.
siderations might be difficult. For example, considerations related
There are some studies which considered this issue from the
to real-life practical issues such as budget constraints, information
point of view of strategic management literature. For example,
need of the company and the baseline for segmentation (this is
Goller, Hogg, and Kalafatis (2002) categorised these criteria into
about conducting either a one shot or continuous segmentation
two main criteria: segmentability, which mainly includes Kotler’s
study) were not selected and were excluded as basis for the critical
criteria of measurability, accessibility, and differentiability (homo-
analysis. The reason for this is that these considerations are mainly
geneity within and heterogeneity between); and target market
company-specific and dependent on the practical conditions of a
selection (e.g., segment size and growth, market share), which con-
company. Likewise, considerations including operationalization of
sists of Kotler’s criteria of substantiality and actionability. The most
variables, segment stability, and issues related to the implementa-
comprehensive study in that particular topic was conducted by
tion and interpretation of results were not taken into account, as
Dibb (1995, 1999), and operationalises the criteria from two differ-
those are either not easy to be evaluated or usually not mentioned
ent dimensions by merging them into the segmentation process.
in academic studies. Hence, the discussion regarding the consider-
The first dimension is criteria-oriented and includes two main cri-
ations taken into account will be brief and the findings from the
teria, namely, segment qualification (whether segments are opera-
literature will be summarised.
tional or not) and segment attractiveness (includes a wide range of
internal and external factors within the context of environmental
2.1. Conceptual segmentability conditions, available resources, and the level of competition). The
second is a resource-oriented dimension that shows which differ-
The term ‘‘segmentability’’ questions when it is possible to seg- ent sources of thought take this issue into account and from what
ment a market, and under what conditions this should be done. perspective. The variables associated with these criteria can be
Young, Ott, and Feigin (1978) provided practical insights for differ- found in the related literature (Dibb, 1995, 1999; Dibb & Simkin,
ent segmentation. Green and Carmone (1977) proposed a market 1997, 2010).
segmentability measure in the componential segmentation frame- As it can be seen, there are many classification efforts, but the
work that helps to develop a numerical segmentability index literature does not have a comprehensive analysis of those criteria.
A. Hiziroglu / Expert Systems with Applications 40 (2013) 6491–6507 6493

Table 1  General unobservable variables (e.g., lifestyle, psychographics)


Considerations about critical methodological issues.  Product specific unobservable variables (e.g., benefits, prefer-
Critical methodological issues Major consideration ences and intentions)
Problem definition related issues Managerial requirements
 Conceptual segmentability Selection of variables suitable for the segmentation model is an
 Budget and other constraints important point to consider. Wind (1978) addressed this issue by
 Total information needs of the giving two major considerations. His first consideration is about
firm
The baseline for segmentation
management’s specific need for a segmentation study, while the
Segmentation variables and models second is the current state of the marketing and consumer infor-
 Classification of segmentation mation. He points out that even though all variables could be used
variables as bases for segmentation, a consensus that some variables are bet-
 Selection of segmentation basis
ter than others should be reached. Furthermore, from a practical
 Choosing appropriate segmen-
tation model point of view, Greenberg and McDonald (1989) stressed some
important dimensions which should correlate with market behav-
Research design issues Unit of analysis
Segmentation objective iour, readable for product manipulation and development of com-
Sample design munication strategies, and should give directions for media buying
Data reliability when choosing a base for segmentation. Additionally, another
Operationalization of variables scholar Day (1990) suggests two different ways of identifying seg-
Stability
ment descriptors. The first way starts with the identifiers of con-
Data collection related issues Data type and source sumers and then checks whether segments have distinct
Data analysis related issues Segmentation techniques response profiles or not. The second works backward in order to
 Classification of segmentation find out if the segments have distinct response profiles and can
methods and techniques
 Selection of segmentation
be identified in respect of different characteristics.
techniques Each of these variables has been used in segmentation studies
Standardisation/normalisation and has advantages and disadvantages (Vriens, 2001). It is likely
Determining the number of to have different segmentation schemas depending on the usage
clusters (or segments)
of different variables that are included in analysis (Segal &
Reliability and validity
Giacobbe, 1994). For some specific marketing objectives, the guide-
Issues related to the implementation Selection of target segments
lines on which variables should be considered can be found from
and interpretation of results Translating segmentation findings
into marketing strategy the literature. For instance, a study conducted by Natter (1999)
who suggested that benefit-related bases are the most meaningful
types to use from the point of view of facilitating other marketing
activities, such as product planning, positioning, and advertising.
This could be because different researchers put those criteria into The article also points out that although lifestyle or psychographic
wider concepts or they interpret them differently. Furthermore, variables are not problematic from the statistical standpoint, from
there is no clear guidance regarding how to measure those criteria the marketing perspective they are not helpful enough as they may
in the literature. It can only be claimed that measurability, accessi- not directly be associated with the actual consumer behaviours.
bility, differentiability, substantiality, and actionability are the five Furthermore, while general observable variables are easy to col-
common criteria for effective segmentation as Kotler (2003) sug- lect, in fact their reliability and validity are questionable. Some
gested. From the clustering point of view homogeneity can be researchers agree that demographics and socio-economic variables
added to this list. are not sufficient for an effective segmentation study (Barnett,
1969; Dhalla & Mahatoo, 1976; Greenberg & McDonald, 1989;
2.2. Segmentation variables Haley, 1968; Peltier & Schribrowsky, 1997; Sharma & Lambert,
1994; Yankelovich, 1964). It is suggested that demographic
Consumers have a variety of differences according to their char- variables provide little guidance for product development and
acteristics. In consumer and industrial marketing literature, several communication strategies (Greenberg & McDonald, 1989). In
segmentation variables can be found, such as geographic, demo- addition to that, they have poor prediction capabilities for
graphic, firmographic, behavioural, decision making process-re- consumer behaviour (Haley, 1968), because customers who are
lated variables, purchasing behaviour, situation factors, in the same segment may want personalised products and services
personality, lifestyle, psychographics, and so on (Bock & Uncles, and might not exhibit similar behaviour, even if they have similar
2002; Cheron & Kleinschmidt, 1985; Kotler, 2003; Walters, demographic features and lifestyles. However, although general
1997). Kotler (2003) classifies market segmentation variables into unobservable variables are weakly related to purchasing behav-
four major areas, namely, geographic, demographic, psycho- iours they are also accessible and useful for marketers. The best
graphic, and behavioural variables. On the other hand, some other evaluation of those variables can be found in the book of Wedel
researchers give a classification based on the level of variables. One and Kamakura (2000). They make the evaluation based on six seg-
example of this kind of classification can be found in the study of mentation criteria which were mentioned in an earlier section
Raaij and Verhallen (1994), who make a classification based on when the issue of market segmentability was being discussed.
two main dimensions: the level of variables (general, domain-spe- According to them, compared to product-specific bases, general
cific, brand specific) and the objectivity/subjectivity of variables. observable variables have higher potential on the criteria of iden-
Wedel and Kamakura (2000) give the following classification sche- tifiability, substantiality, accessibility, and stability. However, in
ma below for segmentation bases: terms of actionability and responsiveness criteria, they tend to
have a lower potential. General unobservable bases are rated
 General observable variables (e.g., geographic, demographics, between high and low on most of the criteria.
socio-economic variables) Thirty years after Wind’s (1978) original work, this issue still re-
 Product specific observable variables (e.g., usage frequency and mains a problem, mainly because of a lack of systematicity and
loyalty) non-representativeness in academic studies. The most important
6494 A. Hiziroglu / Expert Systems with Applications 40 (2013) 6491–6507

point here is to determine the objective(s) of the segmentation As the third example regarding this, Raaij and Verhallen (1994)
study. After deciding the objective, one of the variables mentioned classified methods into three basic approaches: forward, backward,
earlier or a combination of those variables can be used for a seg- simultaneous. The forward approach, which is a kind of analysis of
mentation purpose. It should be noted that one of the most valu- customer response, assigns customers to groups on the basis of
able pieces of information is customers’ behavioural behavioural similarity response. In the backward approach, this
characteristics, especially past customer purchases and value-ori- similarity is based on one or more customer characteristics. The
ented attributes (Bayer, 2010; Kim, Jung, Suh, & Hwang, 2006; simultaneous approach takes its basis from the relationship be-
Wind & Lerner, 1979). In fact, customer analytics related techno- tween customer characteristics and behavioural responses.
logical advances have facilitated performing segmentation studies
based on those characteristics (Bailey, Baines, Wilson, & Clark, 2.4. Unit of analysis and objective of segmentation
2009).
Selection of a unit of analysis depends on two decisions (Sausen,
Tomczak, & Herrmann, 2005). The first one is associated with com-
2.3. Segmentation models panies’ overall marketing strategies that lead them to come up dif-
ferent objectives with regards to market segmentation, while the
When building a segmentation model, another crucial consider- second one is the ability to access to certain units of analysis. Seg-
ation is the selection of segmentation methods or techniques. In mentation literature includes a variety of possible segmentation
segmentation literature, several methods and modelling tech- objectives. Wind (1978) stated that segmentation is implemented
niques have been proposed. However, the most well-known seg- through the intent of a company, which could be either strategy
mentation models can be found in industrial market generation like identifying new markets or product-related deci-
segmentation literature and can be classified into three categories, sions i.e., defining pricing policy and possible changes in existing
single, two-stage (Wind & Cardozo, 1974) and multi-stage (aka products. According to Beane and Ennis (1987), the aid of segmen-
nested approach) (Shapiro & Bonoma, 1984) models. This classifi- tation can be either searching for new product opportunities or
cation is mainly based on how many times the segmentation pro- gaining a better customer understanding. Segmentation objectives
cess works in respect of the variable bases used in the model. In can be extended via considering company’s resources, customers
consumer segmentation literature, most approaches are tech- and products. Then the list can include objectives such as customer
nique- or method-oriented, ranging from simple inferential statis- acquisition, customer retention, profitability, customer satisfac-
tics to artificial intelligence. It is possible to give a classification of tion, resource allocation by designing marketing measures or pro-
segmentation techniques based on the literature, which were used grammes increasing, and customer value, etc. (Sausen et al., 2005).
as analytical techniques or methods for market/customer segmen- However, the organisations follow two main dimensions of seg-
tation. For example, Wind (1978) identified four basic approaches mentation strategies, namely, market-induced and customer-in-
for market segmentation. The first approach is ‘‘a priori’’ segmenta- duced segmentation (Sausen et al., 2005). In the first dimension,
tion, which chooses some variables of interests and then classifies the main objective is the identification and exploitation of new
consumers based on that designation (Green & Krieger, 1991; markets and customers by using an anonymous and aggregated
Wind, 1978). However, in the second approach, called ‘‘post hoc’’ unit of analysis. In the second dimension, the objective could be
segmentation, the classification job in the segmentation process customer acquisition or retention by deploying a unit of analysis
is based on clustering (Greenberg & McDonald, 1989). The ‘‘a pri- based on disaggregated and personalised customers. For customer
ori’’ segmentation supposes that the number of segments or clus- segmentation this should be an individual customer. Within the
ters, along with their dimensions and descriptions, are known. scope of this study, the categorisation provided by Sausen and
On the other hand, these characteristics are found in the ‘‘post his friends (2005) is used; they comprehensively organised a work-
hoc’’ approach after the segmentation process (Greenberg & shop by inviting many marketing scholars and managers in order
McDonald, 1989). In the ‘‘post hoc’’ segmentation, multi-variate to identify main segmentation objectives and the capability of
analytical techniques are commonly used. The third approach is the units of analysis to accomplish these objectives. According to
called ‘‘flexible’’ segmentation. This is a dynamic approach and their synthesis, Table 2 presents five segmentation objectives and
can develop and examine many alternative segments. The last ap- four aggregation levels of objects regarding market segmentation.
proach is developed by Green (1977), and is an extended version of
conjoint analysis, which can make predictions regarding which 2.5. Sample design
type of person will be most responsive to which type of products.
A second example in association with the classification of seg- For any scientific research, finding an appropriate sample de-
mentation techniques can be the classification proposed by Wedel sign is crucial for the reasons of validity and reliability. The selec-
and Kamakura (2000), which is provided below: tion of an appropriate sample design is supposed to have a
representative impact on the projectability of the results of a study
 A priori descriptive methods to the research universe. The choice of a target population and the
 A priori predictive methods sampling frame are two key considerations related to this topic
 Post-hoc descriptive methods (Steenkamp & Hofstede, 2002). Regarding the sample design con-
 Post-hoc predictive methods sideration, only ‘‘sample size’’ will be included in the critical
analysis.
The most important distinction in this classification is that the
methods are classified as descriptive or predictive. In the descrip- 2.6. Data type and source
tive methods, there is no difference between variables like being
dependent or independent. However, the predictive methods sup- There are two main different data available for a segmentation
pose that one variable must indicate the dependent variable and study. One of them is primary data, which is commonly used by
the rest are defined as independent (Vriens, 2001). Different com- commercial research; the other one is secondary data, which is ac-
binations of these methods in a single problem are also possible to cepted as more academically oriented (Wind & Lerner, 1979). With
find, as the conceptual examples regarding this can be found in the development of communication and Internet technologies, the
Dolnicar (2004). problem of data collection or reaching compatible data is
A. Hiziroglu / Expert Systems with Applications 40 (2013) 6491–6507 6495

Table 2 determine which techniques or algorithms perform better than


Segmentation objectives and units of analysis. Source: Vellido et al. (1999). the others. Most of them have their advantages as well as disad-
Segmentation objective Unit of analysis vantages (Kuo, Ho, & Hu, 2002b). Segmentation results mostly de-
Exploitation of new customers Anonymous sub-markets pend on algorithms that will be used for clustering or classification.
potentials Improper selection of segmentation techniques may cause a nega-
Development of existing customer Anonymous groups or typologies of tive financial impact. To avoid this problem, the first decision is to
potentials customers determine what kinds of segmentation approaches are suitable for
Increasing customer profitability Personalised existing customers
Improving targeting of marketing Personalised potential customers
the current segmentation study. In other words, it should be
measures decided whether the data in the segmentation study is appropriate
Identification of new sub-markets for the ‘‘post hoc’’ approach or the ‘‘a priori’’ approach. The second
issue that should be taken into consideration is the understanding
of data characteristics. The characteristics can be based on the vol-
diminishing. This study will take into account three different data
ume of the data (e.g., large or small) or its structure (e.g., ill-struc-
types, namely, survey data (for the empirical studies that obtain
tured or not).
the data through questionnaire), secondary data (for the studies
For segmentation problems, previous research suggests that
that make use of data directly taken from a company database),
hierarchical approaches do not perform very well with large data
and simulation data (for the studies that generate hypothetical
sets (Kuo et al., 2002a). Due to the fact that hierarchical methods
data via simulation).
build a tree structure using a dendogram, they are not able to pro-
vide a unique clustering because partitioning to cut the dendogram
2.7. Segmentation techniques above a certain level becomes imprecise The process of cutting the
dendogram is usually done by visualising the dendogram through
For customer segmentation, a wide variety of data analysis taking into account the distance between cluster centres, which
techniques, cluster analysis (Alfansi & Sargeant, 2000; Allred, can be considered as an arbitrary process. Moreover, non-hierar-
Smith, & Swinyard, 2006; Balakrishnan, Cooper, Jacob, & Lewis, chical or partitional methods work based on the assumption that
1996; Chaturverdi, Carroll, Green, & Rotondo, 1997; Dolnicar, the number of clusters and initial cluster points (not necessarily)
2003; Dolnicar, 2004; Dolnicar & New Zealand Marketing Academy are pre-defined, and this affects the final cluster solution (Lee,
Conference, 2002; Doyle & Saunders, 1985; Hruschka, Fettes, & Lee, & Wicks, 2004). However, integration of hierarchical and part-
Probst, 2004; Hruschka & Natter, 1999; Kuo, Ho, & Hu, 2002a; itional methods makes the clustering result powerful, especially in
Lee, Lee, & Wicks, 2004, Li, Wang, & Xu, 2009; Liu & Shih, 2004, large databases (Kuo et al., 2002b).
2005; Shih & Liu, 2003; Shoemaker, 1994; Smith & Hirst, 2001; In customer segmentation problem, there are only a few studies
Smith, Willis, & Brooks, 2002; Wang, 2009; Xia et al., 2010), clust- that combined two clustering methods together. Punj and Steward
erwise regression (Bass, Tigert, & Lonsdale, 1968; Desarbo, Atalay, (1983) first introduced a two-stage clustering concept by combin-
Lebaron, & Blanchard, 2008; Wedel & Kistemaker, 1989), AID/ ing a hierarchical (Ward’s minimum variance) and a non-hierarchi-
CHAID (Assael & Roscoe, 1976; Chen, 2003; Chung, Oh, Kim, & cal technique (K-means). Here, initial clusters (the number of
Han, 2004; Gensch, 1978; Gil-Saura & Ruiz-Molina, 2008; Jonker, clusters) are determined by a hierarchical method, and then a part-
Piersma, & Poel, 2004; McCarty & Hastak, 2007), multiple regres- itional method is employed to find the final clusters. Similar to this
sion (Suh, Noh, & Suh, 1999), discrimination analysis (Fish, Barnes, methodology, another approach was proposed by Vesanto and
& Aiken, 1995; Johnson, 1971; Mazanec, 1992; Tsiotsou, 2006), la- Alhoniemi (2000), which initially was applied in a non-segmentation
tent class structure (Green, Carmone, & Wachspress, 1976; Konto- context, via changing the combination mentioned above by replac-
leon & Yabe, 2006; Rajiv & Srinivasan, 1987) (Dias & Vermunt, ing Ward’s minimum variance method with a self-organising maps
2007; Wu & Chou, 2011), inductive learning techniques (Leung, (SOM) approach. This second methodology was implemented by
2009) and soft computing techniques (the details of which will other researchers (Al-Khatib, Stanton, & Rawwas, 2005; Chiu, Chen,
be provided in the following sections) have been used in marketing Kuo, & Kun, 2009; Kuo et al., 2002a, 2002b; Lee, Lee, & Wicks, 2004;
management. Lee, Suh, Kim, & Lee, 2004; Lien, Ramirez, & Haines, 2006) for the
Even though it is very difficult to provide a clear classification segmentation problem. In this approach, a set of initial cluster pro-
for segmentation techniques, Fig. 1 is proposed as a baseline totypes is formed before implementing k-means to obtain final
scheme for the classification of those techniques. In this figure, clusters. Similar to Ward’s method, the determination of cluster
while some techniques are classified under data preparation, oth- number is accomplished by visual inspection; however, this pro-
ers are considered as classification or clustering data analysis tech- cess in SOM method is much less arbitrary as SOM itself provides
niques, depending on the distinction of whether they are ‘‘a priori’’ more insights about how to make the visualisation. Also, there
or ‘‘post hoc’’ methods of approach or not. Besides traditional sta- are a few techniques developed for that purpose that can be found
tistical data analysis techniques, such techniques based on fuzzy in the related literature, such as U-matrix, displaying the number
logic (FL), artificial neural networks (ANNs), rough set theory of hits, and generic projection methods (Vesanto & Alhoniemi,
(RST) and evolutionary methods (EM) such as genetic algorithm 2000). Apart from being able to identify the initial cluster numbers,
(GA) are considered as soft computing tools, which are mostly sup- the main advantage of using SOM initially is that the complexity of
posed to be non-traditional artificial intelligence (AI) technologies. the reconstruction task and noise can be effectively reduced. More
They have been considered in both data analysis and data prepara- technical benefits of this approach are well explained in the related
tion as techniques for segmentation. Among the soft computing literature (Lee, Lee, & Wicks, 2004; Vesanto & Alhoniemi, 2000).
techniques, supervised neural networks, GA, and RST have been Furthermore, since the non-hierarchical methods require an initial
used for classification, while unsupervised neural networks, FL, solution and a specification of cluster number, it was better to first
and GA are appropriate for clustering purposes. However, some solve this problem by performing a hierarchical method.
of them (i.e., rough sets and GA) have also been considered as algo-
rithms for data preparation purposes such as attribute reduction or 2.8. Standardisation/normalisation
selection.
For market segmentation purposes, many algorithms can be The first consideration related to data analysis issue is
found in the literature, and it is a very challenging task to standardisation/normalisation of variables. As mentioned earlier,
6496 A. Hiziroglu / Expert Systems with Applications 40 (2013) 6491–6507

Fig. 1. A classification of segmentation techniques.

segmenting markets or customers can be performed either by Reliability is a measurement of having stable, repeatable, and con-
employing a clustering or a classification technique. This issue sistent results (Punj & Steward, 1983). Validity is a measurement of
should be taken into account especially when using a cluster anal- accuracy. There are mainly two types of classification to define the
ysis technique. In fact, this is not necessarily a prerequisite issue validity, namely external and internal validities (Shavelson, 1988).
for every segmentation study. However, since the scope of the The former implies that findings of the study can be generalised or
study does include the critical analysis of soft computing applica- not, while the latter measures to which extent the outcomes of the
tions and some of soft computing techniques based on a clustering study results from the variables and techniques being used.
procedure, the standardisation/normalisation consideration will be In general, there are two possible approaches to assess reliabil-
addressed in the methodology part. Although, according to some ity of a study (Ketchen & Shook, 1996). The first approach is based
researchers, standardisation/normalisation has no significant ef- on the degree of consistency and can be performed via the cluster-
fects, some scholars think that standardisation/normalisation elim- ing or classification task. It is possible to do that via altering the
inates the potential effects on scale differences due to the fact that methods employed or carrying out the execution multiple times
a subset of variables can dominate the definition of clusters and discovering the consistency in solutions. The second approach,
(Ketchen & Shook, 1996). Therefore we took the advice of those which is based on cross-validation, can be accomplished by split-
who advocate that the standardisation/normalisation should be ting the data into two halves and conducting the analysis to come
addressed on studies. Since results may differ solely based on stan- up consistency across sample halves. If a clustering procedure is
dardisation/normalisation the evaluation base for this particular employed then the second approach can be modified through
consideration will be based on the fact that whether a particular obtaining cluster centroids from the first half and using them to
study employed any standardisation/normalisation procedure or define clusters in the second half. Cross validation is a more com-
not before performing the data analysis. mon approach compared to the first one. Pertaining to the cross
validation discriminant measure of Wilk’s Lambda (k) and the Kap-
2.9. Determining the number of cluster (or segments) pa index are the most popular ones used by the marketing
researchers (Punj & Steward, 1983). When analysing the empirical
Similar to the previous consideration this also is associated with studies we will just consider whether any form of reliability has
studies that perform any clustering procedure. Clustering literature been taken into account or not in those studies.
provides several techniques regarding the determination of num- The external validity can be accomplished by showing that the
ber of clusters in data (Ketchen & Shook, 1996). For example, visu- results are useful in larger sense (Punj & Steward, 1983). A widely
alising the cut points on a dendogram, (which is a graph of the used procedure developed by Choffray and Lilien (1980) can be uti-
order that similar observations are joined), the agglomeration coef- lised for that purpose. As an alternative, analysis can be done on a
ficient (a numerical value at which various observations are hold-out sample or on a completely different data set to assess the
merged), and certain cluster validity measures (examples are pre- similarity of the results (Ketchen & Shook, 1996). With regards to
sented in the validity section) are the common ones to name. In internal validity of the clustering or classification results, finding
addition, as we discussed before, combining two clustering meth- a criterion-related validity measure (either in the context of accu-
ods (the two-stage clustering procedure) is another approach to racy or in the form of homogeneity/clustering efficiency) can be
determine the number of clusters. helpful for evaluating and selecting an optimal clustering or classi-
fication schema. If the analysis is classification in nature then any
2.10. Reliability and validity form of classification accuracy measurements can be used. How-
ever, if segmentation is performed through a clustering procedure,
An important aspect concerning segmentation is associated there are two measurements referring to this, namely, compact-
with its validity and reliability. Even after obtaining a segmenta- ness and separation (Kovacs, Legany, & Babos, 2005). The compact-
tion schema the researcher has no assurance of being able to obtain ness measures how close are the members of a cluster to other
a meaningful and useful set of segments. One way of satisfying this clusters. The separation measures the distance between different
condition is to conduct suitable reliability and validity tests. clusters. In the related literature, many clustering indexes have
A. Hiziroglu / Expert Systems with Applications 40 (2013) 6491–6507 6497

been developed (particularly for the partitional methods) and they above are the principal components of soft computing, where fuzzy
can be categorised into three groups (Dimitriadou, Dolnicar, & tools enable us to work with vagueness and uncertainty, and EM
Weingessel, 2002). The first group of indexes consider the sum of can involve optimisation and searching processes, while ANNs
squares within and between the clusters. The second group is and RST can solve classification and rule generation problems with
based on the scatter matrix of the data points, which is the sum their learning and discernability capabilities (Mitra et al., 2002; Pal
of the scatter matrices in each cluster. The last group consists of in- et al., 2002). Only fuzzy systems work with a deductive logic; the
dexes that do not belong to the previous ones, such as Davies & others have inductive capabilities. Both fuzzy and RST can work
Bouldin index, likelihood index, simple structure index, and cluster with descriptive and numeric data, while ANN and EM can work
similarity index of C. Among those indexes, there are some mea- only with numeric data (Duntsch & Gediga, 2000).
sures that can also be applied to determine the number of fuzzy Soft computing technologies have recently been used for solv-
clusters in the data set. Interested readers can refer to related liter- ing data mining problems (Craven & Shavlik, 1997; Kim & Street,
ature (Bezdek & Pal, 1998; Chou, Su, & Lai, 2004; Dimitriadou et al., 2004; Zhong & Skowron, 2001). In the related literature a guideline
2002; Estivill-Castro, 2002; Hruschka, 1986; Kovacs et al., 2005; is given along with several dimensions (complexity of the task, dy-
Punj & Steward, 1983; Shin & Sohn, 2004; Vesanto & Alhoniemi, namic modelling capability of each technique, the size of training
2000; Xie & Beni, 1991). Some of those indexes can also be used data and data type, the capability of modelling uncertainty and
for comparing the results of different clustering methods (Bezdek handling noisy data, interpretability of the technique’s results)
& Pal, 1998; Estivill-Castro, 2002; Kovacs et al., 2005; Rand, regarding the decision of which technique to use when developing
1971; Shin & Sohn, 2004; Vesanto & Alhoniemi, 2000). Another a soft computing application (Martinez, Magoulas, Chen, & Macre-
possible way to measure the criterion-related validity is to assess die, 2005). The suitability of each technique for different problems
significancy with the external variables (Ketchen & Shook, 1996). can be extracted from the literature. Classification of soft comput-
These variables must not have been used in defining clusters but ing application within the scope of data mining is provided in
are theoretically related to the clusters. Categorisation associated Table 3. Considering the fact that segmentation is handled as either
with internal and external validities that were used to assess the a classification or a clustering problem within the data mining, one
empirical studies is presented in the methodology section. can figure out which soft computing technologies are applicable for
segmenting customers or markets.
Fuzzy sets are suitable for handling issues, such as understand-
3. An overview of soft computing technologies ability of patterns, incomplete and imprecise data, information fu-
sion and linguistic information, deducing the knowledge, and
Soft computing (SC) is mainly used in order to improve the finding approximate solutions (Pal et al., 2002; Pedrycz, 1998).
performance of conventional traditional systems, which can be Most of the fuzzy-oriented applications utilised fuzzy clustering
considered as hard computing. It can also be used for imple- (Crespo & Weber, 2005; Hruschka, 1986; Hu & Sheu, 2003; Ozer,
menting novel intelligent and user-friendly features (Dote & 2001; Shin & Sohn, 2004; Wedel & Steenkamp, 1989) in order to
Ovaska, 2001). Soft computing consists of technologies including segment customers. The fuzzy clustering methods partition a data
FL, ANNs, RST, and EM. The history of the soft computing tech- set into a number of overlapping groups based on the similarity or
nologies goes back further than the history of soft computing it- the distance in a metric space between the objects in the data and
self (Dote & Ovaska, 2001; Mitra et al., 2002). This is partly the cluster prototypes (Setnes, 2000). Therefore, in FC, constructing
because of the late realisation or awareness of using those tech- clusters with uncertain boundaries by allowing one object to be-
nologies in a complementary manner due to them being seen as long to some overlapping clusters to some degree of membership
competing tools in the beginning. The definition of soft comput- is possible.
ing as ‘‘an evolving collection of methodologies, which aims to ANNs are able to extract the embedded knowledge in trained
exploit tolerance for imprecision, uncertainty, and partial truth networks usually in the form of symbolic rules, which helps to
to achieve robustness, tractability, and low cost’’ shows that it identify the classes or the predicted values of the observations
deals with the problems that have ambiguity and vagueness in and the importance of the attributes regarding the determination
human thinking with real-life uncertainty (Dote & Ovaska, of those classes or class values in the data space (Mitra et al.,
2001). The concepts of uncertainty and vagueness characterise 2002). Rule generalisation, clustering or classification of the ob-
the situations that we regard as the phenomena surrounding jects, forecasting or prediction future behaviour, and modelling
us and are concerned with the amount of information available complex mathematical functions are the tasks that neuro-comput-
at our disposal (Novak, 1998). Both terms are the main constit- ing is able to deliver within the scope of data mining and knowl-
uents of soft computing. The first term is mathematically ex- edge discovery, especially for predictive marketing (consumer
plained by probability theory and concerns the question of behaviour, market segmentation, purchase modelling, customer
whether something occurs or not, while the latter can be formu- service support, prediction of bond rating, fraud detection, bank-
lised by fuzzy or other approximate sets and deals with what ruptcy and corporate failure) (Zahavi & Levin, 1997). With regards
has or has not occurred (Novak, 1998). Zadeh (1994) defined soft to the application of ANN in the area of segmentation, solely the
computing as ‘‘not a body of concepts and techniques, [but] a backpropagation algorithm (Bloom, 2005) was used for classifica-
partnership of distinct methods that in one way or another con- tion while the other ANN algorithms (e.g., self-organising maps –
form [to] its guiding principle’’, while some authors describe it SOM) were used for clustering (Chiu et al., 2009; Diez, Coz, Luacez,
by the opposite term of ‘‘hard computing’’ (Mitra & Hayashi, & Bahamonde, 2008; Ha, 2007; Hung & Tsai, 2008; Kuo, An, Wang,
2000; Wang & Tan, 1997). & Chung, 2006; Lee & Park, 2005; Shin & Sohn, 2004). Moreover,
However, definitions about soft computing are not completely Adaptive Resonance Theory, Hopfield ANN, Backpropagation, Fre-
satisfactory because there is a risk of epistemological confusion quency-Sensitive Competitive Learning Algorithm (FSCL) are other
about related thoughts (Dubois & Prade, 1998). It can be said that methods that have been used market/customer segmentation
the only consensus about soft computing is that it is a consortium problem.
of methodologies that work synergistically (Mitra & Hayashi, 2000; EM are capable of arriving at an optimal solution via a fitness
Mitra et al., 2002; Pal et al., 2002). This consortium is done in a function in a robust and efficient way where the search space is
cooperative, rather than a competitive manner (Mitra & Acharya, large as in data mining problems. Evolutionary methods (EM), as
2003). In addition to this consensus, the techniques mentioned a member of soft computing techniques, consist of several
6498 A. Hiziroglu / Expert Systems with Applications 40 (2013) 6491–6507

computational models of evolutionary processes including evolu- 4.1. Identification of relevant studies
tionary algorithms, genetic algorithms, evolution strategy, and
evolutionary programming (Kusiak, 2000). It is also possible to find To identify the applications of soft computing techniques on
some applications of EM (particularly applications of genetic algo- segmentation problem, several journal articles were examined.
rithms, which is the most common approach) in the marketing The article selection procedure was based on several criteria. The
field, including customer targeting (Kim & Street, 2004; Kim, first criterion is that the studies should be in empirical nature
Street, & Menczer, 2001; Kim, Street, Russell, & Menczer, 2005), and consist of hypothetical (or simulated) or real-world data. The
market modelling (Shiraz, Marks, Midgley, & Cooper, 1998), second one is related to the main purpose of using soft computing
location analysis and market structuring (Hurley, Moutinho, & techniques; the technique or the techniques should be used in or-
Stephens, 1995), acquiring marketing decision rules (Ghosh & der to perform the segmentation either in clustering or classifica-
Bhabesh, 2004; Terano & Ishino, 1995), and direct marketing appli- tion form. The techniques that are supplementary to the
cations (Bhattacharyya, 2000). Furthermore, strategic marketing segmentation process, such as data preparation or attribute reduc-
initiatives, such as optimization of marketing resources, segmenta- tion were excluded. The third criterion is that only articles where
tion and other consumer behaviour modelling problems, can be segmentation was the main purpose of the study were considered.
considered by EM (Chan, 2008; Chiu, 2002; Hurley et al., 1995; Also, the excluded studies were those, whose main focus is not seg-
Kim & Ahn, 2008; Kuo et al., 2006; Tsai & Chiu, 2004). mentation, but they may be doing something related to the seg-
RST, which is based on mathematical computations and granu- mentation process, such as predictive modelling or direct
lar approximation, has been used for discovering hidden patterns marketing. The last one is associated with the publication type.
in an uncertain environment like fuzzy sets (Mitra et al., 2002). Only journal articles were examined, and publications in other
Within the framework of data mining, some application purposes forms, such as conference paper, book chapters and research re-
of RST can be found, such as attribute reduction and rule genera- ports were not included in the study. Hence, regardless of the pub-
tion (Hu & Cercone, 1994; Hu & Cercone, 1996), prediction (Poel lication date, all empirical studies were collected through the
& Piesta, 1998), rule extraction (Lingras & Yao, 1998; Zhong & publication databases depending on the availability of the access
Skowron, 2001), rule induction (Griffin & Chen, 1998) and classifi- to these databases. However, it can be said that the majority of
cation (Chan, 1998; Li & Wang, 2004). For marketing problems, it the well-known science and social science journals were searched.
can be possible to find a few applications of RST, such as rule At the end of the searching process, a total of 42 studies were
extraction, feature selection, customer retention and response selected for critical analysis. The earliest date of these publications
modelling, segmentation and prediction (Changchien & Lu, 2001; is in 1986, whilst the latest one is in 2012. The majority of the stud-
Cheng & Chen, 2009; Komorowski, Polkowski, & Skowron, 1999; ies were published in science related journals, while only a few of
Poel & Piesta, 1998; Tseng & Huang, 2007; Voges, Pope, & Brown, those were in business and marketing related journals. The follow-
2003). ing section of this study will provide detail knowledge for the dis-
tribution of these studies in different journals and their publication
periods.
4. Method
4.2. Coding procedure
The method which was followed in this article can be found in
most of the literature studies that include the critical analysis of The critical factors described in the previous section served as
the current body of knowledge, particularly associated with exam- the basis for the coding procedure. Specifically, each journal article
ining the empirical studies of any literature. The identification of was given an electronic code number and each of the factors used
the relevant studies, establishing a coding procedure and main- to examine the journal articles was coded respectively. The follow-
taining the reliability of this coding procedure are the main con- ing table is given for elucidating the coding scheme for each factor.
stituents of the method. A description of these efforts is clearly As it can be seen from Table 4, a total of 18 factors or variables
explained below. were taken into consideration, as the basis to examine the selected

Table 3
Soft computing applications in different data mining tasks. Adapted from Mitra et al. (2002) and Pal et al. (2002).

Soft computing technologies Data mining tasks


Fuzzy sets  Clustering
 Association rules
 Functional dependencies
 Data summarisation
 Time series analysis
 Web mining (information retrieval)
 Image retrieval
Artificial neural networks  Rule extraction
 Rule evaluation
 Clustering
 Regression
 Web mining (information extraction and retrieval, personalisation)
Evolutionary methods  Regression
 Association rules
 Web mining (search and retrieval, query optimisation, document representation, distributed mining)
Rough sets  Decision rule induction
 Data filtration (including attribute reduction)
 Rule generation
 Web mining (information retrieval, information fusion, handling multimedia data, document clustering, web usage mining)
A. Hiziroglu / Expert Systems with Applications 40 (2013) 6491–6507 6499

articles. 13 out of 18 variables were explained via a specific cate- selected studies. Also, those studies are implemented into different
gorisation. Some variables took values that include more than industries such as manufacturing, banking, logistics, and charity
one categorisation, whilst for some of them one category was en- etc. Moreover, the term ‘‘data source’’ indicates the origin or the
ough to carry out the task. In other words, some of the variables source of the data.
in certain articles were needed to be explained with more than The studies were examined based on six different segmentation
one category. Variables, such as segmentability criteria, internal criteria. As we discussed earlier, these are the criteria for an effec-
validity, and external validity etc. can be given as examples for this tive segmentation study. While there were studies that satisfy a
type of cases. However, as it can be seen from the table, some of the few of them, some studies do not meet any of those criteria and
variables (year, journal name, SC technique, sample size, and they were labelled as ‘‘none’’ during coding procedure.
industry) do not have any categorisation at all. This is due to the Apart from the factors or variables mentioned above the study
fact that either the values for those factors were difficult to be cat- also took into consideration some other issues related to technical
egorised or a categorisation might not have been necessary at all aspects of those studies. The studies were examined depending on
for them. the availability of the information whether they include any nor-
Regarding the explanation of each variable by referring to what malisation or not. For some studies this information was unavail-
was mentioned in the previous sections, the followings can be said. able and they were coded as ‘‘not available’’ for this variable.
The objectives of segmentation in the studies were examined Also, the studies that are in clustering nature were examined if
based on the five categories that were previously described in they use any method (these are coded as categories from 1 to 5)
the study of Sausen and his friends (2005) and each study was as- in order to determine the number of segments (or clusters). How-
signed to one category only. At the end of the coding procedure, all ever, for some studies this information is ‘‘not applicable’’ and they
five objectives were appeared in at least one journal article. For coded as it so due to the fact that the technique(s) is(are) used for
some articles, it was really hard to extract this information as these classification purpose in those studies.
studies do not mention it clearly. It should be clarified that here Finally, there are variables associated with reliability and valid-
‘‘objective’’ does not refer the objective of the article as the objec- ity of those segmentation studies. Here, those factors are related to
tive of the article could be comparison of different techniques. the core of the segmentation process (either in clustering or classi-
Rather, it is an identification process that finds the most suitable fication form) employed in these studies. Hence, the concern is
segmentation objective category for the selected studies in terms here not the reliability and validity of those articles as a whole
of models that they were employed. In other words, which of the piece but the focus is only on the segmentation part takes place.
specified segmentation objective is the most suitable to describe The existence of reliability was looked for as a binary manner
the potential implication of the article. The unit of analysis were (yes or no). However, for internal and external validities different
coded into five categories and similar to the previous factor the categories were created based on the author’s experience and
categorisation was taken from the same literature (Sausen et al., information available in the social science literature. This categor-
2005). One extra category was added, labelled as ‘‘not available’’, isation includes different ways of measuring internal and external
to represent studies where the identification of the unit of analysis validities and as well as the option for the cases where there is no
was impossible. It might be possible to see some articles where internal/external validity. As similar to other factors/variables in
unit of analysis is not clearly mentioned in their methodology part. which the categorisation made by the numbers, there were cases
Also, the category coded with one did not appear at the end of the where the joint coding procedure was necessary.
coding process.
Segmentation variable factor were categorised into four, 4.3. Coding reliability
namely, general observable, product-specific observable, general
unobservable, and product-specific unobservable. This categorisa- All of the selected studies were coded independently by the two
tion is based on a common classification scheme accepted by seg- evaluators. In order to ensure consistency, initially a random sam-
mentation scholars and suggested by Wedel and Kamakura (2000). ple of eight articles was coded and then the results were compared
As we mentioned earlier for this category some of the studies had to measure the preliminary inter-rater reliability. The consistency
more than one category. Similarly, segmentation model categorisa- between the two evaluators was measured as 78%. In order to im-
tion is based upon industrial segmentation literature. As we ex- prove this rate a meeting was held to discuss the current discrep-
pected, none of the studies were assigned to multi-stage model ancies. The rest of the articles, 30 studies, were then coded with a
segmentation category. 92% inter-rater reliability that can be considered as a quite accept-
Segmentation technology category was created to cover the ma- able rate compared to the studies conducted to accomplish some-
jor technologies of soft computing. There was no single study ben- how similar objective as this study has. The discrepancies were
efiting from rough computing in the selected literature. With resolved by the evaluators through reviewing the differences again
regard to the segmentation technique no categorisation were cre- and consequently a joint decision was obtained by recoding the
ated but typing of each technique was carefully done. For the stud- relevant items.
ies where the soft computing technique is not named clearly, a
categorisation of ‘‘not available’’ was used. A range of techniques
from fuzzy clustering and genetic algorithm to self-organising 5. Results
maps and back propagation were obtained after finishing the cod-
ing process. The purpose of the usage of these segmentation tech- This section highlights major important points regarding the
niques on segmentation problem is also another factor included in descriptive results of the examination of each factor/variable used
the critical analysis. Based on the experience of the author different in the study. The results were presented in a structural way by the
categories associated with classification and clustering were cre- author.
ated as represented in the table.
Sample size and industry are other variables included in the 5.1. Distribution of studies by year, journal name and industry
study. Studies that do not mention the number data available were
labelled as ‘‘not available’’. However, from the majority of the stud- Fig. 2 represents the distribution of the selected articles. The
ies it was possible to extract this information. Sample sizes from time period covers more than 25 years starting from 1986 to
two digits to five and six digits numbers were available in the 2012. Almost one fourth of the studies (n = 10, 23.8% of the total
6500 A. Hiziroglu / Expert Systems with Applications 40 (2013) 6491–6507

Table 4 If we look at the distribution of the articles in terms of the jour-


The coding scheme for each factor or variable. nals in which they were published (see Table 5), it could easily be
Critical factors or variables Coding scheme (categorisation) seen that more than 40% of the publications belong to one partic-
Year None ular journal, namely, Expert Systems with Applications. Also, an
Journal name None interesting point is that the cumulative number of publications
Objective Exploitation of new customers potentials in marketing related journals is not even more than 20% of the total
Development of existing customer publications.
potentials Table 6 indicates that the studies were implemented in several
Increasing customer profitability industries covering different types of sectors. Tourism industry has
Improving targeting of marketing
measures
the biggest number of applications by 19% of total studies and re-
Identification of new sub-markets tail and e-business follow tourism industry as second most imple-
Unit of analysis Not Available
mented areas.
Anonymous sub-markets
Anonymous groups or typologies of 5.2. Distribution of studies by SC technology, technique and the
customers purpose of usage the techniques
Personalised existing customers
Personalised potential customers
With regard to the deployment of soft computing technologies
Segmentation variable General observable
across the studies Table 7 shows that around 65% of the studies
Product specific observable
General unobservable used neuro computing as soft computing technology. There is only
Product specific unobservable one recorded study, which made use of rough computing. The
Segmentation model Single-stage models usage of evolutionary and fuzzy computing is substantially less
Two-stage models compared to neuro computing. Also, three studies utilised both
Multi-stage models neural and evolutionary computing in a collaborative way in the
SC technology Fuzzy computing form of hybrid soft computing.
Neural computing Should one looks at the distribution of the soft computing tech-
Evolutionary computing nologies across the industries as it is shown in Table 8, it can be no-
Rough computing
ticed that neuro computing technologies had applications in all
SC technique None industries. Also, Rough computing technology applied only in e-
Purpose of the usage of SC Clustering business area while the application of fuzzy and evolutionary com-
technique Classification puting technologies can be seen in half of the industries. Moreover,
Both clustering and classification
Contributory to clustering
it is possible to see that three different soft computing technologies
Both clustering and contributory to were utilised in e-business, logistics and transportation, and man-
clustering ufacturing-automotive-food industries.
Data type Survey/Secondary data/Simulation data Pertaining to the above soft computing technologies variety of
Sample size None techniques were utilised as it can be seen from Table 9. This variety
Industry None in fact mainly belongs to neuro computing consisting of techniques
Segmentability criteria satisfied None including self-organising maps, back propagation, vector quantisa-
Homogeneity tion, and Hopfield–Kagmar algorithms. Self-organising maps meth-
Substantiality
od is the most utilised technique across studies by 45% of total
Identifiability
Actionability publications including the complementary usage with evolution-
Measurability ary algorithms. Also, the studies that do not specify the technique
Differentiability made use of neuro computing as well. Regarding fuzzy computing,
Normalisation Yes/No/Not Available only fuzzy clustering technique was utilised and while for evolu-
Determining the number of Not Applicable tionary computing genetic algorithm and particle swarm optimiza-
segments tion were used.
Two-stage method As far as the purpose of the usage of these techniques is con-
Special techniques cerned, Fig. 3 indicates that those techniques are used to perform
Agglomeration coefficient
Dendogram
clustering task in more than 80% of the studies. It should be noted
Arbitrary that this percentage also includes the usage of the techniques as
Reliability Yes/No
contributory to clustering task. The contribution of the used clus-
tering technique stems from either for the purpose of determining
Internal validity Clustering efficiency
Statistical tests on non-clustering
the number of clusters or increasing clustering efficiency.
variables
None 5.3. Segmentation objective and unit of analysis
External validity Hold-out samples
Applying in another data In terms of the pre-specified segmentation objectives, Table 10
None shows that the studies which have objectives of increasing cus-
tomer profitability and developing existing customers potentials
are the two highest categories. This is followed by the objective
of improving targeting marketing measures. The objective related
studies located) were published in year 2004. Before 1999 there is to new markets or customers did only appear in three studies.
an even distribution and at average one article was published each Basically, this result is linked with the unit of analysis being
year. However, following that period we can talk about an increase used across studies as it can be observed from Table 11. If we look
on the average number of publications. After 1998, if the extreme at the frequencies and percentages, more than 80% of the studies
case was excluded (year 2004) the average number of publication used personalised existing customers as unit of analysis. Among
per year would have become three. the pre-specified categories for unit of analysis, anonymous
A. Hiziroglu / Expert Systems with Applications 40 (2013) 6491–6507 6501

sub-markets option was not used in a single study. We could say that segmentability criteria were not taken into consideration from
the anonymity of the customers or customer groups is a key factor the social science perspective. Almost 12% of the studies did not
for soft computing techniques to perform segmentation. Because, consider any of the criteria as a proof of segmentation effective-
almost all of the soft computing techniques need an information ness. Homogeneity was calculated in around 17% of these studies.
table consisting of individual-level attributes in order to perform Identifiability stands out as the highest percentages among the cri-
clustering or classification task. Similar to the traditional clustering teria. Also, the results showed that there are combinatorial catego-
analysis the data should be presented to these techniques in matrix ries (double and triple) with respect to the usage of those criteria.
format (cases as rows and attributes as columns). The combination of homogeneity, identifiability and actionability
As far as the distribution of the objective of those studies across were measured in only five studies. In fact, for a segmentation
industries is concerned, the objectives related to existing custom- study to be considered as an effective study depends on the mea-
ers were aimed in the majority of the industries while studies surement of those criteria. Furthermore, the measurement should
aim at exploitation of new customers or identification of new mar- not be only based on one of them but if possible it should cover
kets applied only in banking-insurance-stock markets and e-busi- all criteria to prove the effectiveness.
ness sectors, respectively as shown in Table 12.

5.4. Segmentation variables and models and segmentability criteria 5.5. Factors related to the analysis stage of the studies

The results indicated that all types of segmentation variables Table 15 provides the results of the examination of the articles
were used in the studies as shown in Table 13. However, general with respect to the issues related to data analysis. The examination
unobservable variables were only used in two studies in which of the articles indicated that almost half of the studies clearly men-
one of them is combined with general and product specific obser- tioned that they performed a normalisation process before con-
vable variables. General observable category was used alone in two ducting the data analysis. However, for almost the other half it
studies and occurred in 12 studies in different combinatorial cate- was impossible to extract this information from the corresponding
gories. The highest percentages belong to product specific observa- manuscripts. In six studies it was clear to conclude that they do not
ble and unobservable categories with the alone usage rate of 28.6% possess any course of action regarding normalisation.
for both of them. They also involved into combinatorial categories The data were used in those studies usually either in the form of
with different usage rates. questionnaire or secondary data that procured from an external
With regards to the segmentation model used in those studies it party. Although the results show that simulation (hypothetical)
can be concluded that there is no application of multi-stage model data was used in only one study, we could absolutely ensure that
as it can be seen from Fig. 4. This is partly, because of the fact that some of the studies that used secondary data also possess hypo-
almost all studies were conducted within the context of customer thetical data. However, during the coding process if there were
segmentation rather than in the scope of industrial or global mar- two different data sets in a particular study, one from secondary
ket segmentation. More than 90% of the studies can fall into the source and the other is hypothetical, the secondary data source
category of single-stage model. However, it is noticeable that there was accepted as the main data type. In such cases that the simula-
are three studies that were utilised the two-stage segmentation tion data was considered as another sample, it was accepted that
model. an analysis concerning external validation was carried out.
The results associated with the segmentability criteria is the With regard to the actions taken in order to determine the num-
most important aspect of this study. It could be argued that there ber of clusters during the analysis stage, the results illustrated that
is a real gap between applied science and social science research in sixteen studies did not utilise any of the specified methods. When
terms of taking this issue as a priority. As it was mentioned before, there was a method for determining the number of clusters in a
the majority of those studies were written from the perspective of particular study either a special technique from clustering litera-
applied science. The analysis results as shown in Table 14 indicated ture or the two-stage method was utilised. For the classification

Fig. 2. Publications by year.


6502 A. Hiziroglu / Expert Systems with Applications 40 (2013) 6491–6507

Table 5 An interesting result occurred when reliability factor was ex-


Publications by journal name. plored. Approximately 65% of the studies did not include a reliabil-
Journal name Frequency Percentage ity measure regarding the consistency of the clustering and/or
Advanced engineering informatics 1 2.4 classification methodology they employed. Among the remaining
Annals of tourism research 1 2.4 studies, 13 articles were considered to possess reliability since they
Asian journal of management and humanity 1 either provided a reliability measure or they conducted the analy-
sciences sis in at least two data sets and obtained consistent results. In fact,
Australasian marketing journal 1 2.4
Computers and industrial engineering 1 2.4
in the latter cases the reliability was ensured through obtaining
Computers and operations research 1 2.4 stable results as a result of replication of the analysis.
Decision support systems 1 2.4 In connection with reliability, 14 studies carried out some anal-
European journal of marketing 1 2.4 yses regarding external validity as it can be seen in Table 16. They
European journal of operational research 2 4.8
assured the external validity either by having a different data set or
Expert systems with applications 18 42.9
Fuzzy sets and systems 2 4.8 allocating a hold-out sample. However, in general, external validity
Industrial marketing management 1 2.4 did not exist in 28 studies. In comparison with external validity,
International journal of research in marketing 2 4.8 better results were emerged with respect to internal validity of
Journal of operational research society 1 2.4 the examined studies. In total, 31 studies have at least one mea-
Journal of organisational computing and electronic 1 2.4
surement associated with internal validity. Yet, no method or tech-
Journal of research in marketing 1 2.4
Journal of retailing and consumer services 1 2.4 nique was carried out in 11 studies to ensure internal validity.
Journal of travel and tourism marketing 1 2.4
Journal of travel research 1 2.4
Omega 1 2.4
6. Discussions for the future of SC in segmentation research
Tourism management 2 4.8
Total 42 100.0
A large volume of data about customers has created opportuni-
ties for businesses and enables them to gain competitive advantage
(Shaw et al., 2001). However, because of the lack of appropriate
Table 6
tools and techniques to analyse customer databases, a wide variety
Publications by industry. of customer information and buying patterns are hidden in these
databases (Shaw et al., 2001). Soft computing techniques have
Industry Frequency Percentage
been getting attention in this context by the interdisciplinary
Banking & insurance & stock markets 4 9.5 researchers. The application areas of soft computing are mainly hu-
Charity & social club 2 4.8
E-business 7 16.7
man-related fields ranging from manufacturing, automation and
Household & universal products 4 9.5 robotics to transportation and communication systems that in-
Logistics and transportation 3 7.1 volve uncertainty and vagueness to some extent (Bonissone, Chen,
Manufacturing & automotive & food 4 9.5 Goebel, & Khedkar, 1999; Dote & Ovaska, 2001; Martinez et al.,
Retail 6 14.3
2005). Those applications have stimulated other applications re-
Telecommunication 3 7.1
Tourism 8 19.0 lated to business and finance. In fact, Kordon (2006) pointed out
Not available 1 2.4 eight future industrial needs that are associated business and fi-
nance for which SC technologies can be very much useful. The arti-
cle lists four problem domains; predictive marketing, accelerated
new product diffusion, manufacturing at economic optimum, and
Table 7 predictive optimal supply chain.
Publications by soft computing technology.
However, compared to other disciplines it is difficult to see a
SC technology Frequency Percentage growing number of soft computing technologies applications to
Evolutionary computing 5 11.6 business and management problems particularly for customer seg-
Fuzzy computing 6 13.9 mentation. There may be four main reasons that SC has not been
Hybrid computing 3 7.0 taken enough attention by the academics and practitioners. The
Neuro computing 28 65.1 first reason can be linked to human factors in the context of ‘‘pol-
Rough computing 1 2.0
itics’’ as it has been discussed in Kordon (2006). According to the
author although there are people advocate the benefits of SC and
ready to take risk to implement them, there are however a lot of
techniques, this option was not applicable since the issue in those scepticism regarding the value of SC as some people see the imple-
studies was in the form of binary classification. mentation efforts as a ‘‘research toy exercise’’. The second reason is

Table 8
Publications by industry and SC technology.

Industry Evolutionary computing Fuzzy computing Hybrid soft computing Neuro computing Rough computing
Banking & insurance & stock markets 0 2 0 2 0
Charity & social club 1 0 0 1 0
E-business 1 1 0 4 1
Household & universal products 0 1 0 3 0
Logistics and transportation 0 1 1 1 0
Manufacturing & automotive & food 1 1 0 2 0
Retail 2 0 0 5 0
Telecommunication 0 0 1 2 0
Tourism 0 0 1 7 0
Not available 0 0 0 1 0
A. Hiziroglu / Expert Systems with Applications 40 (2013) 6491–6507 6503

Table 9 Table 10
Usage of SC technique across the studies. Segmentation objective across studies.

SC technique Frequency Percentage Objective Frequency Percentage


Back propagation 5 11.9 Exploitation of new customers potentials 2 4.8
FSCL 1 2.4 Development of existing customer potentials 14 33.3
Fuzzy clustering 6 14.3 Increasing customer profitability 13 31.0
Genetic algorithm 4 9.5 Improving targeting of marketing measures 10 23.8
Hopfield–Kagmar 1 2.4 Identification of new sub-markets 3 7.1
Self-organising maps 15 35.7
Self-organising maps and genetic algorithm 2 4.8
Self-organising maps and particle swarm 2 4.8
optimisation Table 11
Vector quantisation 3 7.1 Unit of analysis across studies.
Rough set theory 1 2.4
Unit of analysis Frequency Percentage
Not available 2 4.8
Not available 1 2.4
Anonymous groups or typologies of customers 4 9.5
Personalised existing customers 35 83.3
Personalised potential customers 2 4.8

associated with the technical and methodological requirements


(including data requirement) of these technologies as additional
analytical reasoning skills or training may be needed before imple- The soft computing technologies can be used individually,
menting them. The third reason is that both the researchers in so- integrated or in combination as hybridizations like neuro-fuzzy,
cial science and the practitioners see those technologies quite fuzzy-neuro, fuzzy-genetic, genetic-fuzzy, neuro-genetic, rough-
complex and hard to implement to perform a segmentation study. neuro, rough-fuzzy, rough-neuro-fuzzy, rough-neuro-genetic or
Although, those technologies have some sort of technical aspects, rough-neuro-fuzzy-genetic (Bonissone et al., 1999; Komorowski
the majority of them are included in some of the latest data mining et al., 1999; Mitra & Hayashi, 2000; Mitra et al., 2002; Pal et al.,
software and they are easy to use likewise its counter data analysis 2002). The main difference between the usages of these tech-
tools. The last reason stems from the gap between applied science niques, either in an integrated or combinatorial manner, is that
researchers and social scientists. This is due to the fact that either combination consists of merging two or more techniques in a
the researchers in both parties may not be completely aware of unitary structure, while integration refers to a dividable series of
each other’s studies or it is partly because of the reality that both continua. In customer segmentation literature most of them were
research orientations follow different scientific paradigms. A used individually rather than integrated or hybridised. In another
systematic solution to the credibility of SC in general (but can be words, appropriate combination of those techniques has not been
applied to social science research and practice) is provided by accomplished yet. The appropriateness is with regards to obtaining
Kordon (2006) to which the more interested readers can refer. efficient clusters both from the clustering and marketing points of
Although business and finance applications may not be as views. Therefore, one needs to look at the methodological frame-
sophisticated as science and engineering applications so that SC works that give us the idea of how to combine different clustering
can play a role in business-related problems, the applicability of or classification techniques in an appropriate manner and more
SC in social science problems represents a significant paradigm importantly in a simplified way. This simplicity is partly due to
shift (breakthrough) that reflects the human mind possesses a the fact that the majority of researchers in business and manage-
remarkable ability to store and process information in the aim of ment area may have been facing difficulty to understand the
computing (Dote & Ovaska, 2001). Within the scope of customer technical background of the methods developed by information
segmentation it can be said that although the techniques based technologists and computers scientists.
on statistical approaches to classify customers to form segments When the issue comes to the application of soft computing
have met with various degrees of success, it is noteworthy to men- technologies, there are however some critical issues regarding
tion that those approaches are not capable of executing large num- the usage of them in a specific problem domain (Mitra et al.,
ber of data and do not provide a flexible segmentation structure as 2002). These are (1) scalability problem, (2) feature evaluation
soft computing technologies are capable of to do so. and dimensionality reduction, (3) choice of metrics and evaluation

Fig. 3. Purpose of the usage of SC technique in publications.


6504 A. Hiziroglu / Expert Systems with Applications 40 (2013) 6491–6507

Table 12
Publications by industry and segmentation objective.

Industry Exploitation of new Development of existing Increasing customer Improving targeting of Identification of new
customers potentials customer potentials profitability marketing measures sub-markets
Banking & insurance & 1 0 3 0 0
stock markets
Charity & social club 0 0 2 0 0
E-business 0 3 0 2 2
Household & universal 1 1 0 2 0
products
Logistics and 0 1 1 1 0
transportation
Manufacturing & 0 1 1 2 0
automotive & food
Retail 0 2 4 0 0
Telecommunication 0 2 1 0 0
Tourism 0 4 1 2 1
Not available 0 0 0 1 0

Table 13 Table 14
The usage of segmentation variables across studies. The deployment of segmentability criteria across studies.

Segmentation variables Frequency Percentage Segmentability criteria satisfied Frequency Percentage


General observable 2 4.8 None 5 11.9
General unobservable 1 2.4 Homogeneity 7 16.7
Product specific observable 12 28.6 Identifiability 8 19.0
Product specific unobservable 12 28.6 Actionability 1 2.4
General observable & product specific observable 6 14.3 Homogeneity & substantiality 1 2.4
General observable & product specific 2 4.8 Homogeneity & identifiability 11 26.2
unobservable Homogeneity & actionability 1 2.4
Product specific observable & product specific 1 2.4 Identifiability & actionability 2 4.8
unobservable Homogeneity & identifiability & actionability 5 11.9
General unobservable & product specific 2 4.8 Substantiality & identifiability & differentiability 1 2.4
unobservable
General observable & product specific observable 1 2.4
& general unobservable
General observable & product specific observable 3 7.1
& product specific unobservable
Table 15
Publications by factors related to the analysis (normalisation, data type, determina-
tion of the number of clusters).

Normalisation Frequency Data type Frequency Det. no of Frequency


cluster
Not available 17 Survey 20 Not applicable 5
No 6 Secondary 21 Two-stage 8
data method
Yes 19 Simulation 1 Special 9
data techniques
Dendogram 2
Arbitrary 16
Two-stage 2
method &
special
techniques

Table 16
Internal and external validity status of the studies.

Internal validity Frequency External Frequency


validity

Fig. 4. Segmentation model across studies. Clustering efficiency 25 Hold-out 3


sample
Statistical tests on non- 1 Different data 10
clustering variables
techniques for dynamic changes in data, (4) incorporation of do-
Clustering efficiency & 5 Hold-out 1
main knowledge and user interaction, (5) efficient integration of statistical tests on non- sample &
soft computing tools. The majority of the those issues stem from clustering variables different data
the contemporary challenges of data mining itself. What is more None 11 None 28
important to point out that integration or hybridisation of these
technologies in different data mining application domains has be-
come an important future research area. Should one considers the necessity of dealing with web data, there is still much to research
fact that the data mining researchers highlight especially the on the application of those technologies either in single or
A. Hiziroglu / Expert Systems with Applications 40 (2013) 6491–6507 6505

integrated/combinatorial way within the context of web mining Chiu, C. (2002). A case-based customer classification approach for direct marketing.
Expert Systems with Applications, 22, 163–168.
(Pal et al., 2002). In particular, within the web mining domain
Chiu, C.-Y., Chen, Y.-F., Kuo, I., & Kun, H. C. (2009). An intelligent market
performing different segmentation studies by making use of soft segmentation system using k-means and particle swarm optimization. Expert
computing technologies could be of great importance for further Systems with Applications, 36, 4558–4565.
studies. Choffray, J. M., & Lilien, G. L. (1980). Market planning for new industrial products. New
York: Wiley.
Although soft computing is a major area of academic research, Chou, C., Su, M., & Lai, E. (2004). A new cluster validity measure and its application
the concept is still in its evolving stage, and new methodologies, to image compression. Pattern Analysis and Applications, 7, 205–220.
e.g., chaos computing and immune networks are nowadays consid- Chung, K. Y., Oh, S. Y., Kim, S. S., & Han, S. Y. (2004). Three representative market
segmentation methodologies for hotel guest room customers. Tourism
ered to be belong to SC (Dote & Ovaska, 2001). In conclusion, the Management, 24, 429–441.
advances in fuzzy systems such as investigations on computing Claycamp, H. J., & William, F. M. (1968). A theory of market segmentation. Journal of
with words, cognitive and reactive distributed artificial intelli- Marketing Research, 5, 388–394.
Craven, M. W., & Shavlik, J. W. (1997). Using neural networks for data mining. Future
gence applications including intelligent agents, the emerging Generation Computer Systems, 13, 211–229.
applications of evolutionary computing including meta-heuristics, Crespo, F., & Weber, R. (2005). A methodology for dynamic data mining based on
probabilistic models, and rough computing, and will lead us to the fuzzy clustering. Fuzzy Sets and Systems, 150, 267–284.
Day, G. S. (1990). Market-driven strategy: process for creating value. New York: The
construction of more advanced intelligent systems, which can also Free Press.
be applicable for business problems (Kordon, 2006; Verdegay, Yag- Desarbo, W. S., Atalay, A. S., Lebaron, D., & Blanchard, S. J. (2008). Estimating
er, & Bonissone, 2008). multiple consumer segment ideal points from context-dependent survey data.
Journal of Consumer Research, 35, 142–153.
Dhalla, N. K., & Mahatoo, W. H. (1976). Expanding the scope of segmentation
References research. Journal of Marketing, 40, 34–41.
Dias, J. G., & Vermunt, J. K. (2007). Latent class modeling of website users’ search
Alfansi, L., & Sargeant, A. (2000). Market segmentation in the Indonesian banking patterns: implications for online market segmentation. Journal of Retailing and
sector: the relationship between demographics and desired customer benefits. Consumer Services, 14, 359–368.
International Journal of Bank Marketing, 18, 64–74. Dibb, S. (1995). Developing a decision tool for identifying operational and attractive
Al-Khatib, J. A., Stanton, A. A., & Rawwas, M. Y. A. (2005). Ethical segmentation of segments. Journal of Strategic Marketing, 3, 189–203.
consumers in developing countries: a comparative analysis. International Dibb, S. (1999). Criteria guiding segmentation implementation: reviewing the
Marketing Review, 22, 225–246. evidence. Journal of Strategic Marketing, 7, 107–129.
Allred, C. R., Smith, S. M., & Swinyard, W. R. (2006). E-shopping lovers and fearful Dibb, S., & Simkin, L. (1997). A program for implementing market segmentation.
conservatives: a market segmentation analysis. International Journal of Retail & Journal of Business & Industrial Marketing, 12, 51–65.
Distribution Management, 34, 308–333. Dibb, S., & Simkin, L. (2010). Judging the quality of customer segments:
Assael, H., & Roscoe, A. M. (1976). Approaches to market segmentation analysis. segmentation effectiveness. Journal of Strategic Marketing, 18, 113–131.
Journal of Marketing, 40, 67–76. Diez, J., Coz, J. J., Luacez, O., & Bahamonde, A. (2008). Clustering people according to
Bailey, C., Baines, P. R., Wilson, H., & Clark, M. (2009). Segmentation and customer their preference criteria. Expert Systems with Applications, 34, 1274–1284.
insight in contemporary services marketing practice: why grouping customers Dimitriadou, E., Dolnicar, S., & Weingessel, A. (2002). An examination of indexes for
is no longer enough. Journal of Marketing Management, 25, 227–252. determining the number of clusters in binary data sets. Psychometrika, 67,
Balakrishnan, P. V. S., Cooper, M. C., Jacob, V. S., & Lewis, P. A. (1996). Comparative 137–160.
performance of the FSCL neural net and k-means algorithm for market Dolnicar, S. (2002). A review of unquestioned standards in using cluster analysis for
segmentation. European Journal of Operational Research, 93, 346–357. data-driven market segmentation. In Australian and New Zealand marketing
Barnett, N. L. (1969). Beyond market segmentation. Harvard Business Review, 47, academy conference.
152–166. Dolnicar, S. (2003). Using cluster analysis for market segmentation – typical
Bass, F. M., Tigert, D. J., & Lonsdale, R. T. (1968). Market segmentation: group versus misconceptions, established methodological weaknesses and some
individual behavior. Journal of Marketing Research, 5, 264–270. recommendations for improvement. Journal of Market Research, 11, 5–12.
Bayer, J. (2010). Customer segmentation in the telecommunications industry. Dolnicar, S. (2004). Beyond commonsense segmentation: a systematics of
Database Marketing & Customer Strategy Management, 17, 247–256. segmentation approaches in tourism. Journal of Travel Research, 42, 244–250.
Beane, T. P., & Ennis, D. M. (1987). Market segmentation: a review. European Journal Dolnicar, S., Freitag, R., & Randle, M. (2005). To segment or not to segment?: an
of Marketing, 21, 20–42. investigation of segmentation strategy success under varying market
Bezdek, J. C., & Pal, N. R. (1998). Some new indexes of cluster validity. IEEE conditions. Australasian Marketing Journal, 13, 20–35.
Transactions on Systems, Man, and Cybernetics, 28, 301–315. Dote, Y., & Ovaska, S. J. (2001). Industrial applications of soft computing: a review.
Bhattacharyya, S. (2000). Evolutionary algorithms in data mining: multi-objective Proceeding of the IEEE, 1243–1265.
performance modelling for direct marketing. In Proceedings of the ACM-SIGKDD Doyle, P., & Saunders, J. (1985). Market segmentation and positioning in specialized
international conference on knowledge discovery and data mining, 2000 Boston industrial markets. Journal of Marketing Research, 49, 24–32.
(pp. 465–473). ACM. Dubois, D., & Prade, H. (1998). Soft computing, fuzzy logic and artificial intelligence.
Biggadike, E. R. (1981). The contributions of marketing to strategic management. Soft Computing, 2, 7–11.
Academy of Management Review, 6, 621–632. Duntsch, I., & Gediga, G. (2000). Rough set data analysis. Encyclopaedia of computer
Bloom, J. Z. (2005). Market segmentation: a neural network application. Annals of science and technology. New York: Marcel Dekker.
Tourism Research, 32, 93–111. Estivill-Castro, V. (2002). Why so many clustering algorithms – a position paper.
Bock, T., & Uncles, M. (2002). A taxonomy of differences between consumers for SIGKDD Explorations, 4, 65–75.
market segmentation. International Journal of Research in Marketing, 19, Fish, K., Barnes, J., & Aiken, M. (1995). Artificial neural networks: a new
215–224. methodology for industrial market segmentation. Industrial Marketing
Bonissone, P., Chen, Y., Goebel, K., & Khedkar, P. S. (1999). Hybrid soft computing Management, 24, 432–438.
systems: industrial and commercial applications. Proceeding of the IEEE, FREITAS, A. A (2002). A survey of evolutionary algorithms for data mining and
1641–1667. knowledge discovery. In A. Ghosh & S. S. Tsutsui (Eds.), Advances in evolutionary
Chan, C. C. (1998). A rough set approach to attribute generalization in data mining. computation. Berlin: Springer Verlag.
Journal of Information Sciences, 107, 169–176. Gensch, D. H. (1978). Image-measurement segmentation. Journal of Marketing
Chan, C. C. (2008). Intelligent value-based customer segmentation method for Research, 15, 384–394.
campaign management: A case study of automobile retailer. Expert Systems with Ghosh, A., & Bhabesh, N. (2004). Multi-objective rule mining using genetic
Applications, 34, 2754–2762. algorithms. Information Sciences, 163, 123–133.
Changchien, S. W., & Lu, T. Z. (2001). Mining association rules procedure to support Gil-Saura, I., & Ruiz-Molina, M.-E. (2008). Customer segmentation based on
on-line recommendation by customers and products fragmentation. Expert commitment and ICT use. Industrial Management & Data Systems, 109, 206–223.
Systems with Applications, 20, 325–335. Goller, S., Hogg, A., & Kalafatis, S. P. (2002). A new research agenda for business
Chaturverdi, A., Carroll, J. D., Green, P. E., & Rotondo, J. A. (1997). A feature-based segmentation. European Journal of Marketing, 36, 252–271.
approach to market segmentation via overlapping K-centroids clustering. Goyat, S. (2011). The basis of market segmentation: a critical review of literature.
Journal of Marketing Research, 34, 370–377. European Journal of Business and Management, 3, 45–54.
Chen, J. S. (2003). Developing a travel segmentation methodology: a criterion-based Green, P. E. (1977). A new approach to market segmentation. Business Horizons, 20,
approach. Journal of Hospitality & Tourism Research, 27, 310–327. 61–73.
Cheng, C.-H., & Chen, Y.-S. (2009). Classifying the segmentation of customer value Green, P. E., & Carmone, F. J. (1977). Segment congruence analysis: a method for
via RFM model and RS theory. Expert Systems with Applications, 36, 4176–4184. analysing association among alternative bases for market segmentation. Journal
Cheron, E. J., & Kleinschmidt, E. J. (1985). A review of industrial market of Consumer Research, 3, 217–222.
segmentation research and a proposal for an integrated segmentation Green, P. E., Carmone, F. J., & Wachspress, D. P. (1976). Consumer segmentation via
framework. International Journal of Research in Marketing, 2, 101–115. latent class analysis. Journal of Consumer Research, 3, 170–174.
6506 A. Hiziroglu / Expert Systems with Applications 40 (2013) 6491–6507

Green, P. E., & Krieger, A. M. (1991). Segmenting markets with conjoint analysis. Li, R., & Wang, Z. (2004). Mining classification rules using rough sets and neural
Journal of Marketing, 55, 20–31. networks. European Journal of Operational Research, 157, 439–448.
Greenberg, M., & McDonald, S. S. (1989). Successful needs/benefits segmentation: a Li, J., Wang, K., & Xu, L. (2009). Chameleon based on clustering feature tree and its
user’s guide. The Journal of Consumer Marketing, 6, 29–36. application in customer segmentation. Ann Operations Research, 168, 225–245.
Griffin, G., & Chen, Z. (1998). Rough set extension of Tcl for data mining. Knowledge- Liao, S. (2003). Knowledge management technologies and applications: literature
Based Systems, 11, 249–253. review from 1995 to 2002. Expert Systems with Applications, 25, 155–164.
Ha, S. H. (2007). Applying knowledge engineering techniques to customer analysis Lien, C.-H., Ramirez, A., & Haines, G. H. (2006). Capturing and evaluating segments:
in the service industry. Advanced Engineering Informatics, 21, 293–301. using self-organizing maps and k-means in market segmentation. Asian Journal
Haley, R. I. (1968). Benefit segmentation: a decision-oriented research tool. Journal of Management and Humanity Sciences, 1, 1–15.
of Marketing, 32, 30–35. Lingras, P. J., & Yao, Y. Y. (1998). Data mining using extensions of the rough set
Hruschka, H. (1986). Market definition and segmentation using fuzzy clustering model. Journal of the American Society for Information Science, 49, 415–422.
methods. International Journal of Research in Marketing, 3, 117–134. Liu, D., & Shih, Y. (2004). Hybrid approaches to product recommendation based on
Hruschka, H., Fettes, W., & Probst, M. (2004). Market segmentation by maximum customer lifetime value and purchase preferences. The Journal of Systems and
likelihood clustering using choice elasticities. European Journal of Operational Software, 77, 181–191.
Research, 154, 779–786. Liu, D., & Shih, Y. (2005). Integrating AHP and data mining for product
Hruschka, H., & Natter, M. (1999). Comparing performance of feed-forward neural recommendation based on customer lifetime value. Information &
nets and k-means for cluster-based market segmentation. European Journal of Management, 42, 387–400.
Operational Research, 114, 346–353. Martinez, E., Magoulas, G., Chen, S., & Macredie, R. (2005). Modeling human
Hu, X. & Cercone, N. (1994). Discovery of decision rules in relational databases: a behavior in user-adaptive systems: recent advances using soft computing
rough set approach. In CIKM’94, 1994 Galtherburg (pp. 392–400). techniques. Expert Systems with Applications, 29, 320–329.
Hu, X. & Cercone, N. (1996). Mining knowledge rules from database: a rough set Mazanec, J. A. (1992). Classifying tourists into market segments: a neural network
approach. In Proceedings 12th international conference on data engineering, 1996 approach. Journal of Travel and Tourism Marketing, 1, 39–59.
Washington (pp. 96–105). McCarty, J. A., & Hastak, M. (2007). Segmentation approaches in data-mining: a
Hu, T., & Sheu, J. (2003). A fuzzy-based customer classification method for demand- comparison of RFM, CHAID, and logistic regression. Journal of Business Research,
responsive logistical distribution operations. Fuzzy Sets and Systems, 139, 60, 656–662.
431–459. Mitra, S., & Acharya, T. (2003). Data mining: multimedia, soft computing, and
Hung, C., & Tsai, C.-F. (2008). Market segmentation based on hierarchical self- bioinformatics. New Jersey: Wiley-Interscience.
organizing map for markets of multimedia on demand. Expert Systems with Mitra, S., & Hayashi, Y. (2000). Neuro-fuzzy rule generation: survey in soft
Applications, 34. computing framework. IEEE Transactions on Neural Networks, 11, 748–768.
Hurley, S., Moutinho, L., & Stephens, N. M. (1995). Solving marketing optimization Mitra, S., Pal, S. K., & Mitra, P. (2002). Data mining in soft computing framework: a
problems using genetic algorithms. European Journal of Marketing, 29, 39–56. survey. IEEE Transactions on Neural Networks, 13, 3–14.
Johnson, R. M. (1971). Market segmentation: a strategic management tool. Journal of Myers, J. H., & Tauber, E. (1977). Market structure analysis. Chicago: American
Marketing Research, 8, 13–18. Marketing Association.
Jonker, J., Piersma, N., & Poel, D. V. (2004). Joint optimization of customer Nairn, A., & Berthon, P. (2003). Creating the customer: the influence of advertising
segmentation and marketing policy to maximize long-term profitability. on consumer market segments. Journal of Business Ethics, 42, 83–99.
Expert Systems with Applications, 27, 159–168. Natter, M. (1999). Conditional market segmentation by neural networks: a monte-
Ketchen, D. J. J., & Shook, C. L. (1996). The application of cluster analysis in strategic carlo study. Journal of Retailing and Consumer Services, 6, 237–248.
management research: an analysis and critique. Strategic Management Journal, Novak, V. (1998). Towards formal theory of soft computing. Soft Computing, 2, 4–6.
17, 441–458. Ozer, M. (2001). User segmentation of online music services using fuzzy clustering.
Kim, K., & Ahn, H. (2008). A recommender system using GA k-means clustering in an Omega, 29, 193–206.
online shopping market. Expert Systems with Applications, 34, 1200–1209. Pal, S. K., Talwar, V., & Mitra, P. (2002). Web mining in soft computing framework:
Kim, Y., Street, W. N., & Menczer, F. (2001). An evolutionary multi-objective local relevance, state of the art and future directions. IEEE Transactions on Neural
selection algorithm for customer targeting. In Proceedings of the 2001 congress Networks, 13, 1163–1177.
on evolutionary computation, 2001 Seoul (pp. 759–766). Pedrycz, W. (1998). Fuzzy set technology in knowledge discovery. Fuzzy Sets and
Kim, S.-Y., Jung, T.-S., Suh, E.-H., & Hwang, H.-S. (2006). Customer segmentation and Systems, 98, 279–290.
strategy development based on customer lifetime value: a case study. Expert Peltier, J. M., & Schribrowsky, J. A. (1997). The use of need-based segmentation for
Systems with Applications, 31, 101–107. developing segment-specific direct marketing strategies. Journal of Direct
Kim, Y., & Street, W. N. (2004). An intelligent system for customer targeting: a data Marketing, 11, 54–62.
mining approach. Decision Support Systems, 37, 215–228. Poel, D. V., & Piesta, Z. (1998). Purchase prediction in database marketing with the
Kim, Y., Street, W. N., Russell, G. J., & Menczer, F. (2005). Customer targeting: a probrough system. In RSCTC’98 (pp. 593–600). Berlin: Springer-Verlag.
neural network approach guided by genetic algorithms. Management Science, Punj, G., & Steward, D. W. (1983). Cluster analysis in marketing research: review
51, 264–276. and suggestions for applications. Journal of Marketing Research, 20, 134–148.
Komorowski, J., Polkowski, L., & Skowron, A. (1999). Rough sets: a tutorial. Rough- Raaij, W. F., & Verhallen, T. M. M. (1994). Domain-specific market segmentation.
fuzzy hybridization: a new trend in decision making. Singapore: Springer-Verlag. European Journal of Marketing, 28, 49–66.
Kontoleon, A., & Yabe, M. (2006). Market segmentation analysis of preferences for Rajiv, G., & Srinivasan, V. (1987). A simultaneous approach to market segmentation
GM derived animal foods in the UK. Journal of Agricultural & Food Industrial and market structuring. Journal of Marketing Research, 24, 139–153.
Organization, 4, 1–38. Rand, W. M. (1971). Objective criteria for the evaluation of clustering methods.
Kordon, A. K. (2006). Future trends in soft computing industrial applications. In IEEE Journal of the American Statistical Association, 66, 846–850.
international conference on fuzzy systems (pp. 1663–1670). Sausen, K., Tomczak, T., & Herrmann, A. (2005). Development of a taxonomy of
Kotler, P. (2003). Marketing management. New Jersey: Prentice-Hall. strategic market segmentation: a framework for bridging the implementation
Kovacs, F., Legany, C., & Babos, A. (2005). Cluster validity measurement techniques. gap between normative segmentation and business practice. Journal of Strategic
In 6th international symposium of Hungarian researchers on computational Marketing, 13, 151–173.
intelligence, 2005 Budapest. Segal, M. N., & Giacobbe, R. W. (1994). Market segmentation and competitive
Kuo, R. J., An, Y. L., Wang, H. S., & Chung, W. J. (2006). Integration of self-organizing analysis for supermarket retailing. International Journal of Retail & Distribution
feature maps neural network and genetic K-means algorithm for market Management, 22, 38–48.
segmentation. Expert Systems with Applications, 30, 313–324. Setnes, M. (2000). Supervised fuzzy clustering for rule extraction. IEEE Transactions
Kuo, R. J., Ho, L. M., & Hu, C. M. (2002a). Cluster analysis in industrial market on Fuzzy Systems, 8, 416–424.
segmentation through artificial neural network. Computers and Industrial Shapiro, B. P., & Bonoma, T. V. (1984). How to segment industrial markets. Harvard
Engineering, 42, 391–399. Business Review, 62, 104–110.
Kuo, R. J., Ho, L. M., & Hu, C. M. (2002b). Integration of self-organizing feature map Sharma, A., & Lambert, D. M. (1994). Segmentation of markets based on customer
and k-means algorithm for market segmentation. Computers and Operations service. International Journal of Physical Distribution & Logistics Management, 24,
Research, 29, 1475–1493. 50–58.
Kusiak, A. (2000). Evolutionary computation and data mining. In Proceedings of the Shavelson, R. J. (1988). Statistical reasoning for the behavioral sciences. Boston: Allyn
SPIE conference on intelligent systems and advanced manufacturing, 2000 Boston and Bacon.
(pp. 1–10). Shaw, M. J., Subramaniam, C., Tan, G. W., & Welge, M. E. (2001). Knowledge
Lee, C., Lee, Y., & Wicks, B. E. (2004). Segmentation of festival motivation by management and data mining for marketing. Decision Support Systems, 31,
nationality and satisfaction. Tourism Management, 25, 61–70. 127–137.
Lee, J. H., & Park, S. C. (2005). Intelligent profitable customers segmentation system Shih, Y., & Liu, C. (2003). A method for customer lifetime value ranking: combining
based on business intelligence tools. Expert Systems with Applications, 29, analytic hierarchy process and clustering analysis. Journal of Database Marketing
145–152. & Customer Strategy Management, 11, 159–172.
Lee, S. C., Suh, Y. H., Kim, J. K., & Lee, K. J. (2004). A cross-national market Shin, H. W., & Sohn, S. Y. (2004). Segmentation of stock trading customers according
segmentation of online game industry using SOM. Expert Systems with to potential value. Expert Systems with Applications, 27, 27–33.
Applications, 27, 559–570. Shiraz, G. M., Marks, R. E., Midgley, D. F., & Cooper, L. G. (1998). Using genetic
Leung, C.-H. (2009). An inductive learning approach to market segmentation based algorithms to breed competitive marketing strategies. in IEEE international
on customer profile attributes. Asian Journal of Marketing, 3, 65–81. conference on systems, man and cybernetics (pp. 2367–2372).
A. Hiziroglu / Expert Systems with Applications 40 (2013) 6491–6507 6507

Shoemaker, S. (1994). Segmenting the U.S. travel market according to benefits Vriens, M. (2001). Market segmentation: analytical developments and application
realised. Journal of Travel Research, 32, 8–21. guidelines. Millward Brown IntelliQuest.
Smith, G., & Hirst, A. (2001). Strategic political segmentation: a new approach Walters, P. G. P. (1997). Global market segmentation: methodologies and
for a new era of political marketing. European Journal of Marketing, 35, challenges. Journal of Marketing Management, 13, 165–177.
1058–1073. Wang, C. H. (2009). Outlier identification and market segmentation using kernel-
Smith, K. A., & Gupta, J. N. D. (2000). Neural networks in business: techniques and based clustering techniques. Expert Systems with Applications, 36, 3744–3750.
applications for the operations researchers. Computers & Operations Research, Wang, P., & Tan, S. (1997). Soft computing and fuzzy logic. Soft Computing, 1, 35–41.
127, 1023–1044. Wedel, M., & Kamakura, W. (2000). Market segmentation: conceptual and
Smith, K. A., Willis, R. J., & Brooks, M. (2002). An analysis of customer retention and methodological foundations. Norwell, MA: Kluwer Academic Publishing.
insurance claim patterns using data mining: a case study. Journal of Operational Wedel, M., & Kistemaker, C. (1989). Consumer benefit segmentation using
Research Society, 51, 532–541. clusterwise linear regression. International Journal of Research in Marketing, 6,
Smith, W. R. (1956). Product differentiation and market segmentation as an 45–59.
alternative marketing strategy. Journal of Marketing, 21, 3–8. Wedel, M., & Steenkamp, J. E. M. (1989). Fuzzy clusterwise regression approach to
Steenkamp, J. E. M., & Hofstede, F. T. (2002). International market segmentation: benefit segmentation. International Journal of Research in Marketing, 6, 241–258.
issues and perspectives. International Journal of Research in Marketing, 19, Wilkie, W. L., & Cohen, J. B. (1977). An overview of market segmentation: behavioral
185–213. concepts and research approaches. Cambridge: Marketing Science Institute.
Suh, E. H., Noh, K. C., & Suh, C. K. (1999). Customer list segmentation using the Wind, Y. (1978). Issues and advances in segmentation research. Journal of Marketing
combined response model. Expert Systems with Applications, 17, 89–97. Research, 15, 317–337.
Sun, S. (2009). An analysis on the conditions and methods of market segmentation. Wind, Y., & Cardozo, R. (1974). Industrial market segmentation. Industrial Marketing
International Journal of Business and Management, 4. Management, 3, 153–166.
Terano, T. & Ishino, Y. (1995). Marketing data analysis using inductive learning and Wind, Y., & Lerner, D. (1979). On the measurement of purchase data: surveys versus
genetic algorithms with interactive and automated phases. In IEEE international purchase diaries. Journal of Marketing Research, 16, 39–47.
conference, evolutionary computation (pp. 771–776). Wu, R.-S., & Chou, P.-H. (2011). Customer segmentation of multiple category data in
Tsai, C. Y., & Chiu, C. C. (2004). A purchase-based market segmentation e-commerce using a soft-clustering approach. Electronic Commerce Research and
methodology. Expert Systems with Applications, 27, 265–276. Applications, 10, 331–341.
Tseng, T., & Huang, C. (2007). Rough set-based approach to feature selection in Xia, J., Evans, F. H., Spilsbury, K., Ciesielski, V., Arrowsmith, C., & Wright, G. (2010).
customer relationship management. Omega, 35, 365–383. Market segments based on the dominant movement patterns of tourists.
Tsiotsou, R. (2006). Using visit frequency to segment ski resorts customers. Journal Tourism Management, 31, 464–469.
of Vacation Marketing, 12, 15–26. Xie, X. L., & Beni, G. (1991). A validity measure for fuzzy clustering. IEEE Transactions
Tynan, A. C., & Drayton, J. (1987). Market segmentation. Journal of Marketing on Pattern Analysis and Machine Intelligence, 13, 841–847.
Management, 1, 301–335. Yankelovich, D. (1964). New criteria for market segmentation. Harvard Business
Tyndale, P. (2002). A taxonomy of knowledge management software tools: origins Review, 42, 83–90.
and applications. Evaluation and Program Planning, 25, 183–190. Yankelovich, D., & Meer, D. (2006). Rediscovering market segmentation. Harvard
Vellido, A., Lisboa, P. J. G., & Vaughan, J. (1999). Neural networks in business: a Business Review, 84, 121–132.
survey of applications (1992–1998). Expert Systems with Applications, 17, 51–70. Young, S., Ott, L., & Feigin, B. (1978). Some practical considerations in market
Verdegay, J. L., Yager, R. R., & Bonissone, P. P. (2008). On heuristics as a fundamental segmentation. Journal of Marketing Research, 15, 405–412.
constituent of soft computing. Fuzzy Sets and Systems, 159, 846–855. Zadeh, L. A. (1994). Soft computing and fuzzy logic. IEEE Software, 11, 48–56.
Vesanto, J., & Alhoniemi, E. (2000). Clustering of the self-organizing map. IEEE Zahavi, J., & Levin, N. (1997). Applying neural computing to target marketing.
Transactions on Neural Networks, 11, 586–600. Journal of Direct Marketing, 11, 5–22.
Voges, K., Pope, N., & Brown, M. (2003). A rough cluster analysis of shopping Zhong, N., & Skowron, A. (2001). A rough set-based knowledge discovery process.
orientation data. In ANZMAC proceedings 2003, Adelaide (pp. 1625–1631). International Journal of Applied Mathematics and Computer Science, 11, 603–619.

You might also like