978 3 642 05253 8
978 3 642 05253 8
Artificial Intelligence
and Computational
Intelligence
International Conference, AICI 2009
Shanghai, China, November 7-8, 2009
Proceedings
13
Series Editors
Randy Goebel, University of Alberta, Edmonton, Canada
Jörg Siekmann, University of Saarland, Saarbrücken, Germany
Wolfgang Wahlster, DFKI and University of Saarland, Saarbrücken, Germany
Volume Editors
Hepu Deng
RMIT University, School of Business Information Technology
City Campus, 124 La Trobe Street, Melbourne, Victoria 3000, Australia
E-mail: [email protected]
Lanzhou Wang
China Jiliang University, College of Metrological Technology and Engineering
Hangzhou 310018, Zhejiang, China
E-mail: [email protected]
Fu Lee Wang
City University of Hong Kong, Department of Computer Science
83 Tat Chee Avenue, Kowloon Tong, Hong Kong, China
E-mail: [email protected]
Jingsheng Lei
Hainan University, College of Information Science and Technology
Haikou 570228, China
E-mail: [email protected]
ISSN 0302-9743
ISBN-10 3-642-05252-5 Springer Berlin Heidelberg New York
ISBN-13 978-3-642-05252-1 Springer Berlin Heidelberg New York
This work is subject to copyright. All rights are reserved, whether the whole or part of the material is
concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting,
reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication
or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965,
in its current version, and permission for use must always be obtained from Springer. Violations are liable
to prosecution under the German Copyright Law.
springer.com
© Springer-Verlag Berlin Heidelberg 2009
Printed in Germany
Typesetting: Camera-ready by author, data conversion by Scientific Publishing Services, Chennai, India
Printed on acid-free paper SPIN: 12781256 06/3180 543210
Preface
Organizing Committee
General Co-chairs
Program Committee
Co-chairs
Local Arrangements
Chair
Proceedings
Co-chairs
Publicity
Chair
Sponsorship
Chair
Program Committee
Ahmad Abareshi RMIT University, Australia
Stephen Burgess Victoria University, Australia
Jennie Carroll RMIT University, Australia
Eng Chew University of Technology Sydney, Australia
Vanessa Cooper RMIT University, Australia
Minxia Luo China Jiliang University, China
Tayyab Maqsood RMIT University, Australia
Ravi Mayasandra RMIT University, Australia
Elspeth McKay RMIT University, Australia
Alemayehu Molla RMIT University, Australia
Konrad Peszynski RMIT University, Australia
Siddhi Pittayachawan RMIT University, Australia
Ian Sadler Victoria University, Australia
Pradip Sarkar RMIT University, Australia
Carmine Sellitto Victoria University, Australia
Peter Shackleton Victoria University, Australia
Sitalakshmi Venkatraman University of Ballarat, Australia
Leslie Young RMIT University, Australia
Adil Bagirov University of Ballarat, Australia
Philip Branch Swinburne University of Technology, Australia
Feilong Cao China Jiliang University, China
Maple Carsten University of Bedfordshire, UK
Caroline Chan Deakin University, Australia
Jinjun Chen Swinburne University of Technology, Australia
Richard Dazeley University of Ballarat, Australia
Yi-Hua Fan Chung Yuan Christian University Taiwan, Taiwan
Richter Hendrik HTWK Leipzig, Germany
Furutani Hiroshi University of Miyazaki, Japan
Bae Hyeon Pusan National University, Korea
Tae-Ryong Jeon Pusan National University, Korea
Sungshin Kim Pusan National University, Korea
Wei Lai Swinburne University of Technology, Australia
Edmonds Lau Swinburne University of Technology, Australia
Qiang Li University of Calgary, Canada
Xiaodong Li RMIT University, Australia
Kuoming Lin Kainan University, Taiwan
YangCheng Lin National Dong Hwa University, Taiwan
An-Feng Liu Central South University, China
Liping Ma University of Ballarat, Australia
Costa Marly Federal University of the Amazon, Brazil
Jamie Mustard Deakin University, Australia
Syed Nasirin Brunel University, UK
Lemai Nguyen Deakin University, Australia
Heping Pan University of Ballarat, Australia
Organization IX
Reviewers
Adil Bagirov Chen Lifei Du Junping
Ahmad Abareshi Chen Xiang Du Xufeng
Bai ShiZhong Chen Ling Duan Yong
Bi Shuoben Chen Dongming Duan Rongxing
Bo Rui-Feng Chen Ming Fan Jiancong
Cai Weihua Chen Ting Fan Xikun
Cai Zhihua Chen Yuquan Fan Shurui
Cai Kun Chen Ailing Fang Zhimin
Cao Yang Chen Haizhen Fang Gu
Cao Qingnian Chu Fenghong Fuzhen Huang
Cao Guang-Yi Chung-Hsing Yeh Gan Zhaohui
Cao Jianrong Congping Chen Gan Rongwei
Carmine Sellitto Cui Shigang Gao chunrong
Chai Zhonglin Cui Mingyi Gao Xiaoqiang
Chang Chunguang DeJun Mu Gao Jiaquan
Chen Yong Liang Deng Liguo Gao Jun
Chen Haiming Deng-ao Li Gao Boyong
Chen Yong Ding Yi-jie Gu Xuejing
Chen Pengnian Dingjun Chen Gu Tao
X Organization
Neural Computation
A Uniform Solution to HPP in Terms of Membrane Computing . . . . . . . . 59
Haizhu Chen and Zhongshi He
Information Security
Research of Trust Chain of Operating System . . . . . . . . . . . . . . . . . . . . . . . 96
Hongjiao Li and Xiuxia Tian
A Novel Application for Text Watermarking in Digital Reading . . . . . . . . 103
Jin Zhang, Qing-cheng Li, Cong Wang, and Ji Fang
Immune Computation
Optimization of Real-Valued Self Set for Anomaly Detection Using
Gaussian Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
Liang Xi, Fengbin Zhang, and Dawei Wang
Genetic Algorithms
A GP Process Mining Approach from a Structural Perspective . . . . . . . . . 121
Anhua Wang, Weidong Zhao, Chongchen Chen, and Haifeng Wu
Effects of Diversity on Optimality in GA . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
Glen MacDonald and Gu Fang
Dynamic Crossover and Mutation Genetic Algorithm Based on
Expansion Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
Min Dong and Yan Wu
Multidisciplinary Optimization of Airborne Radome Using Genetic
Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
Xinggang Tang, Weihong Zhang, and Jihong Zhu
Global Similarity and Local Variance in Human Gene Coexpression
Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
Ivan Krivosheev, Lei Du, Hongzhi Wang, Shaojun Zhang,
Yadong Wang, and Xia Li
A Grid Based Cooperative Co-evolutionary Multi-Objective
Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
Sepehr Meshkinfam Fard, Ali Hamzeh, and Koorush Ziarati
Fuzzy Computation
Fuzzy Modeling for Analysis of Load Curve in Power System . . . . . . . . . . 176
Pei-Hwa Huang, Ta-Hsiu Tseng, Chien-Heng Liu, and
Guang-Zhong Fan
Table of Contents XV
Biological Computing
Honey Bee Mating Optimization Vector Quantization Scheme in Image
Compression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
Ming-Huwi Horng
Robotics
Robot Virtual Assembly Based on Collision Detection in Java3D . . . . . . . 270
Peihua Chen, Qixin Cao, Charles Lo, Zhen Zhang, and Yang Yang
XVI Table of Contents
Pattern Recognition
A Novel Character Recognition Algorithm Based on Hidden Markov
Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298
Yu Wang, Xueye Wei, Lei Han, and Xiaojin Wu
New Algorithms for Complex Fiber Image Recognition . . . . . . . . . . . . . . . 306
Yan Ma and Shun-bao Li
Laplacian Discriminant Projection Based on Affinity Propagation . . . . . . 313
Xueping Chang and Zhonglong Zheng
An Improved Fast ICA Algorithm for IR Objects Recognition . . . . . . . . . 322
Jin Liu and Hong Bing Ji
Facial Feature Extraction Based on Wavelet Transform . . . . . . . . . . . . . . . 330
Nguyen Viet Hung
Fast Iris Segmentation by Rotation Average Analysis of
Intensity-Inversed Image . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340
Wei Li and Lin-Hua Jiang
Neural Networks
A New Criterion for Global Asymptotic Stability of Multi-delayed
Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 350
Kaiyu Liu and Hongqiang Zhang
An Improved Approach Combining Random PSO with BP for
Feedforward Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 361
Yu Cui, Shi-Guang Ju, Fei Han, and Tong-Yue Gu
Fuzzy Multiresolution Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 369
Li Ying, Shang Qigang, and Lei Na
Research on Nonlinear Time Series Forecasting of Time-Delay NN
Embedded with Bayesian Regularization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 379
Weijin Jiang, Yusheng Xu, Yuhui Xu, and Jianmin Wang
An Adaptive Learning Algorithm for Supervised Neural Network with
Contour Preserving Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 389
Piyabute Fuangkhon and Thitipong Tanprasert
Table of Contents XVII
Machine Vision
Object Recognition Based on Efficient Sub-window Search . . . . . . . . . . . . 435
Qing Nie, Shouyi Zhan, and Weiming Li
Machine Learning
A Multi-Scale Algorithm for Graffito Advertisement Detection from
Images of Real Estate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 444
Jun Yang and Shi-jiao Zhu
Intelligent Scheduling
A Controlled Scheduling Algorithm Decreasing the Incidence of
Starvation in Grid Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 517
Minna Liu, Kairi Ou, Yuelong Zhao, and Tian Sun
Others
Formalizing the Modeling Process of Physical Systems in MBD . . . . . . . . 685
Nan Wang, Dantong OuYang, Shanwu Sun, and Chengli Zhao
Erratum
Implementation of On/Off Controller for Automation of
Greenhouse Using LabVIEW . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E1
R. Alimardani, P. Javadikia, A. Tabatabaeefar, M. Omid, and
M. Fathi
Abstract. Given the heavy computation, easy saturation and cumulate errors of
conventional direct vector control, the vector control of induction motor based on
simple model is studied and the detailed scheme is described on the basis of the
decomposing and approximating the rotor flux. Because of the direct closed-loop
control of the magnetizing current and the torque current and the complex current
regulator is completed by PI regulator, so the direct vector control of induction
motor is simplified. The experimental results show that the proposed method is
effective in decreasing the dynamic disturbance and has the advantages of the
simplicity of the code program, rare saturation and shocks.
1 Introduction
The vector control can make the induction motor gain the excellent dynamic per-
formance that is similar to the one of the DC motor. Now, the vector control that based
on the rotor field orientated control is paid enough attention because it easily realizes
the absolute decoupling between the magnetizing current and the torque current [1-2].
The direct vector control and the indirect vector control are the two most commonly
used in the vector control of induction motor system. The former has the speed
closed-loop control which includes torque closed-loop and the flux closed-loop control
system, the latter is an open-loop flux control system. There are a large number of
operations such as the differential and the product in the direct vector control. Fur-
thermore, the complex processes of current regulator in the voltage-type inverter bring
the heavy computation [3-5], easy saturation, cumulate errors, and other uncertain
disturbance, resulting in deterioration of system performance.
Therefore, in order to overcome the above-mentioned disadvantages of the direct
vector control and maintain its advantages of performance, the vector control of in-
duction motor based on simple model is studied and the detailed scheme is described in
this paper. After the decomposing and approximating the rotor flux, the magnetizing
current and the torque current are directly close-looped and the complex process of
current regulator in the voltage-type inverter is completed by PI regulator to achieve the
regulation without static error. Thus, the proposed method simplifies the direct vector
H. Deng et al. (Eds.): AICI 2009, LNAI 5855, pp. 1–8, 2009.
© Springer-Verlag Berlin Heidelberg 2009
2 Y. Zhang et al.
control of induction motor, and makes the direct vector control system easy, clear, less
code program, rare saturation and shocks.
Park −1t.
ω * Te* i s*t ust* us*α
m, t SA
− ˆ
− Te SB
ωr ψ r* *
ism *
usm
α,β
us*β SC
− ψˆ r
θˆs
iA
pn Lm iB
Lr i iC
st
The following are the main formulas that used in the conventional direct vector
control system of the induction motor.
Rotor flux is expressed as follows
Lm
ψr = ism (1)
Tr p + 1
The feedback of the torque calculation equation is
pn Lm
Te = ist ψ r (2)
Lr
The angular speed of synchronous rotation is given by t-he following formula
Lm ist
ωs = ωr + ωsl = ωr + (3)
Trψ r
In the most example applications, the voltage-type inverter is used widely, so it is
necessary to change the current to voltage. This process is called current regulator. And
the equations of the conversion are
An Experimental Research on Vector Control of Induction Motor 3
⎧ Ls σTr p + 1
⎪usm = Rs (1 + R p T p + 1 )ism
⎪ s r
⎪ Tr p + 1 ist
⎪ − σLs (ωr + )ist
⎪ Tr ism
⎨ (4)
⎪u = [ R ( σLs p + 1) + Ls (σT p + 1)]i
⎪ st s
Rs Tr
r st
⎪
⎪ σT p + 1
+ ωr Ls r )ism
⎪ Tr p + 1
⎩
The parameters of equation (1) to (4) are described as follows:
L2m
σ =1−
Ls Lr
Lm — Mutual inductance
Lr — Rotor inductance
Rs — Stator resistance
Ls — Stator inductance
Tr — Rotor time constant
p — Differential operator
ωr — Rotor angular speed
ωsl — Slip angular speed
pn — The number of pole pairs
ism — M-axis stator current
ist — T-axis stator current
usm — M-axis stator voltage
ust — T-axis stator voltage
From the equation (1) to (3), we can see that the accuracy of estimation of equation (2)
and (3) depend on (1). It shows that the importance of the accuracy of estimation of the
rotor flux. Because of the existence of differential part, the transient drift will appear in
the part of direct current following the transient changes of the digital operation when
the rectangular discrete integrator is used, thus making the value of estimated rotor flux
inaccurate and then reducing the calculation accuracy of the torque and the synchro-
nous speed. Furthermore, the existence of a large number of the differential and product
operations in the equations described by (4) will bring the heavy computation, easy
saturation and cumulate errors. Especially, in the low speed region, the torque is very
prone to oscillate.
Park −1t.
ω* i s*t u st* u s*α
m, t SA
− −i SB
ωr α,β
st
i* *
usm u*sβ SC
sm
−i sm
θˆs
isα
m, t α,β iA
i st
α , β isβ a, b, c iB
ωr ism
Park t. Clarke t.
Fig. 2. The vector control system of induction motor based on simple model
the conventional direct vector control. The system includes the speed control subsys-
tem and the torque current control subsystem. And the inner loop of the speed control
subsystem is the torque current closed-loop control subsystem. The system is com-
posed of three PI regulators, the speed regulator, the torque current regulator and the
magnetizing current regulator.
The process of the main formulas derivation that used in the vector control system
based on simple model are as follows, in which they are discretized to fit DSP computing.
The equation (1) can be rewrote as
Tr pψ r + ψ r = Lm ism (5)
And the rotor flux equation that based on MT reference frame is
ψ r = Lmism + Lr irm (6)
Lσ r is rotor leakage inductance and it is small enough compared with the mutual in-
ductance Lm. So we can consider that Lm and Lr are approximately equal in value ac-
cording to L r = L m + L σ r . Furthermore, in the rotor field orientated control system,
the rotor flux must maintain constant in order to achieve an ideal speed-adjusting
performance under the rating frequency. And above the rating frequency, the flux
weakening control is adopted and is usually achieved through the look-up table. Lmism is
regard as constant and contained by Lrirm . So the equation (6) can be simplified as
ψ r = Lm irm (7)
In equation (7), irm is the M-axis component of rotor current, and applying equation (7)
to (5), we can obtain
dirm
Tr + irm = ism (8)
dt
After discrete differential, equation (8) can be rewrote as
An Experimental Research on Vector Control of Induction Motor 5
Ts
irm (k + 1) = irm (k ) + (ism (k ) − irm (k )) (9)
Tr
Where Ts is the switching period. And as the same method, equation (3) can be modi-
fied as
1 ist (k )
ωs (k + 1) = ωr (k + 1) + (10)
Tr irm (k + 1)
The equation (9) and (10) are derived based on the decomposition and approximation
of the rotor flux, which are considered as crucial observation equations of the system
showed in Fig. 2, being used to calculate the angular speed of synchronous rotation.
Comparing to the conventional direct vector control system, we can find that in the
simplified vector control system it is not to calculate the angular speed of synchronous
rotation by calculating the rotor flux, but to obtain directly through the rotor magnet-
izing current. Then the heavy computational processes are ignored, so the saturation
drift and cumulative errors are avoided. And we will have a good foundation during the
study of the real-time high-performance vector control system.
It can be seen from Fig. 2 that the magnetizing current and the torque current are for
closed-loop control directly rather than the torque and flux in the simplified vector
control system that compared to the conventional vector control. So the calculation of
torque feedback is avoided, and the whole system can regulate the torque and the flux
more quickly and efficiently. In addition, the complex process of current regulator is
included by PI regulator to achieve the regulation without static error. So it make the
system simple, clear and easy computation, and more suitable for real-time
high-performance control.
4 Experiment Analysis
The hardware and software environments of the experiment: the power inverter part is
designed with IGBT (IRGB15B60KD) which is produced by IR. The control system
part adopts the TI 2407DSP EVM by Wintech. The current sensor is LA28-NP pro-
duced by the company of LEM. The resolution of the speed encoder is 1000P/R, and
the type of digital oscilloscope is TektronixTDS2002. Switching frequency is 16 KHz,
switching period Ts = 62.5µs. PI parameters: in the current loop, P=1.12, I=0.0629 and
in the speed loop, P= 4.89, I=0.0131.
The parameters of the induction motor: the rated power: Pn =500W, the rated cur-
rent: In =2.9A, the rated voltage: Vn=220V, the rotor resistance: Rr =5.486Ω, the stator
resistance: Rs =4.516Ω, the mutual inductance: Lm =152mH, Lσr =13mH, Lr =165mH,
pn=2, Ls =168mH.
Fig. 3 gives the trajectory of the total flux circle when the operation frequency f
=5Hz in the vector control system of induction motor based on simple model. Ac-
cording to the experimental waveform, we can find that the trajectory of the total flux
circle is near round and that the system has a good running status when the system is
running at low speed.
6 Y. Zhang et al.
Ψβ (pu)
Ψα (pu)
Fig. 3. The total flux circle with the operation frequency f =5Hz
/450rpm/div
/450rpm/div
ωr
ωr
iA /1.5A/div
ist (pu)
t/50ms/div t/100ms/div
(a) rotor speed and phase current (b) rotor speed and torque current
Fig. 4. The start-up of induction motor with no-load
ism (pu)
ωr
ist (pu)
ist (pu)
t/100ms/div t/500ms/div
(a) rotor speed and torque current (b) magnetizing current and torque current
The experimental waveforms that under the random load when the motor runs at the
steady state of f =10Hz is represented by Fig. 5. Fig. 5a gives the waveforms of the rotor
speed and torque current, and we can see that the maximum dynamic speed downhill is
about 16% and the recovery time is less than the time of disturbance operation. Fig. 5b
shows the waveforms of the magnetizing current and torque current, and we can find
that the magnetizing current ism always remain unchanged and be able to get ideal de-
coupling with the torque current. So the vector control of induction motor based on
simple model works well in the anti-disturbance performance.
5 Conclusion
In this paper, the vector control of induction motor based on simple model is studied
and the detailed scheme is described through the decomposing and approximating the
rotor flux. The proposed method makes the system easy and has less code program,
shorter execution time. The experimental results show that the anti-disturbance per-
formance of the simplified vector control system is reliable and efficient and suitable
for real-time high-performance control system.
References
1. Noguchi, T., Kondo, S., Takahashi, L.: Field-oriented control of an induction motor with
robust on-line tuning of its parameters. IEEE Trans. Industry Applications 33(1), 35–42
(1997)
2. Telford, D., Dunnigan, M.W.: Online identification of induction machine electrical parame-
ters for vector control loop tuning. IEEE Trans. Industrial Electronics 50(2), 253–261 (2003)
8 Y. Zhang et al.
3. Zhang, Y., Chen, J., Bao, J.: Implementation of closed-loop vector control system of induc-
tion motor without voltage feedforward decoupler. In: IEEE 21st Annual Applied Power
Electronics Conference and Exposition, APEC, pp. 1631–1633 (2006)
4. Lorenz, Lawson, R.D., Donald, B.: Performance of feedforward current regulators for
field-oriented induction machine controllers. IEEE Trans. Industry Applications 23(1.4),
597–602 (1987)
5. Del Blanco, F.B., Degner, M.W., Lorenz, R.D.: Dynamic analysis of current regulators for
AC motors using complex vectors. IEEE Trans. Industry Applications 35(6), 1424–1432
(1999)
An United Extended Rough Set Model Based on
Developed Set Pair Analysis Method
Abstract. Different with traditional set pair analysis method, a new method to
micro-decompound the discrepancy degree is proposed in line with the distrib-
uting actuality of missing attributes. Integrated with rough set theory, advances
the united set pair tolerance relation and gives corresponding extended rough set
model. Different values of identity degree and discrepancy degree can modulate
the performance of this model and extend it's application range. Then expound
some existed extended rough set model are the subschema of it. At last simula-
tion experiments and conclusion are given, which validate the united set pair
tolerance relation model can improve classification capability.
Keywords: Set pair analysis, Otherness, Rough set, United set pair tolerance
relation.
1 Introduction
Classic rough set theory is based on the equivalence relation, and for a complete
information system. But in real life, acknowledge obtain always face the incomplete
information system for the reason of data measure error or the limitation of data
acquisition. There are two kind of null value: (1)omit but exist;[2]; (2)lost and not
allowed to be compared. Kryszkiewicz[3] established tolerance relationship for
type(1). Stefanowski[4] and others built similar relationship for type(2). Wang GY[5]
proposed the limited tolerance relation by in-depth study of tolerance relations and
similar relation.
The incomplete degree and the distribution of loss data in various systems are dif-
ferent from each other. And any extended rough set model can achieve a good effect
sometimes but not always. Some quantified extended rough set model, like quantified
tolerance relation and quantified limited tolerance relation[6], has a good effect , but the
process of quantifying costs large calculation.
The set pair analysis (Set Pair Analysis, SPA) theory was formally proposed by
China scholars ZHAO Ke-qin in 1989, which is used to study the relationship between
two set [7].It uses "a + bi + cj "as an associated number to deal with uncertain sys-
tem,such as fuzzy, stochastic or intermediary. It studies the uncertain of two sets from
H. Deng et al. (Eds.): AICI 2009, LNAI 5855, pp. 9–17, 2009.
© Springer-Verlag Berlin Heidelberg 2009
10 X. Ji et al.
three aspects: identity, difference and opposition. At present, the set pair analysis the-
ory is widely used in artificial intelligence, system control, management deci-
sion-making and other fields [7].
In this paper, we expand the set pair analysis theory firstly, then put forward a united
extended rough set model based on it.This model can convert to different existed
models with different identity and discrepancy thresholds, including tolerance relation,
similarity relation and limited tolerance relation. Experiments on the UCI data sets
show validity ,rationality and effectiveness of this model.
Set Pair is composed of two sets A and B ,namely H=(A,B).Under general circum-
stances (W), it’s a formula:
u W (A , B ) =
S F P
+ i+ j (1)
N N N
Here N is the total number of features, S is the number of identity features, and P is the
number of contrary features, of the set pair discussed. F = N-S-P is the number of
features of the two sets that are neither identity nor contrary. S/N, F/N, and P/N are
called identity degree, discrepancy degree, and contrary degree of the two sets under
certain circumstances respectively. In order to clarify formula(1),we set a=S/N, b=F/N,
c=P/N, and then uW(A,B) can be rewritten as:
u = a + bi + cj (2)
It is obvious that 0 ≤ a, b, c ≤ 1, and the “a”, “b”, and “c” satisfy the equation a + b + c = 1.
An incomplete information system is the following tuple: I=(U,A,F), where U is a
finite nonempty set of objects, A is a finite nonempty set of attributes, Va is a nonempty
∈
set of values of a A, and F={Fl: U→ρ(Va)} is an information function that maps an
∈ ∈
object in U to exactly one value in Va. For every al A and every xi U, if Fl(xi)is a
single point set, then (U,A,F)is a complete information system. If there is some al ∈ A
∈
and some xi U makes Fl(xi) is not a single point set, then (U,A,F)is an incomplete
information system. An incomplete information system is the special case of a com-
plete information system.
Let I=(U,A,F) is an incomplete information system. For any B ⊆ A,the tolerance
relation TR(B) proposed by M.Kryskiewcz is defined as follows:[3]
TR(B)={(x,y) ∈ U×U| ∀ a ∈ A,a(x)=a(y) ∪ a(x)=* ∪ a(y)=*}
Let I=(U,A,F) is an incomplete information system. For any B ⊆ A ,the similarity
relation SR(B) proposed by Stefanowski is defined as follows:[4]
SR(B)={(x,y)∈U×U|∀b∈B,b(x)=b(y)orb(x)=*}
An United Extended Rough Set Model Based on Developed Set Pair Analysis Method 11
Let I=(U,A,F) is an incomplete information system. For any B ⊆ A, the limited toler-
ance relation LTR(B) proposed by WANG GY is defined as follows:[5]
The connection degree u=a +bi +cj can be expanded breadthwise and lengthways ac-
cording to the demand of research The transverse expanding form is:
u(x,y)=a + bi + cj
a1 b1 c1
u(x,y)=a + bi + cj
b1 b2 b3
。
b3=|{a| a(x)=*&a(y)=*}| The "b1" represents that attribute of x is missing. The "b2"
represents that attribute of y is missing. And "b3" represents attribute of x and y are all
missing.
The traditional set pair analysis theory decompose the discrepancy term "bi" ac-
cording to possible value of missing attribute. While in this paper we from the angle of
distribution of missing attributes to decompose discrepancy term "bi".In the later sec-
tion we'll prove that the symmetry and transitivity of binary relation depend on the
threshold of b1 and b2.
B Extended Rough Set Model Based on United Set Pair Tolerance Relation
The connection degree between two objects is composed of three parts: the ratio of
known and same attributes (the identity degree a), the ratio of known but different at-
tributes (contrary degree cj) and the ratio of unknown attributes (discrepancy degree b ).The
contribution of contrary degree to similarity is negative, without consideration to deal
with noise data, so we set contrary degree be 0,that is c=0.According to expanded set pair
analysis theory, we give the united set pair tolerance relation.
(
USPTR B X ) = {x | USPTRB ( x) ∩ X ≠ Φ}
USPTRB ( X ) = {x | USPTRB ( x) ⊆ X }
C Performance Analysis of Extended Rough Set Model Based on United Set Pair
Tolerance Relation
Unified set pair tolerance relation is reflexive,but whether it satisfy symmetry and
transitivity is not a conclusion. When the identity degree and discrepancy degree
thresholds are changed, this extended model will have different performance. We have
three theorems as follows.
∈
Let I= (U,A,V)is an incomplete information system, ∀ x, y U, B ⊆ A,α is threshold
of identity degree a, β1,β2 and β3 are thresholds of discrepancy degree bi respectly.
∈
Proof. Let y USPTRB(x), that is u(x,y)=a+(b1+b2+b3)i, b1≤β1 and b2≤β2. Because
β1≠β2, there may be some y in USPTRB(x) make b1>β2 or b2>β1.
u(y,x)=a+(b2+b1+b3)i, so x ∉ USPTRB(y).
∈
Proof. Firstly prove the situation of β1=0. Let y USPTRB(x) and z ∈USPTR (y),
∈A, if a(x)≠*, then a(y)=a(x). Simul-
B
The performance of united set pair tolerance relation will change with the modification
of identity and discrepancy thresholds, and some representative existed extended rough
set models are its subschema., see Table 1.
14 X. Ji et al.
Table 1. The relationship between united set pair tolerance relation and some existed models
4 Simulation Experiments
In this section, we select three test databases in UCI machine learning repository, Iris
and Zoo and cmc-data. Details see in Tbale2 .We get nine incomplete databases though
getting rid of data by random function, which are I-5%, I-10%, I-30%, Z-5%, Z-10%
,Z-30%,C-5%,C-10% and C-30%.
Let U be test database, E(xi) be the complete equivalent class of xi ,and R(xi) be the
tolerance class of xi under different extended rough set models. We do ten times ran-
dom experiments on every database ,and take their average as final result. The com-
parability( u ) of test result can be computed through formula(1) and formula(2).
|U|
| E(xi ) ∩ R(xi ) |
∑| E(x ) | + | R(x ) | − | E(x ) ∩ R(x ) |
i =1
u= i i i i (3)
|U |
10
∑µ t
(4)
µ= t =1
10
(1) β1=β2
(2) β1≠β2
5 Conclusions
In this paper, we firstly develop set pair analysis method, give a developed mi-
cro-de-compound method of discrepancy degree, then propose united set pair tolerance
relation and give corresponding extended rough set model. The threshold α, β1 and β2
can adjust according to the incomplete degree of databases. So this model is more
objective, flexible and effective. The experiments on three databases of UCI, which are
Iris, Zoo and cmc-data, can clearly manifest the advantage of new methods in this
paper. But how to fast calculate the identity and discrepancy degree thresholds is still a
problem need to solve.
Acknowledgement
References
1. Pawlak, Z.: Rough set. International Journal of Computer and Information Sciences 11(5),
341–356 (1982)
2. Grzymala Busse, J.W.: On the unknown attribute values in learning from examples. In: Raś,
Z.W., Zemankova, M. (eds.) ISMIS 1991. LNCS (LNAI), vol. 542, pp. 368–377. Springer,
Heidelberg (1991)
An United Extended Rough Set Model Based on Developed Set Pair Analysis Method 17
3. Stefanowski, J., Tsoukias: A incomplete information tables and rough classification. Com-
putational Intelligence 17, 545–566 (2001)
4. Kryszkiewicz, M.: Rough Set Approach to Incomplete Information System. Information
Sciences (S0020-0255) 112(1/4), 39–49 (1998)
5. Wang, G.-y.: The extension of rough set theory in incomplete information system. Com-
puter Research and Development 39(10), 1238–1243 (2002)
6. Sun, C.-m., Liu, D.-y., Sun, S.-y.: Research of rough set method oriented to incomplete
information system. Mini-micro Computer System 10(10), 1869–1873 (2007)
7. Zhao, K.-q.: Set pair and its prilimitary applications. Hangzhou Zhejiang Science Publishers
(2000)
8. Xu, Y., Li, L.-s., Li, X.-j.: Extended rough set model based on set pair power. Journal of
System Simulation 20(6), 1515–1522 (2008)
9. Stefanowski, J., Tsoukias, A.: On the Extension of Rough Sets under Incomplete Informa-
tion. In: Zhong, N., Skowron, A., Ohsuga, S. (eds.) RSFDGrC 1999. LNCS (LNAI),
vol. 1711, pp. 73–82. Springer, Heidelberg (1999)
10. Lei, Z., Lan, S.: Rough Set Model Based on New Set Pair Analysis. Fuzzy Systems and
Mathmatics (S1001-7402) 20(4), 111–116 (2006)
Learning Rules from Pairwise Comparison Table
1 Introduction
Multiple Criteria Decision Analysis (MCDA) aims at helping a decision maker (DM)
to prepare and make a decision where more than one point of view has to be consid-
ered. There are three major models used until now in MCDA: (1) the functional
model expressed in terms of a utility function within multiple attribute utility theory
[1]; (2)the relational model expressed in the form of an outranking relation [2] and a
fuzzy relation [3]; (3) the function-free model expressed in terms of symbolic forms,
like “if… then…” decision rules [6] or decision trees, or in a sub-symbolic form, like
artificial neural nets [4].
Both functional and relational models require that the DM gives some preference
information, such as importance weights, substitution ratios and various thresholds
on particular criteria, which is often quite difficult for DMs not acquainted with the
MCDA methodology [4]. According to Slovic [5], people make decisions by search-
ing for rules that provide good justification of their choices. The decision rule ap-
proach [6] follows the paradigm of artificial intelligence and inductive learning.
These decision rules are induced from preference information supplied by the DM in
terms of some decision examples. Therefore, the decision rules are intelligible and
speak the language of the DM [7].
Roy [8] has stated that the objective of an MCDA is to solve one of the following
five typologies of problems: classification, sorting, choice, ranking and description.
Classification concerns an assignment of a set of actions to a set of pre-defined
classes. The actions are described by a set of regular attributes and the classes are not
H. Deng et al. (Eds.): AICI 2009, LNAI 5855, pp. 18–27, 2009.
© Springer-Verlag Berlin Heidelberg 2009
Learning Rules from Pairwise Comparison Table 19
Let C be the set of criteria used for evaluation of actions from A. For any criterion
q∈C, let Tq be a finite set of binary relations defined on A on the basis of the evalua-
tions of actions from A with respect to the considered criterion q, such that for every
(x, y)∈A×A exactly one binary relation t∈Tq is verified.
The preferential information has the form of pairwise comparisons of reference ac-
tions from B⊆A, considered as exemplary decisions. The pairwise comparison table
(PCT) is defined as data table SPCT =(B, C∪{d}, TC∪Td, g), where B⊆B×B is a non-
empty set of exemplary pairwise comparisons of reference actions, TC= ∪ q∈C Tq , d is a
20 L. An and L. Tong
Let CO be the set of criteria expressing preferences on an ordinal scale, and CN, the set
of criteria expressing preferences on a quantitative scale or a numerical nonquantita-
tive scale, such that CO∪CN=C and CO∩CN=∅. Moreover, for each P⊆C, let PO be the
subset of P composed of criteria expressing preferences on an ordinal scale, i.e.
PO=P∩CO, and PN, the subset of P composed of criteria expressing preferences on a
quantitative scale or a numerical non-quantitative scale, i.e. PN=P∩CN. For each P⊆C,
we have P=PO∪PN and PO∩PN=∅. The following three situations are considered:
(1) P=PN and PO=∅.
The exemplary pairwise comparisons made by the DM can be represented in terms
of graded preference relations Pqh : for each q∈C and for every (x, y)∈A×A, Tq={ Pqh :
h∈Hq}, where Hq is a particular subset of the relative integers and
• x Pqh y, h>0, means that action x is preferred to action y by degree h with respect
to the criterion q,
• x Pqh y, h<0, means that action x is not preferred to action y by degree h with re-
spect to the criterion q,
• x Pq0 y means that action x is similar to action y with respect to the criterion q.
For each q∈C and for every (x, y)∈A×A, [x Pqh y, h>0]⇒[y Pqk x, k≤0] and [x Pqh y,
h<0]⇒[y Pqk x, k≥0].
Given PN⊆C (PN≠∅), (x, y), (w, z)∈A×A, the pair of actions (x, y) is said to domi-
nate (w, z), taking into account the criteria from PN, denoted by (x, y) D P N (w, z), if x
is preferred to y at least as strongly as w is preferred to z with respect to each q∈PN.
Precisely, “at least as strongly as” means “by at least the same degree”, i.e. hq≥kq,
where hq, kq∈Hq, x Pq q y and w Pq q z, for each q∈P.
h k
Exact definition of the cumulated preferences, for each (x, y)∈A×A, q∈C and
h∈Hq, is the following:
Given P⊆C and (x, y)∈A×A, the P-dominating set and the P-dominated set are de-
fined, respectively, as:
• a set of pairs of actions dominating (x, y), DP+ (x, y)={(w, z)∈A×A: (w, z)DP(x, y)},
• a set of pairs of actions dominated by (x, y), DP− (x, y)={(w, z)∈A×A: (x, y)DP(w, z)}.
Using the approximations of S and Sc based on the dominance relation defined above, it
is possible to induce a generalized description of the available preferential information
in terms of decision rules. The decision rules have in this case the following syntax:
(1) Df -decision rules
IF x Pqf1 h ( q1 ) y and x Pqf2h ( q2 ) y and… x Pqfeh ( qe ) y and cqe +1 (x) f rqe +1 and cqe +1 (x) p sqe +1
and … cq p (x) f rq p and cq p (x) p s q p , THEN xSy,
where P={ql, q2, …, qp}⊆C, PN={ql, q2, …, qe}, PO={qe+l, qe+2, …, qp}, (h(ql), h(q2),
…, h(qp))∈ H q1 × H q2 ×…× H q p and ( rqe +1 , …, rq p ), ( sqe +1 , …, s q p )∈ C qe +1 ×…× C q p .
These rules are supported by pairs of actions from the P-lower approximation of S
only.
(2) Dp -decision rules
IF x Pqp1 h ( q1 ) y and x Pqp2h ( q2 ) y and… x Pqpeh ( qe ) y and cqe +1 (x) p rqe +1 and cqe +1 (x) f sqe +1
and … cq p (x) f rq p and cq p (x) p s q p , THEN xScy,
where P={ql, q2, …, qp}⊆C, PN={ql, q2, …, qe}, PO={qe+l, qe+2, …, qp}, (h(ql), h(q2),
…, h(qp))∈ H q1 × H q2 ×…× H q p and ( rqe +1 , …, rq p ), ( sq e +1 , …,
sq p )∈ Cqe +1 ×…× Cq p .
These rules are supported by pairs of actions from the P-lower approximation of Sc
only.
(3) Df p -decision rules
ph ( qe+1 )
IF x Pqf1 h ( q1 ) y and x Pqf2h ( q2 ) y and… x Pqfeh ( qe ) y and x Pqe+1 y and x Pqpeh+ 2( qe+ 2 ) y and…
ph ( q f )
x Pq f y cq f +1 (x) f rq f +1 and cq f +1 (y) p sq f +1 and … cqg (x) f rq g and cqg (x) p sqg
and cq g +1 (x) p rqg +1 and cq g +1 (y) f s qg +1 and … cq p (x) p rq p and cq p (x) f s q p , THEN
xSy or xScy,
where O′={ql, q2, …, qe}⊆C, O″={qe+l, qe+2, …, qf}⊆C, PN=O′∪O″, O′ and O″ not
necessarily disjoint, PO={qf+l, qf+2, …, qf}, (h(ql), h(q2), …,
h(qf))∈ H q1 × H q2 ×…× H q f and ( rq f +1 , …, rq p ), ( sq f +1 , …, s q p )∈ C q f +1 ×…× C q p .
These rules are supported by pairs of actions from the P-boundary of S and Sc only.
Applying the decision rules induced from a given SPCT, a final recommendation for
choice or ranking can be obtained upon a suitable exploitation of this structure [12].
the concepts of decision matrices and decision functions to generate the minimal
decision rules.
where m[(x, y), (w, z)]={pi, qj: (x, y) D{ pi } (w, z), h pi > k pi , (x, y) D{1q j } (w, z) and
cq j (x)≠ cq j (y) or cq j (w)≠ cq j (z), (x, y) D{2q j } (w, z) and cq j (x)≠ cq j (w) or
cq j (y)≠ cq j (z)}.
Definition 2. Let M(S) be the decision matrix of S in SPCT. The decision function of
(x, y) with respect to M(S), is defined as:
fS[(x, y)]= ∧ { (∨ pi* ) ∨ (∨ q *j ) : pi, qj∈m[(x, y), (w, z)] and m[(x, y), (w, z)]≠∅}.
( w, z ) i j
where pi* and q *j are corresponding to the attribute pi and qj, respectively.
The decision function fS[(x, y)] is a Boolean function that expresses how a pair (x,
y)∈S can be discerned from all of the pairs (w, z)∈SC. Turning fS[(x, y)] into disjunc-
tive normal form, the prime implicants of fS[(x, y)] reveal the minimal subsets of P
that are needed to discern the pair (x, y) from the pair in SC and correspond to the
minimal Df -decision rules.
Similarly, we can define the decision matrix of SC, M(SC), and the decision func-
tion with respect to M(SC).
An example. Let us consider the example used in [6]. The students are evaluated
according to the level in Mathes, Phys and Lit. Marks are given on a scale from 0 to
20. Three students presented in Table 1 are considered.
Comprehensive
Pair of students Maths Phys Lit
outranking relation
(a, a) 18, 18 16, 16 10, 10 S
(a, b) 18, 10 16, 12 10, 18 S
(a, c) 18, 14 16, 15 10, 15 Sc
(b, a) 10, 18 12, 16 18, 10 Sc
(b, b) 10, 10 12, 12 18, 18 S
(b, c) 10, 14 12, 15 18, 15 Sc
(c, a) 14, 18 15, 16 15, 10 S
(c, b) 14, 10 15, 12 15, 18 S
(c, c) 14, 14 15, 15 15, 15 S
The analogous partial preorders can be induced using dominance relations on Phys
and on Lit. The entities to compute the lower and upper approximations of S are
shown in Table 3.
Pair of
D{+Maths} (x, y) D{+Physics } (x, y) D{+Literature} (x, y) D P+ (x, y)
students
(a, a), (a, b), (a, c), (a, a), (a, b), (a, c), (b, (a, a), (b, a), (b, b), (a, a), (b,
(a, a)
(b, b), (c, b), (c, c) b), (c, b), (c, c) (b, c), (c, a), (c, c) b), (c, c)
(a, a), (a, b), (a, c),
(a, b) (a, b) (a, b) (b, a), (b, b), (b, c), (a, b)
(c, a), (c, b), (c, c)
(a, a), (a, b), (a, c),
(a, c) (a, b), (a, c) (a, b), (a, c) (b, a), (b, b), (b, c), (a, c)
(c, a), (c, b), (c, c)
(a, a), (a, b), (a, c), (a, a), (a, b), (a, c),
(b, a) (b, a), (b, b), (b, c), (b, a), (b, b), (b, c), (b, a) (b, a)
(c, a), (c, b), (c, c) (c, a), (c, b), (c, c)
(a, a), (a, b), (a, c), (a, a), (a, b), (a, c), (a, a), (b, a), (b, b), (a, a), (b,
(b, b)
(b, b), (c, b), (c, c) (b, b), (c, b), (c, c) (b, c), (c, a), (c, c) b), (c, c)
(a, a), (a, b), (a, c), (a, a), (a, b), (a, c),
(b, c) (b, b), (b, c), (c, b), (b, b), (b, c), (c, b), (b, a), (b, c) (b, c)
(c, c) (c, c)
(a, a), (a, b), (a, c), (a, a), (a, b), (a, c),
(c, a) (b, b), (c, a), (c, b), (b, b), (c, a), (c, b), (b, a), (c, a) (c, a)
(c, c) (c, c)
(a, a), (b, a), (b, b),
(c, b) (a, b), (c, b) (a, b), (c, b) (b, c), (c, a), (c, b), (c, b)
(c, c)
(a, a), (a, b), (a, c), (b, (a, a), (a, b), (a, c), (a, a), (b, a), (b, b), (a, a), (b,
(c, c)
b), (c, b), (c, c) (b, b), (c, b), (c, c) (b, c), (c, a), (c, c) b), (c, c)
P (S)={(a, a), (a, b), (b, b), (c, a), (c, b), (c, c)},
P (S)={(a, a), (a, b), (b, b), (c, a), (c, b), (c, c)},
BnP(S)=∅.
Learning Rules from Pairwise Comparison Table 25
The entities to compute the lower and upper approximations of SC are shown in
Table 4.
Pair of
D{−Maths} (x, y) D{−Physics } (x, y) D{−Literature} (x, y) D P− (x, y)
students
(a, a), (b, a), (b, b), (a, a), (b, a), (b, b), (a, a), (a, b), (a, c), (a, a), (b,
(a, a)
(b, c), (c, a), (c, c) (b, c), (c, a), (c, c) (b, b), (c, b), (c, c) b), (c, c)
(a, a), (a, b), (a, c), (a, a), (a, b), (a, c),
(a, b) (b, a), (b, b), (b, c), (b, a), (b, b), (b, c), (a, b) (a, b)
(c, a), (c, b), (c, c) (c, a), (c, b), (c, c)
(a, a), (a, c), (b, a), (a, a), (a, c), (b, a),
(a, c) (b, b), (b, c), (c, a), (b, b), (b, c), (c, a), (a, b), (a, c) (a, c)
(c, c) (c, c)
(a, a), (a, b), (a, c),
(b, a) (b, a) (b, a) (b, a), (b, b), (b, c), (b, a)
(c, a), (c, b), (c, c)
(a, a), (b, a), (b, b), (a, a), (b, a), (b, b), (a, a), (a, b), (a, c), (a, a), (b,
(b, b)
(b, c), (c, c) (b, c), (c, c) (b, b), (c, b), (c, c) b), (c, c)
(a, a), (a, b), (a, c),
(b, c) (b, a), (b, c) (b, a), (b, c) (b, b), (b, c), (c, b), (b, c)
(c, c)
(a, a), (a, b), (a, c),
(c, a) (b, a), (c, a) (b, a), (c, a) (b, b), (c, a), (c, b), (c, a)
(c, c)
(b, a), (b, b), (b, c), (b, a), (b, b), (b, c),
(c, b) (a, b), (c, b) (c, b)
(c, a), (c, b), (c, c) (c, a), (c, b), (c, c)
(a, a), (b, a), (b, b), (a, a), (b, a), (b, b), (a, a), (a, b), (a, c), (a, a), (b,
(c, c)
(b, c), (c, a), (c, c) (b, c), (c, a), (c, c) (b, b), (c, b), (c, c) b), (c, c)
P (Sc)={(a, c), (b, a), (b, c)}, P (Sc)={(a, c), (b, a), (b, c)}, BnP(S)=∅.
The decision matrix of S in SPCT,
( a, c ) (b, a) (b, c)
( a, a ) Lit Maths, Phys Maths, Phys
( a, b) Maths, Phys Maths, Phys Maths, Phys
M(S)= (b, b) Lit Maths, Phys Maths, Phys
( c, a ) Lit Maths, Phys ∅
( c, b ) ∅ Maths, Phys Maths, Phys
(c , c ) Lit Maths, Phys Maths, Phys
fS[(c, b)]=Maths∨Phys,
fS[(c, c)]=(Lit∧Maths)∨(Lit∧Phys).
Then, the following minimal Df -decision rules can be obtained:
IF (cMaths(x) f 18 and cMaths(y) p 18) and (cLit(x) f 10 and cLit(y) p 10) THEN xSy,
(a, a),
IF (cPhys(x) f 16 and cPhys(y) p 16) and (cLit(x) f 10 and cLit(y) p 10) THEN xSy, (a, a),
IF cMaths(x) f 18 and cMaths(y) p 10 THEN xSy, (a, b),
IF cPhys(x) f 16 and cPhys(y) p 12 THEN xSy, (a, b),
IF (cMaths(x) f 10 and cMaths(y) p 10) and (cLit(x) f 18 and cLit(y) p 18) THEN xSy,
(b, b),
IF (cPhys(x) f 12 and cPhys(y) p 12) and (cLit(x) f 18 and cLit(y) p 18) THEN xSy, (b, b),
IF (cMaths(x) f 14 and cMaths(y) p 18) and (cLit(x) f 15 and cLit(y) p 10) THEN xSy,
(c, a),
IF (cPhys(x) f 15 and cPhys(y) p 16) and (cLit(x) f 15 and cLit(y) p 10) THEN xSy, (c, a),
IF cMaths(x) f 14 and cMaths(y) p 10 THEN xSy, (c, b),
IF cPhys(x) f 15 and cPhys(y) p 12 THEN xSy, (c, b),
IF (cMaths(x) f 14 and cMaths(y) p 14) and (cLit(x) f 15 and cLit(y) p 15) THEN xSy,
(c, c),
IF (cPhys(x) f 15 and cPhys(y) p 15) and (cLit(x) f 15 and cLit(y) p 15) THEN xSy, (c, c).
Similarly, we can obtain the Dp -decision rules as follows.
IF (cMaths(x) p 18 and cMaths(y) f 14) and (cLit(x) p 10 and cLit(y) f 15) THEN xSCy,
(a, c),
IF (cPhys(x) p 16 and cPhys(y) f 15) and (cLit(x) p 10 and cLit(y) f 15) THEN xSCy, (a, c),
IF cMaths(x) p 10 and cMaths(y) f 18 THEN xSCy, (b, a),
IF cPhys(x) p 18 and cPhys(y) f 10 THEN xSCy, (b, a),
IF cMaths(x) p 10 and cMaths(y) f 14 THEN xSCy, (c, b),
IF cPhys(x) p 18 and cPhys(y) f 15 THEN xSCy, (c, b).
4 Conclusions
Learning decision rules from preference-ordered data differs from usual machine learn-
ing, since the former involves preference orders in domains of attributes and in the set of
decision classes. This requires that a knowledge discovery method applied to prefer-
ence-ordered data respects the dominance principle, which is addressed in the Domi-
nance-Based Rough Set Approach. This approach enables us to apply a rough set
approach to multicriteria choice and ranking. In this paper, we propose the concepts of
decision matrices and decision functions to generate the minimal decision rules from a
pairwise comparison table. Then, the decision rules can be used to obtain a recommen-
dation in multicriteria choice and ranking problems. We will further present some
extensions of the approach that make it a useful tool for other practical applications.
Learning Rules from Pairwise Comparison Table 27
References
1. Keeney, R.L., Raiffa, H.: Decision with Multiple Objectives-Preferences and Value Trade-
offs. Wiley, New York (1976)
2. Roy, B.: The Outranking Approach and the Foundation of ELECTRE Methods. Theory
and Decision 31(1), 49–73 (1991)
3. Fodor, J., Roubens, M.: Fuzzy Preference Modelling and Multicriteria Decision Support.
Kluwer, Dordrecht (1994)
4. Zopounidis, C., Doumpos, M.: Multicriteria Classification and Sorting Methods: A litera-
ture Review. European Journal of Operational Research 138(2), 229–246 (2002)
5. Slovic, P.: Choice between Equally-valued Alternatives. Journal of Experimental Psychol-
ogy: Human Perception Performance 1, 280–287 (1975)
6. Greco, S., Matarazzo, B., Slowinski, R.: Rough Set Theory for Multicriteria Decision
Analysis. European Journal of Operational Research 129(1), 1–47 (2001)
7. Fortemps, P., Greco, S., Slowinski, R.: Multicriteria Decision Support Using Rules That
Represent Rough-Graded Preference Relations. European Journal of Operational Re-
search 188(1), 206–223 (2008)
8. Roy, B.: Méthodologie multicritère d’aide à la décision. Economica, Paris (1985)
9. Pawlak, Z.: Rough Sets. International Journal of Computer and Information
Sciences 11(5), 341–356 (1982)
10. Pawlak, Z., Slowinski, R.: Rough Set Approach to Multi-attribute Decision Analysis.
European Journal of Operational Research 72(3), 443–459 (1994)
11. Greco, S., Matarazzo, B., Slowinski, R.: Rough Sets Methodology for Sorting Problems in
Presence of Multiple Attributes and Criteria. European Journal of Operational Re-
search 138(2), 247–259 (2002)
12. Greco, S., Matarazzo, B., Slowinski, R.: Extension of the Rough Set Approach to Multicri-
teria Decision Support. INFOR 38(3), 161–193 (2000)
13. Greco, S., Matarazzo, B., Slowinski, R.: Rough Approximation of a Preference Relation by
Dominance Relations. European Journal of Operational Research 117(1), 63–83 (1999)
14. Skowron, A., Rauszer, C.: The Discernibility Matrices and Functions in Information Table.
In: Intelligent Decision Support: Handbook of Applications and Advances of the Rough
Set Theory, pp. 331–362. Kluwer Academic Publishers, Dordrecht (1991)
A Novel Hybrid Particle Swarm Optimization
for Multi-Objective Problems
1 Introduction
Multi-objective Evolutionary Algorithms(MOEAs) are powerful tools to solve
multi-Objective optimize problems(MOPs), it gains popularity in recent years
[1]. Recently, some elitism algorithms are proposed as: NSGA-II [2], SPEA2 [3],
GDE3 [4], MOPSO [5, 6, 7].
NSGA-II adopt a fast non-dominated sorting approach to reduce computer
burden, it uses Ranking and Crowding Distance to choose the candidate solutions
[2]. SPEA2 proposes a fitness assignment with cluster technique, it designs a
truncation operator based on the nearest Neighbor Density Estimation metric [3].
GDE3 is a developed version of Differential Evolution, which is suited for global
optimization with an arbitrary number of objectives and constraints [4].
MOPSO is proposed by Coello et al, it adopts swarm intelligence to optimize
MOPs, and it uses the Pareto-optimal set to guide the particles flight [5]. Sierra
adopt Crowding Distance to filter the leaders, the different mutation methods
are acted on divisions particles [6]; Mostaghim introduces a new Sigam-method
to find local best information to guide particle [7].
The Project was supported by the Research Foundation for Outstanding Young
Teachers, China University of Geosciences(Wuhan)( No:CUGQNL0911).
H. Deng et al. (Eds.): AICI 2009, LNAI 5855, pp. 28–37, 2009.
c Springer-Verlag Berlin Heidelberg 2009
A Novel Hybrid Particle Swarm Optimization 29
For one decision variable Xj with the boundary [lj , uj ], then the quantize techni-
cal divide the domain into Q levels αj1 , αj2 , · · · , αjQ , where the design parameter
30 S. Jiang and Z. Cai
When MOEAs get a set of equal good solutions full to the setting size(usually
archiveSize = 100), an accept rule must be designed to decide which one should
be cut off from archive. It’s a critical issue in MOEAs which directly influence
the quality of finally optimal set in convergence and spread metric. Some popular
accept rules have been presented: NSGA-II adopts the Ranking and Crowding
Distance metric, SPEA2 uses the nearest Neighbor Density Estimation metric.
In this paper, we design a new accept rule called Minimum Reduce Hypervol-
ume. Hypervolume is a quality indicator proposed by Zitzler et al, it is adopted
in jMetal 2.1 [14]. Hypervolume calculates the volume covered by members of a
non-dominated set of solutions (the region enclosed into the discontinuous line
respect the worst point W in the figure 1 is ADBECW in dashed line).
If two solutions D and E are non-dominated each other, NSGA-II choose the
solution D remained in archive if CD(D) > CD(E), it maintains the spread
along the Pareto-front.
CD(D) = AD + D B
(6)
CD(E) = BE + E C
D1
D’
CD(D)=AD’+D’B hv(D)=DD1*DD2
D D D2
E1
E’
CD(E)=BE’+E’C hv(E)=EE1*EE2
E E2
E
Fig. 1. The comparison of Crowding Distance and MRV is described in the left and
right figure
The factor scale is designed to equal the influence of crowding distance and
MRV, but if it is too large, we set it is to 1000 when scale > 1000.
5 Experiment Results
Experiment is based on jMetal 2.1 [14], which is a Java-based framework aimed
at facilitating the development of metaheuristics for solving MOPs, it provides
large block reusing code and fair comparison for different MOEAs. The paper
selects the algorithm OMOPSO [6], SMPSO [14] as the compare objectives.
Each algorithm independent runs for 100 times and maximum evolution times
is 25, 000. The test problems are choose from ZDTx, DTLZx problem family.
The performance metrics have five categories:
1
Unary additive epsilon indicator(I+ ). The metric indicator of an approx-
1
imation set A (I+ (A)) gives the minimum factor by which each point in the
A Novel Hybrid Particle Swarm Optimization 33
real front can be added such that the resulting transformed approximation set
is dominated by A:
1
I+ (A) = inf∈R{∀z 2 ∈ R\∃z 1 ∈ A : zi2 ≤ zi1 + ∀i} (9)
Hypervolume. This quality indicator calculates the volume (in the objective
space) covered by members of a nondominated set of solutions with a reference
point.The hypervolume (HV) is calculated by:
|Q|
HV = volume( vi ) (10)
i=1
Inverted Generational Distance. The metric is to measure how far the el-
ements are in the Pareto-optimal set from those in the set of non-dominated
vectors found. It is defined as:
|N | 2
i=1 di
IGD = (11)
N
Generational Distance. The metric is to measure how far the elements are in
the set of non-dominated vectors found from those in the Pareto-optimal set. It
is defined as:
|n| 2
i=1 di
GD = (12)
n
Spread. The Spread indicator is a diversity metric that measures the extent of
spread achieved among the obtained solutions. This metric is defined as:
n−1
df + dl + i=1 |di − d|
∆= (13)
df + dl + (n − 1)d
1
The higher Hypervolume and lower I+ ,GD, IGD, Spread mean the better al-
gorithm. The results are compared with median and interquartile range(IQR)
which measures of location (or central tendency) and statistical dispersion, re-
spectively, the best result is grey background.
From table 2, in term of median and IQR for additive epsilon metric, HPSODE
get best results in all of the 12 MOPs.
From table 3, in term of median and IQR for HyperVolume metric, HPSODE
get best results in all of the 12 MOPs.
From table 4, in term of median and IQR for Genetic Distance metric, HP-
SODE get best results in 11 MOPs, only worse in DTLZ6. OMOPSO get best
results only in 1 MOPs: DTLZ6.
From table 5, in term of median and IQR for Inverted Genetic Distance metric,
HPSODE get best results in 10 MOPs, only worse in ZDT2, DTLZ6. OMOPSO
get best results only in 1 MOPs: ZDT2. SMOPSO get best results only in 1
MOPs: DTLZ6.
34 S. Jiang and Z. Cai
1
Table 2. Unary additive epsilon indicator(I+ ) Median and IQR
From table 6, in term of median and IQR for Spread metric, HPSODE get
best results in all of the 6 MOPs. OMOPSO get best results only in 2 MOPs.
SMOPSO get best results only in 4 MOPs.
The comparison for three algorithms in five categories: epsilon indicator, hy-
pervolume, Generation Distance, Inverted genetic Distance, Spread, it shows
that the new algorithm is more efficient to solve the MOPs. Now, we summarize
the highlight as follows:
1. HPSODE adopt the statistical method Uniform Design to construct the first
population, which can get well distributed solutions in feasible space.
2. HPSODE combine the PSO and Differential Evolution operation to gener-
ate next population. DE operation enhances the diversity for global guide
population.
A Novel Hybrid Particle Swarm Optimization 35
References
1. Coello, C.A.C.: Evolutionary multi-objective optimization: A historical view of the
Field. IEEE Computational Intelligence Magazine 1(1), 28–36 (2006)
2. Deb, K., Pratap, A., Agarwal, S., Meyarivan, T.: A fast and elitist multiobjec-
tive genetic algorithm: NSGA - II. IEEE Transactions on Evolutionary Computa-
tion 6(2), 182–197 (2002)
3. Zitzler, E., Laumanns, M., Thiele, L.: SPEA2: Improving the strength Pareto evo-
lutionary algorithm, Technical Report 103, Computer Engineering and Networks
Laboratory (2001)
4. Kukkonen, S., Lampinen, J.: GDE3: The third evolution step of generalized dif-
ferential evolution. In: Proceedings of the 2005 IEEE Congress on Evolutionary
Computation (1), pp. 443–450 (2005)
5. Coello Coello, C.A., Toscano Pulido, G., Salazar Lechuga, M.: Handling Multiple
Objectives With Particle Swarm Optimization. IEEE Transactions on Evolutionary
Computation 8, 256–279 (2004)
6. Sierra, M.R., Coello, C.A.C.: Improving PSO-based multi-objective optimization
using crowding, mutation and -dominance. In: Coello Coello, C.A., Hernández
Aguirre, A., Zitzler, E. (eds.) EMO 2005. LNCS, vol. 3410, pp. 505–519. Springer,
Heidelberg (2005)
7. Mostaghim, S., Teich, J.: Strategies for Finding Good Local Guides in Multi-
objective Particle Swarm Optimization (MOPSO). In: 2003 IEEE Swarm Intel-
ligence Symposium Proceedings, Indianapolis, Indiana, USA, pp. 26–33. IEEE
Service Center (2003)
8. Zhang, W.J., Xie, X.F.: DEPSO: Hybrid particle swarm with differential evolution
operator. In: IEEE International Conference on Systems Man and Cybernetics, pp.
3816–3821 (2003)
9. Fang, K.T., Ma, C.X.: Orthogonal and uniform design. Science Press (2001)
(in Chinese)
A Novel Hybrid Particle Swarm Optimization 37
10. Leung, Y.W., Wang, Y.: An orthogonal genetic algorithm with quantization for
global numerical optimization. IEEE Transactions on Evolutionary Computa-
tion 5(1), 41–53 (2001)
11. Zeng, S.Y., Kang, L.S., Ding, L.X.: An orthogonal multiobjective evolutionary
algorithm for multi-objective optimization problems with constraints. Evolutionary
Computation 12, 77–98 (2004)
12. Cai, Z.H., Gong, W.Y., Huang, Y.Q.: A novel differential evolution algorithm based
on -domination and orthogonal design method for multiobjective optimization.
In: Obayashi, S., Deb, K., Poloni, C., Hiroyasu, T., Murata, T. (eds.) EMO 2007.
LNCS, vol. 4403, pp. 286–301. Springer, Heidelberg (2007)
13. Leung, Y.-W., Wang, Y.: Multiobjective programming using uniform design and
genetic algorithm. IEEE Transactions on Systems, Man, and Cybernetics, Part
C 30(3), 293 (2000)
14. Durillo, J.J., Nebro, A.J., Luna, F., Dorronsoro, B., Alba, E.: jMetal: A
Java Framework for Developing Multi-Objective Optimization Metaheuristics,
Departamento de Lenguajes y Ciencias de la Computación, University of
Málaga, E.T.S.I. Informática, Campus de Teatinos, ITI-2006-10 (December 2006),
http://jmetal.sourceforge.net
Research on Constrained Layout Optimization Problem
Using Multi-adaptive Strategies Particle Swarm
Optimizer∗
Kaiyou Lei
Faculty of Computer & Information Science, Southwest University, Chongqing, 400715, China
[email protected]
1 Introduction
The classical layout problem is generally divided in to two types: Packing problem
and Cutting problem. The main problem is to increase the space utility ratio as much
as possible under the condition of non-overlapping between piece (object) and piece
(object) and between piece (object) and container. They are called as layout problems
without behavioral constraints (unconstrained layout). In recent years, another more
complex layout problem is attracting a lot of attention, such as the layout design of
engineering machine, spacecraft, ship, etc. Solving these kinds of problems need to
consider some additional behavioral constraints, for instance, inertia, equilibrium,
stability, vibration, etc. They are called as layout problem with behavioral constraints
(constrained layout). Constrained layout belong to NP-hard problem, its optimization
is hand highly [1].
As a newly developed population-based computational intelligence algorithm, Par-
ticle Swarm Optimization (PSO) was originated as a simulation of simplified social
model of birds in a flock [2]. The PSO algorithm is easy to implement and has been
proven very competitive in large variety of global optimization problems and applica-
tion areas compared to conventional methods and other meta-heuristics [3].
∗
The work is supported by Key Project of Chinese Ministry of Education (104262).
H. Deng et al. (Eds.): AICI 2009, LNAI 5855, pp. 38–47, 2009.
© Springer-Verlag Berlin Heidelberg 2009
Research on Constrained Layout Optimization Problem 39
Since its introduction, numerous variations of the basic PSO algorithm have been
developed in the literature to avoid the premature problem and speed up the conver-
gence process, which are the most important two topics in the research of stochastic
search methods. To make search more effective, there are many approaches suggested
by researchers to solve the problems, such as variety mutation and select a single
inertia weight value methods, etc, but these methods have some weakness in common,
they usually can not give attention to both global search and local search, preferably,
so to trap local optima, especially in complex constrained layout problems [4].
In this paper, particle swarm optimizer with a better search performance is pro-
posed, which employ multi-adaptive strategies to plan large-scale space global search
and refined local search as a whole according to the specialties of constrained layout
problems, and to quicken convergence speed, avoid premature problem, economize
computational expenses, and obtain global optimum. We tested the proposed algo-
rithm and compared it with other published methods on three constrained layout ex-
amples. The experimental results demonstrated that this revised algorithm can rapidly
converge at high quality solutions.
Suppose that the center of the graph plane is origin o of the Cartesian system, the
graph plane and graph units are just in plane xoy, the thickness of the graph units is
ignored, and xi,yi are the coordinates of the center oi of the graph unit i, which is its
mass center also. The mathematical model for optimization of the problem is given
by:
40 K. Lei
{ (x )+ r }
(1)
min F ( X i ) = max + yi
2 2
i i
.t.
s
f 2 (X i ) = (x i
2
+ yi
2
) + r − R ≤ 0,
i i∈I
(3)
2 2 (4)
⎛ n ⎞ ⎛ n ⎞
f3 ( X i ) = ⎜⎜ mi xi ⎟⎟ + ⎜⎜ mi yi ⎟⎟ − [δ J ] ≤ 0, i ∈ I
∑ ∑
⎝ i =1 ⎠ ⎝ i =1 ⎠
( )
v idt +1 = w ∗ v idt + c1 ∗ r1 ∗ p idt − x idt + c 2 ∗ r2 ∗ p gd
t
− x idt ( ) (5)
Constants c1 and c2 are learning rates; r1 and r2 are random numbers uniformly dis-
tributed in the interval [0, 1]; w is an inertia factor.
To speed up the convergence process and avoid the premature problem, Shi pro-
posed the PSO with linearly decrease weight method (LDWPSO)[3,4]. Suppose wmax
is the maximum of inertia weight, wmin is the minimum of inertia weight, run is cur-
rent iteration times, runMax is the total iteration times, the inertia weight is formu-
lated as:
w = wmax − ( wmax − wmin ) * run / runMax (7)
Research on Constrained Layout Optimization Problem 41
The w has the capability to automatically harmonize global search abilities and local
search abilities, avoid premature and gain rapid convergence to global optimum. First
of all, larger w can enhance global search abilities of PSO, so to explore large-scale
search space and rapidly locate the approximate position of global optimum, smaller
w can enhance local search abilities of PSO, particles slow down and deploy refined
local search, secondly, the more difficult the optimization problems are, the more
fortified the global search abilities need, once located the approximate position of
global optimum, the refined local search will further be strengthen to get global opti-
mum[5,6,7].
According to the conclusions above, we first constructed ⑻
as new inertia weight
decline curve for PSO, demonstrated in Figure 2.
t
3.2 Adaptive Difference Mutation Strategy of Global Optimum p gd
Considering that the particles may find the better global optimum in the current best
region, the algorithm is designed to join mutation operation with the perturbation
operator. The runmax1, which is an iteration times of the transformation point, divides
runMax into two segments to respectively mutate according to themselves characteris-
tics, and further enhance the global search abilities and local search abilities to find a
satisfactory solution. pη is the mutation probability within (0.1, 0.3).The computed
equation is defined as:
In equation (3), the graph plane radius is constant, which will influence the fitness
value computation and the result of search. In the search process, the more smallness
the periphery envelope circle radiuses are, the more convergence graph units are.
Evidently, taking the current periphery envelope circle radius as layout radius, the
search progress in smaller radius area, and quicken convergence speed, economize
computational expenses. Suppose the periphery envelope circle radius is Rs, the com-
puted equation is defined as:
t +1 t +1 t +1 t +1
if RS > RS , RS = RS ,esle RS = RS
t t
(10)
Analysis on equation (5), if the best position of all particles has no change for a long
time, then the velocity updated is decided by w*vidt. Due to w<1, the velocity is less
and less, so the particle swarm is tended to get together, farther, the algorithm is
trapped into local optima, premature problem emerged like in genetic algorithms.
Considering that the adaptive difference mutation strategy of Xi is introduced. When
the best position of all particles has no or less change for a long time, the part of all
particles are kept down, deploy refined local search unceasingly; the rest are initial-
ized randomly, so to enhance global search abilities of PSO, break away from the
attraction of local optima, synchronously. Suppose runt is an iteration times, runk,
runk+t are the kth, (k+t)th iteration times, Rk, Rk+t are the maximal radius, accordingly;
Xρ are the initialized particles randomly, ρ is the dimension mutation probability,
accordingly; ε is an error threshold, the computed equation is defined as:
PSO, which is modified as MASPSO by the above methods, has the excellent search
performance to optimize constrained layout problems. According to MASPSO, The
design of the algorithm is as follows.
All particles are coded based on the rectangular plane axis system. Considering that
the shortest periphery envelope circle radius is the optimization criterion based on the
above constraint conditions for our problem, the fitness function and the penalty func-
tion, which are constructed in the MASPSO, respectively, can be defined as:
3
φ 1 (X i )= F (X i ) + ∑i=1
λ i f i (X i )u i ( f i ),
(12)
⎧0 fi ( X i ) ≤ 0
u i ( fi ) = ⎨ , i ∈ I.
⎩1 fi ( X i ) > 0
Research on Constrained Layout Optimization Problem 43
4 Computational Experiments
Taking he literature [8], [9] as examples, we tested our algorithm, the comparison of
statistical results are shown in Tab.1, Tab.2, and Tab.3, respectively.
Parameters used in our algorithm are set to be: c1=c2=1.5, runMax=1000. The run-
ning environment is: MATLAB7.1, Pentium IV 2GHz CPU, 256MRAM, Win
XPOS.
Example1. Seven graph units are contained in the layout problem. The radius of
a graph plane is R=50mm. The allowing value of static non-equilibrium J is
Num- Graph units Literature [8] result Literature [9] result Our result
ber r(mm) m(g) x(mm) y(mm) x (mm) y(mm) x(mm) y(mm)
1 10.0 100.00 -12.883 17.020 14.367 16.453 17.198 -13.618
2 11.0 121.00 8.8472 19.773 -18.521 -9.560 -19.857 6.096
3 12.0 144.00 0.662 0.000 2.113 -19.730 -1.365 19.886
4 11.5 132.00 -8.379 -19.430 19.874 -4.340 18.882 7.819
5 9.5 90.25 -1.743 0.503 -19.271 11.241 -16.967 -14.261
6 8.5 72.25 12.368 -18.989 -3.940 22.157 -0.606 -22.873
7 10.5 110.25 -21.639 -1.799 -0.946 2.824 -0.344 -3.010
44 K. Lei
Table 1. (continued)
(c) The example 1 layout results comparison based on 40 times algorithm running
Times Times
The least circle The least circle
radius included all Literature radius included all Literature
Our Our
graph units (mm) [9] graph units (mm) [9]
algorithm algorithm
algorithm algorithm
≤32.3 10 19 (32.9,33.1] 2 1
(32.3,32.5] 16 11 (33.1,33.3] 2 1
(32.5,32.7] 5 5 >33.3 3 0
(32.7,32.9] 2 3
(a) Literature [8] layout result (b) Literature [9] layout result (c) Our layout result
[gj]=3.4g*mm. The result is in Table 1 and the geometric layout is shown in Figure 3
(the particle size is 50).
Example2. Forty graph units are contained in the layout problem. The radius of a
graph plane is R=880 mm. The radius of a graph plane is R=880 mm. The allowing
value of static non-equilibrium J is [gj]=20g*mm. The result is in Table2 and the
geometric layout is shown in Figure 4 (the particle size is 100).
Research on Constrained Layout Optimization Problem 45
Table 2. (continued)
5 Conclusions
From table1and table2, we can educe that: PSO with four improved adaptive strate-
gies harmonized the large-scale space global search abilities and the refined local
search abilities much thoroughly, which has rapid convergence and can avoid prema-
ture, synchronously. The effectiveness of the algorithm is validated for the con-
strained layout of the NP-hard problems; the algorithm outperformed the known best
ones in the in quality of solutions and the running time. In addition, the parameters of
、、
runt ε ρ are chose by human experience, which has the certain blemish. How to
choose the rather parameters are one of the future works.
References
1. Teng, H., Shoulin, S., Wenhai, G., et al.: Layout optimization for the dishes installed on
rotating table. Science in China (Series A) 37(10), 1272–1280 (1994)
2. Kennedy, J., Eberhart, R.C.: Particle swarm optimization. In: Proc. of IEEE Int’l Conf.
Neural Networks, pp. 1942–1948. IEEE Computer Press, Indianapolis (1995)
Research on Constrained Layout Optimization Problem 47
3. Eberhart, R.C., Kennedy, J.: A new optimizer using particles swarm theory. In: Sixth Inter-
national Symposium on Micro Machine and Human Science, pp. 39–43. IEEE Service Cen-
ter, Piscataway (1995)
4. Angeline, P.J.: Using selection to improve particle swarm optimization. In: Proc. IJCNN
1999, pp. 84–89. IEEE Computer Press, Indianapolis (1999)
5. Lei, K., Qiu, Y., He, Y.: A new adaptive well-chosen inertia weight strategy to automati-
cally harmonize global and local search ability in particle swarm optimization. In: 1st Inter-
national Symposium on Systems and Control in Aerospace and Astronautics, pp. 342–346.
IEEE Press, Harbin (2006)
6. Shi, Y., Eberhart, R.C.: A modified particle swarm optimizer. In: Proc. of the IEEE Con.
Evolutionary Computation, pp. 69–73. IEEE Computer Press, Piscataway (1998)
7. Jianchao, Z., Zhihua, C.: A guaranteed global convergence particle swarm optimizer. Jour-
nal of Computer Research and Development 4(8), 1334–1338 (2004) (in Chinese)
8. Fei, T., Hongfei, T.: A modified genetic algorithm and its application to layout optimization.
Journal of Software 10(10), 1096–1102 (1999) (in Chinese)
9. Ning, L., Fei, L., Debao, S.: A study on the particle swarm optimization with mutation
operator constrained layout optimization. Chinese Journal of Computers 27(7), 897–903
(2004) (in Chinese)
ARIMA Model Estimated by Particle Swarm
Optimization Algorithm for
Consumer Price Index Forecasting
1 Introduction
The ARIMA, which is one of the most popular models for time series forecast-
ing analysis, has been originated from the autoregressive model (AR) proposed
by Yule in 1927, the moving average model (MA) invented by Walker in 1931
and the combination of the AR and MA, the ARMA models [1]. The ARMA
model can be used when the time series is stationary, but the ARIMA model
hasn’t that limitation. ARIMA model pre-processes the original time series by
some methods to make the obtained series stationary so that we can then ap-
ply the ARMA model. For example, for an original series which has only serial
dependency and hasn’t seasonal dependency, if the dth differenced series of this
original series is stationary, we can then apply the ARMA(p, q) model to the
dth differenced series and we say the original time series satisfy the modeling
condition of ARIMA(p, d, q) which is one kind of the ARIMA models. So the
Corresponding author.
H. Deng et al. (Eds.): AICI 2009, LNAI 5855, pp. 48–58, 2009.
c Springer-Verlag Berlin Heidelberg 2009
ARIMA Model Estimated by Particle Swarm Optimization Algorithm 49
Then we can obtain a stationary time series with zero mean {x1 , x2 , . . . , xN −d },
where xt = zt −z̄. So the problem of modeling ARIMA(p, d, q) for {y1 , y2 , . . . , yN }
converts to the problem of modeling ARMA(p, q) for {x1 , x2 , . . . , xN −d }. And
the generalized form of ARMA(p, q) can be described as follow
where
ϕ(B) = 1 − ϕ1 B − ϕ2 B 2 − · · · − ϕp B p
θ(B) = 1 − θ1 B − θ2 B 2 − · · · − θq B q
and at is stationary white noise with zero mean, ϕi (i = 1, 2, . . . , p) and θi (i =
1, 2, . . . , q) are coefficients of the ARMA model.
On the other hand, for identifying the orders of ARMA model, autocorrela-
tion and partial autocortrelation graphs which are drawn based on 1 ∼ M lag
ARIMA Model Estimated by Particle Swarm Optimization Algorithm 51
numbers provide information about the AR and MA orders (p and q) [1]. Con-
cretely, p is determined as the number of the front a few significant coefficients
in the partial autocorrelation graph and similarly q is determined as the number
of the front a few significant coefficients in the autocorrelation graph.
Then we can estimate {θi } and σa2 which is the variance of {at } in (7).
γ̂0 (x̄) = σ̂a2 (1 + θ̂12 + · · · + θ̂q2 )
(7)
γ̂k (x̄) = σ̂a2 (−θ̂k + θ̂k+1 θ̂1 + · · · + θ̂q θ̂q−k ), k = 1, 2, . . . , q
p p
where γ̂k (x̄) = j=0 l=0 ϕ̂j ϕ̂l γ̂k+l−j , k = 0, 1, . . . , q
So if q = 1, according to (7) and the reversibility(that is |θ1 | < 1) of ARMA(p, q)
model, we can calculate σa2 and θ1 using (8) and (9).
2 γ̂0 (x̄) + γ̂02 (x̄) − 4γ̂12 (x̄)
σ̂a = (8)
2
Then
γ̂1 (x̄)
θ̂1 = − (9)
σ̂a2
But if q > 1, we can see it is difficult to solute (7). At this time, we can change
(7) to (10) to solve this problem by using linear iterative method.
⎫
σ̂a2 = γ̂0 (x̄)(1 + θ̂12 + · · · + θ̂q2 )−1 ⎪
⎬
γ̂k (x̄) (10)
θ̂k = −( 2 − θ̂k+1 θ̂1 − · · · − θ̂q θ̂q−k ), k = 1, 2, . . . , q ⎪
⎭
σ̂a
At first, give initial values for {θˆi } and σ̂a2 , for example, let θ̂i = 0 (i = 1, 2, . . . , q)
and σ̂a2 = γ̂0 (x̄) and mark them as {θ̂i (0)} and σ̂a2 (0). Then substitute them into
(10), we can obtain {θ̂i (1)} and σ̂a2 (1). The rest may be deduced by analogy, so
we can obtain {θ̂i (m)} and σ̂a2 (m). If max{|θ̂i (m) − θ̂i (m − 1)|, |σ̂a2 (m) − σ̂a2 (m −
1)|} is very small, we can consider {θ̂i (m)} and σ̂a2 (m) are the approximatively
estimated values of {θi } and σa2 .
52 H. Wang and W. Zhao
From the above, we can see that the moment estimation is cumbersome to
achieve. And when q > 1, the approximate results of the parameters is unpre-
dictable and this might greatly impact the accuracy of forecast. So a method of
PSO for model estimating is proposed in Section 3, and we can see this method
has a simple principle and can be implemented with ease. What’s more, the PSO
estimation for ARIMA model (simply marked as PSOARIMA) has a higher fore-
casting precision than the ARIMA model which will be shown in Section 4.
where
ϕi 1≤i≤p θi 1≤i≤q
ϕi = , θi =
0 i>p 0 i>q
Through comparing the terms which have the same degree of B in (12), we can
obtain {πi } in (13), see [7].
⎧
⎪ π1 = ϕ1 − θ1
⎪
⎨
i−1
(13)
⎪
⎩ πi = ϕi − θi +
⎪ θi−j πj , i>1
j=1
K
x̂t = πi x̂t−i , if t − i ≤ 0, let xt−i = 0 (14)
i=1
xk+1
i = xki + vik+1 (16)
where vik and xkiare the velocity and position of the ith particle at time k
respectively; pxbesti is the individual best position of the ith particle; gxbest is
the global best position of the whole swarm; τ1 , τ2 are random number uniformly
from the interval [0,1]; c1 , c2 ∈ [0, 2] are two positive constants called acceleration
coefficients namely cognitive and social parameter respectively and as default in
[9], c1 = c2 = 2 were proposed; w is inertia weight that can be determined by
where wmax and wmin are separately maximum and minimum of w; N ummax is
maximum iteration time and N um is the current iterative time. A larger inertia
weight achieves the global exploration and a smaller inertia weight tends to
facilitate the local exploration to fine-tune the current search area. Usually, the
maximum velocity Vmax is set to be half of the length of the search space [10].
In this paper, because there are p + q coefficients (ϕi , θj , i = 1, 2, . . . , p, j =
1, 2, . . . , q) to be selected, the dimension of the swarm n = p + q. What’s more,
when the coefficients {ϕi } and {θj } are denoted by Φ and Θ which have p and
54 H. Wang and W. Zhao
q components respectively, the form of one particle can be denoted by (Φ, Θ),
that is (ϕ1 , ϕ2 , . . . , ϕp , θ1 , θ2 , . . . , θq ). Then we mark the simulation series {x̂t }
(Φ,Θ)
which is obtained from (14) as {x̂t } when the coefficients are Φ and Θ.
In the investigation, the mean square error (MSE), shown as (18), serves as
the forecasting accuracy index and the objective function of PSO for identifying
suitable coefficients.
N −d
1 (Φ,Θ) 2
M SE (Φ,Θ) = (xt − x̂t ) (18)
N −d t=1
1. Randomly generate m initial particles x01 , x02 , . . . , x0m and initial velocities
v10 , v20 , . . . , vm
0
which have p + q components respectively.
2. for 1 ≤ i ≤ m do
0 0
3. pxbesti ⇐ x0i ; pf besti ⇐ M SE (xi (1:p),xi (p+1:p+q))
4. x1i ⇐ x0i + vi0
5. end for
6. for all i such that 1 ≤ i ≤ m do
7. if pf bests ≤ pf besti then
8. gxbest ⇐ pxbests ; gf best ⇐ pf bests
9. end if
10. end for
11. k ⇐ 1; wmax ⇐ 0.9; wmin ⇐ 0.4
12. while k < N ummax do
13. for 1 ≤ i ≤ m do
k k
14. if M SE (xi (1:p),xi (p+1:p+q)) < pf besti then
k k
15. pf besti ⇐ M SE (xi (1:p),xi (p+1:p+q)) ;
16. pxbesti ⇐ xi k
17. end if
18. end for
19. for all i such that 1 ≤ i ≤ m do
20. if pf bests ≤ pf besti then
21. gxbest ⇐ pxbests ; gf best ⇐ pf bests
22. end if
23. end for
24. w ⇐ wmax − k(wmax − wmin )/N ummax
25. for 1 ≤ i ≤ m do
26. vik+1 ⇐ w × vik + c1 τ1 (pxbesti − xki ) + c2 τ2 (gxbest − xki )
27. if vik+1 ≥ Vmax or vik+1 ≤ −Vmax then
28. vik+1 ⇐ vik+1 /|vik+1 | × Vmax
29. end if
30. xk+1
i ⇐ xki + vik+1
31. end for
32. end while
ARIMA Model Estimated by Particle Swarm Optimization Algorithm 55
where x(1 : p), x(p + 1, p + q) denote the 1st ∼ pth and the (p + 1)th ∼ (p + q)th
components of the vector x respectively. So gxbest(1 : p) and gxbest(p + 1, p + q)
are the optimally estimated values of Φ and Θ.
4 Case Study
In this section, a case study of predicting the consumer price index (CPI) of 36
big or medium-sized cities in China is shown. The original data is CPI in Jan.
2001 ∼ Oct. 2008 which is obtained from CEInet statistical database. And mark
the original data as {yt }, t = 1, 2, . . . , 94.
We can easily see {yt } is a non-stationary series by observing its graph-
ics. Observing the autocorrelogram of the third-differenced data of the original
data(shown in Fig. 1, where the two solid horizontal lines represent the 95%
confidence interval), in which we can see the autocorrelation coefficients statis-
tically rapidly decline to zero after lag 1, we confirm the third-differenced data
is stationary, and mark it as {zt }, t = 1, 2, . . . , 91. So d is equal to 3. And we can
then mark {zt − z̄} as {xt }, t = 1, 2, . . . , 91 where z̄ = −0.0275 according to (4).
Then we can draw the autocorrelogram and partial autocorrelogram (Fig. 2).
From this figure, two facts stand out: first, the autocorrelation coefficient starts
at a very high value at lag 1 and then statistically rapidly declines to zero;
second, PACF up to 4 lags are individually statistically significant different from
zero but then statistically rapidly declines to zero too. So we can determine p = 4
and q = 1, that is to say, we can design an ARMA(4,1) model for {xt }.
Estimating the coefficients Φ = {ϕ1 , ϕ2 , ϕ3 , ϕ4 } and Θ = {θ1 } according to
(6), (8), (9) and PSO method with the optimizing range of [−1, 1], respectively,
we can obtain Φ = {−1.0750, −0.8546, −0.4216, −0.1254}, Θ = {0.1614} and
Φ = {−0.4273, −0.0554, 0.1979, 0.1487}, Θ = {0.8196} respectively. Then we
can obtain the fitted and prediction series, and the fitted graph is shown as Fig.
3. At the same time, we obtain their MSE are 1.3189 and 1.2935 respectively,
that is to say, the latter fit better than the former.
8 8
Original data Original data
Simulation values Simulation values
6 6
4 4
2 2
0 0
Ŧ2 Ŧ2
Ŧ4 Ŧ4
Ŧ6 Ŧ6
Ŧ8 Ŧ8
0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100
Furthermore, the prediction values of the original data {yt } can be obtained
according to Section 2.3 by both moment estimation method and PSO method
for estimation. The relevant results are shown in Table 1. From this table, we can
see that the accuracy of PSOARIMA(4, 3, 1) model is better than the ARIMA(4,
3, 1) model. What’s more, we present the prediction figures (Fig. 4) to compare
the accuracy of the two models clearly.
ARIMA Model Estimated by Particle Swarm Optimization Algorithm 57
110 110
Original data Original data
Actual value in Nov. 2008 Actual value in Nov. 2008
Prediction values Prediction values
108 108
106 106
Consumer price index
102 102
100 100
98 98
96 96
2001.01 2002.01 2003.01 2004.01 2005.01 2006.01 2007.01 2008.01 2001.01 2002.01 2003.01 2004.01 2005.01 2006.01 2007.01 2008.01
Year.Month Year.Month
Additionally, when we know the actual value in Nov. 2008, the value in
Dec. 2008 can be predicted. The rest may be deduced by analogy, we predict
the values in Jan. 2009 ∼ Mar. 2009, and the relevant results are shown in
Table 2. From this table we can see that accuracy of both models is very good,
but relative error of PSOARIMA(4, 3, 1) is smaller than ARIMA(4, 3, 1) gen-
erally.
5 Conclusion
In this paper, we proposed a novel design methodology which is a hybrid model
of particle swarm optimization algorithm and ARIMA model. PSO is used for
model estimation of ARIMA in order to overcome the deficiency that the tra-
ditional estimation method is difficult to implement and may obtain very bad
results. Furthermore, it is observed that the PSO-based model has worked more
accurately than the traditional moment estimation-based model through the
experimental results of forecasting the CPI.
On the other hand, we can see the power of the PSO for optimizing param-
eters. So in the future study, we will find more practical way to apply PSO
58 H. Wang and W. Zhao
in the other fields. But it is necessary to pay attention to that standard PSO
sometimes easily get in local optimization but not the holistic optimization,
fortunately, chaotic particle swarm optimization algorithm(CPSO) can help to
solve this problem successfully.
References
1. Ediger, V.Ş., Akar, S.: ARIMA forecasting of primary energy demand by fuel in
Turkey. Energy Policy 35, 1701–1708 (2007)
2. Erdogdu, E.: Electricity demand analysis using cointegration and ARIMA mod-
elling: A case study of Turkey. Energy Policy 35, 1129–1146 (2007)
3. Ong, C.-S., Huang, J.-J., Tzeng, G.-H.: Model identification of ARIMA family
using genetic algorithms. Applied Mathematics and Computation 164, 885–912
(2005)
4. Niu, D., Cao, S., Zhao, L., Zhang, W.: Power load forecasting technology and its
applications. China Electric Power Press, Beijing (1998) (in Chinese)
5. Ediger, V.Ş., Akar, S., Uğurlu, B.: Forecasting production of fossil fuel sources
in Turkey using a comparative regression and ARIMA model. Energy Policy 34,
3836–3846 (2006)
6. Valenzuela, O., Rojas, I., Rojas, F., Pomares, H., Herrera, L.J., Guillen, A., Mar-
quez, L., Pasadas, M.: Hybridization of intelligent techniques and ARIMA models
for time series prediction. Fuzzy Sets and Systems 159, 821–845 (2008)
7. Li, M., Zhou, J.-z., Li, J.-p., Liang, J.-w.: Predicting Securities Market in Shanghai
and Shenzhen by ARMA Model. Journal of Changsha Railway University 3, 78–84
(2000) (in Chinese)
8. Dingxue, Z., Xinzhi, L., Zhihong, G.: A Dynamic Clustering Algorithm Based
on PSO and Its Application in Fuzzy Identification. In: The 2006 International
Conference on Intelligent Information Hiding and Multimedia Signal Processing,
pp. 232–235. IEEE Press, New York (2006)
9. Kennedy, J., Eberhart, R.: Particle Swarm Optimization. In: The IEEE Interna-
tional Conference on Neural Networks, vol. 4, pp. 1942–1948. IEEE Press, New
York (1995)
10. Alatas, B., Akin, E., Bedri Ozer, A.: Chaos embedded particle swarm optimization
algorithms. Chaos, Solitons and Fractals (2008)
A Uniform Solution to HPP in Terms of Membrane
Computing
1 Introduction
P systems are emergent branch of nature computing, which can be seen as a kind of
distributed parallel computing model. It is based upon the assumptions that the proc-
esses taking place in the living cells can be considered as computations. Up to now,
several variants of P systems with linear or polynomial time have been constructed to
solve some NP-complete problems, such as SAT [1, 2, 3], HPP [2, 4, 5], Subset Sum
[6], Knapsack [7], 2-Partition[8], Bin Packing[9].
Hamiltonian path problem (HPP for short) is a well-known NP-complete problem,
and there are several P systems dealing with it. The P systems in [4, 5] are all semi-
form way [10]. Specifying the starting and ending vertices, literature [2] presents a
uniform solution to HPP with membrane separation. It remains open confluently solv-
ing HPP in polynomial time by P systems with division rules instead of separation
rules in a uniform way [2].
Without specifying the starting vertex and ending vertex of a path, this paper
presents a uniform solution to HPP with membrane division where the communica-
tion rules (sending objects into membranes) [3] are not used. The remainder of the
paper is organized as follows: first recognizer P system with membrane division is
introduced in the next section. In Section3, the solution to HPP with membrane divi-
sion is presented, with some formal details of such solution given in Section4. Finally,
conclusions are drawn in the last section.
H. Deng et al. (Eds.): AICI 2009, LNAI 5855, pp. 59–68, 2009.
© Springer-Verlag Berlin Heidelberg 2009
60 H. Chen and Z. He
Π =( V , H , µ , w1 , …, wm , R )
where:
m ≥1 (the initial degree of the system);
V is the alphabet of objects;
H is a finite set of labels for membranes;
µ is a membrane structure, consisting of m membranes, labeled (not necessarily in a
one-to-one manner) with elements of H ;
w1 , …, wm are strings over V describing the multisets (every symbol in a string
representing one copy of the corresponding object) placed in the m regions of µ
respectively;
R is a finite set of developmental rules, of the following forms:
(1) [ a → v ] eh , for h ∈ H , e ∈ {+, −, 0} , a ∈V , v ∈V *
(object evolution rules);
(2) [a]eh1 → [ ]eh2 b , for h ∈ H , e1 , e2 ∈ {+, −, 0} , a, b ∈V
(communication rules);
(3) [a]eh1 → [ b]eh2 [c]eh3 , for h ∈ H , e1 , e2 , e3 ∈{+, −, 0} a, b, c ∈V
(division rules for elementary membranes)
Note that, in order to simplify the writing, in contrast to the style customary in the
literature, we have omitted the label of the left parenthesis from a pair of parentheses
which identifies a membrane. All aforementioned rules are applied according to the
following principles:
(a) The rules of type (1) are applied in the parallel way.
(b) The rules of types (2), (3) are used sequentially, in the sense that one membrane
can be used by at most one rule of these types at a time.
(c) All objects and all membranes, which can evolve, should evolve simultaneously.
We will introduce two definitions[1]:
Definition 1. A P system with input is a tuple ( Π , ∑ , iΠ ), where: (a) Π is a P sys-
tem, with working alphabet Γ , with p membranes labeled with 1, ..., p, and initial
multisets w1 , …, w p associated with them; (b) ∑ is an (input) alphabet strictly con-
tained in Γ ; the initial multisets of Π are over Γ - ∑ ; and (c) iΠ is the label of a
distinguished (input) membrane.
Let win be a multiset over ∑ . The initial configuration of ( Π , ∑ , iΠ ) with input
win is ( µ , w1 , …, wiΠ ∪w in
, …, w p ).
A Uniform Solution to HPP in Terms of Membrane Computing 61
For any given directed graph G with n vertices, we construct a recognizer P system
Π ( n) to solve HPP. Therefore the family presented here is
Π = {( Π ( n) , ∑(n) , iΠ ): n∈N}
Π ( n) =( V (n) , H , µ , w1 , w2 , R )
Where,
∪{ v | 1≤ i ≤ n-1}∪{ t , z | 1≤ i ≤ n}
V (n) ={xi, j, k | 1 ≤ i, j ≤ n, -1 ≤ k ≤ i} i i i
i i 0
H = {1, 2}
µ = [[ ] +2 ]10
w1 = f 0 , w2 = d 0 v1
And the set R contains the rules as follows:
A Uniform Solution to HPP in Terms of Membrane Computing 63
[vi ]+2 → [ti ]−2 [vi +1 ]+2 (1≤ i ≤ n-2), [vn−1 ]+2 → [t n−1 ]−2 [t n ]−2 , [t j → c j −1r j′g ]−2 (1≤ j ≤ n) (1)
At the preparing stage, we have { d 0 v1 } and an input multiset win in membrane la-
beled with 2, which is encoded according to the instance G. From v1 , other vertices in
G can be obtained by rules (1). This means that we can find HPs from the n vertices
respectively. Object r j′ represents the starting vertex and object cj-1 can be regarded as
a counter whose subscript is decreased by 1 each time in generating stage. These rules
are applied at the preparing stage.
[ g ]−2 → [ ]02 g (2)
In the membranes labeled with 2, when their polarization is 0, the third subscript k of
xi, j, k and the subscript i of ci are decreased by 1 simultaneously. For the i-th vertex in
G that has been added into the current path, Objects ri′ and ri are two different ver-
sions of representation in the membranes with different polarizations. Xi, j, k can not
evolve any more once k reaches -1. c0 and xi, j,0 will appear after I steps if we have
(i, j) ∈E.
[c0 ]02 → [ ] 2+ c0 (4)
Vertex vi is the last vertex of the current path, and we will extend this path by adding
vj with the appearance of c0 if we have (i, j) ∈E. Several vertices will connect vi be-
sides vj, so new membranes are needed to be created to contain the new paths. Rule
(4) can change the polarization of membrane to positive in order to trigger membrane
divisions to obtain new membranes.
[ d i → d i′] +2 , [ ri → ri′] +2 , [ xi , j , 0 ] +2 → [ z j ] −2 [ g ] +2 (1 ≤ i, j ≤ n, 0 ≤ k ≤ i) (5)
Objects d i , d i′ both mean the fact that the length of current path is I, but they are two
different versions in membranes labeled with 2 with different polarizations. Xi, j, 0 in
the membranes with positive polarization will cause the membrane division to extend
the current path. Since not only one vertex connects with vi, there will be several
objects like xi, j, 0 whose third subscript k is 0. We non-deterministically choose one of
them to trigger the division, and the rest will be remained in the two new membranes
with different polarizations, which indicate that they will be processed by different
rules: the division will be triggered again by one of them in the new membrane with
positive polarization in order to obtain another new path; while all of them will be
deleted in the one with negative polarization.
[ z j → r j′c j −1 g ] −2 , [ d i′ → d i ] −2 , [ xi , j , k → xi , j , i ] −2 (6)
64 H. Chen and Z. He
[ xi , j , 0 → λ ] 2− , [ d i → d i +1 ] −2 (1 ≤ i, j ≤ n, k ≠ 0)
In the new membrane labeled with 2 with negative polarization, new object r j′ is
introduced, which indicates that vertex vj is added to the current path. At the same
time, new counter cj-1 is introduced and other objects, such as xi, j, k, return to xi, j, i.
After the length of the path is increased, the polarization is changed by rule (2). We
return to the beginning of generating stage and resume the procedure for extending
the current path.
[ g → λ ]+2 (7)
In the new membrane labeled with 2 with positive polarization, object g does not
work any more and is deleted.
[d n → se]02 (8)
When the length of path is n, we know that n vertices are in the path and we can enter
the next stage to check if the path is Hamiltonian or not. Two new objects s and e are
introduced to prepare for the checking stage.
[e → e0 r0 ] +2 , [ s] +2 → [ ]02 s (9)
With the appearance of dn, the counter ct (t>0) associated with the last vertex is intro-
duced and it still can evolve until c0 is sent out of the membrane. Objects e and s
evolve by rule (9) to trigger the next stage.
[ei → gei +1 ]02 , [r0 ]02 → [ ]2− r0 ( 0 ≤ i ≤ n) (10)
In checking stage, we use rules (10) and (11) with a loop. If the object r0 is present,
we eliminate it and perform a rotation of the objects ri, ..., rn with their subscripts
decreased by 1. It is clear that a membrane contains HP if and only if we can run n+1
steps of the loop. Note that this check method is not the same as the one in [10], for
the communication rule are not used.
[en +1 → yes] −2 (12)
The object en+1 introduced by rules (10) means that the path is Hamiltonian, so it
evolves to object yes, which will leave the membrane labeled with 2 and enter
membrane 1.
[ g → λ ]10 , [c0 → λ ]10 , [r0 → λ ]10 , [ s → λ ]10 , [ f i → f i +1 ]10 (0≤ i ≤ 3n(n+3)/2+4) (14)
All of these rules are applied in parallel in membrane 1 with neutral polarization.
Since objects g, c0, r0 and s are sent out of the membranes labeled with 2 and will be
not used any more, it necessary to delete them to release computing recourse. Object fi
evolves at each computation step for counting how many steps have been performed.
A Uniform Solution to HPP in Terms of Membrane Computing 65
With changing the polarization to positive, object yes in membrane 1 will be sent out
the environment to show that there is a HP among the n vertices. If at step
3n(n+3)/2+5 the polarization of the membrane 1 has not changed to positive, the an-
swer must be no and no will be sent out to the environment. The special step will be
analyzed in subsection3.2. Once the polarization is change all rules in membrane 1
cannot be applied any more and the computation halts.
All of foregoing rules applied in membranes labeled with 2 are illustrated in Fig.1, in
which 0, + and - respectively represent the polarization of the membranes. Fig.1 looks
like state machine, and it can be seen that different polarizations and objects indicate
distinct stages. Rules (2), (4), (9), (10) are used to change the polarizations in order to
enter another stage of computation.
At preparing stage, we have v1 in membrane labeled with 2. From v1, other vertices
in G can be obtained by application of rules (1). After n starting vertices obtained, we
will enter next stage to find HPs respectively. Generating stage consists of three
phases: a vertex that connects with the last vertex in the current path is expected to
find in search phase, then it will be add into the path through divide phase, and the
length of the path will be increased and the objects related to others edges will recover
in restore phase. Once a vertex is added in the path, all of the edges start from it will
be consumed in the extensions of the path. These three phases will be looped until the
length of the path reaches n. Checking stage is needed to decide whether the path with
length of n is HP or not. The answer yes or no will be sent out at the output stage.
ri ri o ri1
In this section, we will prove that the uniform family Π of recognizer P system in
Section3 gives a polynomial solution for the HPP. According to Definition3 in
Section2, we have:
Theorem 1. HPP ∈ PMCAM.
Proof: (1) It obviously that the family Π of recognizer P system defined is AM-
consistent, because for every P system of the family is a recognizer P system with
active membrane using 2-division and all of them are confluent.
(2) The family Π of recognizer P system is polynomially uniform Turing Machine.
For each n, which is the size of an instance G, a P system can be constructed. The
rules of Π (n) are defined in a recursive manner from n. The necessary resources to
construct the system are:
Size of the alphabet: n3+(7n2+27n)/2+14= Θ( n 3 )
Initial number of membrane: 2= Θ(1)
Initial number of objects: 3= Θ(1)
The total number of evolution rules: n3+(9n2+31n+38)/2= Θ( n 3 )
All these resources are bounded polynomially time of n; therefore a Turing machine
can build the P system in polynomial time according to n.
∪∈
(3) We have the functions g: I HPP → t NI∏(t) and h: I HPP →N defined for an in-
stance u=v1, …, vn as g(u)={xi, j, k : 1≤ i, j ≤ n} and h(u)=n. Both of them are total and
polynomially computable. Furthermore, (g, h) conforms an encoding of the set in-
stances of the HPP problem in the family of recognizer P system as for any instance u
we have that g(u) is a valid input for the P system.
Π is polynomially bounded, with respect to (g, h). A formal description of the
computation let prove that Π always halts and sends to the environment the object yes
or no in the last step. The number of steps of the P system is within 3n(n+3)/2+4 if the
A Uniform Solution to HPP in Terms of Membrane Computing 67
output is yes and 3n(n+3)/2+5 if the output is no, therefore there exists a polynomial
bound for the number of steps of the computation.
Π is sound with respect to (g, h). As we describe the computations of P system of
Π above, that object yes is send out of the system means that there are n different
vertices in the path, which is the actual Hamiltonian path in the instance G according
to the definition of HPP.
Π is complete with respect to (g, h). The P system searches the HP from every ver-
tex respectively at the beginning of the computation. We will get objects such as ri (1
≤ i ≤ n) and delete objects such as xi, j, k (1 ≤ i, j ≤ n, -1 ≤ k ≤ i) corresponding to the
edges whose starting vertex vi is added to the current path. When there is a Hamilto-
nian path in the instance G with n vertices, the system Π will obtain objects r1, r2, ...,
rn in some membrane, and output object yes to the environment. □
Since this class is closed under polynomial-time reduction and complement we
have:
Theorem 2. NP ∪ co-NP ⊆ PMC AM.
5 Conclusion
In this paper, we present a uniform solution for HPP in terms of P system with
membrane division. This has been done in the framework of complexity classes in
membrane computing.
It can be observed that many membranes will not evolve any more at some step be-
fore the whole system halts. Some kind of rule should be developed to dissolve these
membranes for releasing the computing resource, and it will contribute to the con-
struction of the efficient simulators of recognizer P systems for solving NP-complete
problems.
Acknowledgments. This research is supported by Major National S&T Program of
China (#2008ZX 07315-001).
References
1. Gutiérrez-Naranjo, M.A., Pérez-Jiménez, M.J., Romero-Campero, F.J.: A uniform solution
to SAT using membrane creation. Theoretical Computer Science 371(1-2), 54–61 (2007)
2. Pan, L., Alhazov, A.: Solving HPP and SAT by P systems with active membranes and
separation rules. Acta Informatica 43(2), 131–145 (2006)
3. Păun, G.: Computing with Membranes: Attacking NP-Complete Problems. In: Proceedings
of the Second International Conference on Unconventional Models of Computation,
pp. 94–115. Springer, London (2000)
4. Păun, A.: On P Systems with Membrane Division. In: Proceedings of the Second Interna-
tional Conference on Unconventional Models of Computation, pp. 187–201. Springer,
London (2000)
5. Zandron, C., Ferretti, C., Mauri, G.: Solving NP-Complete Problems Using P Systems
with Active Membranes. In: Proceedings of the Second International Conference on
Unconventional Models of Computation, pp. 289–301. Springer, London (2000)
68 H. Chen and Z. He
6. Pérez-Jiménez, M.J., Riscos-Núñez, A.: Solving the Subset-Sum problem by active mem-
branes. New Generation Computing 23(4), 367–384 (2005)
7. Jesús Pérez-Jímenez, M., Riscos-Núñez, A.: A Linear-Time Solution to the Knapsack
Problem Using P Systems with Active Membranes. In: Martín-Vide, C., Mauri, G., Păun,
G., Rozenberg, G., Salomaa, A. (eds.) WMC 2003. LNCS, vol. 2933, pp. 250–268.
Springer, Heidelberg (2004)
8. Gutiérrez-Naranjo, M.A., Pérez-Jiménez, M.J., Riscos-Núñez, A.: A fast P system for find-
ing a balanced 2-partition. Soft Computing 9(9), 673–678 (2005)
9. Pérez-Jiménez, M.J., Romero-Campero, F.J.: Solving the BINPACKING problem by rec-
ognizer P systems with active membranes. In: Proceedings of the Second Brainstorming
Week on Membrane Computing, Seville, Spain, pp. 414–430 (2004)
10. Pérez-Jiménez, M.J., Romero-Jiménez, A., Sancho-Caparrini, F.: Computationally Hard
Problems Addressed Through P Systems. In: Ciobanu, G. (ed.) Applications of Membrane
Computing, pp. 315–346. Springer, Berlin (2006)
Self-organizing Quantum Evolutionary Algorithm
Based on Quantum Dynamic Mechanism
1 Introduction
Most studies have used hybrid evolutionary algorithm in solving global optimization
problems. Hybrid evolutionary algorithm takes advantages of a dynamic balance be-
tween diversification and intensification. Proper combination of diversification and
intensification is important for global optimization. Quantum computing is a very
attractive research area; research on merging evolutionary algorithms with quantum
computing [1] has been developed since the end of the 90's. Han proposed the quan-
tum-inspired evolutionary algorithm (QEA), inspired by the concept of quantum com-
puting, and introduced a Q-gate as a variation operator to promote the optimization of
the individuals Q-bit [2]. Up to now, QEA has been developed rapidly and applied in
several applications of knapsack problem, numerical optimization and other fields [3].
It has gained more attraction for its good global search capability and effectiveness.
Although quantum evolutionary algorithms are considered powerful in terms of global
optimization, they still have several drawbacks such as premature convergence. A
number of researchers have experimented with biological immunity and cultural
dynamic systems to overcome these particular drawbacks implicit in evolutionary
algorithms [4], [5]; here we experiment with quantum dynamic mechanism.
H. Deng et al. (Eds.): AICI 2009, LNAI 5855, pp. 69–77, 2009.
© Springer-Verlag Berlin Heidelberg 2009
70 S. Liu and X. You
}
}
end.
where Q(t) is a population of qubit chromosomes at generation t, and P(t) is a set of
binary solutions at generation t.
1) In the step of ‘initialize Q(t)’, all qubit chromosomes are initialized with the
same constant 1/ 2 . It means that one qubit chromosome represents the linear super-
position of all possible states with the same probability.
2) The next step makes a set of binary solutions, P(t), by observing Q(t) states. One
binary solution is formed by selecting each bit using the probability of qubit. And
then each solution is evaluated to give some measure of its fitness.
3) The initial best solution is then selected and stored among the binary solutions,
P(t).
4) In the while loop, A set of binary solutions, P(t), is formed by observing Q(t-1)
states as with the procedure described before, and each binary solution is evaluated to
give the fitness value. It should be noted that P(t) can be formed by multiple observa-
tions of Q(t-1).
5) In the next step, ‘update Q(t)’, a set of qubit chromosomes Q(t) is updated by ap-
plying rotation gate defined below
⎡ COS ( ∆ θ i) − SIN ( ∆ θ i)⎤
U (∆θ i ) = ⎢
⎣ SIN ( ∆ θ i) COS ( ∆ θ i)⎥⎦ (1)
|1>
(¦Ái',¦Âi' )
△θ i
(¦Ái,¦Âi)
|0>
6) The best solution among P(t) is selected, and if the solution is fitter than the
stored best solution, the stored solution is replaced by the new one. The binary solu-
tions P(t) are discarded at the end of the loop.
Table 1. Lookup table of ∆θ i , where f(·) is the fitness function, and bi and xi are the i-th bits of
the best solution b and the binary solution x, respectively
xi bi f(x)≥f(b) ∆θ i
0 0 false θ1
0 0 true θ2
0 1 false θ3
0 1 true θ4
1 0 false θ5
1 0 true θ6
1 1 false θ7
1 1 true θ8
At time t, the potential energy function Q(t) that is related to interactive forces among
particles is defined by Eq(3)
Qi (t ) = ∑ ∫
Ui ( t )
{[1 + exp(−ζ i j x)]−1 − 0.5}dx;
0
j
(3)
QD (t ) = ∑ Qi (t )
i
Let S i (t ) be the vertical coordinate of particle S i at time t. The dynamic equation for
particle S i is defined by Eqs (5) and (6)
74 S. Liu and X. You
dSi (t ) dU (t ) dJ (t ) dP (t ) dQ (t )
= λ1 i + λ 2 D − λ 3 D − λ 4 D
dt dt dt dt dt
∂J D (t ) ∂PD (t ) ∂QD (t ) dU i (t ) (5)
= (λ1 + λ 2 − λ3 − λ4 )
∂U i (t ) ∂U i (t ) ∂U i (t ) dt
= g ( Si (t ), U i (t )), 0 ≤ λ1, λ 2, λ 3, λ 4 ≤ 1
dU i (t )
= exp(− fitness (t )) × fitness ' (t )
dt
(6)
= (1 − U i (t )) × fitness ' (t )
= g1 ( Si (t ), U i (t ))
g, g1 are functions of S i (t ) and U i (t ) .
If all particles reach their equilibrium states at time t, then finish with success.
4 Experimental Study
In this section, DQEA is applied to the optimization of well-known test functions and
its performance is compared with that of CRIQEA [6] and MIQEA [7] algorithm.
When testing the algorithm on well understood problems, there are two measures of
performance: (i) CR: Convergence Rate to global minima; (ii) C: The average number
of objective function evaluations required to find global minima.
The test examples used in this study are listed below:
x12 + x2 2 x
f1 ( x1 , x2 ) = + 1 − cos( x1 ) × cos( 2 ), −600 ≤ x1 , x2 ≤ 600 (7)
4000 2
The certainty of convergence of the DQEA may be attributed to its ability to main-
tain the diversity of its population. In fact, the superior performance of self-organizing
operator may be attributed to its adaptability and learning ability; a larger pool of
feasible solutions enhances the probability of finding the optimum solution. Coopera-
tion of subpopulations by quantum dynamic mechanism can help to obtain the opti-
mal solution faster. Comparison of the result indicates that self-organizing operator
can keep the individual diversity and control the convergence speed.
5 Conclusions
In this study, a self-organizing quantum evolutionary algorithm based on quantum
dynamic mechanism for global optimization (DQEA) is proposed. The efficiency of
quantum evolutionary algorithm is enhanced by using the self-organizing operator.
We estimate the performance of algorithm. The efficiency of the approach has been
illustrated by applying to some test cases. The results show that integration of the self-
organizing operator in the quantum evolutionary algorithm procedure can yield sig-
nificant improvements in both the convergence rate and solution quality. Cooperation
of subpopulations by quantum dynamic mechanism can help to convergence faster.
The next work is to exploit quantum dynamic mechanism further to deal with the
global optimization.
Acknowledgments. The authors gratefully acknowledge the support of Natural Sci-
ence Foundation of Shanghai (Grant No.09ZR1420800), Development Foundation of
SUES(Grant No.2008XY18 ) and Doctor Foundation of SUES.
References
1. Narayanan, M.M.: Genetic Quantum Algorithm and Its Application to Combinatorial Opti-
mization Problem. In: Proc. IEEE International Conference on Evolutionary Computation
(ICEC 1996), pp. 61–66. IEEE Press, Piscataway (1996)
Self-organizing Quantum Evolutionary Algorithm 77
2. Han, K.H., Kim, J.H.: Quantum-inspired Evolutionary Algorithm for a Class of Combinato-
rial Optimization. IEEE Transactions on Evolutionary Computation 6(6), 580–593 (2002)
3. Han, K.H., Kim, J.H.: Quantum-Inspired Evolutionary Algorithms with a New Termination
Criterion, Hε gate, and Two-Phase Scheme. IEEE Transactions on Evolutionary Com-
putation 8(2), 156–169 (2004)
4. Fukuda, T., Mori, K., Tsukiyama, M.: Parallel search for multi-modal function optimization
with diversity and learning of immune algorithm. In: Artificial Immune Systems and Their
Applications, pp. 210–220. Spring, Berlin (1999)
5. Saleem, S.: Knowledge-Based Solutions to Dynamic Problems Using Cultural Algorithms.
PhD thesis, Wayne State University, Detroit Michigan (2001)
6. You, X.M., Liu, S., Sun, X.K.: Immune Quantum Evolutionary Algorithm based on Chaotic
Searching Technique for Global Optimization. In: Proc. The First International Conference
on Intelligent Networks and Intelligent Systems, pp. 99–102. IEEE Press, New York (2008)
7. You, X.M., Liu, S., Shuai, D.X.: Quantum Evolutionary Algorithm Based on Immune The-
ory for Multi-Modal Function Optimization. Journal of Petrochemical Universities 20, 45–
49 (2007)
8. Hey, T.: Quantum Computing: An Introduction. Computing & Control Engineering Jour-
nal 10, 105–112 (1999)
9. Shuai, D., Shuai, Q., Liu, Y., Huang, L.: Particle Model to Optimize Enterprise Computing.
In: Research and Practical Issues of Enterprise Information Systems. International Federa-
tion for Information Processing, vol. 205, pp. 109–118. Springer, Boston (2006)
Can Moral Hazard Be Resolved by
Common-Knowledge in S4n-Knowledge?
Takashi Matsuhisa
1 Introduction
H. Deng et al. (Eds.): AICI 2009, LNAI 5855, pp. 78–85, 2009.
c Springer-Verlag Berlin Heidelberg 2009
Can Moral Hazard Be Resolved by Common-Knowledge in S4n-Knowledge? 79
2 Moral Hazard
Let us consider the principal-agents model as follows: There are the principal
P and n agents {1, 2, · · · , k, · · · , n} (n ≥ 1) in a firm. The principal makes a
80 T. Matsuhisa
profit by selling the productions made by the agents. He/she makes a contact
with each agent k that the total amount of all profits is refunded each agent k
in proportion to the agent’s contribution to the firm.
Let ek denote the measuring managerial effort for k’s productive activities.
The set of possible efforts for k is denoted by Ek with Ek ⊆ R. Let Ik (·) be a
real valued continuously differentiable function on Ek . It is interpreted as the
profit by selling the productions made by the agent k with the cost c(ek ). Here
we assume Ik (·) ≥ 0 and the cost function c(·) is a real valued continuously
differentiable function on E = ∪nk=1 Ek . Let IP be the total amount of all the
research grants awarded:
n
IP = Ik (ek ).
k=1
The principal P cannot observe these efforts ek , and shall view it as a random
variable on a probability space (Ω, µ). The optimal plan for the principal then
solves the following problem:
n
Maxe=(e1 ,e2 ,··· ,ek ,··· ,en ) {Exp[IP (e)] − ck (ek )}.
k=1
Wk (ek ) = rk IP (e),
n
with k=1 rk = 1, 0 ≤ rk ≤ 1, where rk denotes the proportional rate represent-
ing k’s contribution to the firm. The optimal plan for each agent also solves the
problem: For every k = 1, 2, · · · , n,
3 Common-Knowledge
Let N be a set of finitely many agents and i denote an agent. The specification
is that N = {P, 1, 2, · · · , k, · · · , n} consists of the president P and the faculty
Can Moral Hazard Be Resolved by Common-Knowledge in S4n-Knowledge? 81
Ref ω ∈ Πi (ω);
Trn ξ ∈ Πi (ω) implies Πi (ξ) ⊆ Πi (ω).
This structure is equivalent to the Kripke semantics for the multi-modal logic
S4n. The set Πi (ω) will be interpreted as the set of all the states of nature that
i knows to be possible at ω, or as the set of the states that i cannot distinguish
from ω. We call Πi (ω) i’s information set at ω.
A partition information structure is a RT-information structure Ω, (Πi )i∈N
satisfying the additional postulate: For each i ∈ N and for any ω ∈ Ω,
Ki E = {ω | Πi (ω) ⊆ E }
The event Ki E will be interpreted as the set of states of nature for which i knows
E to be possible.
We record the properties of i’s knowledge operator3: For every E, F of 2Ω ,
N Ki Ω = Ω;
K Ki (E ∩ F ) = Ki E ∩ Ki F ;
T Ki E ⊆ E
4 Ki E ⊆ Ki (Ki E).
5 Ω \ Ki E ⊆ Ki (Ω \ Ki E).
1
This stands for a reflexive and transitive information structure.
2
C.f.; Fagin et al [2].
3
According to these properties we can say the structure Ω, (Ki )i∈N is a model for
the multi-modal logic S4n.
82 T. Matsuhisa
KC F = ∩n∈N (KE )n F.
M E := Ω \ KC (Ω \ E).
Let Z be a set of decisions, which set is common for all agents. By a decision
function we mean a mapping f of 2Ω × 2Ω into the set of decisions Z. We refer
the following properties of the function f : Let X be an event.
DUC (Disjoint Union Consistency): For every pair of disjoint events S and T ,
if f (X; S) = f (X; T ) = d then f (X; S ∪ T ) = d;
PUD (Preserving Under Difference): For all events S and T such that S ⊆ T ,
if f (X; S) = f (X; T ) = d then f (X; T \ S) = d.
By the membership function associated with f under agent i’s private infor-
mation we mean the function di from 2Ω × Ω into Z defined by di (X; ω) =
f (X; Πi (ω)), and we call di (X; ω) the membership value of X associated with f
under agent i’s private information at ω.
Definition 3. We say that consensus on X can be guaranteed among all agents
(or they agree on it) if di (X; ω) = dj (X; ω) for any agent i, j ∈ N and in all
ω ∈ Ω.
Can Moral Hazard Be Resolved by Common-Knowledge in S4n-Knowledge? 83
We can now state explicitly Theorem 1 as below: Let D be the event of the
membership degrees of an event X for all agents at ω, which is defined by
Theorem 3. Assume that the agents have a S4n-knowledge structure and the
decision function f with satisfying the two properties (DUC) and (PUD). If
ω ∈ KC D then di (X; ω) = dj (X; ω) for any agents i, j ∈ N and in all ω ∈ Ω.
Proof. Will be given in the same line in Matsuhisa and Kamiyama [4].
This section investigates the moral hazard problem from the common-knowledge
view point. Let us reconsider the principal-agents model and let notations and
assumptions be the same in Section 2. We show the evidence of Theorem 2
under additional assumptions A1-2 below. This will give a possible solution of
our moral hazard problem.
From the necessity condition for critical points together with A2 it can been
seen that the principal’s marginal expected costs for agent k is given by
[c (e(·); ω)] = ∩i∈N ∩ξ∈Ω {ζ ∈ Ω|f (ξ; Πi (ζ)) = f (ξ; Πi (ω))}.
7 Concluding Remarks
It ends well this article to pose additional problems for making further progresses:
1. If the proportional rate rk representing k’s contribution to the college de-
pends only on his/her effort for research activities in the principal-agents model,
what solution can we have for the moral hazard problem?
2. Can we construct a communication system for the principal- agents model,
where the agents including Principal communicate each other about their ex-
pected marginal cost as messages? The recipient of the message revises his/her
information structure and recalculates the expected marginal cost under the
revised information structure. The agent sends the revised expected marginal
cost to another agent according to a communication graph, and so on. In the
circumstance does the limiting expected marginal costs actually coincide ? Mat-
suhisa [3] introduces a fuzzy communication system and extends Theorem 3 in
the communication model. By using this model Theorem 4 can be extended in
the communication framework, and the detail will be reported in near future.
References
1. Aumann, R.J.: Agreeing to disagree. Annals of Statistics 4, 1236–1239 (1976)
2. Fagin, R., Halpern, J.Y., Moses, Y., Vardi, M.Y.: Reasoning about Knowledge. MIT
Press, Cambridge (1995)
Can Moral Hazard Be Resolved by Common-Knowledge in S4n-Knowledge? 85
3. Matsuhisa, T.: Fuzzy communication reaching consensus under acyclic condition. In:
Ho, T.-B., Zhou, Z.-H. (eds.) PRICAI 2008. LNCS (LNAI), vol. 5351, pp. 760–767.
Springer, Heidelberg (2008)
4. Matsuhisa, T., Kamiyama, K.: Lattice structure of knowledge and agreeing to
disagree. Journal of Mathematical Economics 27, 389–410 (1997)
5. Samet, D.: Ignoring ignorance and agreeing to disagree. Journal of Economic
Theory 52, 190–207 (1990)
Casuist BDI-Agent: A New Extended BDI Architecture
with the Capability of Ethical Reasoning
Keywords: ethical artificial agent, explicit ethical agent, BDI agent, ethical rea-
soning, CBR-BDI agent.
1 Introduction
At the present moment, many researchers work on projects like expert systems which
can assist physicians for diagnosing malady of patient, warplanes which can be oper-
ated and used without human operators in war, autonomous driverless vehicles which
can be used for urban transportation, and suchlike too. The common goal between
these projects is augmentation of autonomy in behavior of such machines. As a con-
sequence of this autonomy they can act on behalf of us without any interference and
guidance, so it causes human comfort in their duties. But if we do not consider and
control the autonomy of these entities, we will face serious problems because of the
confidence in intelligence of autonomous system without any control and restriction
on their operations.
The new interdisciplinary research area of “Machine Ethics” is concerned with
solving this problem [1,2,3]. Anderson proposed Machine Ethics as a new issue which
consider the consequence of machine’s behavior on humanlike. The ideal and ulti-
mate goal of this issue is the implementation of ethics in machines, as machines can
autonomously detect the ethical effect of their behavior and follow an ideal ethical
principle or set of principles, that is to say, it is guided by this principle or these prin-
ciples in decisions it makes about possible courses of actions it could take [3]. So with
simulation of ethics in autonomous machine we can avoid the problems of autonomy
in autonomous machines.
H. Deng et al. (Eds.): AICI 2009, LNAI 5855, pp. 86–95, 2009.
© Springer-Verlag Berlin Heidelberg 2009
Casuist BDI-Agent: A New Extended BDI Architecture 87
2 Preliminaries
BDI agents have been widely used in relatively complex and dynamically changing
environments. BDI agents are based on the following core data structures: beliefs,
desires, intentions, and plans [7]. These data structures represent respectively, infor-
mation gathered from the environment, a set of tasks or goals contextual to the envi-
ronment, a set of sub-goals that the agent is currently committed, and specification of
how sub goals may be achieved via primitive actions. The BDI architecture comes
88 A.R. Honarvar and N. Ghasem-Aghaee
with the specification of how these four entities interact, and provides a powerful
basis for modeling, specifying, implementing, and verifying agent-based systems.
Case-based reasoning (CBR) has emerged in the recent past as a popular approach to
learning from experience. Case-based reasoning (CBR) is a reasoning method based
on the reuse of past experiences which called cases [8]. Cases are description situa-
tions in which agents with goals interact with the world around them. Cases in CBR
are represented by a triple (p,s,o), where p is a problem, s is the solution of the prob-
lem, and o is the outcome (the resulting state of the world when the solution is carried
out). The basic philosophy of CBR is that the solution of successful cases should be
reused as a basis for future problems that present a certain similarity [9]. Cases with
unsuccessful outcomes or negative cases may provide additional knowledge to the
system, by preventing the agent from repeating similar actions that leads to unsuc-
cessful results or states.
2.3 Casuistry
The term “casuistry” refers descriptively to a method of reasoning for resolving per-
plexities about difficult cases that arise in moral and legal contexts [10]. Casuistry is a
broad term that refers to a variety of forms of case-based reasoning Used in discus-
sions of law and ethics. casuistry is often understood as a critique of a strict principle-
based approach to reasoning [11]. For example, while a principle-based approach may
conclude that lying is always morally wrong, the casuist would argue that lying may
or may not be wrong, depending on the details surrounding the case. For the casuist,
the circumstances surrounding a particular case are essential for evaluating the proper
response to a particular case. Casuistic reasoning typically begins with a clear-cut,
paradigm case which means "pattern" or "example". From this model case, the casuist
would then ask how close the particular case currently under consideration matches
the paradigm case. Cases similar to the paradigm case ought to be treated in a similar
manner; cases unlike the paradigm case ought to be treated differently. The less a
particular case resembles the paradigm case, the weaker the justification for treating
that particular case like the paradigm case.
2.4 Consequentialist
Consequentialist refers to those moral theories which hold that the consequences of a
particular action form the basis for any valid moral judgment about that action. Thus,
from a consequentialist standpoint, a morally right action is one that produces a good
outcome, or consequence [12].
evaluation of each action without knowing codes of ethics explicitly. When human
dose any action, he will see the result of his action in future. If he considers implicitly
the result of his action from ethical aspect, he can use this experience in future situa-
tions. When he is situated in similar situation to previous case, he will use his experi-
ences and behave similarly if his previous action is successful and implicitly ethical
(we assume Humans don’t know codes of ethics). This idea called “casuistry” in ethics.
For implementing ethical decision making capability in artificial agents we use this
idea which is related to casuistry bottom-up approach in ethics. For considering and
evaluating situation from ethical aspect (without knowing codes of ethics) we use and
adapt previous works that try to make ethics computable.
In BDI architecture, agent behavior is composed of beliefs, desires, and intentions.
These mental attitudes determine the agent’s behavior. In casuist BDI-Agent architec-
ture, BDI-agent’s behavior is adapted to behave ethically. In this architecture, agents
sense environment <E> and make a triple which shows current situation that consists
of current agent’s beliefs, desires and details of sensed environment. The current
situation is denoted by a triple <E, B, D>. The current situation is delivered to Case-
Retriever of Casuist BDI-architecture. Case-Retriever is responsible for retrieving
previous cases which is similar to current situation. As each case in case memory
consist of solution part that shows how agent should act on basis of situation part of a
case, If any case is retrieved, agent should accept the solution part, adapt it and be-
have similar to solution part of retrieved case. If no case is retrieved, agent behaves
like normal BDI-agent. In this situation the evaluator part of casuist BDI-Agent archi-
tecture comes to play its role. Evaluator part evaluates the result of agent’s behavior.
The result of this evaluation is denoted by a <EV>. This evaluation is sent to Case-
Updater. Case-Updater creates a new case and saves the current situation in situation
part of a case, the behavior of agent in solution part of a case and the evaluation in
outcome part of a case ( if this case is not exist in case memory, otherwise it updates
previous cases). Fig. 1 shows the general structure of Casuist BDI-Agent architecture.
The algorithm of casuist BDI-Agent is described below.
BDI-Agent Environment
Case-Retriever Case-Evaluator
Moral Agent
Acts on
Moral Patient
Agent
Human Organization Artificial
Agent
Human A human acts An An artificial
on a human organization agent acts on
P acts on a a human
human
A
Organization A human acts An An artificial
T
on an organization agent acts on
I organization acts on an an
organization organization
E
Artificial A human acts An An artificial
N Agent on an artificial organization agent acts on
agent acts on an an artificial
T artificial agent agent
With this new simplification, outcome part of a case is divided to three sections.
These sections contain ethical evaluation values of performed agent’s intentions on
humans, organizations and artificial agents or non-humans. Ethical evaluation values
are calculated by a Case-Evaluator of casuist BDI architecture. This component is
described next section. The main structure of each case in casuist BDI architecture is
illustrated in Fig. 4. According to the specific application domain of artificial ethical
agent, more details can be added to this structure.
Solution
(intentions)
Outcome
On humans x
On y
organizations
On artificial z
agents
ܶ݁ݎݑݏ݈ܽ݁ܲݐ݁ܰ ݈ܽݐ
This computation would be performed for each alternative action. The action with the
highest Total Net Pleasure is the right action. In other words we can say their pro-
posed equation is equal to Gips’s equations with extended parameter of time. Accord-
ing to Gips’s equation, this new equation has the following form:
σ ܹ ܲ כ ܶ כ (3)
Where wi is the weight assigned each person, pi is the measure of pleasure or happi-
ness or goodness for each person and Ti is the duration of action’s pleasure for each
person. Our Case-Evaluator considers the ethical classifications of situation which is
introduced. This ethical evaluator determines the kind of entity which is affected by
each agent’s intention or behavior, Computes duration, intensity and probability of
effects of intentions on each entity. These evaluations send to Case-Updater compo-
nent of Casuist BDI agent for updating the experiences of agent or Case Memory. The
evaluation method of this evaluator is described in equation 7 which uses equation 4-6
for its operation:
ܶܰܲ ܪൌ σୀ ୧ כ୧ כ୧ (4)
TNPH is the total net pleasure of humans after agent’s behavior. Whi is the weight
assigned to humans which shows the importance of each person in specific situation
and application domain. Phi is the probability that a person is affected. Thi is the dura-
tion of pleasure/displeasure of each person after agent’s behavior. n shows the num-
bers of people in that situation.
ܱܶܰܲ ൌ σୀ ୧ כ୧ כ୧ (5)
TNPO is the total net pleasure of organizations after agent’s behavior. Woi is the
weight assigned to organizations which shows the importance of each organization in
specific situation and application domain. Poi is the probability that an organization is
affected. Toi is the duration of pleasure/displeasure of each organization after agent’s
behavior. n shows the numbers of organization in that situation.
ܶܰܲ ܣൌ σୀ ୧ כ୧ כ୧ (6)
TNPA is the total net pleasure of artificial agent after agent’s behavior. Wai is the
weight assigned to artificial agents which shows the importance of each artificial
agent in specific situation and application domain. Pai is the probability that an artifi-
cial agent is affected. Tai is the duration of pleasure/displeasure of each artificial
agent after agent’s behavior. n shows the numbers of artificial agents in that situation.
ܶܰܲ ൌ ܹܶܰܲ כ ܪ ܱܶܰܲ ܹ כ ܹܶܰܲ כ ܣ (7)
94 A.R. Honarvar and N. Ghasem-Aghaee
TNP is the total net pleasure of all three kinds of entities which participate in applica-
tion domain of agent’s behavior. Wh, Wo, Wa illustrate the participation degree of
humans, organizations and artificial agents, respectively. The summation of Wh, Wo
and Wa equals one.
TNPH, TNPO, TNPA and TNP are stored in outcome part of a case in Case Mem-
ory. These values can use by a Case-Retriever for retrieving cases when agents en-
counter a problem.
6 Conclusion
In this paper a new extension of BDI architecture, Casuist BDI-Agent, proposed
which has the previous capability of BDI-agent architecture in addition to capability
of ethical decision making. This architecture can be used for designing BDI-Agent in
domains where BDI-Agent can be used and ethical considerations are important issue.
With the aid of this new architecture, agents can consider ethical effects of their be-
haviors when they make decision. The main idea in this architecture is based on a
method of ethical reasoning which uses previous experiences and doesn’t use any
codes of ethics. The main advantage of using this method for implementing ethical
decision making capability in agent is elimination of needs to convert a set of ethical
rules in application domain of agents to algorithm which needs conflict management
of rules. In this architecture agents can adapt ethically to its application domain and
can augment its implicit ethical knowledge, so behave more ethically.
References
1. Anderson, M., Anderson, S., Armen, C.: Toward Machine Ethics: Implementing Two
Action-Based Ethical Theories. In: AAAI 2005 Fall Symp. Machine Ethics, pp. 1–16.
AAAI Press, Menlo Park (2005)
2. Allen, C., Wallach, W., Smith, I.: Why Machine Ethics? IEEE Intelligent Systems Special
Issue on Machine Ethics 21(4), 12–17 (2006)
3. Anderson, M., Anderson, S.: Ethical Healthcare Agents. Studies in Computational Intelli-
gence, vol. 107, pp. 233–257. Springer, Heidelberg (2008)
4. Wallach, W., Allen, C.: Moral Machines: Teaching Robot Right from Wrong. Oxford
University Press, Oxford (2009)
5. Moor, J.H.: The nature, importance, and difficulty of Machine Ethics. IEEE Intelligent
Systems Special Issue on Machine Ethics 21(4), 18–21 (2006)
6. Honarvar, A.R., Ghasem-Aghaee, N.: Simulation of Ethical Behavior in Urban Transporta-
tion. Proceedings of World Academy of Science, Engineering and Technology 53,
1171–1174 (2009)
7. Rao, G.: BDI Agents: From Theory to Practice. In: Proceedings of the First International
Conference on Multi-Agent Systems (ICMAS 1995), San Fransisco, USA (1995)
8. Pal, S., Shiu: Foundations of Soft Case-Based Reasoning. Wiley-Interscience, Hoboken
(2004)
9. Kolodner: Case-Based Reasoning. Morgan Kaufmann, San Mateo (1993)
Casuist BDI-Agent: A New Extended BDI Architecture 95
10. Keefer, M.: Moral reasoning and case-based approaches to ethical instruction in science.
In: The Role of Moral Reasoning on Socioscientific Issues and Discourse in Science
Education. Springer, Heidelberg (2003)
11. http://wiki.lawguru.com/index.php?title=Casuistry
12. Gips: Towards the Ethical Robot, Android Epistemology, pp. 243–252. MIT Press,
Cambridge (1995)
13. Kendal, S.L., Creen, M.: An Introduction to Knowledge Engineering. Springer, Heidelberg
(2007)
14. Olivia, C., Change, C., Enguix, C.F., Ghose, A.K.: Case-Based BDI Agents: An Effective
Approach for Intelligent Search on the web. In: Proceeding AAAI 1999 Spring
Symposium on Intelligent Agents in Cyberspace Stanford University, USA (1999)
15. Corchado, J.M., Pavón, J., Corchado, E., Castillo, L.F.: Development of CBR-BDI Agents:
A Tourist Guide Application. In: Funk, P., González Calero, P.A. (eds.) ECCBR 2004.
LNCS (LNAI), vol. 3155, pp. 547–559. Springer, Heidelberg (2004)
16. Al-Fedaghi, S.S.: Typification-Based Ethics for Artificial Agents. In: Proceedings of Sec-
ond IEEE International Conference on Digital Ecosystems and Technologies (2008)
17. Floridi, L., Sanders, J.W.: On the Morality of Artificial Agents. Minds and
Machines 14(3), 349–379 (2004)
Research of Trust Chain of Operating System
Abstract. Trust chain is one of the key technologies in designing secure oper-
ating system based on TC technology. Constructions of trust chain and trust
models are analyzed. Future works in these directions are discussed.
1 Introduction
It seems likely that computing platforms compliant with Trusted Computing Group
(TCG) specifications will become widespread over the next few years [1, 2]. The ad-
venture of TC (Trust Computing) technology gives new opportunity to the research and
development of secure operating system [3, 4]. On the one hand, secure operating
system has the ability to protect legal information flow, protect the information from
unauthorized access. On the other hand, TC technology can protect operating system of
its own security. Therefore, with the help of TC technology to develop secure operating
system is an effective way to solve the security of terminal. Future operating system
will be open-source, trusted, secure, etc [5].
Trust chain is one of the key technologies of TC, which plays vital role in con-
struction of high-assured secure operating system. In this paper we summary in detail
two important problems about trust chain of secure operating system, mainly focus on
Linux.
The paper starts with a description of transitive trust and the notion of trust chain of
operating system in Section 2. Section 3 details the research on construction of oper-
ating system trust chain and analyzes future work on Linux. Trust models and its future
work are discussed in Section 4. We conclude the paper in Section 5.
H. Deng et al. (Eds.): AICI 2009, LNAI 5855, pp. 96–102, 2009.
© Springer-Verlag Berlin Heidelberg 2009
Research of Trust Chain of Operating System 97
group of functions can give a trustworthy description of the third group of functions,
etc. Transitive trust is used to provide a trustworthy description of platform character-
istics, and also to prove that non-migratable keys are non-migratable.
The transitive trust enables the establishment of a chain of trust by ensuring that
the trust, on each layer of the system, is based on, and is only based on, the trust on the
layer(s) underneath it, all the way down to the hardware security component, which
serves as the root of trust. If verification fails to succeed at any given stage, the system
might be put in a suspended-mode to block possible attacks. The resulting integrity
chain inductively guarantees system integrity. Figure 1 illustrates the process of tran-
sitive trust of operating system. Trust chain starts from the trust root. The unmodifiable
part of the BIOS measure the integrity of the BIOS and then the BIOS measures the
MBR (Master Boot Record) of the bootstrap device. Then, MBR measures the OS
loader, and OS loader measures OS kernel. Finally, OS measure applications.
Many researchers have endeavor to lots of works on operating system based on TCPA.
AEGIS is a system structure for integrity examination developed by University of
Pennsylvania [6]; AEGIS architecture establishes a chain of trust, driving the trust to
lower levels of the system, and, based on those elements, secure boot. It validates
integrity at each layer transition in the bootstrap process. AEGIS also includes a re-
covery process for integrity check failures.
IBM T. J. Watson Research Center developed an integrity mechanism on Linux [7].
There system is the first to extend the TCG trust concepts to dynamic executable con-
tent from the BIOS all the way up into the application layer. They present the design
and implementation of a secure integrity measurement system for Linux. All executa-
ble content that is loaded onto the Linux system is measured before execution and these
measurements are protected by the Trusted Platform Module (TPM) that is part of the
TCG standards. Their measurement architecture is applied to a web server application
to show how undesirable invocation can be detected, such as rootkit programs, and that
measurement is practical in terms of the number of measurements taken and the per-
formance impact of making them.
Huang-Tao also designed a trusted boot method in server system [8]. Their research
proposed a solution of trusted boot. The solution performs integrity verification on
every control capability transfer between one layer and next layer. If OK, control can
transfer. Therefore, the boot of system can execute according to the transition of trusted
chain. In this way, system boot can be in a trusted state, which can improve the sys-
tem’s security. Also, the solution is easily and flexibility to implement.
98 H. Li and X. Tian
Dynamic multi-path trust chain (DMPTC) is a software type and character based
mechanism to assure system trustworthiness on WINDOWS system [9]. DMPTC dif-
ferentiates static system software and dynamic application software and takes different
ways and policies to control the loading and running of various executable codes. The
goal of DMPTC is to build a trusted computing platform by making computing plat-
form only load and run trustworthy executables. Also, DMPTC gives great considera-
tion to the impact it causes to system performance1.Based on the attributes of various
executables and by taking advantage of WINDOWS interior security mechanisms,
DMPTC reduces the time cost of the executables verification process greatly. And
ultimately assures the flexibility and utility of DMPTC.
Maruyama etc deeply studied transitive mechanisms from GRUB boot loader to
operating system, describe a TCPA integrity mechanism from end to end and imple-
ment it on Linux kernel [10]. Integrity measurement of the kernel is done through a
chaining of PCR (Platform Configuration Register) updates during the bootstrap
process of the GRUB kernel loader. Kernel integrity measure is done by update the
PCR chain in kernel loader during booting. The measure results which involved digi-
tally-signed PCR values are reported to remote server by using a JAVA application,
Hadi Nahari describes a mechanism to construct effective CONTAINMENT (that is, a
mechanism to stop an exploited application start to against to attacks on another ap-
plication), which is applied to embedded system [11]. The MAC (Mandatory Access
Control) is provided by SeLinux. The focus will be on practical aspects of hardware
integration as well as porting SeLinux to resource-constrained devices. The methods
provides a high-level, structured overall infrastructure to provide basic and necessary
function to establish the trust of operating system services (via connect a hard-
ware-based root of trust).
For Linux kernel, changes may take place during execution. Due to the multi-aspect
and in-order of applications, single chained boot mechanism of system boot cannot
apply to the trust transition from OS to applications. Once the kernel is booted, then
user-level services and applications may be run. In Linux, a program execution starts
by loading an appropriate interpreter (i.e., a dynamic loader, such as ld.so) based on the
format of the executable file. Loads of the target executable's code and supporting
libraries are done by the dynamic loader. In Linux operating system, kernel and its
modules, binary execution files, shared libraries, configuration files, and scripts run
loosely serially, and can change the system’s state during execution, which is the key
difficulty to the research of trusted Linux. In such situations, even OS loader has veri-
fied that the kernel is trusted. The trusted states can also be destroyed because the
kernel modules can be inserted or uninstalled, which incurs that the PCR values cannot
represent the current execution condition. It needs verification whether the module has
effects on kernel states on inserting or uninstalling the kernel modules.
One must note that the former research though ensuring the integrity of the oper-
ating environment when a hard boot occurs, does not guarantee its integrity during
the runtime; that is, in case of any malicious modification to the operating environ-
ment during the runtime, this architecture will not detect it until the next hard boot
happens. Trust of application embodies the integrity of privacy of complete trust
Research of Trust Chain of Operating System 99
chain. Though in reference [7], the trust is extended to dynamic executable content,
mandatory access control policy are not measured, future architecture extensions
should include such measurements.
On the other hand, a MAC (Mandatory Access Control) mechanism will be able to
address one of its fundamental shortcomings; providing a level of protection at runtime.
Deploying MAC mechanisms at the same time balance performance and control are
particularly challenging tasks.
TPM-EMULATOR provides better technical support for the practice and verifica-
tion for the research of trust chain [12, 13]. There is a tremendous opportunity for
enhanced security through enabling projects to use the TPM.
4 Trust Model
TC models centralized on how to compute trust in information world, that is, put use
the trust relations among humans into computing environments to achieve the goal of
trust.
Patel J etc proposed a probabilistic trust model[14], their research aims to develop a
model of trust and reputation that will ensure good interactions amongst software
agents in large scale open systems in particular[14]. Key drivers for their model are: (1)
agents may be self-interested and may provide false accounts of experiences with other
agents if it is beneficial for them to do so; (2) agents will need to interact with other
agents with which they have no past experience. Specifically, trust is calculated using
probability theory taking account of past interactions between agents. When there is a
lack of personal experience between agents, the model draws upon reputation infor-
mation gathered from third parties.
Trust management model based on the fuzzy set theory deals with the authenticity in
open networks [15]. The author showed that authentication can not be based on public
key certificate alone, but also needs to include the binding between the key used for
certification and its owner, as well as the trust relationships between users. And develop
a simple algebra around these elements and describe how it can be used to compute
measures of authenticity.
In algebra for assessing trust in certification chains, the fuzzy nature of subjective
trust is considered, and the conceptions of linguistic variable and fuzzy logic are in-
troduced into subjective trust management [16]. A formal trust metric is given, fuzzy
IF-THEN rules are applied in mapping the knowledge and experiences of trust rea-
soning that humanity use in everyday life into the formal model of trust management,
the reasoning mechanisms of trust vectors are given, and a subjective trust management
model is provided. The formal model proposed provides a new valuable way for
studying subjective trust management in open networks.
Due to the open network is dynamic, distributed, and non-deterministic, in reference
to trust relation among human, a trust evaluation model based on D2S theory is pro-
posed [17]. The model gives the formal description of direct trust based on the history
trade record among grid and constructed the trust recommend mechanism of grid
nodes. Combining D2S theory with evidence, we can get indirect trust. After that,
100 H. Li and X. Tian
combine direct trust and indirect trust, effectively. The results show that the trust model
is viable and usability.
Reference [18] proposed a trust management model based on software behavior. An
agent-based software service coordination model is presented to deal with the trust
problem in open, dynamic and changeful application environment. The trust valuation
model is given to value trust relationships between software services. Trust is
abstracted as a function of subjective expectation and objective experience, and a rea-
sonable method is provided to combine the direct experience and the indirect experi-
ence from others. In comparison with another work, a complete trust valuation model is
designed, and its reasonability and operability is emphasized. This model can be used
in coordination and security decision between software services.
Non-interference theory is introduced into the domain of trusted computing to con-
struct the trusted chain theoretic model [19]. The basic theory of the computing trusted is
proposed and a non-interference based trusted chain model is built from the dynamic
point of view, and then the model is formalized and verified. Finally, the process of
start up based on Linux operating system kernel is implemented. The implementation
provides a good reference for the development and application of the trusted computing
theory as well.
TCG introduces the idea of trust into the computing environment, but there is still not
the formalized uniform description. Trusted computing is still a technology but not a
theory, and the basic theory model has not been established. Above trust models focus
on sociology human relation, does not in accordance with the definition of TCG. Also,
present models should be further optimized to objective, simple and usability.
Linux is multi-task operating system, multi-program run simultaneously; therefore,
verification of the former and then transit to the latter is not viable. To maintain the
trusted state of the system, there is need to verify the file and associated execute pa-
rameters is trusted or not. At the same time, there is also need verification before
loading to execution to those objects which probably change system state.
Due to the property of applications, formal description should focus on the separa-
tion of processes, etc. The theory which support trust transition should pay more at-
tention to the dynamic execution. Also, the security principle, such as least privilege,
need-to-know policy should also take into account. Efficiency of integrity measure-
ment and security policy enforcement should also further improved.
In essence, non-interference theory is based on information flow theory, and it can
detect covert channel of the system which meets the requirements of designing
high-level secure operating system. Future research on trust model should pay more
attention to the non-interference theory, which support the construction and extend of
trusted chain.
5 Conclusions
In this paper, we summarize construction and formalization of trust chain of operating
system. We conclude that future research on trust chain of secure operating system
Research of Trust Chain of Operating System 101
should focus on dynamic trusted chain. Make full use of trusted mechanisms supplied
by TPM, extending the TCG concept of trust to application layer to study the trusted
chain and its formal description.
Acknowledgments. We are grateful to Shanghai Municipal Education Commission‘s
Young Teacher Research Foundation, under grant N0. SDL08024.
References
1. Shen, C.-x., Zhang, H.-g., Feng, D.-g.: Survey of Information Security. China Sci-
ence 37(2), 129–150 (2007) (in Chinese)
2. Trusted Computing Group. TCG Specification Architecture Overview [EB/OL]
[2005-03-01], http://www.trustedcomputinggroup.org/
3. Changxiang, S.: Trust Computing Platform and Secure Operating System. Network Security
Technology and Application (4), 8–9 (2005) (in Chinese)
4. Shieh, A.D.: Nexus: A New Operating System for Trustworthy Computing. In: Proceedings
of the SOSP, Brighton, UK. ACM, New York (2005)
5. Zhong, C., Shen, Q.-w.: Development of Modern Operating System. Communications of
CCF 9, 15–22 (2008) (in Chinese)
6. Arbaughz, W.A., Farber, D.J., MSmith, J.: A Secure and Reliable Bootstrap Architecture.
In: IEEE Security and Privacy Conference, USA, pp. 65–71 (1997)
7. Sailer, R., Zhang, X., Jaeger, T., et al.: Design and Implementation of a TCG -based Integ-
rity Measurement Architecture. In: The 13th Usenix Security Symposium, San Diego (2004)
8. Tao, H., Changxiang, S.: A Trusted Bootstrap Scenario based Trusted Server. Journal of
Wuhan University (Nature Science) 50(S1), 12–14 (2004) (in Chinese)
9. Xiaoyong, L., Zhen, H., Changxiang, S.: Transitive Trust and Performance Analysis in
Windows Environment. Journal of Computer Research and Development 44(11),
1889–1895 (2007) (in Chinese)
10. Maruyama, H., Nakamura, T., Munetoh, S., et al.: Linux with TCPA Integrity Measurement.
IBM, Tech. Rep.: RT0575 (2003)
11. Nahari, H.: Trusted Embedded Secure Linux. In: Proceedings of the Linux Symposium,
Ottawa, Ontario Canada, June 27-30, pp. 79–85 (2007)
12. Hall, K., Lendacky, T., Raliff, E.: Trusted Computing and Linux,
http://domino.research.ibm.com/comm/research_projects.nsf/
pages/gsal.TCG.html/FILE/TCFL-TPM_intro.pdf
13. Strasser, M.: A Software-based TPM Emulator for Linux. Swiss Federal Institute of
Technology (2004),
http://www.infsec.ethz.ch/people/psevinc/TPMEmulatorReport.pdf
14. Patel, J., Teacy, W.T.L., Jennings, N.R., Luck, M.: A Probabilistic Trust Model for Han-
dling Inaccurate Reputation Sources. In: Herrmann, P., Issarny, V., Shiu, S.C.K. (eds.)
iTrust 2005. LNCS, vol. 3477, pp. 193–209. Springer, Heidelberg (2005)
15. Wen, T., Zhong, C.: Research of Subjective Trust Management Model based on the Fuzzy
Set Theory. Journal of Software 14(8), 1401–1408 (2003) (in Chinese)
16. Audun, J.: An Algebra for Assessing Trust in Certification Chains. In: The Proceedings of
NDSS 1999, Network and Distributed System Security Symposium, The Internet Society,
San Diego (1999)
17. Lulai, Y., Guosun, Z., Wei, W.: Trust evaluation model based on DempsterShafer evidence
theory. Journal of Wuhan University (Natural Science) 52(5), 627–630 (2006) (in Chinese)
102 H. Li and X. Tian
18. Feng, X., Jian, L., Wei, Z., Chun, C.: Design of a Trust Valuation Model in Software Service
Coordination. Journal of Software 14(6), 1043–1051 (2003) (in Chinese)
19. Jia, Z., Changxiang, S., Jiqiang, L., Zhen, H.: A Noninterference Based Trusted Chain
Model. Journal of Computer Research and Development 45(6), 974–980 (2008)
(in Chinese)
A Novel Application for
Text Watermarking in Digital Reading
1 Introduction
It has been several ten years since digital watermarking technology emerged. Both of
technology diversity and theoretical research have made major strides in this field.
But its application in business is always a big problem which perplexes the develop-
ment of watermarking. Many watermarking companies have gone out of business or
suspended their watermarking efforts except some focus on tracking and digital
fingerprint [1].
This problem is due to, in our opinion, the misunderstanding of the nature of digital
watermarking. The real value of digital watermarking is the information bound in
watermarking. In other words, watermarking content and application methods are the
key issues in digital watermarking research. Unfortunately, the current research in this
area is little to write about.
As an important part in digital watermarking, text watermarking encounters the same
problem with other kinds of watermarking technology. While most people agree that
text watermark should be been widely used in the field of digital reading, but this is not
the truth. On one hand lack of appropriate application methods make text watermarking
useless. On the other hand, text watermarking is characterized by its attack sensitiveness
(this part is discussed in section2). This easily led to watermark information lost.
In this paper, one application method is proposed for text watermarking in digital
reading. We combine original digital content and advertisement as a whole digital
H. Deng et al. (Eds.): AICI 2009, LNAI 5855, pp. 103–111, 2009.
© Springer-Verlag Berlin Heidelberg 2009
104 J. Zhang et al.
document. The rules of combination are embedded in this digital document as water-
marking information. Once the document is under attack, advertisement will be re-
leased from it. The attacker will have to read them with real content together, which
will serve as compensation for the copyright holder.
This paper is organized as follows: in section 2, we review other kinds of text wa-
termarking. And the procedure of our solution is presented generally in section 3. And
we expound solution detail and experiment result in section 4. At last we draw the
conclusion in section 5.
2 Related Works
In general, the research object of text watermarking is digital document. And docu-
ment is always made up of two parts, content and format.
Many researchers choose to embed watermarking by taking content as manifesta-
tions of language. That is semantics-based watermarking. When the watermarking
information is embedded, the content is modified according to linguistic rules. Atal-
lah’s TMR is the representative of this kind of watermarking [2].Dai and Yang also
proposed their solution [3, 4]. This kind of watermarking technology always needs
more time to consume. And it is very sensitive to tamper attack and deleting attack
[5]. At the same time, less capacity is another shortcoming of them.
Some other researchers deal with content as a still picture. In this way, they can
build additional redundant space to embed watermarking so as to enhance capacity.
Brassil and Low’s line-shift/word-shift coding and word feature coding are the repre-
sentative of such solutions [6, 7]. In their solution, content is a binary image. They use
lines or words’ tiny mobile in different directions to embed information. And Huang
complemented their work [8].Liu and Chen’s work is based on interconnected do-
main. They regarded Chinese characters as special images [9]. It is worth mentioning
that Li and Zhang propose their method which integrates the ideas of Liu and seman-
tics-based watermarking [10]. They make a significant reduction in overall algorithm
consumption. But OCR technology is their natural enemies. No matter how sophisti-
cated these methods are, the majority of watermarking information will be removed
after the OCR or retyping attack. Tamper and deleting attack also could reduce their
performance.
There are some researchers who hope that watermarking can be hidden in docu-
ment’s format. Fu and Wang’s design is a prime example [11]. Zhou and Sun use spe-
cial characters in RTF file to embed watermark [12]. And some solution is based on
redundant coding space, such as Zhou’s work [13]. Other techniques such as add
blanks and other perceptible characters are also proposed. Although they remove the
impact of tamper attack and deleting attack. Any change to the document will have a
fatal impact on them, much more OCR and retyping attack.
To sum up, all kinds of watermarking are sensitive to special attack. There is little
information could be maintained under the niche attack.
A Novel Application for Text Watermarking in Digital Reading 105
3 Solution Design
As mentioned above, there is none text watermarking technology can resist all kinds
of attack. In particular the realization of these attacks is also very low cost. Rather
than spending a lot of energy to design a more robust watermark, it is better to find a
method which could make attacker wouldn’t burn their house to rid it of a mouse.
CA
C0 ⊕ W
⊕ CW
nA ⊕
W CWn
C0
Step 1: C0 and CA combine in accordance with the rules. These rules are encoded
into a special format, which is watermarking information W. In other words, both of
C0 and CA are watermarked.
Step 2: After being embedded W, digital document will be distributed or attacked.
In this way attack noise nA. is introduced.
Once attack happened during transmission, nA will add into document. At this time
CW turns into Cwn. As formula (2) shows: When attack misses, nA is given the value 0.
CW equals to Cwn in this condition.
Step 3: When user’s device or client software receive Cwn, it will extract W and
restore C0 according to W. If nA equals to 0, user could watch original content. But if
watermarking is totally erased, user will have to watch CA while they are browsing C0.
In the worst case that W is totally distorted, there are only garbled characters left.
We implement our design on one kind of hand-held digital reading device, and apply
it in related digital works release platform. The algorithm in [10] is taken as default
algorithm. We use a txt document which contains 10869 Chinese characters as ex-
periment object.
Figure2. (a) shows the document which contains watermarking and advertisement.
In this screenshot, it is open directly without stripping. And Figure2. (b) shows the
result of opening it after stripping. In Figure2.(b) both of advertisement and water-
marking signs are removed by content stripper.
(a) (b)
Original content C0 is split into N blocks, sizes of which are not equal and are de-
noted by sequence {Sai}. And sequence {Sbi} is used to denote the sizes of CA blocks.
These combination rules are described by W, which is made up of sequence
{Wi}.We define Wi as follows (3):
Wi = {li ,{Ci , ai , bi }R , S } (3)
li: total length of Wi.
Ci: check code of ai and bi.
{}R: redundancy construction function. It could be replaced by cryptographic
functions.
S: barrier control code. It is used to isolate different Wi. It is also helpful when delet-
ing attack appears. Different code words can also play the role of control information
transmission.
{ai}: relative length of C0 block i. It equals to the distance between first WT in this to
the last character in this C0 block.
{bi}: length of CA block i. It usually equals to Sbi.
Description above is just content structure. After these processing, it could be sealed
in other document format, such as DOC, Html. Because of lager consumption of sys-
tem, this work should be arranged in content servers or distribution servers.
Most main DRMs use asymmetric cryptographic algorithm and special document
format to ensure security. Our design focuses on content which is independent of
document format. And then it could be applied in most DRM smoothly.
Dynamic DRM (DDRM) is a special DRM which support superdistribution and
distribution among users [15]. And its technical basis is its license mechanism. And
all digital works or digital documents on this platform must be divided into blocks
and reorganized. Especially dynamic advertisement block is supported in DDRM.
That means the advertisement in digital works could be changed with distribution
gong on.
As mentioned in section 3, watermarking is embedded into total content. Both of
C0 and CA contain watermarking. In this case, device has to reorganize digital docu-
ment to update CA after each distribution. Obviously it is very hard for today’s em-
bedded devices.
Therefore we have to amend formula (3), and make it turn into another form as (4)
shows.
CW = F (W , C0 ) ⊕ C A (4)
• New CA block must have the same size with the old one. That is the reason why
we usually give CA block a larger size.
• Validity of CA should also be checked. And barrier control code S’s value is
used to indicate whether dynamic CA is supported.
As discussed in section 2, most text watermarking algorithm including our design are
attack sensitive. Once attack match with watermarking algorithm’s shortcoming, there
could be little information left.
Rather than pay attention to more robust algorithm, we focus on risk management
after attack happened. Figure 5 shows the result of watermarking information lost in
our design.
Our research mainly focuses on finding a new method to apply digital watermarking.
Therefore the correlation of selection algorithm needs to be discussed.
Though most algorithms could smoothly be applied in our design, we need use
different way to split content according to different algorithms.
When one algorithm hiding information in format is selected, splitting method is
very easy. If the algorithm tries to regard text as a picture, the descriptors about size
or offset should be replaced by the descriptors about coordinates.
110 J. Zhang et al.
(a)
(b)
5 Conclusion
As an application technology, digital watermarking research should not only indulge in
finding new algorithm or blindly seek technical target. Searching efficient application
method maybe more meaningful.
Out of viewpoint above, this paper propose a novel application method for text wa-
termarking in digital reading field. It will release advertisement as interference infor-
mation under attack. On the one hand, reducing the quality of digital works could
inhabit unauthorized distribution. On the other hand, it point out a new way for
watermarking application.
A Novel Application for Text Watermarking in Digital Reading 111
References
1. Delp, E.J.: Multimedia security: the 22nd century approach. Multimedia Systems 11,
95–97 (2005)
2. Atallah, M.J., Raskin, V., Crogan, M., Hempelmann, C., Kerschbaum, F., Mohamed, D.,
Naik, S.: Natural language watermarking: Design, analysis, and a proof-of-concept imple-
mentation. In: Moskowitz, I.S. (ed.) IH 2001. LNCS, vol. 2137, pp. 185–200. Springer,
Heidelberg (2001)
3. Dai, Z., Hong, F., Cui, G., et al.: Watermarking text document based on statistic property
of part of speech string. Journal on Communications 28(4), 108–113 (2007)
4. Yang, J., Wang, J., et al.: A Novel Scheme for Watermarking Natural Language Text:
Intelligent Information Hiding and Multimedia Signal Processing. IEEE Intelligent Soci-
ety 2, 481–488 (2007)
5. Li, Q., Zhang, J., et al.: A Chinese Text Watermarking Based on Statistic of Phrase
Frequency: Intelligent Information Hiding and Multimedia Signal Processing. IEEE Intel-
ligent Society 2, 335–338 (2008)
6. Brassil, J., Low, S., Maxemchuk, N., O’Gorman, L.: Electron marking and identification
techniques to discourage document copying. IEEE J. Select. Areas Commun. 13,
1495–1504 (1995)
7. Low, S., Maxemchuk, N.: Capacity of Text Marking Channel. IEEE Signal Processing
Letter 7(12), 345–347 (2000)
8. Huang, H., Qi, C., Li, J.: A New Watermarking Scheme and Centroid Detecting Method
for Text Documents. Journal of Xi’an Jiaotong University 36(2), 165–168 (2002)
9. Liu, D., Chen, S., Zhou, M.: Text Digital Watermarking Technology Based on Topology
of Alphabetic Symbol. Journal of Chinese Computer Systems 27(25), 812–815 (2007)
10. Li, Q., Zhang, Z., et al.: Natural text watermarking algorithm based on Chinese characters
structure. Application Research of Computers 26(4), 1520–1527 (2009)
11. Fu, Y., Wang, B.: Extra space coding for embedding wartermark into text documents and
its performance. Journal of Xi’an Highway University 22(3), 85–87 (2002)
12. Zou, X., Sun, S.: Fragile Watermark Algorithm in RTF Format Text. Computer Engineer-
ing 33(4), 131–133 (2007)
13. Zhou, H., Hu, F., Chen, C.: English text digital watermarking algorithm based on idea of
virus. Computer Engineering and Applications 43(7), 78–80 (2007)
14. Cox, J., Miller, M., et al.: Digital Watermarking and Steganography, 2nd edn. Morgan
Kaufmann, San Francisco (2007)
15. Li, Q., Zhang, J., et al.: A Novel License Distribution Mechanism in DRM System. In:
Proceedings of 22nd Advanced Information Networking and Applications - Workshops,
pp. 1329–1334. IEEE Computer Society, Los Alamitos (2008)
Optimization of Real-Valued Self Set for Anomaly
Detection Using Gaussian Distribution
College of Computer Science and Technology, Harbin University of Science and Technology,
150080 Harbin, China
[email protected], [email protected],
[email protected]
Abstract. The real-valued negative selection algorithm (RNS) has been a key
algorithm of anomaly detection. However, the self set which is used to train de-
tectors has some problems, such as the wrong samples, boundary invasion and
the overlapping among the self samples. Due to the fact that the probability of
most real-valued self vectors is near to Gaussian distribution, this paper pro-
poses a new method which uses Gaussian distribution theory to optimize the
self set before training stage. The method was tested by 2-dimensional synthetic
data and real network data. Experimental results show that, the new method
effectively solves the problems mentioned before.
1 Introduction
The anomaly detection problem can be stated as a two-class problem: given an ele-
ment of the space, classify it as normal or abnormal[1]. There exist many approaches
for anomaly detection which include statistical, machine learning, data mining and
immunological inspired techniques[2, 3, 4]. The task of anomaly detection may be
considered as analogous to the immunity of natural systems, while both of them aim
to detect the abnormal behaviors of system that violate the established policy[5,6].
Negative selection algorithm (NSA)[7] has potential applications in various areas, in
particular anomaly detection. NSA trains detector set by eliminating any candidate
that match elements from a collection of self samples (self set). These detectors sub-
sequently recognize non-self data by using the same matching rule. In this way, it is
used as an anomaly detection algorithm that only requires normal data to train.
Most works in anomaly detection used the problem to binary representation in the
past. However, many applications are natural to be described in real-valued space.
Further more, these problems can hardly be processed properly using NSA in binary
representation. Gonzalez et al.[8] proposed the real-value representation for the
self/non-self space, which differs from the original binary representation. The real-
valued self/detector is a hypersphere in n-dimensional real space; it can be repre-
sented by an n-dimensional point and a radius.
H. Deng et al. (Eds.): AICI 2009, LNAI 5855, pp. 112–120, 2009.
© Springer-Verlag Berlin Heidelberg 2009
Optimization of Real-Valued Self Set for Anomaly Detection 113
As we know that, the detection performance is mainly relied on the quality of de-
tector set: detector coverage of the non-self space. The detector is trained by the self
set, so that the self set plays an important role on the whole detection process, and the
health of self set is very important to be paid enough attention. It should be noted that,
the real-valued self set indeed has some problems which the binary representation has
not, such as, the wrong self samples, the overlapping of self samples and the boundary
invasion. To solve these problems, this paper proposes a novel method to optimize the
self set. Starting from the point that self samples approach the Gaussian distribution
as the number of samples is large enough, so this method employs Gaussian distribu-
tion theory to deal with the problems mentioned above.
Before training detectors, the self set must be secure. However, the real situation may
not meet to the requirement all the time, and the self set is likely to collect some
wrong samples. The figure 1(a) illustrates the problem of wrong self in 2-dimensional
space, there are two samples is so far from the self region, which we can consider as
the wrong self. The wrong self samples may result in the holes, so it is important to
discard the wrong samples before RNS.
(a) The wrong self samples (b) The overlapping (c) The boundary invasion
The overlapping self samples: In a work situation, the self samples distribute inten-
sively. As shown in figure 1(b), a large number of self samples are intensively distribute
in the self region, and the overlapping rate is very high, and there are many unnecessary
samples in the self region. Now we discuss the relationship between the number of self
samples and the candidates. Some variables should be defined as below:
f : The probability of failure matching of a candidate to any self sample
pf : The probability of an attack that is not detected
pm : The matching probability
Ns : The number of self samples
114 L. Xi, F. Zhang, and D. Wang
f = (1 − pm ) Ns ≈ e− Ns pm .
When the N d is large enough,
p f = (1 − pm ) N d ≈ e − N d pm .
Because
− ln( p f ) ,
N d = N d0 ∗ f =
pm
so
− ln( p f )
N d0 = (1)
pm (1 − pm ) N s
From the equation 1, we can see that the N d has exponential relationship with the N s ,
0
so the more self samples, the more candidates and the higher cost of the detector
training.
The follow equation defines an approximate measure of overlapping between self
samples:
2
− si − s j
r2 (2)
Overlapping( si , s j ) = e s
The maximum value, 1, is reached when the distance between the two sample is 0, on
the contrary, when the distance is equal to 2 rs , the value of this function is almost close
to 0. Based on the equation 2, the amount of overlapping of a self set is defines as
2
− si − s j
Overlapping( S ) = ∑ e r2
s
, i, j = 1, 2,..., n. (3)
i≠ j
As a result of the self’s radius, on the border of the self region, the covering area
should invade the non-self region’s boundary. We can see this phenomenon clearly in
the figure 1(c). So when training the detectors by self set, the vicinity of the non-self
boundary may not be covered completely.
Optimization of Real-Valued Self Set for Anomaly Detection 115
3 Optimization Method
In view of the analysis above, to optimize the self set is significant. The aim of the
optimization is using the least self samples to cover the self region but not the non-
self region. The problem of optimization can be stated as follows:
minimize:
V ( S ) = Volume{x ∈ U | ∃s ∈ S ,|| x − s ||≤ rs } (4)
restricted to:
{∀s ∈ S | ∃d ∈ D, s − d ≤ rs } = ∅ (5)
and
{∀si , s j ∈ S ,|| si − s j ||≤ rsi or || si − s j ||≤ rs j } = ∅ (6)
The function defined in equation 4 represents the amount of the self space covered by
self samples. S, which corresponds to the volume covered by the union of hyper-
spheres associated with each self. The restriction: equation 5 tells that no self should
invade the detector set; equation 6 tells that, there is no self sample covered by the
other samples in the self set.
As we said before, if the number of self samples is large enough, the probability
distribution of self samples approaches the Gaussian distribution. Furthermore, the
central limit theorem clarifies that whichever kind of the probability distribution all
samples approach, the distribution of the samples’ mean is near to a Gaussian distri-
bution; the mean of Gaussian distribution is equal to the mean of all samples.
According to the Gaussian distribution, we can also realize that, the smaller the
distance between the sample and the mean point is, the larger the value of the sam-
ple’s probability density. As discussed before, the nearer the sample is next to the
center of the self region, the heavier the overlapping rate is. So we can use this rela-
tionship to deal with the problem of the unnecessary self samples. This paper pro-
poses the method is that adjusting every sample’s radius by its probability density to
deal with the boundary invasion, and then discarding each unnecessary sample by the
radius.
There is an important and famous proposition in the Gaussian distribution: the “3σ”
Criterion, which shows that each normal sample is almost in the “3σ” district, al-
though the range is from −∞ to ∞ . So we can use the “3σ” Criterion to deal with
the problem of the wrong self samples.
As analyzed above, we may draw a conclusion that, the optimization by the prob-
ability theory is reasonable. To solve the three problems with self set mentioned
above, the optimization is composed by three steps as follows:
Step.1: Discarding the wrong self samples by the “3σ” criterion;
Step.2: Adjusting the self radius by the self’s probability density;
Step.3: Discarding the unnecessary self samples by the self radius.
Before describing the optimization, some variables frequently used in the following
sections are defined here:
116 L. Xi, F. Zhang, and D. Wang
BEGIN
Collection the self samples: S0 ← s;
//step. 1: discarding the wrong self samples
Regularize S0, and then compute the µ and σ;
n=0;
while (n<N0){
if (sn is out of the “3σ” district){
discard sn, N0--;
}
n++;
}
// step. 2: adjusting every sample’s radius
Compute the smaxpdf and sminpdf ;
L= (maxpdf-minpdf)/num;
while (n<N0){
l = ( pdf s − min pdf ) / L ;
n
Sn.r=l×k;
}
// step. 3: discarding the unnecessary self samples
S ← s0, N=1, flag=0;
while (n<N0){
i=0;
while (i<N){
d =|| sn − si || ;
if (d<sn.r || d<si.r){
flag=1, break;
}
i++;
}
if (flag==0){
S ← si, N++;
}
n++;
}
END
In the step.1, the self sample should be discarded if its probability density is out of
the “3σ” district. The time complexity of this step is O(n).
Optimization of Real-Valued Self Set for Anomaly Detection 117
In the step.2, the nearer self samples are closed to the center of self region, the lar-
ger the radii of them. So the radii of self samples near the boundary of self region are
adjusted to be a relatively rational level so that the boundary invasion is avoided,
while the around the central region, each sample’s radius is adjusted to be a relatively
larger level than before so that the overlapping is severer than before. The time com-
plexity of this step is O(n).
In the step.3, by discarding the unnecessary samples, the problem with more over-
lapping in the step.2 of optimization is solved. The samples with larger level radius
can cover more space than the fixed radius samples before, so that, the self region can
be covered by fewer samples than before. The time complexity of this step is O(n2).
Via the optimization, the self set is more rational than before. The total time com-
plexity of optimization is O(n2).
Firstly, some self samples chosen from a fixed number of random points, which fit
with the normal distribution, are normalized to compose the self set (198 samples,
radius is fixed as 0.05). So this set has no wrong samples, and the step.1 optimization
will be illustrated in the real network experiment. Each result of optimization is
shown by the figure 2.
(a) The initial self set (b) The self set optimized by step.2 (c) The self set optimized by step.3
(d) Detectors trained by (e) Detectors trained by self set (f) Detectors trained by self set
initial self set optimized by step.2 optimized by step.3
The figure 2(a) shows the initial self set, and we can see that the overlapping and
the boundary invasion are extremely severe. The detectors trained by the initial self
set are shown in figure 2(d), and we can see clearly that the boundary of the non-
self is not covered completely which is the consequence of the self samples bound-
ary invasion. Likewise, the optimized self set and its detector set are shown by the
other figures of the figure 2 (the unit radius is set as 0.005). After all the steps of
optimization, the number of optimized self is dropped to 67, and the overlapping(S)
is dropped to 2.71% (by the equation 3). As the figures show, adjusting each sam-
ple’s radius avoids the boundary invasion so that the non-self region’s boundary is
covered well; and then the coverage of self region by the most reasonable self sam-
ples almost staying the same as before.
In the real network experiment, the self samples are come from the network security
Lab of the HUST (Harbin University of Science and Technology). The value is 2-
dim: pps(packages per second) and bps(bytes per second). We collected 90 samples
and drew the joint probability distribution in figure 3. And we can see that the joint
distribution is quite similar to the standardized Gaussian distribution.
(a) The joint probability distribution of pps and bps (b) Two-dimensional standardized Gaussian
The results of optimization are shown by the figure 4. The optimized self sets are
in figure 4(a, b, c, d), and the corresponding detector sets are in the other figures of
figure 4.
As shown in the figure 4(a), the self set had 3 wrong self samples which were far
away from the self region. Therefore, the detector set generated by the initial self set
has holes shown in the figure 4(e). After the step.1, the wrong samples were discarded
(shown in the figure 4(b)). The results of other steps of optimization are just like the
simulation experiment above. The final self set has 19 samples.
To perform a comparison of generating detector set between the initial self set and
the optimized one, we used the two data sets to generate 3000 detectors. We can
clearly see the process from the figure 5, the detector generating speed using opti-
mized self set is faster than using initial self set.
Optimization of Real-Valued Self Set for Anomaly Detection 119
Fig. 5. The detector generating speed comparison between initial and optimized self set
To perform a comparison of the detection effect of the two detector sets, we used
the SYN flooding to attack the two hosts with different detector sets simultaneously.
All the results shown in the table 1 are average of 100 times experiment. From the
table we can see clearly that the detection rate with detectors using optimized self set
is much higher than the other (98.9% vs. 83.7%) and the false alarm rate is remark-
able lower than the other (2.3% vs.8.3%).
False Alarm
Detector Set Connection Invasion Detecting Detection Rate
Rate
By initial 13581653.3 100 92.5 83.7% 8.3%
By optimized 13581653.3 100 101.2 98.9% 2.3%
120 L. Xi, F. Zhang, and D. Wang
5 Conclusion
In this paper we have shown that the real-valued self set’s optimization in anomaly
detection with the probability theory. Experimental results demonstrated that this
optimization is necessary and the method is obviously effective. Moreover, the opti-
mized self set provides a favorable foundation of generating detectors. The detector
set trained by the optimized self set is more reliable because of the advantages of the
optimization algorithm:
1. Holes can be better covered by the generated detectors because of discarding the
wrong self samples.
2. Time to generate detectors is saved by using smaller number of self samples, which
also requires less space to store them.
3. The radius of the border self samples is adjusted to a rational range so that the
boundary invasion is solved and the false alarm rate is reduced obviously.
The influence of real-valued self set optimization needs further study, including more
real network experiment and strict analysis. The implication of self radius, or how to
interpret each self sample, is also an important topic to be explored.
Acknowledgment
This work was sponsored by the National Natural Science Foundation of China
(Grant No.60671049), the Subject Chief Foundation of Harbin (Grant
No.2003AFXXJ013), the Education Department Research Foundation of Heilongji-
ang Province (Grant No. 10541044, 1151G012).
References
1. Patcha, A., Park, J.M.: An overview of anomaly detection techniques: existing solutions and
latest technological trends. Computer Networks 51, 3448–3470 (2007)
2. Hofmeyr, S., Forrest, S.: Architecture for an artificial immune system. IEEE Transactions on
Evolutionary Computation 8, 443–473 (2000)
3. Simon, P.T., Jun, H.: A hybrid artificial immune system and self and self organising map for
network intrusion detection. Information Sciences 178, 3024–3042 (2008)
4. Forrest, S., Perelson, A.S., Allen, L.: Self-non-self discrimination in a computer. In: Proc. of
the IEEE Symposium on Research in Security and Privacy, pp. 202–212 (1994)
5. Dasgupta, D., Gonzalez, F.: An immunity based technique to characterize intrusions in com-
puter network. IEEE Transactions on Evolutionary Computation 69, 281–291 (2002)
6. Boukerche, A., Machado, R.B., Juca, R.L.: An agent based and biological inspired real-time
intrusion detection and security model for computer network operations. Computer Com-
munications 30, 2649–2660 (2007)
7. Ji, Z., Dasgupta, D.: Real-valued negative selection algorithm with variable-sized detectors.
In: Deb, K., et al. (eds.) GECCO 2004, Part I. LNCS, vol. 3102, pp. 287–298. Springer,
Heidelberg (2004)
8. Gonzalez, F., Dasgupta, D., Kozema, D.: Combining negative and classification techniques
for anomaly detection. In: Proc. of the 2002 Congress on Evolutionary Computation,
pp. 705–710 (2002)
A GP Process Mining Approach from a Structural
Perspective
1 Introduction
Today, information about business processes is mostly recorded by enterprise infor-
mation systems such as ERP and workflow management systems in the form of
so-called “event logs” [1]. As in many domains processes are evolving and become
more and more complex, there is a need to understand the actual processes based on
logs. Thus process mining is employed since it automatically analyzes these logs to
extract explicit process models. Currently, most of process mining techniques try to
mine models from the behavior aspect only (i.e., reflect the exact behavior expressed in
logs) while ignore the complexity of mined models. However, as processes nowadays
are becoming more complex, process designers and analysts also consider mining
structurally simple models since complex processes may result in errors, bad under-
standability, defects and exceptions [2]. Therefore, “good” process models are required
to conform to logs and somehow have simple structure to clearly reflect the desired
behavior [3]. Thus to consider both behaviorally and structurally, we propose a genetic
process mining approach coupled with a process complexity metric.
In fact, utilizing evolutionary computation in process mining research is not a new
concept. In 2006, Alves introduced the genetic algorithm (GA) to mine process
models due to its resilience to noisy logs and the ability to produce novel sub-process
∗
Corresponding author.
H. Deng et al. (Eds.): AICI 2009, LNAI 5855, pp. 121–130, 2009.
© Springer-Verlag Berlin Heidelberg 2009
122 A. Wang et al.
combinations [4]. In this case, an individual is a possible model and the fitness
evaluates how well an individual is able to reproduce the behavior in the log. How-
ever, since the individual was abstracted to the level of a binary string, it had prob-
lems when mining certain processes, especially those exhibiting high level of parallel
execution. Thus, a genetic programming (GP) approach coupled with graph-based
representation was proposed in [5]. Its abstraction of a directed graph structure also
allows greater efficiency in evaluating individual fitness.
However, there are two main problems in these approaches: Firstly some relations
between process tasks are not considered in the individual representation; the other is
that they neglect the complexity of mined models. Thus we improve the GP approach
by extending the individual representation in [4] and define a new part of the fitness
measure that benefits individuals with simpler structure. The new structural fitness is
based on the structuredness metric (SM) proposed in [6], which is a process complexity
metric. Our paper is organized as follows: Section 2 defines the individual representa-
tion and section 3 extends the SM. The GP approach is shown in section 4 with
experiments discussed in section 5. Finally, conclusions are drawn in section 6.
As we can see, without supporting other business logic, previous GA and GP ap-
proaches may cause unexpected errors. For instance, consider the event log shown in
Table 1. It shows three different process instances for job applicants. Note that when
HR screens resumes (A), he/she will decline the resume (E) or arrange examinations
including interview (B) and a written test(C) and score the applicant (D) when his/her
resume is passed. Finally, the result is sent to the applicant (F).
The result model is expected to be Fig. 1(a), where the black rectangle is an implicit
task. However, previous approaches will wrongly generate Fig. 1(b) since the indi-
vidual’s representation in these approaches cannot express the relation of exclusive
disjunction of conjunctions. Therefore, we give an improved version of the causal
matrix that supports other relations, i.e., the extended CM (ECM for short).
A GP Process Mining Approach from a Structural Perspective 123
Fig. 1. Expected model (a) and mined result using previous GA and GP (b)
ECM can be denoted as a quadruple (T, C, I, O), where T is a finite set of activities
(tasks) appeared in workflow logs, C ⊆ T × T is the causality relation and I and O are
the input and output condition functions respectively.
The reason why we choose the tree type in ECM is that it provides the feasibility to
express the relation OR. Moreover, it facilitates some operations such as crossover and
mutation since it allows entire sub-tree structures to be swapped over with ease. Note
that there may be duplicate tasks in sub trees. Similar to original CM, our extended CM
can also deal with the loop structure.
124 A. Wang et al.
The GP approach in this paper tries to mine process models not only from the behavior
perspective but also from the structure one. This is done by defining new partial
structure fitness so as to mine process models from the structural perspective. We need
first introduce a process complexity metric based on ECM. Until now, many com-
plexity metrics of process models have been proposed and we choose the structured-
ness metric (SM)[6] in this paper. The SM focuses on the model structure. Addition-
ally, both the GA mining algorithm and SM have been implemented in the context of
ProM, which is characterized by a pluggable framework for process analysis, thus some
modifications can be made to implement the improved GP.
In this section, we extend the Petri net version of SM in [6] to propose a ECM
based SM. The idea behind this metric stems from the observation that process
models are often structured in terms of the combination of basic patterns such as
sequence, choice, parallelism and iteration [7]. To calculate the complexity, we de-
fine the component as a sub ECM to identify different kinds of structures in the ECM
and then score each structure by giving it some “penalty” value. Finally, the sum of
these values is used to measure the complexity of process models, i.e., the complexity
of the individual in the GA. To understand the idea, some definitions are given
below:
Firstly, some structural properties based on ECM are discussed.
Definition 1. Free choice. An ECM is a free-choice ECM iff for every two tasks t1 and
t2, I(t1) ∩ I(t2)≠null implies I(t1) = I(t2)
Definition 2. State machine. An ECM is a State machine iff for every task t, there exists
no AND/OR in I(t) and O(t).
Definition 3. Marked graph. An ECM is a marked graph iff for every task t, there exists
no XOR in I(t) and O(t).
A component corresponds to a behavioral pattern and is basically an ECM
C (Component) ρ ECM (C )
MAXIMAL SEQUENCE ∑ t ∈T
τ (t )
CHOICE 1.5 ⋅ ∑ t ∈T
τ (t )
WHILE ∑ t ∈{ I ( C ), O ( C )}
τ ( t ) + 2 ⋅ ∑ t ∈ T τ (t )
MAXIMAL MARKED GRAPH 2⋅∑ t ∈T
τ (t ) ⋅ diff (T )
MAXIMAL STATE MACHINE 2⋅∑ t ∈T
τ (t ) ⋅ diff ( P )
MAXIMAL WELL STRUCTURED 2⋅∑ t ∈T
τ (t ) ⋅ diff ( P ) ⋅ diff (T )
otherwise 5⋅∑ t ∈T
τ (t ) ⋅ diff ( P ) ⋅ diff (T )
We build the initial causal matrices in a heuristic way, which tries to determine the
causality relation by utilizing a dependency measure, which aims to ascertain the
strength of the relationship between tasks by calculating the amount of times one task is
directly preceded by another. The measure is also able to determine which tasks are in a
loop.
Once the causality relation is determined, the condition functions I and O are ran-
domly built. i.e., every task t1 that causally precedes a task t2 is randomly inserted as a
leaf in the tree structure, which corresponds to the input condition function of t2. Op-
erators (AND, OR and XOR) are also inserted randomly to construct a tree. A similar
process is done to set the output condition function of a task.
A GP Process Mining Approach from a Structural Perspective 127
Fitness function is used to assess the quality of an individual and this is assessed by:
1) benefiting the individuals that can parse more event traces in logs (the “com-
pleteness” requirement);
2) punishing the individuals that allow for more extra behavior than the one ex-
pressed in logs (the “preciseness” requirement);
3) punishing the individuals that are complex, i.e., individuals with high SM
value(the “uncomplexity” requirement). Equation (1) depicts the fitness function of our
GP-based process mining algorithm, in which L is event logs and ECM is an individual.
The notation ECM[] represents a generation of process models. Three partial fitness
measures are detailed below.
In equation (1), the functions PFcomplete and PFprecise derived from the previous GA
approach are measured from the behavior perspective for completeness and precise-
ness. Since ECM is a general version of CM, the functions are nearly the same as those
in previous GA approach [8].
The function PFcomplete shown in equation (2) is based on the continuous parsing
of all the traces against an individual in event logs. In this equation, all missing tokens
and extra tokens left behind act as a penalty in the fitness calculation. More details can
be found in [8] Obviously, ‘OR’ construct parsing should also be considered.
PFcomplete( L, ECM ) = (2)
allParseTasks ( L, ECM ) − punishment ( L, ECM )
numTasksLog ( L)
In terms of preciseness, the individual should contain “less extra behavior”, i.e., the
individual tends to have less enabled activities. PFprecise in equation (3) provides an
measure of the amount of extra behavior an individual allows for in comparison to other
individuals in the generation [8].
PFprecise ( L, ECM , ECM []) = (3)
allEnabledTasks ( L, ECM )
max(allEnabledTasks ( L, ECM []))
As mentioned before, the GA approach and most other process mining techniques
focus on the behavior in logs but ignore the structure of the actual individual. Thus we
defined the uncomplex fitness in equation (4). It gives the transformation of the
structuredness metric values to the interval [0, 1], where max and min are the maximum
and minimum value of the entire generation of individuals respectively.
128 A. Wang et al.
4.3 GP Operators
Genetic operators such as selection, crossover and mutation are used to generate indi-
viduals of the next generation, and the operators in [8] have been adapted for our GP
approach. In our case, the next population consists of the best individuals and others
generated via crossover and mutation. And parents are selected from a generation by
playing a five individual tournament selection process.
In terms of crossover, a task, which exists in both parents, is selected randomly as
the crossover point. The input and output tree of the task are then split at a randomly
chosen swap point (a set of tree nodes) for each parent. Our crossover algorithm covers
the complete search space defined by ECM. The following is an illustration of the
crossover algorithm, which shows the input trees of two parents for task K.
• Crossover algorithm in our GP process mining approach
parentTask1 = AND(XOR(A,B,C),OR(B,D),E)
parentTask2 = OR(AND(A,B),XOR(C,D),F)
swap1 = XOR(A,B,C),OR(B,D); remainder1 = E
swap2 = AND(A,B),F; remainder2 = XOR(C,D)
If crossover, for each branch in swapset1
If the selected branch shares the same operation with one
of the remainder2 (e.g., XOR(A,B,C) and XOR(C,D) are both
XOR), with the equal probability, select one of the
following three crossover methods:
i) Add it as a new branch in remainder
(e.g., XOR(A,B,C),XOR(C,D))
ii) Join it with an existing branch in remainder2
(e.g., XOR(A,B,C,D))
iii) Add as a new branch, and then select the branch
with the same operation in remainder2 and remove the tasks
that are in common(i.e., the same tasks)
(e.g., XOR(A,B,C),D)
Else select one crossover method conditionally.
If the operation of selected branch is the same as the
root of parentTask2
Add as a new leaf of the root in remainder2 for each
task in selected branch
(e.g., XOR(C,D),B,D)
Else add as a new branch in remainder2
Repeat for the combination swap2 and remainder1
In the crossover code, some examples are shown to make sure you can understand.
After execution, the result child for input2(K) may be one of these:
OR(XOR(A,B,C),XOR(C,D),B,D);OR(XOR(A,B,C,D),B,D); OR(XOR(A,B,C),B,D).
The cycle is repeated for both input and output trees of the selected task.
A GP Process Mining Approach from a Structural Perspective 129
In the mutation, with probability mutation rate, every task in the individual is mu-
tated for its input and output tree. The following program depicts the mutation.
• Mutate algorithm in our GP process mining approach
If mutation, for each task t in the individual (Assuming I(t)
= AND(XOR(A,B,C),OR(B,D),E)), one of the following opera-
tions are done:
i) Choose a branch and add a task to it (randomly chosen
from the complete set of available tasks in the individual),
if the branch is a leaf (i.e., one task), randomly add an
operation on the new branch.
(e.g., XOR(A,B,C),OR(B,D,F),E)
ii) Remove a task from a chosen branch
(e.g., XOR(A,B,C),OR(B,D))
iii) Change the operation for the chosen branch
(e.g., XOR(A,B,C),XOR(B,D),E)
iv) Redistribution the elements in I(t)
(e.g., XOR(B,C),OR(A,B),D,E)
Repeat it for output trees of each task
Both crossover and mutation operators utilize a repair routine executed after an in-
put/output has been changed to make sure that only viable changes are made.
5 Experiments
We have implemented our GP process mining approach in Java and plug it into the
ProM framework. The experiments in this section allow us to validate the approach.
Table 4 shows the parameters being used in the experiments.
Fig. 3. Two mined models (a) using previous GA/GP approach; (b) using our GP approach
130 A. Wang et al.
Fig. 4. Another two mined models (a) using previous GA/GP approach; (b) using our GP
To compare with the previous GA/GP approach, we have tested various workflow
logs. For short, we give two typical comparisons. Fig. 3(b) is the expected model and
our GP approach has successfully mined it while previous approaches produced a
confusing model since our GP approach supports the task relation of disjunction of
conjunctions. Fig. 4 shows two mined results when mining process with short parallel.
And it illustrates that the structurally uncomplex fitness help our GP method outper-
forms when dealing with process models containing short parallel constructs.
6 Conclusions
The advantage of our GP process mining approach is that it can mine process behav-
iorally and structurally. This owes to the novel idea of combining the genetic
programming with the structuredness metric. Additionally, the extended CM for rep-
resenting the individuals provides benefits in crossover and mutation. Thus, our GP
approach outperforms other process mining approach in some aspects.
References
[1] van der Aalst, W.M.P., van Dongen, B.F., Herbst, J., Maruster, L., Schimm, G., Weijters,
A.J.M.M.: Workflow mining: a survey of issues and approaches. Data & Knowledge En-
gineering 47(2), 237–267 (2003)
[2] Cardoso, J.: Control-flow complexity measurement of processes and Weyuker’s properties.
Transactions on Enformatika, Systems Sciences and Engineering 8, 213–218 (2005)
[3] Rozinat, A., van der Aalst, W.M.P.: Conformance Testing: Measuring the Fit and Appro-
priateness of Event Logs and Process Models. In: Bussler, C.J., Haller, A. (eds.) BPM 2005.
LNCS, vol. 3812, pp. 163–176. Springer, Heidelberg (2006)
[4] Alves de Medeiros, A.K., Weijters, A.J.M.M.: Genetic Process Mining. Ph.D Thesis,
Eindhoven Technical University, Eindhoven, The Netherlands (2006)
[5] Turner, C.J., Tiwari, A., Mehnen, J.: A Genetic Programming Approach to Business
Process Mining. In: GECCO 2008, pp. 1307–1314 (2008)
[6] Lassen, K.B., van der Aalst, W.M.P.: Complexity metrics for Workflow nets. Information
and Software Technology 51, 610–626 (2009)
[7] van der Aalst, W.M.P., ter Hofstede, A.H.M., Kiepuszewski, B., Barros, A.P.: Workflow
patterns. Distributed and Parallel Databases 14(1), 5–51 (2003)
[8] Alves de Medeiros, A.K., Weijters, A.J.M.M., van der Aalst, W.M.P.: Genetic process min-
ing: an experimental Evaluation. Journal of Data Mining and Knowledge Discovery 14(2),
245–304 (2007)
Effects of Diversity on Optimality in GA
1 Introduction
Genetic Algorithm (GA) is an Evolutionary Computation (EC) method proposed by
Holland [1] which searches the binary space for solutions that satisfy pre-defined
criteria. GA is a heuristic search method that contains three operators: Selection,
Combination and Mutation. It operates on a collection of individuals (populations). A
population contains individuals which have the potential to differ from each other.
The GA works by identifying individuals which most closely satisfy pre-defined
criteria, called selection. These selected individuals (the fittest) are combined. Each
combination cycle is called a generation. To create new individuals, the combined
individuals must differ. The amount of difference in the population is called its diver-
sity. Combination occurs by swapping parts between the individuals. The combination
methods can vary significantly [2-6]. Goldberg [3] suggested Single Point Crossover
(SPC) and Multi-Point Crossover (MPC) as pattern identifying crossover methods.
Punctuated Crossover [4] is a deterministic crossover method which alters the prob-
ability based on the success of previous crossover points. Random Respectful Cross-
over [5] produces offspring by copying the parts at positions where the parents are
identical and filling the remaining positions with random members from the bit set.
Disrespectful Crossover (DC) [6] is a variant on SPC, where the bits in the section
below the crossover point are inverted if they are different or have a 50% chance of
being in either state if they are the same. DC is essentially a localized high probability
mutation algorithm incorporated into a crossover function. Its purpose is in encourag-
ing search divergence, not search convergence. Reduced Surrogate [4] randomly
H. Deng et al. (Eds.): AICI 2009, LNAI 5855, pp. 131–140, 2009.
© Springer-Verlag Berlin Heidelberg 2009
132 G. MacDonald and G. Fang
selects bits which are different as the crossover points for single or double point
crossover. This has the advantage of not performing redundant crossovers and is a
more efficient crossover as search diversity decreases. Hoshi [2] showed that encour-
aging similarly ranked individuals to combine resulted in superior solutions.
If the combination of two different individuals can lead to all individuals in the
search space then it is possible that the globally optimal solution will be found. How-
ever, in GA, like in biology, Combination and Selection are what makes the search
converge, as traits in individuals become more common after successive generations.
It is believed that these traits are what make the individuals ‘fit’. However these
‘traits’ could be sub-optimal. If this is the case, a population will fail to continually
improve and will have reached a locally optimal solution. This means that the GA
may not produce an optimal solution for the problem, i.e., the global optima.
To encourage the population to avoid local optima, individuals are altered in a
probabilistic manner, called mutation [3]. This is modeled on biological mutation,
which is the alteration of individual genes within an organisms DNA. Commonly,
mutation is implemented in GA by probabilistically changing bit states, whereby
every single bit in an individual has a specific probability of changing state. This type
of mutation is also known as De-Jong mutation [7].
The effectiveness of De-Jong mutation has proven to be dependent on the mutation
probability and the effectiveness of a mutation probability has been shown to be prob-
lem dependent [8-13]. The mutation probability affects the probability of an individ-
ual being mutated as well as the number of bits mutated within that individual. A high
mutation probability means that an individual has a high chance of having a large
number of bits altered; this results in a highly random search which mostly ignores
convergence through crossover. A low mutation probability means that an individual
has a low chance of having a small number of bits altered; this creates a search which
is highly reliant on combination.
Searches which ignore combination do not converge. Searches which are entirely
reliant on combination are prone to premature convergence. Mutation alone amounts
to a random search and forever diverges. Mutation when combined with selection and
combination is a parallel, noise tolerant, hill-climbing algorithm [8]. Hill-climbing
algorithms guarantee the location of a local optimum, but will be deceived by a prob-
lem with many local optima. Methods of determining the mutation probability based
on generation [9], population size [10], length [11], crossover success and the rate of
change in fitness have been investigated [12]. A method which prolongs the GA
search time by systematically varying the mutation rate based on population diversity
has significantly improved the final solution optimality [13]. These mutation strate-
gies have improved the GA in a problem independent fashion. However they do not
present any significant improvement to the ideology behind mutation. They do not
address the search algorithm, they primarily present methods of altering one of the
algorithms parameters in order to prolong the search or more reliably disrupt conver-
gence. This is unlikely to make the search more efficient or more robust but rather
just make it more exhaustive. They also do not allow convergence and divergence to
be complimentary.
In this paper a new mutation method is introduced to improve the diversity of
population during the convergence. This method also intends to ensure that the
convergence is not disrupted by the mutation process.
Effects of Diversity on Optimality in GA 133
The remainder of the paper is organized as follows: section 2 details the methodol-
ogy used for maintain the diversity while ensuring convergence. The numerical
results are shown in section 3. The conclusions are then given in section 4.
n n
D= − −d . (2)
2 2
Mutation traditionally requires the specifying and tuning of the rate at which mutation
affects the population. The mutation rate for the CDM and the implemented version
of RI was determined by the number of repeated individuals in the population. There-
fore, this method requires the comparison of all individuals in the population.
The computational cost of this comparison is:
(N 2 / 2 − N ) × L . (3)
3 Results
The GA testbed used ranking selection without elitism and was tested on two popula-
tion sizes, 50 and 100, over 20 generations. The tests were conducted using both sin-
gle point and dual point crossover. The mutation methods were tested on six (6)
benchmark functions with various shapes: Two Multi-Modal functions, two single
Effects of Diversity on Optimality in GA 135
optima functions and two flat functions. The functions were tested over 3 to 5 dimen-
sions with 6 to 12 Bit resolution per dimension. The mutation methods were com-
pared with respect to diversity, and the final fitness. The results were averaged over
100 tests.
The following six (6) bench mark functions [16] are used in this paper to evaluate the
effectiveness of the proposed CDM. They are:
Two single-optimum functions:
4. Griewank’s Function:
n n
⎛ xi ⎞
∑ x − ∏ cos⎜⎜⎝
1
f ( x) = 2
i ⎟⎟ + 1 . (7)
4000 i =1 i =1 i⎠
6. Rosenbrock’s Function:
n
f ( x) = ∑ (100( x
i =1
i +1 − xi2 ) 2 + ( xi − 1) 2 ) . (9)
The GA results, in terms of the mean final fitness, are shown in Tables 1 – 6 of the
above six benchmark functions. The results were obtained by using different mutation
methods – the proposed CDM method, the completely random RI method and De-
Jong method with 0.001, 0.002, 0.005 and 0 probability mutation rates.
Cross over points/ no. of bits CDM RI 0.001 0.002 0.005 None
Single point/6-Bit 0.38 0.86 1.21 0.62 0.16 1.89
Dual point/ 6-Bit 1.56 3.24 2.86 2.14 0.92 4.16
Single point/ 12Bit 0.3 0.39 0.54 0.36 0.07 0.9
Dual point/ 12Bit 1.58 2.11 1.79 1.26 0.5 2.56
Cross over points/ no. of bits CDM RI 0.001 0.002 0.005 None
Single point/ 6-Bit 0.1 0.18 0.24 0.12 0.01 0.62
Dual point/ 6-Bit 0.07 0.08 0.07 0.04 0.01 0.2
Single point/ 12Bit 0.81 1.77 1.31 0.77 0.1 3.22
Dual point/ 12Bit 0.66 0.9 0.65 0.27 0.03 1.43
Cross over points/ no. of bits CDM RI 0.001 0.002 0.005 None
Single point/ 6-Bit 1.66 2.34 4.56 4.15 4.04 4.89
Dual point/ 6-Bit 1.59 1.89 2.98 3.04 2.69 3.3
Single point/ 12Bit 6.55 9.76 10.56 9.74 7.38 15.81
Dual point/ 12Bit 5.34 6.06 5.91 5.76 4.72 7.27
Cross over points/ no. of bits CDM RI 0.001 0.002 0.005 None
Single point/ 6-Bit 0.04 0.05 0.07 0.05 0.03 0.11
Dual point/ 6-Bit 0.04 0.03 0.05 0.04 0.03 0.07
Single point/ 12Bit 0.08 0.09 0.09 0.07 0.05 0.16
Dual point/ 12Bit 0.06 0.06 0.07 0.05 0.04 0.09
Cross over points/ no. of bits CDM RI 0.001 0.002 0.005 None
Single point/6-Bit 8.9 29.6 54.3 35.4 19.4 115
Dual point/6-Bit 11.1 17.9 22.7 16.4 10.6 42
Single point/12Bit 294 448 430 136 55.1 1419
Dual point/12Bit 165 257 149 84.2 33.8 305
Effects of Diversity on Optimality in GA 137
Cross over points/ no. of bits CDM RI 0.001 0.002 0.005 None
Single point/6-Bit 0.01 0.01 0.07 0.06 0.05 0.09
Dual point/6-Bit 0.01 0.01 0.05 0.05 0.04 0.05
Single point/12Bit 0.01 0.03 0.1 0.06 0.03 0.17
Dual point/12Bit 0.03 0.02 0.05 0.03 0.04 0.11
From these results it can be seen that the RI produces worst results than the CDM
methods. This is expected as the CDM is designed to encourage diversity whilst still
maintaining population convergence.
The more interesting aspects of the results are observed in the fitness results for the
two single-optima functions (Tables 1-2) of CDM in comparison with the De-Jong
methods. It seems that the CDM is not generating better results than the normal prob-
abilistic mutation methods. It is believed that this is due to the fact that single optima
solutions can be found using hill-climbing methods that the De-Jong’s method falls
into. Therefore, the normal mutation method will generate acceptable results.
The CDM in general will generate better or similar results for the multi-optima func-
tions (Tables 3-4). In these cases, as the CDM is designed to encourage diversity, it is
likely to generate better, or at least no worse results than the other mutation methods.
In dealing with the flatter-optima functions (Tables 5-6), it can be seen that the
CDM’s final results are compatible to those from the normal mutation methods. This
is expected as on the flatter surface diversity will not have a significant impact on
diversity outcomes.
It is also noted in the computation that the CDM in general performs better in deal-
ing with simpler cases, i.e., with single cross-over point and with shorter bit length in
population. CDM performed relatively worse with dual point crossover (DPC) be-
cause DPC is better than single point crossover (SPC) in maintaining population di-
versity. It is believed that CDM performed relatively worse when tested with longer
individuals because the search space became too large for the given population size to
be effective.
During the computation, it is noted that the diversity varies significantly with different
mutation rates and mutation methods. It was also found that it varies between bench-
mark functions. However, the most significant effect on diversity came with search
space size, which is a combination of parameter number and parameter resolution.
Shown in Fig. 1 is the average population hamming distances for all tests in 6-Bit
and 12-Bit parameter resolutions.
It is clear that there are marked differences in search space sizes between 6-Bit and
12-Bit parameter resolutions. The average distance between all solutions at the begin-
ning for a 6-Bit 5-dimensional case is 12 Bits while it is 25 Bits for the 12-Bit case.
In all tests the diversity measurement was made after the crossover. This means
that only the useful diversity created through the mutation method is recorded, i.e.,
when the mutation method creates individuals with above average fitness and a larger
bit difference, then it will improve the diversity of the next generation. If mutation
does not create fitter individuals with a larger distance from the other relatively fit
individuals then it will not improve the next generation’s diversity.
138 G. MacDonald and G. Fang
Figure 1 shows the generational diversity summaries for the 6-Bit and 12-Bit tests.
In the figure, CDM refers to Coherent Diversity Maintenance mutation. RI is for Ran-
dom Immigrants mutation and DJ .001 – DJ .005 refer to the De-Jong Mutation with
the stated probability.
It can be seen from the figure that the diversity levels of CDM and RI are similar
for both parameter resolutions. However CDM consistently creates higher population
diversity during the mid generations, whilst tapering off at the final generations. The
lower diversity relative to RI in the final generations is to be expected as CDM only
creates individuals which are coherent with the converging search space whereas RI
disregards the convergence of the search space.
RI and CDM encourage roughly equivalent diversity levels for all tests but pro-
duce significantly different final fitness results, thus one can conclude that there is an
inconsistent link between diversity and superior convergence. As RI encourages new
individuals to come from the entire search space and CDM encourages new indi-
viduals to come from the converging search space one can conclude that specific
diversity gives better results than unspecific diversity. This occurs because as con-
vergence progresses, CDM is able to provide crossover with a more representative
sample of the converging space which enables it to better select the next generation
of individuals.
Effects of Diversity on Optimality in GA 139
4 Conclusions
In this paper a new mutation scheme, named Coherent Diversity Maintenance (CDM),
is introduced. This method is designed to encourage diversity in the GA population by
maintaining the convergence trend. It is believed that this method will explore the
converging parameter space more fully therefore resulting in better convergence re-
sults. Numerical results have shown that the CDM tends to generate better, or at least
no worse, results for multi-optima or flatter optima situations than the normal prob-
abilistic mutation method and complete random mutation method.
References
1. Holland, J.H.: Adaption in Natural and Artificial Systems: An Introductory Analysis with
Applications to Biology, Control and Artificial Intelligence. MIT Press, Cambridge (1992)
2. Chakraborty, G., Hoshi, K.: Rank Based Crossover – A new technique to improve the
speed and quality of convergence in GA. In: Proceedings of the 1999 Congress on Evolu-
tionary Computation, CEC 1999, vol. 2, p. 1602 (1999)
3. GoldBerg, D.E.: Genetic Algorithms in search, optimization and machine learning. Addi-
son-Wesley Publishing Company, Reading (1989)
4. Dumitrescu, D., Lazzerini, B., Jain, L.C., Dumitrescu, A.: Evolutionary Computation.
CRC Press, Boca Raton (2000)
5. Radcliff, N.J.: Forma Analysis and Random Respectful Recombination. In: Proceedings of
the fourth International Conference on Genetic Algorithms (1991)
6. Watson, R.A., Pollack, J.B.: Recombination Without Respect: Schema Combination and
Disruption in Genetic Algorithm Crossover. In: Proceedings of the 2000 Genetic and Evo-
lutionary Computation Conference (2000)
7. De Jong, K.A.: Analysis of the Behaviour of a class of Genetic Adaptive Systems. Techni-
cal Report, The University of Michigan (1975)
8. Hoffmann, J.P.: Simultaneous Inductive and Deductive Modeling of Ecological Systems
via Evolutionary Computation and Information Theory. Simulation 82(7), 429–450 (2006)
9. Fogarty, T.C.: Varying the Probability of mutation in the Genetic Algorithm. In: The Pro-
ceedings of the Third International Conference on Genetic Algorithms 1989, pp. 104–109
(1989)
10. Hesser, J., Manner, R.: Towards an optimal mutation probability for Genetic Algorithms.
In: Schwefel, H.-P., Männer, R. (eds.) PPSN 1990. LNCS, vol. 496, pp. 23–32. Springer,
Heidelberg (1991)
11. Back, T.: Optimal Mutation Rates in Genetic Search. In: Proceedings of the Fifth Interna-
tional Conference on Genetic Algorithms, June 1993, pp. 2–8 (1993)
12. Davis, L.: Adapting Operator Probabilities in Genetic Algorithms. In: Proceedings of the
Third International Conference on Genetic Algorithms, pp. 61–69 (1989)
13. Ursem, R.K.: Diversity-Guided Evolutionary Algorithms. In: Guervós, J.J.M., Adamidis,
P.A., Beyer, H.-G., Fernández-Villacañas, J.-L., Schwefel, H.-P. (eds.) PPSN 2002. LNCS,
vol. 2439, pp. 462–471. Springer, Heidelberg (2002)
14. Janikow, C.Z., Michalewicz, Z.: An experimental comparison of binary and floating point
representations in genetic algorithms. In: Belew, R.K., Booker, J.B. (eds.) Proceedings of
the Fourth International Conference Genetic Algorithms, pp. 31–36. Morgan Kaufmann,
San Mateo (1991)
140 G. MacDonald and G. Fang
15. Gasieniec, L., Jansson, J., Lingas, A.: Efficient Approximation Algorithms for the Ham-
ming Centre Problem. Journal of Discrete Algorithms 2(2), 289–301 (2004)
16. Kwok, N.M., Ha, Q., Liu, D.K., Fang, G., Tan, K.C.: Efficient Particle Swarm Optimiza-
tion: A Termination Condition Based on the Decision-making Approach. In: 2007 IEEE
Congress on Evolutionary Computation, Singapore (2007)
Dynamic Crossover and Mutation Genetic Algorithm
Based on Expansion Sampling
Abstract. The traditional genetic algorithm gets in local optimum easily, and its
convergence rate is not satisfactory. So this paper proposed an improvement,
using dynamic cross and mutation rate cooperate with expansion sampling to
solve these two problems. The expansion sampling means the new individuals
must compete with the old generation when create new generation, as a result,
the excellent half ones are selected into the next generation. Whereafter several
experiments were performed to compare the proposed method with some other
improvements. The results are satisfactory. The experiment results show that
the proposed method is better than other improvements at both precision and
convergence rate.
1 Introduction
H. Deng et al. (Eds.): AICI 2009, LNAI 5855, pp. 141–149, 2009.
© Springer-Verlag Berlin Heidelberg 2009
142 M. Dong and Y. Wu
Traditional genetic algorithm uses the fixed cross-rate operator. Because the cross-
rate of individuals is the same, it will make all of the contemporary individuals in the
cross-operation retained at the same probability, thus the current better individuals
will be selected several times at the choice operation of the next round, and the rela-
tively poor individuals in current generation will be eliminated, leading the population
quickly evolving to the direction of the current optimal individual. If the current op-
timal individual is a local optimum, then the entire algorithm can easily fall into local
optimum. To avoid this situation and increase the diversity of the population, this
paper presents a dynamic cross-rate, namely using the ratio between the Euclidean
distance of two chromosomes and the Euclidean distance of the largest and the small-
est fitness individual in population as cross-rate:
| f (a ) − f (b) |
Pc = (1)
max( f ) − min( f )
where f(a) is a chromosome’s fitness, f(b) for b chromosome’s fitness, max(f) and
min(f) respectively are the largest and the smallest fitness in the population, Pc is the
crossover probability. Such cross-rate will make the individuals in the middle of
population retained at a greater chance, and the individuals at both ends of population
a greater probably cross, so that the better individuals and the poorer individuals are
both changed to avoid too intense competition in the choice operation of next round.
The poor individuals’ genes can also contribute to the development of the population.
The principle of this improvement is a simple probability problem of average dis-
tribution, in which, the crossover probability of a subject to the average distribution of
f(b). To illustrate this principle, we assume that max(f) for 2, min(f) for 0, f(b) average
distributes in the interval [0,2]. In this case, if an individual a1 is in the middle of the
interval, that is, its f(a1) is 1, then its cross-rate with b is 1/4, and if an individual a2 at
both ends of the interval, that is, its f(a2) is 0 or 2, then its cross-rate with b is 1/2,
larger than individuals in the middle. The distribution is shown in Figure 1 below
(x-axis is f(a) values, y-axis is crossover rate):
In addition, this algorithm’s dynamic crossover probability basically is effective
crossover probability (we call crossover between two long distance individuals the
effective cross-operation. Because of their relatively big difference between each
other, cross-operation changes them comparatively large. On the other side, the
cross-operation between very close chromosomes changes individuals very small,
the individuals after cross-operation almost unchanged, such cross-operation is inef-
fective),which can improve the effect of cross-operation and avoid inbreeding and
premature convergence effectively. Therefore, although the cross probability of this
paper is lower than other genetic algorithms, the overall effect is better. And the
lower cross-rate will reduce the computational complexity and accelerate the speed
of evolution.
Dynamic Crossover and Mutation Genetic Algorithm 143
f (a ) 2
pm = k *( ) 0 < k <1 (2)
max( f )
Where f(a) is the fitness of an individual a, max(f) is the largest fitness, Pm is the
mutation probability, k is a parameter, valid in interval (0,1). This mutation rate
allows the better individuals have the bigger probability in order to avoid better indi-
viduals rapidly occupying the entire population, lead the population evolving to the
direction of diversification. The distribution of mutation rate is shown in Figure 2
below (x-axis is f(a) values, y-axis is mutation rate):
At the same time, this paper uses the expansive optimal sampling, namely puts the
new individuals with the previous generation together and chooses the best half indi-
viduals into the next-generation. Traditional genetic algorithm puts the new individu-
als directly into the next generation, which will make the older better individuals
which have been crossed or mutated no chance accessing to future generations, thus
slows down the convergence rate of the population. And the expansive optimal sam-
pling makes those better individuals enter the next generation, thus rapidly accelerates
convergence speed.
The traditional genetic algorithm uses fixed mutation rate in computation. As a re-
sult the algorithm is hard to jump out of local optimum because of the relatively small
mutation probability when the algorithm gets in local optimum. And sometimes even
jumped out, there would be useless because they are not selected at the next choice
operation. To avoid this status, this paper combines the dynamic mutation rate with
144 M. Dong and Y. Wu
the expansive optimal sampling. Dynamic mutation rate makes algorithm more easily
jumping out of local optimum when the algorithm get in a local optimum, and the
expansive optimal sampling ensures this mutated and excellent individuals can be
selected into future generations, thereby greatly reducing the probability of getting in
local optimum. If the evolution enough, in theory, the genetic algorithm would not get
in a local optimum assuredly.
4 Algorithm Processes
The procedures of the proposed method are as follows:
1) Initialize the control parameters: set the population size N, the evolution genera-
tion g;
2) g = 0, randomly generate a population of N individuals Po = (x1, x2, ...);
3) Determine the fitness function, calculate the value of individual fitness f(xi), (i =
1,2, ..., N), to evaluate the individuals in population;
4) Determine the choice strategy, the algorithm population in this paper mainly
uses the method of roulette, select the advantage individuals from the individuals Pg
of generation g to form the mating pool;
5) Determine the dynamic crossover operator. Determine cross-rate of each pair of
individuals through the P c = | f ( a ) − f ( b ) | /(m ax( f ) − m in( f )) , and then
proceed cross-operation to produce the middle results;
6) Determine the dynamic mutation operator. Use the P m = k * ( f ( a ) / m ax( f ))
2
5 Simulation Experiments
5.1 Experiment 1
In this paper, the first experiment uses typical complex optimization function as
follows:
y = x *sin(10* pi * x) + 2 x ∈ [ −1, 2] (3)
We use improved mutation operator genetic algorithm (IMGA) in paper [7] and fuzzy
adaptive genetic algorithm (FAGA) in paper [2] to compare with this paper’s dynamic
crossover and mutation genetic algorithm based on expansion sampling (ESDGA), simu-
lating in Matlab7. We change the parameters of genetic algorithm in experiment in order
to observe the performance of several algorithms better. In experiment N is the popula-
tion size, Pc is the crossover probability. In improved mutation operator genetic algo-
rithm the gene discerption proportion L=0.6, the various parameters: a1 = 0.01, b1 =
0.01, a2 = O.01, b2 = 0.05, α = 0.03, g0 = 40. The evolution generation is 500, each kind
of parameter repeat 50 times. Min and Max are the minimum and maximum optimal
solution in 50 times, Avg is the average, Gen is the convergence generations, local is the
number of local optimum solutions. The experiment’s results are shown in Table 1, Table
2 and Table 3 (Pc are useless in FAGA and ESDGA, k is used only in ESDGA):
From the tables it is clear that IMGA needs about 130 iterations to converge, with
an average of about 12% local optimum. FAGA needs an average of about 20 itera-
tions to converge, 10% local optimum. ESDGA needs only about 12 iterations to
converge, and there is no local optimum in total 200 results, better than the two algo-
rithms. Also need to point out that for this paper’s algorithm, we have a lot of experi-
ments to select the k value, the table only lists representative values, as proved, the
effect of the algorithm are better when k above 0.3.
The speed and accuracy is a contradiction in genetic algorithms. If the algorithm
requires high speed, only evolving a few generations, the accuracy of the results will
be not satisfactory, and most likely have not convergence or convergence in local
optimum. If the algorithm continues the evolution infinitely, it must converge to the
global optimum finally, if there is a global optimum. In this paper, we use the latter
situation to compare the performance of the proposed algorithm affected by different
k values in Experiment 1. This is to say, Algorithm will evolve infinitely until the
population converges to the global optimum. We set the population size N=100 and
choose different k in the valid interval (0, 1). The result is shown in Figure 3 below
(x-axis is k values, y-axis is convergence generations):
From the table we can clearly see that, the bigger k the better performance of this
algorithm. 0.3 is a dividing line, and there is not large difference above 0.6. This re-
sult subjects to the theory discussed earlier.
5.2 Experiment 2
In the function: a = 3.0, b = 0.05; maximum value of the function is 3600 when x = 0,
y = 0. And there are four local maximum points (-5.12,5.12), (-5.12, -5.12) ,
(5.12,5.12), (5.12, -5.12), function value is 2748.78.
This is a typical GA deceptive problem. Here we still use improved mutation op-
erator genetic algorithm (IMGA) in paper [7] and fuzzy adaptive genetic algorithm
(FAGA) in the paper [2] to compare with this paper’s dynamic crossover and muta-
tion genetic algorithm based on expansion sampling (ESDGA) in Matlab7. The evolu-
tion generation is also 500, each parameter repeat 50 times. Min and Max are the
minimum and maximum optimal solution in 50 times, Avg is the average, Gen is the
convergence generations, local is the number of local optimum solutions. The ex-
periment results are shown in Table 4, Table 5, Table 6 (Pc are useless in FAGA and
ESDGA, k is used only in ESDGA):
As can be seen from the tables, IMGA and FAGA appear a few local optimums, in
which FAGA need about 50 iterations to converge, IMGA needs about 270 iterations
to converge. However ESDGA did not lead to local optimum, and the convergence
rate is also very fast, basically as long as 20 on average, better than IMGA and
FAGA.
Then we compare the performance of the proposed algorithm affected by different
k values in Experiment 2 as same as Experiment 1. We also set the population size
N=100 and choose different k in the valid interval (0, 1). The result is shown in
Figure 4 below (x-axis is k values, y-axis is convergence generations):
The conclusion is similar with Experiment 1 though there are a few differences, the
bigger k the better performance of this algorithm.
6 Conclusion
Simple genetic algorithm has defects of slow convergence speed and getting in local
optimum easily, making the application of genetic algorithm being limited. To solve
the two largest defects, this paper proposed an improved method. As we can see from
the experiment results, this paper’s dynamic crossover and mutation genetic algorithm
Dynamic Crossover and Mutation Genetic Algorithm 149
References
1. Kalyanmoy, D., Karthik, S., Tatsuya, O.: Self-adaptive simulated binary crossover for real-
parameter optimization. In: Genetic and Evolutionary Computation Conference, pp. 1187–
1194 (2007)
2. Huang, Y.P., Chang, Y.T., Sandnes, F.E.: Using Fuzzy Adaptive Genetic Algorithm for
Function Optimization. In: Annual meeting of the North American Fuzzy Information Proc-
essing Society, June 3-6, pp. 484–489. IEEE, Los Alamitos (2006)
3. Zhong, W.C., Liu, J., Xue, M.Z., Jiao, L.C.: A Multi-agent Genetic Algorithm for Global
Numerical Optimization. IEEE Transactions on Systems, Man, and Cybernetics, Part
B 34(2), 1128–1141 (2004)
4. Deb, K., Agrawal, S., Pratap, A., Meyarivan, T.: A fast and elitist multi-objective genetic
algorithm: NSGA-II. IEEE Transactions on Evolutionary Computation 6(2), 182–197
(2002)
5. Xiang, Z.Y., Liu, Z.C.: Genetic algorithm based on fully adaptive strategy. Journal of Cen-
tral South Forestry University 27(5), 136–139 (2007)
6. Li, Y.Y., Jiao, L.C.: Quantum clone genetic algorithm. Computer Science 34(11), 147–149
(2007)
7. Li, L.M., Wen, G.R., Wang, S.C., Liu, H.M.: Independent component analysis algorithm
based on improved genetic algorithm. Journal of System Simulation 20(21), 5911–5916
(2008)
Multidisciplinary Optimization of Airborne Radome
Using Genetic Algorithm
1 Introduction
Airborne radomes are often used to protect antennas from a variety of environmental
and aerodynamic effects. The design of a high performance airborne radome is a
challenging task as the aerodynamic, electromagnetism, structural mechanics, etc
requirements are generally involved. The performances of the radome in different
disciplines are usually in conflict with each other. A thin and light-weighted radome
with excellent electromagnetic transmission ability is apparently structurally unreli-
able, while a well-designed radome structure with high stiffness and stability will
affirmatively have a poor electromagnetic performance. The radome design is hence a
multidisciplinary procedure because the analyses of different disciplines are tightly
coupled. However, the traditional engineering design approach separates the design
procedure into sequential stages. A design failure at certain stage would cause the
whole design to start from scratch. This will result in a tremendous cost of design
cycle and resources.
The great improvement of the optimization theory and algorithms in the past several
decades has provided an efficient solution for radome design. The optimization procedure
H. Deng et al. (Eds.): AICI 2009, LNAI 5855, pp. 150–158, 2009.
© Springer-Verlag Berlin Heidelberg 2009
Multidisciplinary Optimization of Airborne Radome Using Genetic Algorithm 151
can simultaneously take into account the radome characteristics in different disciplines and
searches for an optimal design with all design requirements to be well-balanced.
In the earlier researches, the techniques of simulating the electromagnetic charac-
teristics of radomes were well developed, which can be classified into two kinds: (1)
high-frequency methods, such as the Ray Tracing (RT) [1] technique based on Geo-
metric Optics (GO), Aperture Integration-Surface Integration (AI-SI) [2] and Plane
Wave Spectrum-Surface Integration (PWS-SI) [3] based on Physical Optics (PO); (2)
low-frequency methods, such as the Finite Element Method (FEM) [4] and Method of
Moment (MoM) [5]. Generally, the high-frequency methods are more suitable for
electrically large problems because of high computational efficiency, while the low-
frequency methods can provide more accurate analysis with much higher computa-
tional complexity. Since the iteration strategy with large computing cost is used in the
optimization methods, the high-frequency methods are superior to low-frequency
methods for electromagnetic analysis of the radome optimization.
Even though the researches on radome optimization were pioneered early by using a
simulated annealing technique [6], it is until recently that more advanced optimization
algorithms have been used and promoted the development of the radome design. For
example, the particle swarm optimization and the Genetic Algorithm are applied for the
radome optimization. The layered thickness of the sandwich radome wall and the shape
of the radome are optimized to maximize the overall transmission coefficient for the
entire bandwidth [7] or to minimize the boresight error [8]. A multi-objective optimiza-
tion procedure was further proposed to optimize the boresight error and power transmit-
tance simultaneously using the genetic algorithm and RT method [9]. However, these
researches on radome optimization are limited to the electromagnetic characteristics.
Recently, the Multidisciplinary Radome Optimization System (MROS) [10] is proposed
as a synthesis procedure, in which the material selection, structural analysis, probabilis-
tic fracture analysis and electromagnetic analysis are incorporated to perform the mul-
tidisciplinary optimization. This work indicates a new trend of radome optimization
which will be more applicable for practical engineering designs.
In this paper, the structural and electromagnetic characteristics are considered si-
multaneously to perform the radome design. A multidisciplinary optimization proce-
dure is developed based on the finite element model of the radome. The structural
analysis and electromagnetic analysis are carried out to obtain the structural failure
indexes and the overall transmission coefficient, respectively. The genetic algorithm
is employed for the optimization.
Normally, the Finite Element Method (FEA) and the Physical Optics (PO) method are
the preferred approaches used for the structural analysis and electromagnetic analysis
of radome, respectively. However, in the traditional design scheme illustrated in
Fig.1, these analyses, as well as the procedure of modeling and postprocessing are
implemented separately. There is no collaboration or communion between different
analysis procedures, which can no longer match the requirements of rapid design
cycle for the modern products.
152 X. Tang, W. Zhang, and J. Zhu
Finite element
model
Electromagnetic
Finite element model
analysis
Physical optics
analysis
Analysis result
postprocessing
Because of requirements of the low dielectric constant and high mechanical strength,
the airborne radome is usually manufactured of sandwich laminate. The choice of mate-
rials and wall thickness has a significant influence on the structural strength and electri-
cal performance. The idea in this paper is to find the proper configuration of these
parameters to satisfy the design requirements. The validity of the design can be verified
with the structural analysis and electromagnetic analysis of the radome respectively.
Multidisciplinary Optimization of Airborne Radome Using Genetic Algorithm 153
The basic issue in radome structural analysis is the material definition. This is very
difficult because multilayer thin walled configurations including the A sandwich, B
sandwich, and C sandwich are often used as illustrated in Fig.3. Taking A sandwich
for example, it consists of three layers: two dense high-strength skins separated by a
lower-density, lower-dielectric core material such as foam or honeycomb. This con-
figuration can provide much higher strength for a given weight than a monolithic wall
of homogeneous dielectric material. The construction of the rest two kinds of wall
configuration can be concluded accordingly. In our demonstrative application, we use
A sandwich as the wall configuration which is most frequently used.
Another important aspect of the radome structural analysis is the structural loads
and constraints. For common airborne radome (non high-speed radomes), the aerody-
namic load due to airflow is the main cause of mechanical stress. The radome is
tightly attached to the airframe with special bolts.
The radome structural analysis can performed in Patran/Nastran. The A sandwich
configuration is modeled in Patran Laminate Modeler by treating the sandwich layer
as an orthotropic layer with the equivalent material properties. The structural validity
of the radome can be assessed by the failure analysis of composite materials. Fre-
quently used failure criterions include the Hill, Hoffman, Tsai-Wu, Maximum Stress
and Maximum Strain.
Based on the equivalence principle, the approach of PWS-SI method can be divided
into three steps: 1) a transmitting antenna is considered and the fields transmitted to
the radome interior S1 are computed by the Plane Wave Spectrum method; 2) Trans-
mission through the radome wall is then calculated by the transmission line analogy to
obtain the fields over the radome outer surface S 2 ; 3) Radiation to the far field is then
determined by the surface integration of the equivalent currents obtained from the
tangential field components over the outer surface S 2 .
The antenna-radome system model is illustrated in Fig.4. According to the planar
slab approximation technique, the radome wall is treated as being locally planar and
modeled as an infinite planar slab with complex transmission coefficients, lying in the
tangent plane at the incident points of the radiation rays.
154 X. Tang, W. Zhang, and J. Zhu
S2
S1
J
n
M Et , H t
S2
S1
E i , H iθ
Er ,H r
According to PWS technique, the transmitted electric field at the near-filed point
P can be calculated by
1
( ) A( k x , k y ) (1)
2
where ( x, y , z ) are the coordinates of P in the antenna coordinate system,
k0 = ik x + jk y + kk z is the radial wave number and r = ix + jy + kz is the vector of point
P. A( k x , k y ) is the so-called plane wave spectrum or the Fourier transform of the
antenna aperture, which can be formulated as
1
∫∫ E ( x, y,0)e
j ( kx x + k y y )
A( k x , k y ) = dξ dη (2)
2π S
If the antenna aperture is circularly symmetric, the surface integral (2) will regress to
a linear integral. Thus the computational complexity will be greatly reduced.
The transmission coefficients for perpendicular polarization T⊥ (θ P ) and parallel
polarization T// (θ P ) of the multilayered radome wall can be determined by the trans-
mission line analogy [11]. Thus the fields over the radome outer surface can be
formulated by
t i i
M BM BM T ( P ) BM t BM T// ( M )
t i i
(3)
M BM BM T// ( P ) BM t BM T ( M )
where
Re( E × H ∗ )
nBP = PM × nRM , t BP = nBP × nRM , PM ( x , y , z ) =
Re( E × H ∗ )
Thus, the electric far field can be calculated with a surface integral technique by
1
t
M 1 ( t
M ) e jkr R1
dS2 (4)
S2
Multidisciplinary Optimization of Airborne Radome Using Genetic Algorithm 155
where R1 is the unit vector of the observation direction, r ′ is the unit vector of the
incidence point on the outer surface S 2 and n is the normal vector of the outer sur-
face S 2 on the incidence point.
With the far field distribution of the antenna-radome system, the electromagnetic
performance of the radome, such as the transmission ratio, boresight error, etc.
ti ,min ≤ ti ≤ ti ,max
where ti denotes the thickness of the ith layer, Tran denotes the overall transmission
and f i , j is the failure index of the ith layer of the jth element.
156 X. Tang, W. Zhang, and J. Zhu
4 Numerical Results
Iterations
Design Variables t1 t2 t3 t4 t5 t6 t7
Lower limit 0.1 0.1 0.1 1 0.1 0.1 0.1
Upper limit 0.5 0.5 0.5 10 0.5 0.5 0.5
Initial value 0.2 0.2 0.2 8.8 0.2 0.2 0.2
Optimal value 0.1 0.1 0.1 9.96 0.1 0.1 0.1
With Radome
Without radome
E field in dB
5 Conclusion
This paper proposes a multidisciplinary optimization scheme for airborne radome
design. The design procedure considers the structural and electromagnetic performance
of the radome simultaneously. Genetic Algorithm is employed for the optimization to
158 X. Tang, W. Zhang, and J. Zhu
maximize the overall transmission coefficient under constraints of the structural failure
of the radome material. The optimization scheme is successfully validated by the de-
sign optimization of a paraboloidal radome.
This work provides the new trend of radome optimization which will be more effi-
cient and applicable for the radome design engineering. Even though the results of the
demonstration are primitive, the proposed optimization scheme and the solution ap-
proach can be easily extended to more complicated applications.
References
1. Tricoles, G.: Radiation Patterns and Boresight Error of a Microwave Antenna Enclosed in
an Axially Symmetric Dielectric Shell. J. Opt. Soc. Am. 54, 1094–1101 (1964)
2. Paris, D.T.: Computer-Aided Radome Analysis. IEEE. Trans. on AP 18, 7–15 (1970)
3. Wu, D.C.F., Rudduck, R.C.: Wave Spectrum Surface Integration Technique for Radome
Analysis. IEEE Trans. on AP 22, 497–500 (1974)
4. Povinelli, M.J., Angelo, J.D.: Finite Element Analysis of Large Wavelength Antenna Ra-
dome Problems for Leading Edge and Radar Phased Arrays. IEEE Trans. on Magnetics 27,
4299–4302 (1991)
5. Chang, D.C.: A Comparison of Computed and Measured Transmission Data for the
AGM-88 Harm Radome. Master’s thesis, Naval Postgraduate School, Monterey, California,
AD-A274868 (1993)
6. Chan, K.K., Chang, P.R., Hsu, F.: Radome design by simulated annealing technique. In:
IEEE International Symposium of Antennas and Propagation Society, pp. 1401–1404.
IEEE Press, New York (1992)
7. Chiba, H., Inasawa, Y., Miyashita, H., Konishi, Y.: Optimal radome design with particle
swarm optimization. In: IEEE International Symposium of Antennas and Propagation So-
ciety, pp. 1–4. IEEE Press, New York (2008)
8. Meng, H., Dou, W., Yin, K.: Optimization of Radome Boresight Error Using Genetic Al-
gorithm. In: 2008 China-Japan Joint Microwave Conference, pp. 27–30. IEEE Press, New
York (2008)
9. Meng, H., Dou, W.: Multi-objective optimization of radome performance with the struc-
ture of local uniform thickness. IEICE Electronics Express 5, 882–887 (2008)
10. Baker, M.L., Roughen, K.M.: Structural Optimization with Probabilistic Fracture Con-
straints in the Multidisciplinary Radome Optimization System (MROS). In: 48th
AIAA/ASME/ASCE/AHS/ASC Structures, Structural Dynamics, and Materials Confer-
ence, Honolulu, Hawaii (2007); AIAA-2007-2311
11. Ishimaru, A.: Electromagnetic wave propagation, radiation, and scattering. Prentice Hall,
Englewood Cliffs (1991)
12. Holland, J.H.: Adaptation in Natural and Artificial Systems. University of Michigan Press,
Michigan (1975)
13. Riolo, R., Soule, T., Worzel, B.: Genetic programming theory and practice IV. Springer,
New York (2007)
Global Similarity and Local Variance in Human Gene
Coexpression Networks
1 Introduction
In the last few years, gene coexpression networks attracts attention of many re-
searches[1,2]. According to previous studies, these networks are considered as a
graph where each node represents a gene, and edges represent statistically high rela-
tionship between genes. The interaction between two genes in a gene network does
not necessarily imply a physical interaction[4]. For the study presented here, we per-
formed a comparative analysis of whole-genome gene expression variation in 210
unrelated HapMap individuals[7] to assess the extent of expression divergence be-
tween 4 human populations and to explore the connection between the variation of
gene expression and function. Gene coexpression network could be constructed by
using the GeneChip expression profiles data in NCBI GEO (Gene Expression Omni-
bus repository, database of gene expression data). [3]
∗
Corresponding authors.
H. Deng et al. (Eds.): AICI 2009, LNAI 5855, pp. 159–166, 2009.
© Springer-Verlag Berlin Heidelberg 2009
160 I. Krivosheev et al.
Expression profiles of human gene pairs were compared in order to evaluate the di-
vergence of human gene expression patterns. A total of 47,294 human transcripts for
every population were considered. All-against-all gene expression profile compari-
sons for human populations’ matrices (47294*60 CEU, 47294*45 CHB, 47294*45
JPT, and 47294*60 YRI) were used to generate population-specific coexpression
networks. For coexpression networks, nodes correspond to genes, and edges link two
genes from the same population if their expression profiles are considered sufficiently
similar.
Population
Parameter
CEU CHB JPT YRI
Clustering coeffi-
0.350 0.272 0.304 0.320
cient
Network diameter 16 19 24 26
Network centrali-
0.047 0.032 0.066 0.042
zation
Average degree 4.576 10.721 16.46 13.781
As described above, the human population gene coexpression networks are closely
similar in terms of their global topological characteristics; they share similar node
degree (k) distributions and C(k) distributions as well as similar average node degrees
(<k>), clustering coefficients (<C>) and path lengths (<l>). Other parameters related
to neighborhood, such as network density, network centralization and network het-
erogeneity are closely similar.
We further sought to evaluate the similarity between the population-specific coex-
pression networks at a local level. There is as yet no general method for assessing
local network similarity (or graph isomorphism). However, in the case of the human
population gene coexpression networks generated here, the use of orthologous gene
pairs results in a one-to-one mapping between the nodes of the two networks. In this
sense, the networks can be considered to be defined over the same set of nodes N, and
thus can be directly compared by generating an intersection network. The human
population intersection network is defined as the network over the set of nodes N
where there is a link between two nodes i and j if i and j denote two pairs of ortholo-
gous genes which are connected in every human population network. Thus, the inter-
section network captures the coexpressed gene pairs conserved between 4 human
populations.
The global characteristics of the intersection network are shown in Table 2. The in-
tersection network node degree and C(k) distributions are clearly similar to those of
the population-specific networks as are the average clustering coefficient (<C> =
0.213) and average path length (<l> = 3.04). Network diameter equals 10. The net-
work diameter and the average shortest path length, also known as the characteristic
path length, indicate small-world properties of the analyzed network. Taken together,
these findings indicate that the global structure of the population-specific coexpres-
sion networks is preserved in the intersection network. However, the most striking
feature of the intersection network is the small fraction of genes (~20%) and edges
(~4–16%) that are conserved between populations networks (Table 3). Accordingly,
the average node degree is lower (<k> = 7.518) in the intersection network than it is
in each of the population-specific networks.
Nodes Edges
Intersection
713 2680
network
CEU 5546 (13%) 72073 (4%)
CHB 3180 (22%) 17047 (16%)
JPT 3572 (20%) 29398 (9%)
YRI 3061 (23%) 21092 (13%)
Global Similarity and Local Variance in Human Gene Coexpression Networks 163
Intersection Random
Parameter
network network
Clustering coefficient 0.213 0.001
Network diameter 10 10
Network centralization 0.061 -
Average degree 7.518 0.0
Number of nodes 713 713
Number of edges 2680 2680
Network density 0.011 -
Network heterogeneity 1.592 -
Characteristic path
3.040 0.005
length
Genes in the networks were functionally categorized using their Gene Ontology (GO)
biological process annotation terms. Overrepresented GO terms were identified with
BINGO[14] by comparing the relative frequencies of GO terms in specific clusters with
the frequencies of randomly selected GO-terms. The Hypergeometric test was used to
do this with the Benjamini and Hochberg false discovery rate correction for multiple
tests and a P-value threshold of 0.001. Pairwise similarities between gene GO terms
were measured using the semantic similarity method, which computes the relative dis-
tance between any two terms along the GO-graph . Result is shown in Table 4.
The graph (Figure 1) visualizes the GO categories that were found significantly
over-represented in the context of the GO hierarchy. The size (area) of the nodes is
proportional to the number of genes in the test set which are annotated to that node.
The color of the node represents the (corrected) p-value. White nodes are not sig-
nificantly over-represented, the other ones are, with a color scale ranging from yel-
low (p-value = significance level, e.g. 0.001) to dark orange (p-value = 5 orders of
magnitude smaller than significance level, e.g. 10-5 * 0.001). The color saturates at
dark orange for p-values which are more than 5 orders of magnitude smaller than
the chosen significance level.
In fact, from the figure it could be seen that the category ‘biopolymer metabolic
process’ is the important one, and that the over-representation of ‘macromolecule
metabolic process ' and 'metabolic process ' categories is merely a result of the pres-
ence of those 'protein modification' genes. The fact that both categories are colored
equally dark, is due to the saturation of the node color for very low p-values.
4 Conclusion
The global topological properties of the human population gene coexpression networks
studied here are very similar but the specific architectures that underlie these properties
are drastically different. The actual pairs of orthologous genes that are found to be co-
expressed in the different population are highly divergent, although we did detect a
substantial conserved component of the co-expression network. One of the most preva-
lent functional classes that show clear function-expression coherence are genes involved
in biopolymer metabolism. Example of these cluster is shown in Figure 2.
The biological relevance of the global network topological properties appears ques-
tionable[10]. Of course, this does not prevent network analysis from being a powerful
approach, possibly, the most appropriate one for the quantitative study of complex
systems made up of numerous interacting parts.
Acknowledgements
This work was supported in part by the National Natural Science Foundation of China
(Grant Nos. 30871394, 30370798 and 30571034), the National High Tech Develop-
ment Project of China, the 863 Program (Grant Nos. 2007AA02Z329), the National
Basic Research Program of China, the 973 Program (Grant Nos. 2008CB517302) and
the National Science Foundation of Heilongjiang Province (Grant Nos. ZJG0501,
1055HG009, GB03C602-4, BMFH060044, and D200650).
166 I. Krivosheev et al.
References
1. Horvath, S., Dong, J.: Geometric interpretation of gene coexpression network analysis.
PLoS Comput. Biol. 4(8), e1000117 (2008)
2. Carter, S.L., et al.: Gene co-expression network topology provides a framework for mo-
lecular characterization of cellular state. Bioinformatics 20(14), 2242–2250 (2004)
3. Bansal, M., et al.: How to infer gene networks from expression profiles. Mol. Syst. Biol. 3,
78 (2007)
4. Potapov, A.P., et al.: Topology of mammalian transcription networks. Genome
Inform. 16(2), 270–278 (2005)
5. Yu, H., et al.: The importance of bottlenecks in protein networks: correlation with gene
essentiality and expression dynamics. PLoS Comput. Biol. 3(4), e59 (2007)
6. Stranger, B.E., et al.: Relative impact of nucleotide and copy number variation on gene
expression phenotypes. Science 315(5813), 848–853 (2007)
7. The International HapMap Project. Nature 426(6968), 789–796 (2003)
8. Margolin, A.A., et al.: ARACNE: an algorithm for the reconstruction of gene regulatory
networks in a mammalian cellular context. BMC Bioinformatics 7(suppl. 1), S7 (2006)
9. Vlasblom, J., et al.: GenePro: a Cytoscape plug-in for advanced visualization and analysis
of interaction networks. Bioinformatics 22(17), 2178–2179 (2006)
10. Tsaparas, P., et al.: Global similarity and local divergence in human and mouse gene
co-expression networks. BMC Evol. Biol. 6, 70 (2006)
11. Khaitovich, P., et al.: A neutral model of transcriptome evolution. PLoS Biol. 2(5), E132
(2004)
12. Yanai, I., Graur, D., Ophir, R.: Incongruent expression profiles between human and mouse
orthologous genes suggest widespread neutral evolution of transcription control.
OMICS 8(1), 15–24 (2004)
13. Jordan, I.K., Marino-Ramirez, L., Koonin, E.V.: Evolutionary significance of gene expres-
sion divergence. Gene 345(1), 119–126 (2005)
14. Babu, M.M.: Introduction to microarray data analysis. In: Grant, R.P. (ed.) Computational
Genomics: Theory and Application. Horizon Press, Norwich (2004)
A Grid Based Cooperative Co-evolutionary
Multi-Objective Algorithm