ResearchGate
Machine Learning for Power System Disturbance and Cyber-attack
Discrimination
conference Paper Augst2016
240
EI) createnewproer"OREA view pct,
‘comet owing ts ape was wade Raymond or 0027 Feb 201,
588
oT.
srrumeon aantcrnonsMachine Learning for Power System Disturbance and
Cyber-attack Discrimination
Raymond C. Borges Hink, Justin M. Beaver,
Mark A. Buckner
‘Oak Ridge National Laboratory
Email: {borgesre, beaverjm, bucknerma} @[Link]
Apstaact—Power system disturbances are inherently
complex and can be attributed to a wide range of sourees,
including both natural and man-made events. Currently,
‘the power system operators are heavily relied on to make
decisions regarding the causes of experienced disturbances
and the appropriate course of action as a response. In the
ease of eyber-attacks against a power system, human
judgment is less certain since there is an overt attempt to
disguise the attack and deceive the operators as to the true
state of the system. To enable the human decision maker,
‘we explore the viability of machine learning as a means for
discriminating types of power system disturbances, and
focus specifically on detecting cyber-attacks where
deception is a core tenet of the event. We evaluate various
‘machine learning methods as disturbance discriminators
and discuss the practical implications for deploying
machine learning systems as an enhancement to exi
power system architectures.
Keywwords—machine learning, «pber-attack, SCADA, Smart grid
1. INTRODUCTION
‘The core mission of power systems is resilience - continued
delivery of electricity to the customer. These systems have
been designed with the redundancy and fault tolerance
‘mechanisms to perform this mission, but at a time when
computer security was not a design driver. As formerly
physically isolated power systems were joined to the Internet
for centralized control and management, it created a greater
potential for unauthorized access and exposed these systems to
the same vulnerabilities that plague traditional computer
systems and networks.
Industrial control systems, such as those used in the Smart
Electric Grid, are becoming more complex in their architecture
and design. The Supervisory Control and Data Acquisition
(SCADA) systems that are used are more interconnected and
span multiple communication protocols and physical interfaces
‘The methods by which data are collected from remote
Tocations, as well as commercially available SCADA software
developed for physically isolated systems, lead to more
potential flaws in the hardware and software and provide a
‘much larger attack surface to threat agents [20]. Every asset of
the Smart Grid, from home gateways to smart meters to
‘Tommy Morris, Uttam Adhikari, Shengyi Pan
CCitical Infastructure Protection Center
Mississippi State Univeristy
Email: {morrs, ua31, sp821}@[Link]
substations to control rooms, is a potential target for a eyber~
attack (21)
‘Modem power systems are now connected to the Intemet and
computer security is a new threat to resilience [18, 19]. Power
companies must now engineer security into their systems in
arrears of the system design, or rely exclusively on traditional
computer network defenses to prevent unauthorized access.
Power system operators who monitor, assess, and react to
disturbances must now consider the new possibility that the
system is under a cyber-attack. This question is particularly
challenging for a human to answer because, unlike natural
disturbances or faults, a eyber-attack is designed to deceive,
In this work, we explore the suitability of machine leaning
methods as a means of discriminating power system
disturbances, We theorize that the machine learning algorithms.
‘will leverage non-linear complex relationships between power
system measurements and that these will be sufficient to
discriminate between malicious, non-malicious and natural
disturbances, Cyber-attacks can have the same effects as
natural events and so differentiating between malicious and
rnon-malicious in a large and interconnected system can be
overwhelming if not infeasible for a human. The intent of this
‘work is to determine an optimal algorithm that is accurate in its
classification such that it ean provide reliable decision support
{0 a power system operator, and thus relieve that operator of the
burden of determining whether a disturbance is an intentional
act, We evaluate the classification performance of various
‘machine learning methods and discuss the implications for
fielding machine learning systems and any associated
operational constraints, The remainder of this paper is
“organized as follows. Section 2 presents related work. Section 3
discusses our methodology when applying our experiments and
subsequent testing of machine leaming methods. In Section 4
‘we describe our results. And finally, Section 5 presents
conclusions,
RELATED Work
‘Machine learning has distinguished itself as a discriminator of
‘malicious and anomalous events in intrusion detection for
traditional eyber security networks [32] [33] [34]. These are
systems that analyze the network transactions between‘computers and have been trained to characterize and recognize
‘behavioral patterns in that traffic. Our approach is to extend
this work and apply it to power systems, where networks are
the means for communicating the state and operation of
different power delivery components. This application focuses
fon the simultaneous assessment of dozens of variables
associated with devices such as relays and generators as they
are communicated within the power system network. The
subsections below deseribe the vulnerabilities associated with
modern power systems and the related work in intrusion
detection systems (IDS) that domain,
A. Synchrophasor-based Smart Grid Cyber Security
The smart grid consists of two layers, cyber and physical
systems. The two layers are coupled with each other and form
the cyber-physical environment. The Synchrophasor or Phasor
‘Measurement Unit (PMU) technology is built upon the cyber
layer and provides real-time data to the energy management
system (EMS) for the purpose of controlling the physical
system, Such processes are presented as a sequence of
execution events in the cyber-physical environment, The
synchrophasor data includes not only the measurements such as
voltage and current phasors but also the status of system
devices including relays, breakers, switches, and transformers
[1]. The extreme low latency offered by time-synchronized data
provides a huge volume of data with extra information and
enables various real-time power system conteol algorithms in
order to increase smart grid reliability and stability [2] [3] [4]
‘The deployment of synchrophasor technology accelerates the
use of communication networks within utilities and between
neighboring uilities. The latest synchrophasor devices are
‘vulnerable to cyber-attacks [7]; there are still large numbers of
legacy devices in service with litle or no protection against the
attacks
Contemporary attacks against a power system can be launched
from a compromised personal computer (PC) through a
network to control a breaker. For example, the Aurora event
highlights the potential for an attacker to ‘open and close a
breaker at high speed from @ remote connection to damage an
electric generator [5]. Vulnerabilities can also be exploited
against Intelligent Electronic Devices (IED) by uploading
‘malicious settings, The Stuxnet worm [I] is an example of
seltings changes on a control device causing a physical system
to malfunction. Moreover, most network protocols used in
power systems are open standard protocols without any security
features. Such protocols include IEEE C37.118 protocol, used.
for synchrophasor data streaming, MODBUS, used to remotely
‘monitor and control IED, and DNP3, which is also used to
remotely monitor and control IED. The penetration tests,
conducted in [6] and [7] have shown that eyber-attacks targeted
against substation computers and devices can lead to Denial of
Service (DoS) by making communication with a device
‘impossible or causing devices to crash or reset and therefore
prevent real time monitoring and controlling of the power
system.
B. IDS for Smart Grid
In recent years, the emergence of Smart grid has motivated
research into a variety of intrusion detection techniques. People
with different backgrounds have created various intrusion
detection systems (IDS) that focus on different intrusions
against Smart grid. One type of IDS research focuses on IED
scourty within Smart grid. For example, Chee-Wooi Ten etal
in [8] developed an anomaly-based detection technique for
intrusions to IED. The Chee-Wooi Ten IDS is host-based thus
only identifies attacks against a single IED in the substation
using sequential events recorded in the log from that IED.
Another IDS proposed by Chen et al. in [9] provides a
protection mechanism for smart household appliances. Chen et
al. created security ules for individual appliances by proposing
homogeneous functions that models three factors of the
appliance: device security, usability and electricity pricing.
More advanced IDS of this type will consider behaviors of
multiple devices within the system to obtain system level
detection. In [10], Robert Mitchell etal. propose specitication-
based IDS for the electric grid by considering the behaviors of
three types of physical devices in the electric arid: head-ends,
distribution avcess points/data aggregation points and
subseriber energy meters. They use readings from 22 sensors
from the three types of devices as state components. By
«quantizing each ofthe 22 components into a Timited number of
ranges, they manually build three state machines with 3456,
1728, and 3456 states forthe three devices respectively in the
terms of conjunetive normal form. It's very expensive to build
such IDS's due to the large state space. In addition, this IDS
uses a limited number of sensors therefore it's able to detect a
small number of attacks. And also the method is not scalable,
since there are always new attacks and applications
Another type of IDS for Smart grid leverages communication
traffic in the information infrastructure to detect eyber-attacks.
‘Yang et al. propose an IDS in (11) for synchrophasor systems
that detects cyber-attacks by using access control white lists,
protocol-based white lists and network behavior-based rules,
each of which specify security rules in different layers of the
synchrophasor system. The Yang et al. intrusion detection is
limited to eyber-attacks including Man-in-the-Middle (MITM)
and Denial of Service (DoS) against synchrophasor devices and
IEEE C37.118 protocol. Similar to Yang’s IDS, Zhang etal. in
[12] propose a distributed IDS that analyzes communications
traffic at different network levels of smart grid including home
area networks, neighborhood area networks, and wide area
networks. An intelligent module is deployed at each level 10
classify malicious data and possible eyber-attacks using data
‘mining algorithms. These modules then communicate to get a
system level view of the status of the whole communication
network to improve the detection accuracy. Hadeli et al, in
[13], propose an anomaly detection technique for industrial
control systems that extracts behavior patterns of devices from
protocols used in industrial control systems, for example,
GOOSE messages, TEEE 61850, Manufacturing Message
Specification, Modbus TCP and redundant network routing,
protocols. The Hadeli et al. IDS uses a system description fileto include a full description of the overall communication
pattern in the industrial control system,
For the ease of power system control applications, the system
description file describes expected system behaviors from
information carried by those protocols. Hadeli’s method, along.
with [11] and [12] is efficient to detect malicious activities that
‘cause changes in network traffic, but the IDS fails to detect
‘malicious actions that result in invalid changes to the physical
system. For example, Hadeli’s method cannot detect a
‘malicious trip command from a valid IP address that trips a
relay, taking a transmission line out of service and causing a
blackout. A specification-based IDS that can track sequential
cevents in the system is reported in [14] for advanced metering,
infrastructure (AMD. The authors manually build the state
‘machine by extracting specifications from two AMI protocols.
and they consider the devices status. To prove the correctness
of the state machine, they use a model checking technique to
verify their specifications. This IDS is also not applicable to
transmission systems because transmission systems have far
‘more control applications and disturbances than AMI. As such,
‘manually building a state machine is very expensive.
While the two types of IDSs mentioned above were created
from a computer science perspective, there has been work to
create IDSs for Smart grid using power system theories, For
instance, Valenzuela et al, [15] used optimal power flow
programs to detect cyber-attacks, leveraging the notion that the
‘bad data will cause the power flow to be dispatched
erroneously. Talebi et al. in [16] proposed a mechanism for
identification of bad data attacks in a power system using
‘weighted state estimation, Zonowz et al. proposed an IDS that
not only examines the measurement data using state estimation
and power flow theory but also includes the results from
network IDS to calculate the probability that the data is
compromised [17]. Although these works all proved to be
functional to detect false data, the limitation of this type of IDS.
is that its limited to one type of attack and cannot be extended
to detect other attacks against power systems.
In our previous work [36] we applied multiple leaning
algorithms to Modbus RTU data in order to show their viability
as intrusion detection tools on a simple gas pipeline system.
‘State-oF-the-practice classification algorithms were applied in
‘order 1 demonstrate an ability to diseriminate command and
data injection attacks for simple and small-scale SCADA.
systems, This was @ foundation for the viability of machine
learning in this domain.
In this work, we extend that approach in both complexity of the
system under evaluation and in the sophistication of the
classification methods applied. Our hypothesis is that the
learning algorithms can detect disturbances and reliably
classify them as a natural or malicious disturbance, despite any
attempts at deception,
Ill, Metiopotoy
This section describes our approach to evaluating machine
learning classification techniques for discriminating power
system disturbances. The system used for evaluation is
described as well as the different natural and man-made
scenarios, We also discuss the machine learning methods
used and the different approaches to classification.
A. Power System Description
In Figure I we show the power system framework used in
this evaluation, a complex mix of supervisory control systems
{interacting with various smart electronic devices complemented
by network monitoring devices such as SNORT and Syslog
systems, The network is composed of 4 breakers controlled
by imtelligent electronic relays. These IEDs relay
information back through a substation switch through a
router back to the supervisory control and data acquisition
systems. Attack scenarios were built and simulated with the
assumption that an actor had already gained access to the
substation network and poses an insider threat by issuing
commands from the substation switch.
efi aaaat
Snort 51102 Control Pane! OpenPDC
i Control Room
Fie Experiment Network Diag
In Figure I we have several components; firstly, G1 and G2 are
power generators. RI through R4 are IEDs that can switch the
breakers on or off, These breakers are labeled BRI through
BR4, We also have two transmission lines. Line 1 spans from
bbus BI to bus B2 and Line 2 spans from bus B2 to bus B3.
Each IED automatically controls one breaker, RI controls BRI,
R2 controls BR2 and son on accordingly. The IEDs use a
distance protection scheme which trips the breakers on detected
faults whether actually valid or faked since they have no
internal validation to detect the difference. Operators can also
‘manually issue commands to the IEDs R1 through R4 to
‘manually trip the breakers BRI though BR4. The manual
override is used when performing maintenance on the lines orcother system components. In our analysis, we explicitly
include examples from multiple operational scenarios in order
to have confidence that any attack discrimination was valid
luring normal operations where the breakers were manipulated,
‘The man-made disturbance scenarios ae listed below.
‘Types of Scenarios:
1. Short-cireuit fault ~ this is a short in a power line and can
‘occur in various locations along the line, the location is
indicated by the percentage range.
2. Line maintenance -one or more breakers are opened via
the remote relay rip command for maintenance.
3. Remote tripping command injection (Attack) ~ this is an
attack that sends a command to a relay which causes a
breaker to open. It can only be done once an attacker has
penetrated outside defenses.
4. Relay setting change (Attack) ~ relays are configured with
a distance protection scheme and the attacker changes the
setting to disable the relay function such that relay will not
trip fora valid fault or a valid command.
5. Data Injection (Attack) ~ here we imitate a valid fault by
changing values to parameters such as current, voltage,
sequence components eic., in order to blind the operator
and cause a black ou
B. Analytic Approach
To judge the viability of using machine leaming for intrusion
dotection on smart grid electrical systems we tested various
popular learners using Weka [22] as the machine leaning
framework and open-source simulated power system data
provided by Mississippi State University [37]. The
classification of events was performed using three different
classification schemes:
‘* Multiclass - Each of the 37 event scenarios, which
included attack events, natural events, and normal
operations, was its own class and was predicted
independently by the learners,
‘© Three-class ~ The 37 event scenarios were grouped into 3
classes: attack events (28 events), natural event (8 events)
for "No events” (1 event),
‘© Binary ~ The 37 event scenarios were grouped as either an
attack (28 events) or normal operations (9 events),
‘The data was drawn from 15 data sets which included
thousands of individual samples of measurements throughout
the power system for each event type. The datasets were
randomly sampled at 1% to reduce the size and evaluate the
effectiveness of small sample sizes. For this analysis, there was
aan average of 294 “No event” instances, 3,711 attack instances
and 1,221 natural events instances used across the classification
schemes. The date and time information were removed since
scenarios were run sequentially and time and date would
perfectly classify the data.
For each of the three schemes, Multiclass, Three-class and
Binary, we tested 7 leamers on 15 datasets. When running the
experiments we chose to use the tenfold or 10x cross validation
‘methodology. When testing using this method we partitioned
the dataset into 10 sets randomly selecting instances from each,
category. The model was built on a ninety percent selection
from the data and tested on the remaining ten percent of the
data to evaluate the learner's performance. We repeated this for
ceach learner and each dataset then taking the average over the
fifteen datasets to summarize the results.
‘The classification algorithms we tested wer
‘OneR ~ This is a learner with a very simplistic method that
evaluates each feature’s optimum rule and chooses the best one
[24] from all feature rue sets.
NNge — a neatest-neighbor-like algorithm that classifies
‘examples by comparing to those already seen and comparing,
the new examples to its surrounding data points [27]
Random Forests ~ this is an ensemble of tree predictors where
each tree casts a vote for the most popular class on input of a
new instance [23]. The collection of decision trees are created
from randomly pulled training data samples
Naive Bayes - isa probabilistic classifier based on the Bayes’
theorem [25] that reflects the conditional probability
disteibution of a set of random variables, and was adopted into
the field of machine learning in 1992 [26}
‘SVM ~ Support vector machines [28] trained using sequential
‘minimal optimization [29]. An SVM model is a representation
of the examples as points in a space, with classes divided by a
‘mathematically determined set of hyperplanes that maximize
the margin between the classes. New examples are then
predicted to belong to a class based on their position in that
space relative to the hyperplanes.
Ripper ~ Incremental Reduced Error Pruning algorithm that
uses a separate-and-conquer methodology developed in [30]
and modified by Cohen as shown in [31] to generate a
sophisticated rue set.
Adaboost ~ short for Adaptive boosting, this is an algorithm,
use to improve the performance of other types of learning,
algorithms [35]. It is an ensemble learning method where each,
new model instance focuses on training examples that were
‘misclassified in the previous models. By combining Adaboost
‘with our strongest performer we achieve much better results
‘AdaBoost MI method used in Weka can be used in conjunction
‘with leamers to improve their performance.
‘The classifiers we used ean be grouped under these categories:
Probabilistic classification (Naive Bayes)
Rule induction (OneR, NNge, JRipper)
Decision tree learning (Random Forests)
'Non-probabilistic binary classification (SVM)
Boosting, a meta-algorithm for leaming (Adaboost)IV, RESULTS
‘The results of our evaluation and analysis of the viability of
‘machine learning as a method for power system disturbance
discrimination are presented below. Initially, we evaluate the
accuracy of various Jeamers across all data sets in order to
establish a pattern of consistency in the classification results
‘We follow with an evaluation of the various leaming methods
to the power system data to evaluate the power system
disturbance classification. Next is an analysis of the most
significant individual features that contribute to a decision. We
conclude our analysis with a discussion on the operational
viability of laaming methods given the results of this research.
A. Analysis of Accuracy Results
‘The accuracy of a learner is defined as the percentage of correct
classifications relative to the total number of classification
decisions the Iearner made. When classes are balanced,
accuracy provides a good general indicator of classifier
performance. The machine learning method evaluation in
Section IIL.B presents performance measures of the 10-fold
‘ross validation averaged across all data sets. The goal of this,
is initial analysis step isto establish the consistency of learner
performance across data sets so that any averaged performance
values remain credible,
In Figures 1, III and IV we show the classification accuracy
average over the 15 datasets for multiclass, three-class and
binary classification using 7 different algorithms. Note the
consistency of the results regardless of the data set to which the
learning method is applied, While minor variations exist for
each leamer, their individual performance remains steady
regardless ofthe data set of classification scheme,
B. Machine Learning Method Evaluation
Having established that averaging the 10-fold cross validation
results in a reasonable characterization of classifier
performance over all data sets, we focus on the evaluation of
the learners themselves using those averaged values. While
accuracy provides a general indicator of classifier performance,
recall, precision, and F-measure values give a more complete
picture of how the classifier produces errors, Recall measures
the true positive rate, precision measures the positive predictive
value, and the F-measure is the harmonic mean of precision and.
recall. For these measures, values approaching 1.0 indicate
strong classification performance.
Figure V shows the precision value of the various learners
averaged over the 15 datasets where the 10-fold cross
validation approach was used for each data set. Each line
represents leamer performance using the three different
classification schemes. As the measure of positive prediction
rate, precision provides a sense of the false positive values
‘when predicting for specific class such as eyber-attack. For
precision, Random Forests, JRipper and Adaboost+JRipper
have the strongest performance over all classification schemes,
with AdaboosttJRipper for the three-class scheme having the
highest average precision value (0.991),
seeetered
PPPPPPEPPPPP POE
Fig Maliclss Accuracy over Fifteen Daas
POPP PPPPPEPP PPE
Fig I Three-class aeeuracy over Fifteen Datasets
POOP P PE PIL PLP PS
Fig IV, Binary classification accuracy over Fifteen Datasets
Figure VI shows a similar set of results for averaged recall. As
recall reflects true positive rate, this evaluation identifies the
learning methods that detected eyber-attacks most successfully
Interestingly, a slightly different set of learners surface as high
performers for this metric. For example, OneR and Naive
Bayes, two of the simplest methods, score very high (1.0 and
0.961, respectively) in terms of averaged ‘recall whereas
Random Forests performs significantly worse. Ripper and.
Adaboost+IRipper are consistently strong with recall values inthe 08 to 0.9 range. The high recall values coupled with the
low precision values for some learners indicate that leamer’s
bias towards the positive (attack) class. ‘That is, simple learners
such as OneR and Naive Bayes may correctly classify
‘malicious power system disturbances, but at the cost of a
disproportionate amount of false positive values. In a practical
setting, the value that such a learner would bring to a decision
‘would be low since its classification would not be reliable,
‘The F-measure, whose averaged values for all data sets are
shown in Figure VIL, inttinsically describes classification
performance in terms of both precision and recall. As expected,
those learners that performed well in terms of both precision
and recall have the highest F-measure score, with
TRipper+Adaboost having the highest overall value at 0.955 for
the three-class classification scheme. Based on these results,
the Adaboost+JRipper algorithm using a three-class
classification scheme is the optimum approach to reliably
classifying power system disturbances.
‘The variation in results based on classification scheme
(multiclass, three-class, binary) is surprising. While the three-
class produced the overall best performer, the results are
inconclusive as to whether this is the optimum classification
scheme across all learners. Different classification schemes,
coupled with different learners produce dramatically different
results across all. performance metrics. This implies an
unexpected sensitivity to the classification scheme and suggests,
homogeneity in the data for all disturbance types. A future
direction for this research is to explore classification schemes
and learner configuration to more thoroughly address ths issue,
including the possibility of staging leamers for optimum
classification performance, Despite the inconsistencies in
results across classification schemes, the JRipper+Adaboost
algorithm as the optimum leamer is still a valid result as that
approach consistently outperformed the other leamers across all,
classification schemes,
We attribute the strong performance of the IRippertAdaboost
approach to its tree-based approach to rule generation coupled
with the learning ensemble. However, it was surprising that
Random Forests, an ensemble method leveraging decision
trees, performed poorly in comparison. We attribute this,
difference to the way in which the training data is prepared for
each learning approach. Random Forests do no pruning of their
underlying decision trees, and draw their training data samples
randomly, thus providing a very basic approach to building the
decision trees and combining them in an ensemble. JRipper
applies @ pruning algorithm to the sampled training data that
‘minimizes errors. In addition, the boosting creates an ensemble
that is focused on previously misclassified data, another
intrinsic attempt to minimize error. Given the small number of
training data examples relative to the number of features being.
evaluated, methods that explicitly attempt to minimize
classification error should be expected to perform better.
ie
=
$f ff ter “
Pee
e
: ZS
Cf f # “¢ aa
Fig VIL Average F-Measute over Classification Schemes
C. Feature Analysis Discussion
In our framework there were 4 synchrophasors that measured
29 features each for a total of 116 PMU measurements, There
are also three different log types: control panel logs, Snort logs.
and relay logs for each PMU for an additional 12 features and a
total of 128 features. Table I shows the features extracted from,
‘each PMU and a short description for each, Note that numbers
indicate a range of measurementsTABLE FEATUREDEScRIHONS
feature DescriptION ee
PALVHL-PARVH Phase A=CVoiage Phase Angle
Phase AC Voltage Magnitude
Phase A - C Current Phase Angle
Phase AC Curent Magnitde
Pos. ~ Neg. Zeo Voltage Pass Angle
Pos. Neg. Zero Voltage Magnitude
PMT: V—PM9: V
PALO:VH-PAL2:VH__ Pos, ~Neg.—Zero Current Phase Angle
os. - Neg. ~ Zero Current Magnitude
Frequency for relays
Frequency Delta (Fl fr relays
‘Apparent Impedance seen by relays
“Apparent Impedance Angle seen by relays
‘Status Flag for relays
Fig VII. Information Gain Ranked Features
‘The information gain-ordered features are presented in Figure
VIIL. For our measurements, about 50% of the 128 features
provide about 96% of the leaming value. The four features
with the highest information gain were Apparent Impedance
‘measurements for each relay, having values in the 4.8 t0 49)
range. These were followed by Voltage Phase Angles, Current
Phase Angles and Voltage and Current Magnitudes, which had
values in the 3.0 range. Together, these account for the top 40
features. After these 36 additional features there is another
comparatively large drop in information gain making up what
appears to be three levels of information gain groupings.
‘We repeated the experiment using the JRipper algorithm and.
evaluated its classification performance using both the
‘grouping of only the four best features and the grouping of top
40 features. Using only the top four features as training data
yielded poor results, but using the top 40 features for training
data resulted in the same classification performance for the as
when using all of the available features. This identifies an
‘opportunity for dimensionality reduction, but more importantly
it reinforces the need for a algorithmic decision support
component to power system disturbance classification. The
simultaneous evaluation of the four most significant metries
(which in itself would be challenging for a human) is
insufficient for reliable classification. It requires the
simultaneous evaluation of dozens of power system metries to
detect power system disturbances for eyber-attack detection — a
feat that i intractable for a human to perform.
D. Operational Viability Discussion
‘The classification approach to machine learning is still not
widely used in industry as an intrusion detection system,
mainly due to a poor understanding of the training data
requirements that are necessary to construct a reliable learner
As the results indicate, a power system disturbance detector
based that uses event classification to provide decision support
to its operators would be reliable and effective in determining,
the nature ofa disturbance and an appropriate associated course
fof action, However, an operational deployment of a
classification system would also require the site-specific
acquisition and maintenance of disturbance training data, since
the classification models are not generally applicable, as rules
fom signature-based systems are. Both attack and normal
operations data must be acquired in-situ, from the system that
will be monitored, and then must be appropriately tuned 10
‘minimize false positives. ‘The technical issue of the need for
labeled training data could be abated by exploring alternative
approaches that minimize or eliminate the amount of labeled
data needed (e.g., unsupervised and semi-supervised methods)
yet retain the classification performance. However, the
‘operational processes for acquiring and maintaining in-situ
training data and the support processes for learning system
feedback and retraining criteria do not currently exist, and so
are a both a barrier to operational viability and an opportunity
for future research,
V. CONCLUSION
We have established initial benchmarks for applying machine
learning approaches to power system disturbance classification
‘on a smart power grid framework. Using the JRipper+Adaboost
‘method over a three-class (Attack, Natural Disturbance, and No
Event) classification scheme, we were able to reliably classify
power system disturbances with low false positive rates
‘Therefore, based on the results of applying learning methods to
this power system data, we conclude that machine learning is a
viable approach to providing reliable decision support to power
system operators on whether the system is under attack.
Despite these results, we recognize that further work is required
to make Ieaming-based systems deployable in an operation
environment. From a learning perspective, these results need to
bee validated on a broader set of power system data and with a
‘wider variety of learning approaches, classification schemes,
and amounts of labeled data. In addition, more work is
required in understanding the concept of operations associated
‘with these systems, such as methods for determining training
and retraining needs, approaches for generating and managinglabeled data, in-situ evaluation tools to select the optimum
learner and tune the performance of that learner in that specific
deployed environment. However, this work serves as an initial
set of evidence for the application of machine learning in this
domain and motivation for further research,
VI ACKNOWLEDGEMENT
Research sponsored by the Laboratory Directed Research and
Development Program of Oak Ridge National Laboratory, P.O.
Box 2008, Oak Ridge, Tennessee 37831-6285; managed by UT
Battelle, LLC, for the U.S. Deparment of Energy under
contract DE-AC0S-000R2225. This manuscript has been
authored by UT-Battelle, LLC, under contract DE-ACOS-
000R22725 for the U.S. Department of Energy. The United
‘States Government retains and the publisher, by accepting the
article for publication, acknowledges that the United States
Goverment retains non-exclusive, paid-up, irevocable,
‘worldwide license to publish or reproduce the published form
of this manuscript, oF allow others to do so, for United States
Government purposes.
VIL. REFERENCES
1 Fale, L.O°Murchu and . Chien, “W32 Stuanet Dossier", Online
Imipgo kaVOSC, Nev. 2010.
D.'E. Bakken, A. Bose, C. H. Hauser, E. O. Schweiter MD.
Whitehead, and G. C” Zacile, "Smart Generation ad Transmission
with Coherent, Real-Time Data Tecnical Rept TR-GS-O1S. August,
2010,
RR Monkey and D. Dolesilsk, "Case studies: Synchophasors for wide-
area monitoring. proeetion, and. conto” Proc. 2nd. IEEE PES
Intemational ‘Conf and. Exhibition on” lnovative Smart Grid
‘ecinologies(ISGT Europe, p.1-7, 5-7, Dee. 2011,
"Horowitz, D. Novesl V- Madani, and M- Adamiak, “Stem. Wide
Protection’ IEEE Power & Energy Magazine, vl. ,n0. 6p. 4 ~ 42,
Sep. 2008,
SEL; "Mitigating the Aurore Vulnerability with Existing Technology.”
sine: hipg00 U9HKAT, Oct. 2009
MM. Maser and 1 Nat Fovino, “Eft of intentional threats to power
‘station control systems int. J. Cra Infrastructure, vo , m0
12, pp 129-143, 2008
T. Moms, S. Pa, J. Lewis, J. Moorhead, B. Reaves, N. Younan, &
King, M. Freund, and V. Madani, "Cybersceurity Testing of Subsiaton
Phasoe Measurement Unite and Phasor Data Concentrators” (CSURW
Ipp 12-14, Ost 201
Choe: Woo Teno Hong: Chen-Ching Lis, “Anomaly Detection for
Cybersecurity ofthe Substtons” Smart Grid, IEEE Transactions on
yol2, 04, pp.865873, Dee. 2011
Y, Chen and Lo, "$23 Secure smart howschold appliances," in Proc.
2°" ACM Cont, Data Application Sceuty Privacy, San Antoni, TX,
USA, pp 217-228, Feb. 2012
Michell, IngRay Chen, "Behavior Role Based ntsion Detection
Systems fo Safety Critical Smart Grd Applications” Smart Grid, IEEE
‘Transactions, vol 203, pp-1284, 1263, Sept 2013
Yang ¥: MeLaughlin, Ke; Seas, 8; Lite, Panggono, Bs Brogan,
Pe: Wang. HF, “Intsion Detection Syston for network secu in
synchrophasr systems" IET International Conf, vol, 9p 246,252,
27-28, April. 2003
¥. Zhang, L_ Wang: W. San; Groen, RCs Alam, M.,"Disbuted
Intrusion Detection System ina Mut-Layer Network Architect of
Sart Grid” Smart Grid, IEEE Transactions, v2 not, p96 808,
Dee 2011
a
ra)
3
io}
6
o
a
6
%
(io
03)
03
ay
ust
8
un
us)
9
(29)
eu
ea
ea
ea)
25)
(26)
en
es)
29)
0)
ou
ea
oa
oa
6s
roy
on
adel, Hs Schicbol, Rs Braondle, Ms Tudues, C, "Leveraging