InfoSphere™ Optim™ & Guardium® Technology Ecosystem
InfoSphere™ Guardium® Technical Training
Classifier
Information Management
© 2011 IBM Corporation
Information Management
Agenda
What is Classifier?
How does Classification work?
Classification Rule
Classification Policy
Classification Process
Classification Reports
Classification Workflow
Benefits and Use Cases
2 © 2011 IBM Corporation
Information Management
What is Classifier?
■ Sensitive information may be present in multiple locations without the knowledge
of the current owners of these data, or may simply not have any owner at all:
– Enhancement projects between disparate systems
– Mergers and acquisitions
– Legacy systems that have outlasted their original owners
■ Sensitive information may be credit card numbers and transactions, personal
information, or financial data
■ Guardium Classifier feature discovers and classifies sensitive data by efficiently
crawling databases or unstructured files searching for the specified data pattern,
and in case of databases, catalog information or permissions
3 © 2011 IBM Corporation
Information Management
How does Classification work?
Classification Policy: a set of rules designed to discover and tag sensitive data
elements. Sensitive data are looked for based on the type of discovery and an
action is performed when sensitive data are found
Classification Process: a job consisting of a classification policy and one or more
datasources. The process can be scheduled to run on a periodic basis as a task
in a compliance workflow automation process
Classification Audit
Rules
Classification Classification Process
Policy Process
Rule Type
Rule Action
Data Sources
4 © 2011 IBM Corporation
Information Management
Classification Policy and Process Builder
Accounts with admin role:
Tools > Config & Control
Accounts with user role:
Discover > Classification
5 © 2011 IBM Corporation
Information Management
Classification Rule
Defines what to search for, how to search, and
what action(s) to take if an object is found.
Catalog Search: Search the database catalog
for table or column name
Search by Permission: Search for the types of
access that have been granted to users or roles
Search for Data: Match specific values or
patterns in the data
Search for Unstructured Data: Match specific
values or patterns in an unstructured data file
(CSV, Text, HTTP, HTTPS, Samba)
6 © 2011 IBM Corporation
Information Management
Search Expressions
Regular Expressions
Catalog (column name, table name) or SQL:
■ % = any character (%card will match “credit card” and “payment card”)
Search Expressions:
■ * = any number of characters
■ ^ = Match string from the beginning (^find* will match “findthis”)
■ $ = Match string at the end (*this$ will match “findthis”)
■ [0-9] = match a single digit from 0 through 9
■ {n} = match exactly n instances of the preceding character(s)
Built-in Patterns
guardium://CREDIT_CARD → Detects two credit card number patterns. It tests for a string
of 16 digits or for four sets of four digits, with each set separated by a blank
guardium://SSEC_NUMBER → Detects Social Security Number format: three digits, dash
(-), two digits, dash (-), four digits
guardium://PCI_TRACK_DATA → Detects two patterns of magnetic stripe data used in the
Payment Card Industry
7 © 2011 IBM Corporation
Information Management
Classification Actions
Add To Group Of Object-Fields: A member
will be added to the selected Object-Field
group. Can be used for structure data and
unstructured data files.
Add To Group Of Objects: A member will be
added to the selected Object group
Create Access Rule: An access rule will be
inserted into an existing security policy
definition
Ignore: Do not log the match, and take no
additional actions
Create Privacy Set: The selected privacy set's object-field list will be replaced. A privacy
set is a collection of elements that merit special monitoring
Log Policy Violation: A policy violation will be logged. This means that classification
policy violations will be logged (and can be reported) together with access policy
violations (and optionally correlation alerts) that may have been produced
Log Result: Log the match, and take no additional actions
Send Alert: An alert will be sent to one or more receivers
8 © 2011 IBM Corporation
Information Management
Classification Policy
■ Set of classification rules that have a similar discovery and classification objective
■ Many rules with multiple actions can be defined
■ Number of rules will affect the classification process run time
■ Classification policy is data source independent → one policy can be run against
many data sources
9 © 2011 IBM Corporation
Information Management
Classification Process
■ Classification process defines a classification job
■ Classification process applies the policy to one or multiple data sources
■ Can be run ad-hoc or scheduled as part of a audit process
10 © 2011 IBM Corporation
Information Management
Classification Report
11 © 2011 IBM Corporation
Information Management
Classification Workflow
■ The Classification Process can be scheduled using Audit Process
– Ensures policies are based on up-to-date groups of sensitive objects
– Important if sensitive information is added or location of sensitive objects are changed
■ Results can be automatically sent out for review to verify the discovery and classification
12 © 2011 IBM Corporation
Information Management
Benefits and Use Cases
■ Group sensitive objects from multiple data sources
■ Apply security policies to groups of objects with similar properties
■ Secure information and manage risk when the sensitivity of stored information is
not known
■ Ensure compliance when it isn’t clear which information is subject to the terms of
particular regulations
13 © 2011 IBM Corporation
Information Management
Questions?
14 © 2011 IBM Corporation
Information Management
Classifier – Lab
15 © 2011 IBM Corporation
InfoSphere™ Optim™ & Guardium® Technology Ecosystem
InfoSphere™ Guardium® Technical Training
Classifier
Information Management
© 2011 IBM Corporation