0% found this document useful (0 votes)
80 views5 pages

Data Rationalization for Analysts

The document discusses data rationalization, which is the process of linking different data models and metadata to provide a holistic view of the relationships between data objects across various levels of abstraction. Data rationalization can improve data governance, analysis, and knowledge transfer by exposing the semantics and relationships embedded in an organization's models.

Uploaded by

Akshara A
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
80 views5 pages

Data Rationalization for Analysts

The document discusses data rationalization, which is the process of linking different data models and metadata to provide a holistic view of the relationships between data objects across various levels of abstraction. Data rationalization can improve data governance, analysis, and knowledge transfer by exposing the semantics and relationships embedded in an organization's models.

Uploaded by

Akshara A
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

Unit 3 DA

Data Rationalization / Variable Rationalization:

Data rationalization through managed metadata and data modeling,


can support semantic resolution, enabling improved analysis and
knowledge transfer.

Introduction

Ontology is “the study of the categories of things that exist or may


exist in some domain”. Another definition of ontology, applicable to
domains, “a collection of taxonomies and thesauri” from Seth Earley.

Across the Internet and within internal systems, ontologies are used
to improve search capabilities and make inferences for improved
human or computer reasoning. By relating terms in ontology, the
user doesn’t need to know the exact term actually stored in the
document.

Data rationalization is a Managed Meta Data Environment


(MME) enabled application which creates/extends an ontology for a
domain into the structured data world, based on model objects stored
in various models (of varying levels of detail, across model files and
modeling tools) and other meta data Data models, often unknowingly,
express many aspects of ontology..

These concepts form parts of several components of enterprise data


management.

The primary reason for data modeling is to create physical data


structures – though a critical best practice for data modeling is to
follow a phased modeling approach – typically developing conceptual,
logical, and finally physical data models.

Conceptual data models are sometimes considered to be semantic


models since they are expressed in business terminology and
demonstrate how key business objects relate to each other,
independent of technology or application.
There are other types of data models (e.g. enterprise models) and
other metadata that should be linked together to provide a more
holistic view of a domain.

Unfortunately, most modeling tools are incapable of handling all of the


different levels of models effectively; it is common for more than one
modeling tool to be used in an enterprise, and multiple model files are
usually a necessity. For example, a data modeling tool might be used
for logical and physical models, while a UML class diagram might be
used for the conceptual model.

Connecting these model objects, and visualizing these objects and


relationships is called Data Rationalization, and it is typically enabled
as part of a Managed Metadata Environment (MME) by leveraging a
Metadata Repository (MDR) tool.

Data Rationalization is a “vertical data lineage” as opposed to


horizontal data lineage employed for data movement (i.e.
Information Supply Chain).

In Data Rationalization, we’re not trying to find where an actual piece


of data came from (i.e. source to target), but what higher order model
objects the data was conceived from or objects that can help to
explain it, or which downstream objects implement the higher order
model objects (see Figure 1 below).
Figure 1: Example – Data Rationalization versus Information Supply Chain

ECDM: Enterprise conceptual data model

ELDM: Enterprise logical data model

CDM: Conceptual data model

LDM: Logical data model

PDM: Physical data model

DW: Data Warehouse

OLTP: Online Transaction Processing


Benefits of Data Rationalization

What is the benefit of Data Rationalization? To be able to


effectively exploit, manage, reuse, and govern enterprise data
assets (including the models which describe them), it is necessary
to be able to find them.

In addition, there is (or should be) a wealth of semantics (e.g.


business names, definitions, relationships) embedded within an
organization’s models that can be exposed for improved analysis and
knowledge transfer. By linking model objects (across or within models)
it is possible to discover the higher order conceptual objects for any
given object. Conversely, it is possible to identify what implementation
artifacts implement a higher order model object.

For example, using data rationalization, one can traverse from a


conceptual model entity to a logical model entity to a physical model
table to a database table, etc. Similarly, Data Rationalization enables
understanding of a database table by traversing up through the
different model levels.

Data Rationalization is an enabler of effective Data Governance. It is


not possible to govern information assets if without knowing the
location of the data and / or the variety of meaning given to each
object. Similarly, Data Rationalization can aid in the development
of Master Data Management solutions. By identifying common data
entities, and how these relate to other pieces of data (again, across
many systems), MDM solutions will be able to improve meeting
business user needs for all the systems that require the
master/reference data.

How does Data Rationalization work?

To be able to rationalize data, meta-relationships between model


objects (across model levels) must be established. Of course, doing
so does not replace normal types of relationships between model
objects in the same model. Meta-relationships can be established in
multiple ways:
1. Use automated modeling tool functionality
2. Use manual modeling tool functionality
3. Use modeling tool metadata fields
4. Use Metadata Repository (MDR) tool to manually establish links
using a GUI or other interface.
5. Use a spreadsheet
Once the meta-relationships are established, they must be imported
into the Metadata Repository (if not established using the MDR tool).
From there, analysts can search, retrieve, and visualize the metadata
to perform Data Rationalization analysis. Analysts do not need a
modeling tool license to explore the models (assuming the higher
order models can be found), or need to rely on the data modeler to
obtain access or export the model metadata.

Conclusion

A very simple example of a Data Rationalization analysis might be an


analyst wishing to understand the relationship between two tables.
Assume the analyst does not have a modeling tool license, does not
know what model to look for, or does not have access to the network
directory where the models are stored. Also, assume that foreign keys
have been disabled in the physical model (valid in some cases, e.g.
data warehousing…). Using the MDR, the analyst could search on the
table names, and when these are displayed rationalize upwards to see
the logical entities (in a separate model file) these tables originated
from. This enables the analyst to see the relationship and allows the
analyst to understand its cardinality, optionality, and identification and
review the relationship verb phrase.

Data Rationalization provides semantic understanding of the object in


question along with its higher and lower order companions, for
increased knowledge of the organization’s data and information.

You might also like