100% found this document useful (1 vote)
249 views12 pages

Test Data Management Software

Uploaded by

NNivedita
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
249 views12 pages

Test Data Management Software

Uploaded by

NNivedita
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

WHITE PAPER

TEST DATA MANAGEMENT IN


SOFTWARE TESTING LIFE CYCLE-
BUSINESS NEED AND BENEFITS IN
FUNCTIONAL, PERFORMANCE,
AND AUTOMATION TESTING
Praveen Bagare (Infosys) and Ruslan
Desyatnikov (Citibank)
Abstract to create a sub-set of the data. This return on the investment? Where
reduces the effort involved in test do we start implementing Test
The testing industry today is looking
planning and execution and helps Data Management (TDM)? Should
for ways and means to optimize
achieve optimization. However, live we start with functional testing or
testing effort and costs. One potential
data is not always easily available for non-functional testing? Can test
area of optimization is test data
testing. Depending on the business, automation help?
management. Testing completeness
privacy and legal concerns may be
and coverage depends mainly on the The practice of not including TDM
associated with using live data. Often
quality of test data. It stands to reason steps in the Testing Life Cycle often
the data is not complete and therefore
that without high quality data testing leads to ignorance towards TDM on
cannot be used for testing. It is best to
assurance is unattainable. A test plan part of the testing team. This paper
avoid the use of raw production data
with several comprehensive scenarios attempts to explain why testers in
to safeguard business and steer clear
cannot be executed unless appropriate the functional, non-functional and
of expensive lawsuits.
data is available to run the scenarios. automation test arenas need the
The challenge of TDM lies in obtaining TDM service. We also discuss the test
The best data is found in production
the right data effectively. Before data challenges faced by testers and
since these are actual entries the
proceeding on this path, we need describe the unparalleled benefits of a
application uses. While using
to find answers to some pertinent successful TDM implementation.
production data, it is always prudent
questions: Will there be a positive

Significance of Test Data Management

TDM is fast gaining importance in the institutions rely on powerful test data sets compliance. This is a critical area for these
testing industry. Behind this increasing and unique combinations that have high institutions due to the hefty penalties
interest in TDM are major financial losses coverage and drive the testing, including associated with non-compliance. Penalties
caused by production defects, which could negative testing. TDM introduces the for regulatory non-compliance can run into
have been detected by testing with the structured engineering approach to test hundreds of thousands of dollars or more.
proper test data. Some years ago, test data requirements of all possible business Data masking (obfuscating) of sensitive
data was limited to a few rows of data in scenarios. information and synthetic data creation
the database or a few sample input files. are some of the key TDM services that can
Large financial and banking institutions
Since then, the testing landscape has come assure compliance.
also leverage TDM for regulatory
a long way. Now financial and banking

External Document © 2018 Infosys Limited


What is Test Data
Management?
Test data is any information that is used
as an input to perform a test. It can
be static or transactional. Static data
containing names, countries, currencies,
etc., are not sensitive, whereas data
pertaining to Social Security Number
(SSN), credit card information or medical
history may be sensitive in nature.
In addition to the static data, testing
teams need the right combination of
transaction data sets/conditions to test
business features and scenarios.

TDM is the process of fulfilling the test


data needs of testing teams by ensuring
that test data of the right quality is
provisioned in suitable quantity, correct
format and proper environment, at the
appropriate time. It ensures that the
provisioned data includes all the major
flavors of data, is referentially intact and
is of the right size. The provisioned data
must not be too large in quantity like
production data or too small to fulfill all the
testing needs. This data can be provisioned
by either synthetic data creation or
production extraction and masking or by
sourcing from lookup tables.

TDM can be implemented efficiently with


the aid of well-defined processes, manual
methods and proprietary utilities. It can
also be put into practice using well-evolved
TDM tools such as Datamaker, Optim or
others available in the market.

A TDM strategy can be built based on the


type of data requirements in the project.
This strategy can be in the form of:

• Construction of SQL queries that


extract data from multiple tables in the
databases

• Creation of flat files based on:

Mapping rules

Simple modification or desensitizing


of production data or files

An intelligent combination of all of


these

External Document © 2018 Infosys Limited


The Challenges of Test Data
Sourcing
Some of the most common challenges be sensitive in nature, have limited • Data dependencies or combinations to
faced by testing teams while sourcing test coverage or may be unsuitable for the test certain business scenarios can add
data are: business scenarios to be tested. to the difficulties in sourcing test data.

• Test data coverage is often incomplete • Large volumes of data may be needed • Testers often spend a significant amount
and the team may not have the required in a short span of time and appropriate of time communicating with architects,
knowledge. tools may not be at the testing team’s database administrators, and business
disposal. analysts to gather test data instead
• Clear data requirements with volume
of focusing on the actual testing and
specifications are often not gathered • Same data may be used by multiple
validation work.
and documented during the test testing teams, in the same environment,
requirements phase. resulting in data corruption. • A large amount of time is spent in
gathering test data.
• Testing teams may not have access • Review and reuse of data is rarely
to the data sources (upstream and realized and leveraged. • Most of the data creation happens
downstream). during the course of execution based on
• Testers may not have the knowledge of
learning.
• Data is generally requested from the alternate data creation solutions using a
development team which is slow to TDM tool. • If the data related to defects is not found
respond due to other priority tasks. during testing, it can cause a major risk
• Logical data relationships may be hidden
to production.
• Data is usually available in large chunks at the code level and hence testers may
from production dumps and can not extract or mask all the referential data.

External Document © 2018 Infosys Limited


TDM Offers Efficient Solutions and
Valuable Benefits

An effective TDM implementation can address most of the challenges mentioned above. Some of the key benefits
that a business can gain by leveraging the TDM services are:
Superior quality Minimum time Reduced cost Less resources
• Optimal data coverage is • The TDM service employs a • Condensed test design and • 
Database or file access provided
achieved by the TDM team dedicated data provisioning data preparation effort helps to the TDM team facilitates data
through intelligent tools and team with agreed service-level achieve cost savings privacy and reuse
techniques based on data agreements (SLAs) ensuring
analysis strategies prompt data delivery

• Test data requirements from • Compact test design and • Minimized test data storage • Professionals with specialized
the TDM team enable the execution cycles can be space leads to reduction of skills, sharp focus on Test
testing team to capture these achieved for reduced time to overall infrastructure cost Data and access to industry
effectively during the test market standard tools contribute to
planning phase. Version- the success of TDM
controlled data requirements
and test data ensure complete
traceability and easier
replication of results
• Detailed analysis and review • Automated processes lead to • The TDM team also wears
of data requirements ensure less rework and reduced result the system architect’s hat,
early identification of issues replication time thus understanding data
and resolution of queries flow across systems and
provisioning the right data

• Synthetic data can be created • TDM tools such as Datamaker


from the ground up for new can speed up scenario
applications identification and creation of
the corresponding data sets

• Errors and data corruption


can be reduced by including
defined TDM processes in
the Testing Life Cycle and by
adopting TDM tools

• Clear data security policies


increase data safety and
recoverability

• Well-defined process and


controls for data storage,
archival and retrieval support
future testing requirements

External Document © 2018 Infosys Limited


TDM in Functional Testing
Most challenges mentioned earlier are part of the day-to-day struggle of functional
testing. The most commonly seen challenges are: low coverage, high dependency, access
limitations, oversized testing environment, and extended test data sourcing timelines.
Successful implementation of TDM in functional testing projects can alleviate most of
these issues and assure the completeness of testing from the business perspective.

In functional testing, TDM is governed by numerous factors highlighted below.

Coverage: Exposure to all the possible Optimal Environment Size


scenarios or test cases needs to be the
Data in the testing environment should be a
key driving factor for data provisioning in
smart subset (a partial extract of the source
functional testing. The data provisioned for
data based on filtering criteria) of production
this testing must cover:
or synthetic data. It should comprise data for
• Positive scenarios (valid values that all testing needs and be of minimal size to
should make the test case pass) avoid hardware and software cost overhead.

• 
Negative scenarios (invalid values that Sub-setting referentially intact data in the

should result in appropriate error handling) right volume into testing environments is
the solution to overcome the problem of
• Boundary conditions (data values at the oversized, inefficient and expensive testing
extremities of the possible values) environments. This is the key to reduce the
• All functional flows defined in the test execution window and minimize data-
requirement (data for each flow) related defects with the same root cause.

Low volumes: Single data sets for each of Data Requirements Gathering
the scenarios are sufficient for the need. Process
Repetitive test data for similar test cases During test case scripting, the test data
may not be required and can prove to be requirements at the test-case level should
a waste of time. This can help reduce the be documented and marked as reusable
execution time significantly. or non-reusable. This ensures that the test
High reuse: Some test data such as accounts, data required for testing is clear and well
client IDs, country codes, etc., can be reused documented. A simple summation of the same
across test cases to keep the test data pool type of test data provides the requirement of
optimized. Static and basic transaction data test data that needs to be provisioned.
for an application can be base-lined so as A sample test data requirement gathering,
to be restored or retrieved for maintenance as per this method, has been demonstrated
release testing at regular intervals, depending in Appendix I.
on the release frequency.
Feasibility Check
Tools or utilities: As utilities have the
capacity to create large volumes of the TDM is suitable for functional testing

same type of data, they may not be very projects that:

useful in functional data preparation. If • Spend over 15% of the testing effort on
the tool is capable of creating a spectrum data preparation or data rework
of data to meet all the data requirements
• 
Use regression test cases which are run
and the data can be reused across releases,
repeatedly across releases (so that the test
then it is beneficial to use the utility or tool
data methodology identified can be reused)
for TDM implementation.
• 
Indicate that a high data coverage TDM
solution can be identified and implemented

External Document © 2018 Infosys Limited


TDM in Performance Testing
Test data preparation in performance Hence multiple data sets are required in Workload distribution: Performance
testing is most impacted by issues parallel, in large numbers, based on the testing in today’s time does not use one
like large volume requirements workload model. type of data repeatedly. If the use cases
with significant coverage, high data comprise multiple types of data, with
Quick consumption of test data: Since
preparation time, limited environment a corresponding percentage of each
the load or stress on the application
availability and short execution windows. occurrence, a similar workload model is
is induced by multiple users, the data
The TDM team with their tools and built for the testing environment. Thus
provisioned for them is consumed rapidly.
techniques can provide solutions for smart data needs to be provided for such a
This leads to quick exhaustion of large
bulk data generation with quick refresh workload model which covers the multiple
volumes of the test data.
cycles ensuring on-time high quality types. This multiplies the complexity of
data provisioning to address these steep Very short data provisioning cycle: data creation.
demands. Since the data is consumed very quickly
Feasibility Check
in performance testing, a new cycle
Test data for performance testing is
of execution requires the data to be TDM is definitely suitable for most of
characterized by:.
replenished before start. This implies that the performance testing projects. The
High volumes of data: Performance the TDM strategy must make sure that strategy needs to be well designed to
testing always needs large volumes of test the test data can be recreated, extracted make sure that the challenges listed
data by virtue of the way it works. Multiple again or provisioned in a short period above are addressed appropriately and a
users are simultaneously loaded onto an of time so as to avoid the impact of the satisfactory return on investment (ROI) is
application to run a flow of test executions. unavailability of data on the testing cycle. achieved soon.

External Document © 2018 Infosys Limited


TDM in Automation Testing
Some of the biggest challenges in
automation testing are associated
with creating test data. Included in this
challenge is the inability to create data
quickly from the front-end, fast burning up
of large volumes of data during test runs,
limited access to dynamic data and partial
availability of the environment.

Implementing a well-designed TDM strategy


can support multiple iterations of dynamic
data in short intervals of time by synthetically
creating or extracting data, using TDM tools.

Test data for automation testing is driven


by factors such as:

Automation of test data creation: The


data required for automation testing
is usually created by using one of the
automated processes, either from user
interface (UI) front-end or via create or edit
data operations in the database. These
methods are time-consuming and may
require the automation team to acquire
application as well as domain knowledge.

Fast consumption of test data: Similar to


performance testing, automation testing
also consumes data at a very quick pace.
Hence the data provisioning strategy must
accommodate fast data creation with a
relatively short life cycle.

High coverage: As with functional testing


automation testing also requires test data
for each of the automated scenarios. The
data requirement may be restricted to the
regression test pack and yet covers a large
spectrum of data.

Feasibility Check
TDM as process implementation is certainly
recommended for an automation project.
The automation team must provide the
data requirement in clear terms and
the TDM team can ensure provisioning
of this data before the test run begins.
Automation tools have abilities to create,
mock and edit data but the TDM team’s
expertise and tools can add significant
value to the proposition.

External Document © 2018 Infosys Limited


Conclusion

In functional testing, increasing data data can be created swiftly and efficiently. automation and performance testing
coverage plays a significant role in can leverage TDM implementation to
Automation testing can benefit from TDM
providing the TDM value-add. The sheer overcome their respective challenges
implementation. Tools such as Quick Test
volume of test data that is repeatedly and achieve optimization. Each type
Professional (QTP) can create data via user
used in the regression suites make it of testing presents its own unique
interface but need significant functional
an important focal area from the ROI challenges and benefits, but there is
knowledge and are slow in nature. TDM
viewpoint. The right TDM tools can help a common theme – TDM is a major
solutions can save time and cost by
provision a spectrum of data and ensure enhancement and addition to the tools
keeping the data ready. Robust data
continuous ROI in each cycle. and techniques available to the testing
creation methods and tools can be used to
team. This practice can help realize gains
TDM implementation in performance achieve these goals.
at the bottom-line with cost reductions,
testing projects can deliver quick benefits
Based on our learning, interactions and improved turnaround time and fewer
and the improvement can be significantly
experience gleaned from the TDM world, data related defects in test and in
highlighted as large volumes of similar
we can confidently affirm that functional, production.

External Document © 2018 Infosys Limited


We have identified the following set
of metrics to be captured before and
after implementing TDM practices to
measure the ROI and benefits:

Test data coverage

Percentage of test data


related defects

Test data management


effort (percentage and
time in hours)

We wish you success in your TDM


implementation.

External Document © 2018 Infosys Limited


Appendix I
An illustration of ‘Sample Test Data Requirement Gathering’

The last three columns depicted in the table below should be a part of the test case documentation and must be updated during test case
authoring.

No. Test Case Test Data Requirement Reusable? Remarks

1 Test Case 01 NAM Bank Account Number Y with balance > $1,000

2 Test Case 02 Client ID N -

3 Test Case 03 NAM Bank Account Number Y with balance > $100,000

4 Test Case 04 ASIA Bank Account Number Y Account open date > 01 Jan 2013

5 Test Case 05 ASIA Bank Account Number Y -

Data requirement summary for the above illustration:

• Two ‘NAM Bank Account Number’ ‘with balance > $100,000’

• Two ‘ASIA Bank Account Number’

• One client ID

Reference

[Link]

External Document © 2018 Infosys Limited


For more information, contact askus@[Link]

© 2018 Infosys Limited, Bengaluru, India. All Rights Reserved. Infosys believes the information in this document is accurate as of its publication date; such information is subject to change without notice. Infosys
acknowledges the proprietary rights of other companies to the trademarks, product names and such other intellectual property rights mentioned in this document. Except as expressly permitted, neither this
documentation nor any part of it may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, printing, photocopying, recording or otherwise, without the
prior permission of Infosys Limited and/ or any named intellectual property rights holders under this document.

[Link] | NYSE: INFY Stay Connected

Common questions

Powered by AI

Functional testing teams face several challenges in TDM, including low test data coverage, high dependency on development teams for data, access limitations to data sources, oversized testing environments, and extended timelines for sourcing data . TDM can effectively address these issues by providing a structured approach to data provisioning. It ensures referentially intact data in optimal volumes, reduces the time and resources needed for data preparation, and enhances test coverage by employing intelligent tools and techniques to generate and mask data . TDM also reduces the dependency on development teams by enabling test teams to generate or extract data independently using TDM tools .

TDM tools facilitate the efficiency of testing environments by automating data provisioning and management tasks, reducing the time and effort required to prepare test data . These tools support the generation, masking, and extraction of test data, ensuring that it meets specific quality and coverage requirements without overloading the testing environment with unnecessary data volumes . However, their limitations include the initial setup cost, which can be significant, and the potential complexity involved in configuring them to meet specific project needs. Moreover, these tools may have limited flexibility in handling highly dynamic or complex data dependencies, which may require additional manual intervention or customization .

TDM addresses challenges related to data dependencies and combinations in complex business scenarios by systematically analyzing and structuring test data requirements . Through the use of sophisticated tools and techniques, TDM can generate and provision data that accounts for the data relationships and dependencies critical to testing these scenarios . TDM ensures that data used in testing is referentially intact and can accommodate various combinations necessary for covering all functional and non-functional flows of the application . This structured approach helps in identifying and overcoming hidden data relationships at the code level, ensuring comprehensive test coverage and minimizing data-related defects .

A TDM strategy provides ROI by minimizing the resources and time required for test data preparation and execution. By effectively managing test data, TDM ensures high data coverage, which reduces the risk of production defects and associated financial losses . The strategy also supports regulatory compliance through data masking and synthetic data creation, avoiding costly penalties . By decreasing the test design and data sourcing effort, TDM reduces overall testing costs and infrastructure expenses, contributing to significant ROI . Moreover, the efficiency gains from streamlined data provisioning and reduced rework time further amplify the financial benefits, making TDM a worthwhile investment in software testing projects .

Test data management (TDM) benefits performance testing by providing high volumes of data required for simulating multiple users, ensuring data is available for quick consumption, and allowing for short provisioning cycles to replenish data quickly . A well-designed TDM strategy is crucial as it addresses specific challenges such as rapid data exhaustion and diverse data requirements for various workload models. It ensures the necessary data can be recreated or extracted efficiently, supporting continuous testing without delays . This strategic provisioning enhances performance testing by reducing time spent on data issues and improving the accuracy of load testing scenarios.

Excluding TDM in the software testing life cycle can lead to significant risks such as incomplete test coverage, reliance on poor quality or inappropriate data, and increased potential for defects slipping into production . These risks can result in costly production failures and compliance issues if sensitive data is mishandled . Mitigating these risks involves integrating TDM processes to ensure test data is of appropriate quality, maturity, and coverage. Implementation of TDM also aids in regulatory compliance through data masking and synthetic data solutions, which secure sensitive information while providing robust test scenarios .

Data masking and synthetic data creation are crucial components of TDM that help protect sensitive information by obfuscating real data and generating non-production data for testing purposes . These practices significantly impact regulatory compliance by ensuring that no sensitive data is exposed during testing, which can prevent costly legal penalties and address the privacy concerns inherent in using live production data . By leveraging these TDM techniques, organizations can maintain compliance with data protection laws while enabling thorough testing across various scenarios without risking data breaches .

TDM enhances automation testing by automating the test data creation process, ensuring fast provisioning of data necessary for repeated test runs . It addresses common challenges such as the fast consumption of test data and limited access to dynamic data by synthetically creating or extracting the required data efficiently. TDM tools can automate these tasks, reducing dependency on the user interface for data input and ensuring that test data is available before execution begins . This implementation supports multiple iterations with required data sets, saving time and resources while increasing test coverage and efficiency .

A TDM strategy positively impacts the time-to-market for new software releases by streamlining the process of data provisioning and reducing the test execution cycle time . It ensures that high-quality test data is readily available, which minimizes delays associated with data preparation and recall . By leveraging automated tools and predefined processes, TDM eliminates redundant efforts and accelerates defect identification and resolution, facilitating faster release cycles . Additionally, by maintaining a manageable test environment size and ensuring quick turnaround for data refreshes, TDM supports continuous and efficient testing, thereby speeding up the release of new software .

A dedicated data provisioning team plays a critical role in TDM by ensuring the timely delivery of high-quality data aligned with test requirements . This team's expertise allows for a more focused approach to identifying and provisioning the necessary data, which contributes to cost savings by optimizing the test design and data preparation efforts. Moreover, it enhances efficiency by reducing the time required for gathering data, allowing testers to focus on validation and execution rather than data sourcing . The team's use of service-level agreements (SLAs) ensures prompt data delivery, which streamlines the testing process and reduces the overall infrastructure costs associated with storing and managing test data .

You might also like