0% found this document useful (0 votes)
55 views193 pages

Software Testing Full Notes

Uploaded by

crystalsoflilacs
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
55 views193 pages

Software Testing Full Notes

Uploaded by

crystalsoflilacs
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

SOFTWARE TESTING

Unit I:
Introduction: Purpose–Productivity and Quality in Software–TestingVsDebugging–Model for
Testing–Bugs–Types of Bugs – Testing and Design Style.

Unit II:

Flow / Graphs and Path Testing – Achievable paths – Path instrumentation Application
Transaction FlowTesting Techniques.

Unit III:

Data Flow Testing Strategies - Domain Testing:Domains and Paths – Domains and Interface
Testing.

Unit IV:
Linguistic –Metrics – Structural Metric – Path Products and Path Expressions.SyntaxTesting–
Formats–Test Cases

Unit V:
Logic Based Testing–Decision Tables–Transition Testing–States, State Graph, StateTesting.

Text Book:

1. B.Beizer,“SoftwareTestingTechniques”,IIEdn.,DreamTechIndia,NewDelhi,2003.

2. K.V.K.Prasad,“SoftwareTestingTools”,DreamTech.India,NewDelhi,2005

Reference Book:
1. I.Burnstein,2003,“PracticalSoftwareTesting”,SpringerInternationalEdn.

2. E. Kit, 1995, “Software Testing in the Real World: Improving the Process”, PearsonEducation,Delhi.

3. R. Rajani,andP.P.Oak,2004,“SoftwareTesting”,TataMcgrawHill,New Delhi.

Web Resources:

1. https://www.javatpoint.com/software-testing-tutorial

2. https://www.guru99.com/software-testing.html
UNIT – I
What is Software Testing?
Software testing is the process of evaluating a system with intend of finding bugs. It is
performedto check if the system satisfies its specified requirements.Testing measures the overall quality
of the system in terms of its correctness, completeness,usability, performance and other functional and
non-functional attributes.

Attributes of Software testing


Testability
Improved testability can make the testing process more efficient and cost-effective.
Communication
Software testers should have strong verbal and written communication skills to
communicate with programmers, test managers, and customers.
Portability
Software portability should be considered during the development phase. There should be a
clear abstraction between business logic and system interfaces.
Usability
How easily users can interact with the software to accomplish their tasks.
Maintainability
How efficiently the software can be modified to improve it or adapt it to changes in the
environment.
Security
The system's ability to prevent illegal entry and data breaches.

Efficiency
An inefficient application might consume excessive resources that slow down other applications
or the system.
Reliability
Users should be able to access the system at any time.
Performance
The system should be able to withstand a certain number of active users
lOMoARcPSD|28728520

Why is Testing required?


 Software testing as a separate activity in SDLC, is required because:
 Testing provides an assurance to the stakeholders that product works as intended.
 Avoidable defects leaked to the end user/customer without proper testing adds
badreputation to the development company.
 Separate testing phase adds a confidence factor to the stakeholders regarding quality
ofthe software developed.
 Defects detected earlier phase of SDLC results into lesser cost and resource utilization
ofcorrection.
 Saves development time by detecting issues in earlier phase of development.
 Testing team adds another dimension to the software development by providing
adifferent view point to the product development process.

Who does Testing?


Software Testing is/can be done by all technical and non-technical people associated with
thesoftware. Testing in its various phases is done by-
 Developer - Developer does the unit testing of the software and ensure that the
individualmethods work currently
 Tester - Testers are the face of the software testing. A tester verifies the
functionality,usability of the application as functional tester, a tester checks the
performance of theapplication as a Performance tester, a tester automates the manual-
functional test casesand creates test scripts as an automation tester
 Test Managers/Lead/Architects - Define the test strategy and test plan
 End users - A group of end users do the User Acceptance Testing (UAT) of
theapplication to make sure the software can work in the real world

When do we start Software Testing?


Based on the selection of different Software Development Life Cycle Model for the software
project, testing phase gets started in the different phases. There is a software myth that testing
isdone only when some part of software is built but testing can (should) be started even before
asingle line of code is written. It can be done in parallel with development phase e.g. in case of
VModel:

How is Software Testing done?

Downloaded by priya Manoharan ([email protected])


lOMoARcPSD|28728520

Software testing can be done both manually as well as using automation tools. Manual
effortincludes verification of requirement and design; development of test strategy and plan;
preparation of test case and then the execution of tests. Automation effort includes preparation
oftest scripts for UI automation, back-end automation, performance test script preparation and
useof other automation tools.

Why is testing required?


Let's now briefly see why we need testing in software context-
 Testing is important as it uncovers a defect before it is delivered to customer
ensuringquality of software.
 So that the defects or bugs can be identified in early stages of development; as later
thestage in which bug is identified, more is the cost to rectify it.
 It makes software more reliable and user-friendly to operate.
 An untested software not only makes software error prone, it also costs the
customer business failure too like in case of Microsoft's MP3 player.
 Software issue can cost lives too e.g. in case of Therac 25 - many people died due
toconcurrent programming errors wherein patients were given radiation doses
that werehundreds of times greater than normal, resulting in death or serious
injury.
 Well tested software provide efficient resource utilization resulting in low cost.
 A thoroughly tested software ensures reliable and high performance functioning of
the software
What is Software Quality?
 Software quality is the conformance of a software system to its requirements. In
software perspective, quality is defined by two factors - Validation and
Verification. Validation checks ifthe process used during software development is
correct or not whereas in verification the product is evaluated to check if its meets
the specifications.

Verification and Validation


 Software testing is basically the sum total of the two activities - Verification
and Validation.
 Verification Is the process of evaluating the artifacts of software development in
order to ensurethat the product being developed will comply with the standards. It
is static process of analyzingthe documents and not the actual end
product.Whereas,
Validation

Downloaded by priya Manoharan ([email protected])


lOMoARcPSD|28728520

 is the process of validating that the developed software product conformsto the
specified business requirements. It involves dynamic testing of software product
byrunning it.
 The difference between the two is:

Example
Let’s say we are writing a program for addition.
a+b = c
Verification:
Are we getting some output for a+b? 1+1 = 6
Validation:
Are we getting correct output for a+b? 1+1=2?
Verification basically asks if the program is correct. To use your simple example, a=x
+ Y iscorrect if x=1 and y=2 yields a = 3.
Validation asks if the correct program was produced. For example, calculate the area of
arectangle with length x and width y. If x=1 and y =2, the result should be 2.
The first program iscorrect but not valid given the requirement. It is not the right
program

Purpose of Software Testing


 Defects can be identified early: Software testing is important because if there are any
bugs they can be identified early and can be fixed before the delivery of the software.
 Improves quality of software: Software Testing uncovers the defects in the software,
and fixing them improves the quality of the software.
 Increased customer satisfaction: Software testing ensures reliability, security, and high
performance which results in saving time, costs, and customer satisfaction.
 Helps with scalability: Software testing type non-functional testing helps to identify the
scalability issues and the point where an application might stop working.

Downloaded by priya Manoharan ([email protected])


lOMoARcPSD|28728520

 Saves time and money: After the application is launched it will be very difficult to trace
and resolve the issues, as performing this activity will incur more costs and time. Thus, it
is better to conduct software testing at regular intervals during software development.
Different Types Of Software Testing
Explore diverse software testing methods including manual and automated testing for
improved quality assurance . Enhance software reliability and performance through
functional and non-functional testing, ensuring user satisfaction. Learn about the
significance of various testing approaches for robust software development.

Types Of Software Testing


Software Testing can be broadly classified into 3 types:
1. Functional testing : It is a type of software testing that validates the software systems
against the functional requirements. It is performed to check whether the application is
working as per the software’s functional requirements or not. Various types of functional
testing are Unit testing, Integration testing, System testing, Smoke testing, and so on.
2. Non-functional testing : It is a type of software testing that checks the application for
non-functional requirements like performance, scalability, portability, stress, etc. Various

Downloaded by priya Manoharan ([email protected])


lOMoARcPSD|28728520

types of non-functional testing are Performance testing, Stress testing, Usability Testing,
and so on.
3. Maintenance testing : It is the process of changing, modifying, and updating the
software to keep up with the customer’s needs. It involves regression testing that verifies
that recent changes to the code have not adversely affected other previously working
parts of the software.
Apart from the above classification software testing can be further divided into 2 more
ways of testing:
1. Manual testing : It includes testing software manually, i.e., without using any
automation tool or script. In this type, the tester takes over the role of an end-user and
tests the software to identify any unexpected behavior or bug. There are different stages
for manual testing such as unit testing, integration testing, system testing, and user
acceptance testing. Testers use test plans, test cases, or test scenarios to test software to
ensure the completeness of testing. Manual testing also includes exploratory testing, as
testers explore the software to identify errors in it.
2. Automation testing : It is also known as Test Automation, is when the tester writes
scripts and uses another software to test the product. This process involves the
automation of a manual process. Automation Testing is used to re-run the test scenarios
quickly and repeatedly, that were performed manually in manual testing.
Apart from Regression testing , Automation testing is also used to test the application
from a load, performance, and stress point of view. It increases the test coverage,
improves accuracy, and saves time and money when compared to manual testing.
Different Types of Software Testing Techniques
Software testing techniques can be majorly classified into two categories:
1. Black box Testing : Testing in which the tester doesn’t have access to the source code of
the software and is conducted at the software interface without any concern with the
internal logical structure of the software known as black-box testing.
2. White box Testing : Testing in which the tester is aware of the internal workings of the
product, has access to its source code, and is conducted by making sure that all internal
operations are performed according to the specifications is known as white box testing.
3. Grey Box Testing : Testing in which the testers should have knowledge of
implementation, however, they need not be experts.

S No. Black Box Testing White Box Testing

Knowledge of the
Internal workings of an
1 internal workings is a
application are not required.
must.

Downloaded by priya Manoharan ([email protected])


lOMoARcPSD|28728520

S No. Black Box Testing White Box Testing

Also known as closed box/data- Also known as clear


2
driven testing. box/structural testing.

Normally done by
End users, testers, and
3 testers and
developers.
developers.

Data domains and


This can only be done by a trial
4 internal boundaries
and error method.
can be better tested.

Different Levels of Software Testing


Software level testing can be majorly classified into 4 levels:
1. Unit testing : It a level of the software testing process where individual units/components
of a software/system are tested. The purpose is to validate that each unit of the software
performs as designed.
2. Integration testing : It is a level of the software testing process where individual units
are combined and tested as a group. The purpose of this level of testing is to expose faults
in the interaction between integrated units.
3. System testing : It is a level of the software testing process where a complete, integrated
system/software is tested. The purpose of this test is to evaluate the system’s compliance
with the specified requirements.
4. Acceptance testing : It is a level of the software testing process where a system is tested
for acceptability. The purpose of this test is to evaluate the system’s compliance with the
business requirements and assess whether it is acceptable for delivery.

Benefits of Software Testing


 Product quality: Testing ensures the delivery of a high-quality product as the errors are
discovered and fixed early in the development cycle.
 Customer satisfaction: Software testing aims to detect the errors or vulnerabilities in the
software early in the development phase so that the detected bugs can be fixed before the
delivery of the product. Usability testing is a type of software testing that checks the
application for how easily usable it is for the users to use the application.

Downloaded by priya Manoharan ([email protected])


lOMoARcPSD|28728520

 Cost-effective: Testing any project on time helps to save money and time for the long
term. If the bugs are caught in the early phases of software testing, it costs less to fix
those errors.
 Security: Security testing is a type of software testing that is focused on testing the
application for security vulnerabilities from internal or external sources.
What is Testing?
Testing is the process of verifying and validating that a software or application is bug-
free, meets the technical requirements as guided by its design and development, and
meets the user requirements effectively and efficiently by handling all the exceptional
and boundary cases. The purpose of software testing is to identify the errors, faults, or
missing requirements in contrast to actual requirements.
Types of Testing
Following are the types of software testing:
1. Manual Testing
 White Box Testing
 Black Box Testing
 Gray Box Testing
2. Automated Testing
3. Functional Testing
 Unit Testing
 Integration Testing
 System Testing
4. Non-Functional Testing
 Performance Testing
 Usability Testing
 Compatibility Testing
What is Debugging?
Debugging is the process of fixing a bug in the software. It can be defined as identifying,
analyzing, and removing errors. This activity begins after the software fails to execute
properly and concludes by solving the problem and successfully testing the software. It is
considered to be an extremely complex and tedious task because errors need to be
resolved at all stages of debugging.

Aspects Testing Debugging

Definition Testing is the process to find Debugging is the process of

Downloaded by priya Manoharan ([email protected])


lOMoARcPSD|28728520

Disadvantages of the Iterative model:


 Requires careful planning and coordination: Iterative projects require careful planning
and coordination to ensure effective iteration cycles and clear communication among
stakeholders.
 May require cultural change: Shifting to an iterative mindset may involve
organizational cultural changes.
 Potential for scope creep: The Iterative model’s flexibility can lead to scope creep if not
managed effectively.
 May not be suitable for all projects: The iterative model may not be ideal for projects
with highly rigid requirements or those requiring extensive upfront planning.

BUGS
A malfunction in the software/system is an error that may cause components or the
system to fail to perform its required functions. In other words, an error encountered
during the test can cause malfunction. For example, incorrect data description,
statements, input data, design, etc.
There are many different types of software testing, each with specific goals and
strategies. Some of them are below:
1. Acceptance Testing: Ensuring that the whole system works as intended.
2. Integration Testing: Ensuring that software components or functions work together.
3. Unit Testing: To ensure that each software unit operates as expected. The unit is a
testable component of the application.
4. Functional Testing: Evaluating activities by imitating business conditions, based on
operational requirements. Checking the black box is a common way to confirm tasks.
5. Performance Testing: A test of how the software works under various operating loads.
Load testing, for example, is used to assess performance under real-life load conditions.
6. Re-testing: To test whether new features are broken or degraded. Hygiene checks can be
used to verify menus, functions, and commands at the highest level when there is no time
for a full reversal test.
To gain a deeper understanding of bug tracking and other vital testing techniques,
consider enrolling in this comprehensive software testing course. This course equips you
with practical knowledge to efficiently identify, categorize, and resolve bugs, ensuring
the delivery of high-quality software.
Reasons Why Bugs Occur?
The Reasons Why Bugs Occur are as follows:

Downloaded by priya Manoharan ([email protected])


lOMoARcPSD|28728520

Reasons Why Bugs Occur


1. Lack of Communication
 This is a key factor contributing to the development of software bug fixes.
 Thus, a lack of clarity in communication can lead to misunderstandings of what the
software should or should not do. In many cases, the customer may not fully understand
how the product should ultimately work.
 This is especially true if the software is designed for a completely new product. Such
situations often lead to many misinterpretations from both sides.
2. Repeated Definitions Required
 Constantly changing software requirements creates confusion and pressure in both
software development and testing teams.
 Usually, adding a new feature or deleting an existing feature can be linked to other
modules or software components. Observing such problems causes software
interruptions.
3. Policy Framework Does Not Exist
 Also, debugging a software component/software component may appear in a different or
similar component.
 Lack of foresight can cause serious problems and increase the number of distractions.
 This is one of the biggest problems because of what interruptions occur as engineers are
often under pressure related to timelines;
 constantly changing needs, increasing the number of distractions, etc.
 Addition, Design and redesign, UI integration, module integration, database management
all add to the complexity of the software and the system as a whole.
4. Performance Errors

Downloaded by priya Manoharan ([email protected])


lOMoARcPSD|28728520

 Significant problems with software design and architecture can cause problems for
systems. Improved software tends to make mistakes as programmers can also make
mistakes.
 As a test tester, data/announcement reference errors, control flow errors, parameter
errors, input/output errors, etc.
5. Lots of Recycling
 Resetting resources, redoing or discarding a finished work, changes in hardware/software
requirements may also affect the software.
 Assigning a new developer to a project in the middle of nowhere can cause software
interruptions. This can happen if proper coding standards are not followed, incorrect
coding, inaccurate data transfer, etc.
 Discarding part of existing code may leave traces on other parts of the software; Ignoring
or deleting that code may cause software interruptions.
 In addition, critical bugs can occur especially with large projects, as it becomes difficult
to pinpoint the location of the problem.
Life Cycle of a Bug in Software Testing
Below are the steps in the lifecycle of the bug in software testing:

Diagram of Bug Life Cycle


1. Open: The editor begins the process of analyzing bugs here, where possible, and works
to fix them. If the editor thinks the error is not enough, the error for some reason can be
transferred to the next four regions, Reject or No, i.e. Repeat.

Downloaded by priya Manoharan ([email protected])


lOMoARcPSD|28728520

2. New: This is the first stage of the distortion of distractions in the life cycle of the
disorder. In the later stages of the bug’s life cycle, confirmation and testing are performed
on these bugs when a new feature is discovered.
3. Shared: The engineering team has been provided with a new bug fixer recently built at
this level. This will be sent to the designer by the project leader or team manager.
4. Pending Review: When fixing an error, the designer will give the inspector an error
check and the feature status will remain pending ‘review’ until the tester is working on
the error check.
5. Fixed: If the Developer completes the debugging task by making the necessary changes,
the feature status can be called “Fixed.”
6. Confirmed: If the tester had no problem with the feature after the designer was given the
feature on the test device and thought that if it was properly adjusted, the feature status
was given “verified”.
7. Open again / Reopen: If there is still an error, the editor will then be instructed to check
and the feature status will be re-opened.
8. Closed: If the error is not present, the tester changes the status of the feature to ‘Off’.
9. Check Again: The inspector then begins the process of reviewing the error to check that
the error has been corrected by the engineer as required.
10. Repeat: If the engineer is considering a factor similar to another factor. If the developer
considers a feature similar to another feature, or if the definition of malfunction coincides
with any other malfunction, the status of the feature is changed by the developer to
‘duplicate’.
Few more stages to added here are
1. Rejected: If a feature can be considered a real factor the developer will mean “Rejected”
developer.
2. Duplicate: If the engineer finds a feature similar to any other feature or if the concept of
the malfunction is similar to any other feature the status of the feature is changed to
‘Duplicate’ by the developer.
3. Postponed: If the developer feels that the feature is not very important and can be
corrected in the next release, however, in that case, he can change the status of the feature
such as ‘Postponed’.
4. Not a Bug: If the feature does not affect the performance of the application, the corrupt
state is changed to “Not a Bug”.
TESTING
Software testing techniques are methods used to design and execute tests to evaluate software
applications.
Principles Of Testing
Below are the principles of software testing:
1. All the tests should meet the customer’s requirements.
2. To make our software testing should be performed by a third party.

Downloaded by priya Manoharan ([email protected])


Software Testing Introduction

What is testing? Testing is the process of evaluating a system or its component(s) with the intent to find
that whether it satisfies the specified requirements or not. This activity results in the actual, expected and
difference between their results. In simple words testing is executing a system in order to identify any gaps,
errors or missing requirements in contrary to the actual desire or requirements
Software Testing is a method to assess the functionality of the software program. The process
checks whether the actual software matches the expected requirements and ensures the software is bug-
free. The purpose of software testing is to identify the errors, faults, or missing requirements in contrast to
actual requirements. It mainly aims at measuring the specification, functionality, and performance of a
software program or application. Software Testing is a method to assess the functionality of the software
program. The process checks whether the actual software matches the expected requirements and ensures
the software is bug-free. The purpose of software testing is to identify the errors, faults, or missing
requirements in contrast to actual requirements. It mainly aims at measuring the specification,
functionality, and performance of a software program or application.

Attributes of Software testing


Testability
Improved testability can make the testing process more efficient and cost-effective.
Communication
Software testers should have strong verbal and written communication skills to
communicate with programmers, test managers, and customers.
Portability
Software portability should be considered during the development phase. There should be a
clear abstraction between business logic and system interfaces.
Usability
How easily users can interact with the software to accomplish their tasks.
Maintainability
How efficiently the software can be modified to improve it or adapt it to changes in the
environment.
Security
The system's ability to prevent illegal entry and data breaches.

Software Testing
Efficiency
An inefficient application might consume excessive resources that slow down other applications
or the system.
Reliability
Users should be able to access the system at any time.
Performance
The system should be able to withstand a certain number of active users

Purpose of Software Testing

The purpose of the Software Testing is to ensure that the software product is Bug-Free and to enhance
the software quality.
The purpose of software testing is to ensure that a software product meets its intended purpose and
functions as expected:
 Detecting defects
Software testing uncovers bugs and defects early in the development process, when they are less
expensive to fix.
 Improving quality
Software testing helps to ensure that the software is of high quality and meets user
requirements.
 Enhancing user experience
Software testing helps to identify usability issues and ensure that the software is intuitive and
meets user expectations.
 Reducing risk
Software testing helps to identify and mitigate potential risks associated with the software, such
as system failures and data breaches.
 Complying with standards
Software testing ensures that the software adheres to industry standards, regulations, and other
critical compliance requirements. Software testing is most effective when it is continuous,
starting during the design phase and continuing throughout the development and deployment
process.

Software Testing
Goals of Software Testing

The main goal of software testing is to find bugs as early as possible and fix bugs and make sure that the
software is bug-free.

Important Goals of Software Testing:


 Detecting bugs as soon as feasible in any situation.
 Avoiding errors in a project’s and product’s final versions.
 Inspect to see whether the customer requirements criterion has been satisfied.
 Last but not least, the primary purpose of testing is to gauge the project and product level of quality.

The goals of software testing may be classified into three major categories as follows:
1. Immediate Goals
2. Long-term Goals
3. Post-Implementation Goals

Immediate Goals: 1. Bug Discovery


2.Bug Prevention

Long Term Goals: 1.Quality


2. Customer satisfaction
3. Reliability
Software Testing 4.Risk Management

Post-Implemented Goals:

1. Reduce Maintenance
2. Improved Software Testing Process

Software Testing
1. Immediate Goals:
These objectives are the direct outcomes of testing. These objectives may be set at any time
during the SDLC process. Some of these are covered in detail below:
 Bug Discovery: This is the immediate goal of software testing to find errors at any stage of
software development. The number of bugs is discovered in the early stage of testing. The primary
purpose of software testing is to detect flaws at any step of the development process. The higher
the number of issues detected at an early stage, the higher the software testing success rate.

 Bug Prevention: This is the immediate action of bug discovery, that occurs as a result of bug
discovery. Everyone in the software development team learns how to code from the behavior and
analysis of issues detected, ensuring that bugs are not duplicated in subsequent phases or future
projects.

2. Long-Term Goals: These objectives have an impact on product quality in the long run after one
cycle of the SDLC is completed. Some of these are covered in detail below:

 Quality: This goal enhances the quality of the software product. Because software is also a
product, the user’s priority is its quality. Superior quality is ensured by thorough testing.
Correctness, integrity, efficiency, and reliability are all aspects that influence quality. To attain
quality, you must achieve all of the above-mentioned quality characteristics.

 Customer Satisfaction: This goal verifies the customer’s satisfaction with a developed software
product. The primary purpose of software testing, from the user’s standpoint, is customer
satisfaction. Testing should be extensive and thorough if we want the client and customer to be
happy with the software product.

 Reliability: It is a matter of confidence that the software will not fail. In short, reliability means
gaining the confidence of the customers by providing them with a quality product.

Software Testing
 Risk Management: Risk is the probability of occurrence of uncertain events in the organization
and the potential loss that could result in negative consequences. Risk management must be done to
reduce the failure of the product and to manage risk in different situations.
3. Post-Implemented Goals: After the product is released, these objectives become critical. Some of
these are covered in detail below:

 Reduce Maintenance Cost: Post-released errors are costlier to fix and difficult to identify.
Because effective software does not wear out, the maintenance cost of any software product is not
the same as the physical cost. The failure of a software product due to faults is the only expense of
maintenance. Because they are difficult to discover, post-release mistakes always cost more to
rectify. As a result, if testing is done thoroughly and effectively, the risk of failure is lowered, and
maintenance costs are reduced as a result.

 Improved Software Testing Process: These goals improve the testing process for future use or
software projects. These goals are known as post-implementation goals. A project’s testing
procedure may not be completely successful, and there may be room for improvement. As a result,
the bug history and post-implementation results can be evaluated to identify stumbling blocks in
the current testing process that can be avoided in future projects.

Productivity and quality in software


In software testing, productivity is a measure of how efficiently resources are used to produce
software, while quality is a measure of how well the software meets the needs of the user:
Productivity
The ratio of the functional value of the software to the labor and expense required
to produce it. Productivity is dependent on the effective use of human resources, and can be improved
by reusing code, minimizing rework, and following best practices.
Quality
A high-quality product is one that meets the needs of the user. In software testing,
quality is determined by how well the tests contribute to improving the product. Testers should design
tests that provide meaningful insights into the software's capabilities and limitations.
Some metrics used to measure software testing quality include: Number of tests
run per unit of time, Test design efficiency, Test review efficiency, and Number of bugs per test.

Software Testing
Productivity and quality are directly related, so increasing quality will also
increase productivity.
Factors of Productivity and Quality software
 Quality assurance
 Risk Management
 Software quality factors
 Effective Management
 Team Communication
 Clear goal settings
 Access to resource
 Employee Motivation
 Support work Environment
Software quality factors:
These include integrity, reliability, usability, accuracy, efficiency, maintainability,
testability, flexibility, interface facility, re-usability, and transferability.
Effective management:
Effective management helps with controlling resources, meeting customer needs,
ensuring on-time delivery, and properly planning, monitoring, and communicating about projects.
Team communication:
Effective communication is important for productivity.

Clear goal-setting:
Clear goal-setting is important for productivity.
Access to resources:
Access to necessary resources is important for productivity.
Employee motivation:
Employee motivation is important for productivity.
Supportive work environment:
A supportive work environment is important for productivity.
Some ways to improve software productivity include:
 Reusing code
 Minimizing rework
 Adopting sound development practices and standards

Software Testing
 Focusing on key aspects of process improvement
 Reducing the need to develop new software
 Integrating specification, design, and implementation

Here are some ways to measure software productivity:


 Size-related metrics
These metrics indicate the size of the outcome, such as the number of lines of code written.
 Function-related metrics
These metrics indicate the amount of useful functionality delivered in a given time period. Function
points and application points are commonly used for waterfall software development, while story
points are commonly used for agile projects.
Here are some ways to improve software productivity:
 Reuse code to take advantage of existing programs
 Minimize rework through reliability initiatives
 Adopt sound development practices and standards
Here are some ways to improve software quality:
 Follow quality control (QC) methods to detect and remove defective devices
 Follow quality assurance methods to ensure that the organization's processes are followed properly
 Follow total quality management (TQM) methods to continuously improve the organization's procedures

Software Quality
Software quality product is defined in term of its fitness of purpose. That is, a quality product does
precisely what the users want it to do. For software products, the fitness of use is generally explained in
terms of satisfaction of the requirements laid down in the SRS document. Although "fitness of purpose"
is a satisfactory interpretation of quality for many devices such as a car, a table fan, a grinding machine,
etc.for software products, "fitness of purpose" is not a wholly satisfactory definition of quality.
Example: Consider a functionally correct software product. That is, it performs all tasks as specified in
the SRS document. But, has an almost unusable user interface. Even though it may be functionally right,
we cannot consider it to be a quality product.
The modern view of a quality associated with a software product several quality methods such as
the following:
Portability: A software device is said to be portable, if it can be freely made to work in various
operating system environments, in multiple machines, with other software products, etc.

Software Testing
Usability: A software product has better usability if various categories of users can easily invoke the
functions of the product.
Reusability: A software product has excellent reusability if different modules of the product can quickly
be reused to develop new products.
Correctness: A software product is correct if various requirements as specified in the SRS document
have been correctly implemented.
Maintainability: A software product is maintainable if bugs can be easily corrected as and when they
show up, new tasks can be easily added to the product, and the functionalities of the product can be
easily modified, etc.

Software Quality Management System


A quality management system is the principal methods used by organizations to provide that the
products they develop have the desired quality.
A quality system subsists of the following:
Managerial Structure and Individual Responsibilities: A quality system is the responsibility of the
organization as a whole. However, every organization has a sever quality department to perform various
quality system activities. The quality system of an arrangement should have the support of the top
management. Without help for the quality system at a high level in a company, some members of staff
will take the quality system seriously.
Quality System Activities: The quality system activities encompass the following:
 Auditing of projects
 Review of the quality system
 Development of standards, methods, and guidelines, etc.
 Production of documents for the top management summarizing the effectiveness of the quality
system in the organization.
Evolution of Quality Management System
Quality systems have increasingly evolved over the last five decades. Before World War II, the usual
function to produce quality products was to inspect the finished products to remove defective devices.
Since that time, quality systems of organizations have undergone through four steps of evolution, as
shown in the fig. The first product inspection task gave method to quality control (QC).
Quality control target not only on detecting the defective devices and removes them but also on
determining the causes behind the defects. Thus, quality control aims at correcting the reasons for bugs

Software Testing
and not just rejecting the products. The next breakthrough in quality methods was the development of
quality assurance methods.
The primary premise of modern quality assurance is that if an organization's processes are proper and
are followed rigorously, then the products are obligated to be of good quality. The new quality functions
include guidance for recognizing, defining, analyzing, and improving the production process.
Total quality management (TQM) advocates that the procedure followed by an organization must be
continuously improved through process measurements. TQM goes stages further than quality assurance
and aims at frequently process improvement. TQM goes beyond documenting steps to optimizing them
through a redesign. A term linked to TQM is Business Process Reengineering (BPR).
BPR aims at reengineering the method business is carried out in an organization. From the above
conversation, it can be stated that over the years, the quality paradigm has changed from product
assurance to process assurance, as shown in fig.

Phases in a Tester’s Mental Life:

PHASE 0—There’s no difference between testing and debugging. Other than in support of
debugging, testing has no purpose.

PHASE 1—The purpose of testing is to show that the software works.


Software Testing
PHASE 2—The purpose of testing is to show that the software doesn’t work.

PHASE 3—The purpose of testing is not to prove anything, but to reduce the perceived risk
of not working to an acceptable value.

PHASE 4—Testing is not an act. It is a mental discipline that results in low-risk software
without much testing effort.

Phase 0 Thinking:

I called the inability to distinguish between testing and debugging “phase 0” because it denies
that testing matters, which is why I denied it the grace of a number. See Section 2.1 in this chapter for
the difference between testing and debugging. If phase 0 thinking dominates an organization, then there
can be no effective testing, no quality assurance, and no quality. Phase 0 thinking was the norm in the
early days of software development and dominated the scene until the early 1970s, when testing
emerged as a discipline.

Phase 0 thinking was appropriate to an environment characterized by expensive and scarce


computing resources, low-cost (relative to hardware) software, lone programmers, small projects,
and throwaway software. Today, this kind of thinking is the greatest cultural barrier to good testing and
quality software. But phase 0 thinking is a problem for testers and developers today because many
software managers learned and practiced programming when this mode was the norm—and it’s hard to
change how you think.
Phase 1 Thinking-The Software Works
Phase I thinking represented progress because it recognized the distinction between testing and
debugging. This thinking dominated the leading edge of testing until the late 1970s when its fallacy was
discovered. This recognition is attributed to Myers (MYER79) who observed that it is self-corrupting. It
only takes one failed test to show that software doesn’t work, but even an infinite number of tests won’t
prove that it does. The objective of phase 1 thinking is unachievable. The process is corrupted because
the probability of showing that the software works decreases as testing increases; that is, the more you
test, the likelier you are to find a bug. Therefore, if your objective is to demonstrate a high probability of

Software Testing
working, that objective is best achieved by not testing at all! Although this conclusion may seem silly to
the conscious, rational mind, it is the kind of syllogism that our unconscious mind loves to implement.
Phase 2 Thinking-The Software Doesn’t Work:
When, as testers, we shift our goal to phase 2 thinking we are no longer working in cahoots
with the designers, but against them. The difference between phase 1 and 2 thinking is illustrated by
analogy to the difference between bookkeepers and auditors. The book keeper’s goal is to show that the
books balance, but the auditor’s goal is to show that despite the appearance of balance, the bookkeeper
has embezzled. Phase 2 thinking leads to strong, revealing tests.
While one failed test satisfies the phase 2 goal, phase 2 thinking also has limits. The test reveals
a bug, the programmer corrects it, the test designer designs and executes another test intended to
demonstrate another bug. Phase 2 thinking leads to a never-ending sequence of ever more diabolical
tests. Taken to extremes, it too never ends, and the result is reliable software that never gets shipped.
The trouble with phase 2 thinking is that we don’t know when to stop.
Phase 3 Thinking-Test for Risk Reduction:
Phase 3 thinking is nothing more than accepting the principles of statistical quality control. I say
“accepting” rather than “implementing” because it’s not obvious how statistical quality control should
be applied to software. To the extent that testing catches bugs and to the extent that those bugs are fixed,
testing does improve the product. If a test is passed, then the product’s quality does not change, but our
perception of that quality does. Testing, pass or fail, reduces our perception of risk about a software
product. The more we test, the more we test with harsh tests, the more confidence we have in the
product. We’ll risk release when that confidence is high enough.
Phase 4 Thinking-A State of Mind:
The phase 4 thinker’s knowledge of what testing can and can’t do, combined with knowing what
makes software testable, results in software that doesn’t need much testing to achieve the lower-phase
goals. Testability is the goal for two reasons. The first and obvious reason is that we want to reduce the
labor of testing. The second and more important reason is that testable code has fewer bugs than code
that’s hard to test. The impact on productivity of these two factors working together is multiplicative.
What makes code testable? One of the main reasons to learn test techniques is to answer that question.
Cumulative Goals:
The above goals are cumulative. Debugging depends on testing as a tool for probing
hypothesized causes of symptoms. There are many ways to break software that have nothing to do with
the software’s functional requirements: phase 2 tests alone might never show that the software does
what it’s supposed to do. It’s impractical to break software until the easy demonstrations of workability

Software Testing
are behind you. Use of statistical methods as a guide to test design, as a means to achieve good testing at
acceptable risks, is a way of fine-tuning the process. It should be applied only to large, robust products
with few bugs. Finally, a state of mind isn’t enough: even the most testable software must be debugged,
must work, and must be hard to break.
Software testing methods
 Manual, automated, and continuous testing
Manual testing is hands-on and requires human observation, while automated testing uses scripts and
tools to automate the process. Continuous testing combines automated testing with a scaled approach to
achieve reliable test coverage.
 White-box, black-box, and grey-box testing
These approaches describe the tester's point of view when designing test cases. White-box testing
provides access to the source code, while black-box testing only tests the user interface. Grey-box
testing combines aspects of both approaches, providing access to design documents and the database.
 Code analysis tools
Tools like PyCharm, Checkstyle, and SourceMeter can help identify issues in the code.
 Static analysis techniques
These techniques include data flow analysis, control flow analysis, and cyclomatic complexity.
 Load, stress, endurance, and spike testing
These types of load testing evaluate how a software system responds to different types of demand:
 Load testing: Tests how a system performs under real-world load conditions.
 Stress testing: Tests how a system responds to realistic and unrealistic load scenarios, with the goal of
overloading the system until it breaks.
 Endurance testing: Also known as soak testing, this technique analyzes how a system behaves under a
sustained load over a longer period of time.
 Spike testing: A type of load test that determines how a system responds to large bursts of concurrent
activity.
Testing Vs Debugging
What is Testing?
Testing is the process of verifying and validating that a software or application is bug-free, meets the
technical requirements as guided by its design and development, and meets the user requirements
effectively and efficiently by handling all the exceptional and boundary cases. The purpose of
software testing is to identify the errors, faults, or missing requirements in contrast to actual
requirements.
What is Debugging?
Debugging is the process of fixing a bug in the software. It can be defined as identifying, analyzing,
and removing errors. This activity begins after the software fails to execute properly and concludes by
Software Testing
solving the problem and successfully testing the software. It is considered to be an extremely complex
and tedious task because errors need to be resolved at all stages of debugging.

Testing Debugging
Aspects

Testing is the process to find bugs Debugging is the process of correcting the
and errors. bugs found during testing.
Definition

The purpose of testing is to identify


The purpose of debugging is to fix those
defects or errors in the software
defects or errors.
Purpose system

It is the process to identify the failure It is the process to give absolution to code
of implemented code. failure.
Focus

Timing Testing is done before debugging Debugging is done after testing

Debugging involves analyzing the


Testing involves executing the
symptoms of a problem and identifying
software system with test cases
Approach the root cause of the problem

Debugging typically involves using tools


Testing can involve using automated
Tools and and techniques such as logging, tracing,
or manual testing tools
Technique and code inspection.

Testing is the display of errors. Debugging is a deductive process.


Methodology

Debugging is done by either programmer


Testing is done by the tester.
or the developer.
Team Involve

There is no need of design knowledge Debugging can’t be done without proper


Design in the testing process. design knowledge.
Knowledge

Testing can be done by insiders as Debugging is done only by insiders. An


well as outsiders. outsider can’t do debugging.
Access

It is based on different testing levels Debugging is based on different types of


Categorization bugs.
i.e. unit testing, integration testing,
Basic

Software Testing
Testing Debugging
Aspects

system testing, etc.

Debugging is not an aspect of the


Testing is a stage of the software
software development life cycle, it occurs
development life cycle (SDLC).
as a consequence of testing.
SDLC

Debugging process seeks to match


Testing is composed of the validation
symptoms with cause, by that it leads to
and verification of software.
error correction.
Process Nature

Testing is initiated after the code is Debugging commences with the


written. execution of a test case.
Initiation

Models of Testing
Testing is an integral part of the software development life cycle. Various models or approaches
are used in the software development process, and each model has its own advantages and
disadvantages. Choosing a particular model depends on the project deliverables and the complexity of
the project.
What Are Software Testing Models?
Software testing models are systematic approaches used to plan, design, execute, and manage testing
activities. They provide guidelines for carrying out testing processes effectively and ensure
comprehensive test coverage.
Each model offers distinct advantages and is chosen based on the specific requirements of the project
and the organization’s preferences. Understanding these models is crucial for selecting the most suitable
approach for software testing in a given scenario.
Now let us go through the various software testing models and their benefits:
1. Waterfall Model
This is the most basic software development life cycle process, which is broadly followed in the
industry. Here, the developers follow a sequence of processes where the processes flow progressively
downward towards the ultimate goal. It is like a waterfall where there are a number of phases.

Software Testing
These phases each have their own unique functions and goals. There are, in fact, four phases:
requirement gathering and analysis phase, software design, programmed implementation and testing,
and maintenance. All these four phases come one after another in the given order.

In the first phase, all the possible system requirements for developing a particular software are noted and
analyzed. This, in turn, depends on the software requirement specifications, which include detailed
information about the expectations of the end user. Based on this, a requirement specification.
A document is created that acts as input to the next phase, i.e., the software design phase. What needs to
be emphasized here is that once you move into the next phase, it won’t be possible to update the
requirements. So you must be very thorough and careful about the end-user requirements.
Advantages
 Easy to implement and maintain.
 The initial phase of rigorous scrutiny of requirements and systems helps save time later in the
developmental phase
 The requirement for resources is minimal, and testing is done after the completion of each phase.
Disadvantages
 It is not possible to alter or update the requirements
 You cannot make changes once you are in the next phase.
 You cannot start the next phase until the previous phase is completed
2. V Model
This model is widely recognized as superior to the waterfall model. Here, the development and
test execution activities are carried out side by side in a downhill and uphill shape. In this model, testing
starts at the unit level and spreads toward integration of the entire system.

Software Testing
So, SDLC is divided into five phases – unit testing, integration testing, regression testing, system testing,
and acceptance testing.

Advantages
 It is easy to use the model since testing activities like planning and test design are done before
coding
 Saves time and enhances the chances of success.
 Defects are mostly found at an early stage, and the downward flow of defects is generally
avoided
Disadvantages
 It is a rigid model
 Early prototypes of the product are not available since the software is developed during the
implementation phase
 If there are changes in the midway, then the test document needs to be updated
3. Agile model
In this SDLC model, requirements and solutions evolve through collaboration between various
cross-functional teams. This is known as an iterative and incremental model.
Advantages
 Ensure customer satisfaction with the rapid and continuous development of deliverables.
 It is a flexible model as customers, developers, and testers continuously interact with each other
 Working software can be developed quickly, and products can be adapted to changing
requirements regularly

Software Testing
Disadvantages
 In large and complex software development cases, it becomes difficult to assess the effort
required at the beginning of the cycle
 Due to continuous interaction with the customer, the project can go off track if the customer is
not clear about the goals
4. Spiral model
It is more like the Agile model, but with more emphasis on risk analysis. It has four phases:
planning, risk analysis, engineering, and evaluation. Here, the gathering of requirements and risk
assessment is done at the base level, and every upper spiral builds on it.
Advantages
 Risk avoidance is enhanced due to the importance of risk analysis.
 Its a good model for complex and large systems.
 Depending on the changed circumstances, additional functionalities can be added later on
 Software is produced early in the cycle
Software Testing
Disadvantages
 Its a costly model and requires highly specialized expertise in risk analysis
 It does not work well in simpler projects
5. Rational Unified Process
Rational Unified Process Methodology
This model also consists of four phases, each of which is organized into a number of separate iterations.
The difference with other models is that each of these iterations must separately satisfy defined criteria
before the next phase is undertaken.
Advantages
 With an emphasis on accurate documentation, this model is able to resolve risks associated with
changing client requirements.
 Integration takes less time as the process goes on throughout the SDLC.
Disadvantages
 The biggest disadvantage is that the team members must be experts in their niche.
 In big projects such as continuous integration, it might give rise to confusion
Software Testing
6. Rapid application development
This is another incremental model, like the Agile model. Here, the components are developed parallel to
each other. The developments are then assembled into a product.
Advantages
 The development time is reduced due to the simultaneous development of components, and the
components can be reused
 A lot of integration issues are resolved due to integration from the initial stage
Disadvantages
 It requires a strong team of highly capable developers with individual efficacy in identifying
business requirements
 It is a module-based model, so systems that can be modularized can only be developed in this
model
 As the cost is high, the model is not suitable for cheaper projects
7 Iterative Model
The iterative model does not require a complete list of requirements before the start of the project. The
development process begins with the functional requirements, which can be enhanced later. The
procedure is cyclic and produces new versions of the software for each cycle. Every iteration develops a
separate component in the system that adds to what has been preserved from earlier functions.
Advantages
 It is easier to manage the risks since high-risk tasks are performed first.
 The progress is easily measurable.
 Problems and risks that are labeled within one iteration can be avoided in subsequent sprints.
Software Testing
Disadvantages
 The iterative model needs more resources compared to the waterfall model.
 Managing the process is difficult.
 The final stage of the project may not entirely determine the risks.
8. Kanban Model
The Kanban Model is a visual and flow-based approach to software development and project
management. It relies on a visual board to represent work items, which move through different process
stages. These stages include backlog, analysis, development, testing, and deployment.
Each work item in a Kanban system has a card on the board to represent it, and team members move
these cards through the stages as they complete them.
The board provides a real-time visual representation of the work in progress and helps teams identify
bottlenecks or areas for improvement.
Continuous improvement is a key principle of Kanban. Teams regularly review their processes, identify
areas of inefficiency, and make incremental changes to enhance workflow. This adaptability and focus
on improvement make the Kanban Model well-suited for projects with evolving requirements and a need
for continuous delivery.
Advantages of Kanban Model:
 Visual Representation: Provides a clear visual overview of work items and their progress.
 Flexibility: It is adaptable to changing priorities and requirements, making it suitable for
dynamic projects.
 Continuous Improvement: Encourages regular process reviews and enhancements for
increased efficiency.
 Reduced Waste: Minimizes unnecessary work by focusing on completing tasks based on actual
demand.
Disadvantages of the Kanban Model:
 Limited Planning: Less emphasis on detailed planning may be a drawback for projects
requiring extensive upfront planning.
 Dependency on WIP Limits: Ineffective management of work-in-progress (WIP) limits can
lead to bottlenecks.
 Complexity Management: This may become complex for large-scale projects or those with
intricate dependencies.
 Team Dependency: This relies on team collaboration and communication, which can be
challenging if not well coordinated.

Software Testing
9. The Big Bang Model
 No Formal Design or Planning: The Big Bang Model is characterized by an absence of detailed
planning or formal design before the development process begins.
 Random Testing Approach: Testing is conducted randomly, without a predefined strategy or
specific testing phases.
 Suitable for Small Projects: This model is often considered suitable for small-scale projects or
projects with unclear requirements.
Advantages of the Big Bang Model:
1. Simplicity: The model is simple and easy to understand.
2. Quick Start: Quick initiation, as there is no need for elaborate planning.
Disadvantages of the Big Bang Model:
1. Uncertainty: Lack of planning and design can lead to uncertainty and chaos during
development.
2. Testing Challenges: Random testing may result in inadequate test coverage, and missing critical
issues.
3. Limited Scalability: Not suitable for large or complex projects due to a lack of structured
processes.
10. Scrum Model
 Framework within Agile: Scrum is a framework operating within the Agile methodology,
emphasizing iterative development and collaboration.
 Sprints for Short Development Cycles: Development occurs in short, fixed intervals known as
sprints, typically lasting 2-4 weeks.
 Adaptability and Rapid Releases: Scrum promotes adaptability to changing requirements and
aims for rapid, incremental releases.
Advantages of Scrum Model:
1. Flexibility: Allows for flexibility in responding to changing project requirements.
2. Customer Satisfaction: Regular deliverables enhance customer satisfaction and engagement.
3. Continuous Improvement: Emphasizes continuous improvement through regular
retrospectives.
Disadvantages of the Scrum Model:
1. Lack of Structure: Some teams may struggle with flexibility and lack of a structured plan.
2. Dependency on Team Collaboration: Success heavily depends on effective collaboration
within the development team.

Software Testing
3. Limited Predictability: It may be challenging to predict the exact outcomes and timeline due to
the iterative nature.
Bugs:
Software bugs are errors, flaws, deficiencies, or defects in a computer program or system that
cause it to produce an incorrect or unexpected result or to behave in unintended ways.
Different Types of Software Bugs
Here are the most common types of software bugs or defects encountered in software testing so that
developers and testers can deal with them better.
1. Functional Bugs
Functional bugs are associated with the functionality of a specific software component.
In simple terms, any component in an app or website that doesn’t function as intended is a functional
bug.
Such bugs are often detected when testers conduct comprehensive functional testing for their apps or
websites in real user conditions. Teams need to ensure that all the functional bugs are resolved in the
early stages so as to avoid delivering bad user experiences in the production environment.
For example, a Login button doesn’t allow users to login, an Add to cart button that doesn’t update the
cart, a search box not responding to a user’s query, etc.
2. Logical Bugs
A logical bug disrupts the intended workflow of software and causes it to behave incorrectly. These
bugs can result in unexpected software behavior and even sudden crashes. Logical bugs primarily take
place due to poorly written code or misinterpretation of business logic.
For example of logical bugs include:
 Assigning a value to the wrong variable.

 Dividing two numbers instead of adding them together resulting in unexpected output

3. Workflow Bugs
Workflow bugs are associated with the user journey (navigation) of a software application.
Let’s consider an example of a website where a user needs to fill up a form regarding their medical
history. After filling the form, the user has three options to choose from:
 Save
 Save and Exit
 Previous Page

Software Testing
From the available options, if the user clicks on “Save and Exit,” the user intends to save the entered
information and then exit. However, if clicking on the Save and Exit button leads to an exit from the
form without saving the information, it leads to a workflow bug.

4. Unit Level Bugs


Unit level bugs are very common, and they are typically easier to fix. Once the initial modules of
software components are developed, developers perform unit testing to ensure that the small batches of
code are functioning as expected. Here’s where developers encounter various bugs that get overlooked
in the coding stages. Unit level bugs are easier to isolate as developers deal with a comparatively small
amount of code. Moreover, replicating these bugs takes less time, so developers can track the exact bug
and fix it in no time.
For example, if a developer creates a single page form, a unit test will verify whether all the input fields
are accepting appropriate inputs and validate buttons for functionality. In case a field doesn’t accept the
appropriate characters or numbers, developers encounter a unit-level bug.
5. System-Level Integration Bugs
System-level integration bugs primarily pop up when two or more units of code written by different
developers fail to interact with each other. These bugs primarily occur due to inconsistencies or
incompatibility between two or more components. Such bugs are difficult to track and fix as developers
need to examine a larger chunk of code. They are also time-consuming to replicate. Memory overflow
issues and inappropriate interfacing between the application UI and the database are common examples
of system-level integration bugs.
For example: An online booking system integrates with multiple third-party service providers (e.g.,
airlines, hotels). If one of the service providers experiences high latency or timeouts, the entire booking
process may fail, resulting in incomplete bookings or incorrect availability information.
6. Out of Bound Bugs
Out of Bound Bugs show up when the system user interacts with the UI in an unintended manner. These
bugs occur when an end-user enters a value or a parameter outside the limits of unintended use.
For example, entering a significantly larger or a smaller number or entering an input value of an
undefined data type. These bugs often pop up in form validations during functional testing of web or
mobile apps.
7. Security Bugs
Security is a major concern for software development. Security Bugs are a major risk for users and
should be taken very seriously and resolved. Due to their high severity and vulnerable nature, security

Software Testing
bugs are considered among the most sensitive bugs of all types and should be handled with criticality
and urgency.
These bugs might not hinder the operation but can compromise the whole system. These should be
checked thoroughly at regular intervals.
A common example is SQL injection, where an attacker can manipulate a database query to gain
unauthorized access.
8. Performance Bugs
Performance bugs occur when a software application fails to meet the expected performance
benchmarks, such as load times, response times, or throughput. These bugs can significantly degrade the
user experience, especially in high-traffic or resource-intensive environments.
For example: An e-commerce website experiences a performance bug where the page load time exceeds
5 seconds during peak traffic hours, causing frustration for users and leading to a high abandonment
rate.
9. Compatibility Bugs
Compatibility bugs arise when a software application does not function correctly across different
environments, devices, or platforms. These bugs can lead to inconsistent user experiences and reduced
accessibility.
For example: A mobile app works perfectly on Android devices but crashes or displays incorrectly on
certain iOS devices, leading to a compatibility bug that impacts a significant portion of the user base.
10. Usability Bugs
Usability bugs affect the overall user experience, making it difficult or confusing for users to interact
with the software. These bugs do not necessarily prevent functionality but can lead to poor user
satisfaction and increased user error rates.
For example: A web application has a complex navigation structure that makes it difficult for users to
find essential features, leading to a usability bug that frustrates users and reduces engagement.
11. Concurrency Bugs
Concurrency bugs occur in software systems that involve parallel processing or multi-threading. These
bugs arise when multiple threads or processes interact in unintended ways, leading to unpredictable
behavior, data corruption, or system crashes.
For example: A banking application experiences a concurrency bug where two users attempt to transfer
funds simultaneously, leading to incorrect account balances or duplicate transactions.

Software Testing
Impact of Bugs on Software Development
Bugs can have significant impacts on software development, affecting everything from project timelines
to user satisfaction. Here’s how:
1. Delays in Project Delivery: Bugs, especially critical ones, can delay the release of software as
developers need to spend time identifying and fixing them. This can lead to missed deadlines and
increased costs.
2. Increased Development Costs: The longer a bug goes undetected, the more expensive it
becomes to fix. Early-stage bugs might be resolved with a few lines of code, but later-stage bugs could
require extensive rework, leading to higher costs.
3. Reputation Damage: If bugs make it into production, they can lead to a poor user experience,
causing users to lose trust in the software. This can damage the reputation of the company and result in
loss of customers.
4. Security Risks: Bugs that involve security vulnerabilities can be exploited by attackers, leading
to data breaches, legal liabilities, and financial losses.
5. Lower Productivity: Frequent bugs can lead to a more reactive approach to development, where
developers are constantly fixing issues rather than building new features. This can lower overall
productivity and slow down innovation.

Software Testing
lOMoARcPSD|28728520

12

LESSON-2
MODEL AND DESIGN OF TESTING
Structure

2.1 Introduction

2.2 Objectives

2.3 The Project

2.4 Overview of Testing Process

2.5 The Environment

2.6 The Program

2.7 Testing and Levels

2.8 Role of Models

2.9 Test Design Techniques

2.10 The Importance of Test Design Techniques

2.11 Types of Test Design Techniques

2.12 Choosing the Right Technique

2.13 Bugs

2.14 Bugs in Software Testing

2.15 Bugs and it’s Source

2.16 Impact of Bugs

2.17 Bug Lifecycle

2.18 Guidelines for implementing Bug Lifecycle

Downloaded by priya Manoharan ([email protected])


lOMoARcPSD|28728520

13

2.19 Summary

2.20 Model Questions

2.1 Introduction
Testing is an essential part of the Software Development Process. A robust and stable software
product can be delivered with the use of standard testing methodologies that will help to predict
the timeline of the software system.A software application may turn even more complex with a
large number of platforms and devices. More importantly, it is required to ensure whether they
meet the specified requirements and can be efficiently installed and operated on the user’s
machine or not. Model-based testing is a software testing technique in which the test cases are
derived from a model that describes the functional aspects of the system under test.It makes
use of a model to generate tests that includes both offline and online testing.

2.2 Objectives
 To study the fundamental concepts of software testing which includes objectives, process,
criteria, strategies, and methods.

 To discuss various software testing types and levels of testing like black and white box
testing along with levels unit test, integration, regression, and system testing.

 It also helps to learn the types of bugs, testing levels with which the student can very
well identify a bug and correct as when it happens.

2.3 The Project


Testing is applied to anything from subroutines to systems that consist of millions of statements.
Testing the interfaces between different parts of your own mind is very different from testing the
interface between you and other programmers separated from you by geography, language,
time and disposition. Testing a one-shot routine that will be run only a few times is very different
from testing one that must run for decades and may be modified by some unknown future
programmer. There is an implied context for the test methods and a real-world context is
characterized by the following model project.

 Application: The specifics of the application are unimportant. It is a real-time system


that must provide timely responses to user requests for services. It is an online system
connected to remote terminals.

Downloaded by priya Manoharan ([email protected])


lOMoARcPSD|28728520

14

 Staff: The programming staff consists of twenty to thirty programmers: big enough to
warrant formality, but not too big to manage and big enough to use specialists for some
parts of the system’s design.

 Schedule: The project will take 24 months from the start of design to formal acceptance
by the customer. Acceptance will be followed by a 6-month cutover period. Computer
resources for development and testing will be almost adequate.

 Specification: The specification is good. It is functionally detailed without constraining


the design, but there are undocumented “understandings” concerning the requirements.

 Acceptance Test: The system will be accepted only after a formal acceptance test.
The application is not new, so part of the formal test already exists. At first the customer
will intend to design the acceptance test, but later it will become the software design
team’s responsibility.

 Personnel: The staff is professional and experienced in programming and in the


application. Half the staff has programmed that computer before and most know the
source language. One-third, mostly junior programmers, have no experience with the
application. The typical programmer has been employed by the programming department
for 3 years. Management’s attitude is positive and knowledgeable about the realities of
such projects.

 Standards: Programming and test standards exist and are usually followed. They
understand the role of interfaces and the need for interface standards. There is an
internal, semiformal and quality-assurance function. The database is centrally developed
and administered.

 Objectives: The system is the first of many similar systems that will be implemented
in the future. No two will be identicalm but they will have 75% of the code in common.
Once installed, the system is expected to operate profitably for more than 10 years.

 Source: One-third of the code is new, one-third extracted from a previous, reliable, but
poorly documented system and one-third is being rehosted.

 History: One programmer will quit before his components are tested. Another
programmer will be fired before testing begins: excellent work but poorly documented.
One component will have to be redone after unit testing: a superb piece of work that

Downloaded by priya Manoharan ([email protected])


lOMoARcPSD|28728520

15

defies integration. The customer will insist on five big changes and twenty small ones.
A facility and/or hardware delivery problem will delay testing for several weeks and force
second and third shift work. Several important milestones will slip but the delivery date
will be met.

2.4 Overview of Testing Process


Fig. 2.1 is a model of the testing process. The process starts with a program embedded in an
environment, such as a computer, an operating system, or a calling program. We understand
human nature and its susceptibility to error. This understanding leads us to create three models:
a model of the environment, a model of the program, and a model of the expected bugs. From
these models we create a set of tests, which are then executed. The result of each test is either
expected or unexpected. If unexpected, it may lead us to revise the test, our model or concept
of how the program behaves, our concept of what bugs are possible, or the program itself.

Figure 2.1 A Model of Testing

2.5 The Environment


A program’s environment is the hardware and software required to make it run. For online
systems the environment may include communications lines, other systems, terminals, and
operators. The environment also includes all programs that interact with – and are used to
create – the program under test, such as operating system, loader, linkage editor, compiler,
utility routines.

Downloaded by priya Manoharan ([email protected])


lOMoARcPSD|28728520

16

Programmers should learn early in their careers that it is not smart to blame the environment
for bugs. Hardware bugs are rare. So are bugs in manufacturer-supplied software. This is not
because logic designers and operating system programmers are better than application
programmers, but because such hardware and software is stable, tends to be in operation for
a long time, and most bugs will have been found and fixed by the time programmers use that
hardware or software. Our model of the environment includes our beliefs regarding such things
as the workings of the computer’s instruction set, operating system macros and commands,
and what a higher-order language statement will do. If testing reveals an unexpected result, we
may have to change our beliefs (our model of the environment) to find out what went wrong. But
sometimes the environment could be wrong: the bug could be in the hardware or firmware after
all.

2.6 The Program


Most programs are too complicated to understand in detail. We must simplify our concept of
the program in order to test it. So although a real program is exercised on the test bed, in our
brains we deal with a simplified version of it – a version in which most details are ignored. If the
program calls a subroutine, we tend not to think about the subroutine’s details unless its operation
is suspect. Similarly, we may ignore processing details to focus on the program’s control
structure or ignore control structure to focus on processing. As with the environment, if the
simple model of the program does not explain the unexpected behaviour, we may have to
modify that model to include more facts and details. And if that fails, we may have to modify the
program.

2.7 Testing and Levels


Tests are formal procedures, inputs must be prepared, outcomes should predict,tests should
be documented, commands need to be executed, and results are to be observed. All these
errors are subjected to error. We do three distinct levels of testing on a typical software system.
They are:

1. Unit / Component Testing: A Unit is the smallest testable piece of software that can be
compiled, assembled, linked, loaded etc. A unit is usually the work of one programmer and
consists of several hundred or fewer lines of code. Unit Testing is the testing we do to show that
the unit does not satisfy its functional specification or that its implementation structure does not
match the intended design structure. A Component is an integrated aggregate of one or more

Downloaded by priya Manoharan ([email protected])


lOMoARcPSD|28728520

17

units. Component Testing is the testing we do to show that the component does not satisfy its
functional specification or that its implementation structure does not match the intended design
structure.

2. Integration Testing: Integration is the process by which components are aggregated to


create larger components. Integration Testing is testing done to show that even though the
components were individually satisfactory (after passing component testing), checks the
combination of components are incorrect or inconsistent.

3. System Testing: A System is a big component. System Testing is aimed at revealing bugs
that cannot be attributed to components. It includes testing for performance, security,
accountability, configuration sensitivity, startup and recovery.

2.8 Role of Models


Testing is a process in which we create mental models of the environment, the program, human
nature and the tests themselves. Each model is used either until we accept the behavior as
correct or until the model is no longer sufficient for the purpose. Unexpected test results always
force a revision of some mental model and in turn may lead to a revision of whatever is being
modeled. The revised model may be more detailed, which is more complicated or more simple.
The art of testing consists of creating, selecting, exploring, and revising models. Our ability to
go through this process depends on the number of different models we have at hand and their
ability to express a program’s behavior.

2.9 Test Design Techniques


Test design is a significant step in the Software Development Life Cycle (SDLC), also known
as creating test suites or testing a program. In other words, it’s primary purpose is to create a
set of inputs that can provide a set of expected outputs, to address the following concerns: (i)
What to test and what not to test? (ii) How to stimulate the system and with what data values?
and (iii) How the system should react and respond to the stimuli?

Therefore, various techniques exist for test design and execution. It is understandably crucial to
utilize some effective test design techniques since software development is getting more
complicated. Generally, software testing design techniques help you write better test cases
and optimize testing processes. It also helps reduce the time of executing test cases while
escalating test coverage. Next, the overview of software testing design techniques are explained.

Downloaded by priya Manoharan ([email protected])


lOMoARcPSD|28728520

18

2.10 The Importance of Test Design Techniques


Test design techniques are applied to satisfy the goals of every individual in software development
projects, including testers. Although the main purpose is to ensure that the products meet the
expectations of clients and their businesses, these techniques allow testers to execute the test
effortlessly based on various risk factors. Here is a checklist of standards that a smooth testing
process meets:

1) Gather information to understand users’ requirements.

2) Derive all important business scenarios.

3) Design test scenarios for every derived critical business scenarios.

4) Assign all planned test scenarios to different test cases.

Then, you will have to choose a test design technique for each requirement. At this point, if
things are correctly implemented, you can make significant changes that affect your Return of
Investment (ROI) extraordinarily. Two main advantages of techniques lie on their consistency
and repeatability:

 Possibility to reproduce a test: At several testing phases, testing techniques are


considered as a set of rules help ensure a minimum level of consistency. Testers,
indeed, would work much more efficiently with a base, thereby reducing a significant
amount of effort in later fixing.

 Increasing of the found bugs: Test design techniques can also be used as analytical
tools. When applying techniques to elements, we often see problems in the definition of
those elements.

2.11 Types of Test Design Techniques


Each of these test design techniques is suitable for identifying a certain type of error. The
new standard ISO 29119 Software Testing includes the following techniques:

2.11.1 Specification-based or Black-box techniques

 Equivalence Partitioning: The idea of this approach is grouping the inputs with the
same attributes to partitions. Code is not visible to testers. The task is to pick one

Downloaded by priya Manoharan ([email protected])


lOMoARcPSD|28728520

19

condition out of each partition, which covers all possible scenarios, to execute test
cases. If a condition of a partition is valid, other conditions are valid too. Likewise, if a
condition in a partition is invalid, other conditions are also invalid. This helps to reduce
the number of test cases.

 Boundary Value Analysis: This is one of the software testing techniques in which
test cases are designed to include values at the boundaries. If the input is within the
boundary value, it is considered ‘Positive testing’. If the input is outside of the boundary
value, it is considered ‘Negative testing’. The goal is to select test cases to execute
boundary values. In other words, the behavior of Negative testing is more likely to be
incorrect than the behavior of Positive testing; and boundaries are an area in which
testing is more likely to yield defects.

 Decision Table Testing: This technique can be used in test design because it helps
testers explore the effects of combining different input values when adhering business
rules. A Decision Table is a tabular representation of conditions versus test actions.
Conditions are considered as inputs, while actions are considered as outputs.

 State Transition Diagrams: Using this approach, the tester analyzes the behavior of
an Application Under Test (AUT) for different input conditions in a sequence. Both positive
and negative input test values will be provided and record the system behavior. Any
system in which we get a different output for the same input is a finite state system.

 Use Case Testing: Use case testing is a functional testing technique, meaning
programming skill is not required. It helps the tester to determine which test scripts are
executed on the entire system from the beginning to the end of each transaction.

2.11.2 Structure-based or White-Box techniques

 Statement Coverage or Line Coverage: In this technique, every statement in the


source code is executed at least once. Thereby, we can check what the source code is
and is not expected to do. However, we cannot test the false condition in the source
code. Statement coverage = (No. of statements Executed / Total no. of statements in
the source code) * 100

 Condition Coverage or Predicate Coverage: Condition coverage is seen for Boolean


expression. Condition coverage ensures whether all the Boolean expressions have
been covered and evaluated to both TRUE and FALSE.

Downloaded by priya Manoharan ([email protected])


lOMoARcPSD|28728520

28

LESSON-3
FLOW GRAPHS AND PATH TESTING
Structure

3.1 Introduction

3.2 Objectives

3.3 Flow Graph Elements

3.4 Path Testing – An Introduction

3.5 Path Testing Process

3.6 Path Testing – Paths, Nodes and Links

3.7 Path Testing Criteria

3.8 Loops

3.9 Summary

3.10 Model Questions

3.1 Introduction
Flow Graph is defined as a function in a program that can be represented as a control flow
graph and the nodes in the flow graph are defined as program statements while the directed
edges are the flow of control. The graphical representation of a program’s control structure is
known as control flow graph. A Flow Graph consists of nodes and edges. The two nodes in the
Flow Graph can be either unconnected or connected by an edge in either direction or connected
by an edge in all directions.

While tracing a path from a source to a sink a back edge is an edge that leads back to a node
that has already been visited. The Flow Graph contains one source node and one sink. A source
node is the node that has no incoming edges while a sink node is the node with no outgoing
edges. A program’s function may contain more than one sink node, but this graph can be
converted into a graph with only one sink. There are some languages that allow more than one
source. This construct is very rare and not used in Structured Programming.

Downloaded by priya Manoharan ([email protected])


lOMoARcPSD|28728520

29

3.2 Objectives
 To study the fundamental concepts of flow graphs and it’s elements.

 To discuss about path testing, it’s process and path testing criteria.

 To learn about the basics of loops and it’s kinds.

3.3 Flow Graph Elements


Flow graphs are used as a main tool for test case identification. They represent the relationship
between program segments, that is the sequence of statements having the property that if the
first member of the sequence is executed then all other statements in that sequence will also
be executed. Nodes represent one program segment. The area bounded by edges and nodes
are called regions. It uses the elements like process blocks, decisions and junctions. The flow
graph is not to be confused with the earlier flowchart, though both are similar. The four different
types of flow graph elements are:

Process Block:

A process block is a sequence of program statements which are not interrupted by either
decisions or junctions. If any one of the statements of the block is executed, then all followed
statements are executed. Process block contains a piece of straight line code of one statement
or hundreds of statements. A process has one entry and one exit. It can consist of a single
statement or instruction, a sequence of statements or instructions, a single entry/exit subroutine
or a macro or function call.

Downloaded by priya Manoharan ([email protected])


lOMoARcPSD|28728520

30

Decisions

A decision is a program point at which the control flow can diverge. In control flow, most of the
decisions are two way but in some case it may have three way branches.

Standard notations used in constructing a flow graph are as in the following.

To indicate a Sequence:

To indicate “IF-THEN-ELSE”:

Downloaded by priya Manoharan ([email protected])


lOMoARcPSD|28728520

31

To indicate a “WHILE” Loop:

To indicate a “Repeat-Until” Loop:

To indicate a “CASE” Statement:

Downloaded by priya Manoharan ([email protected])


lOMoARcPSD|28728520

32

A case statement is a multi -way branch or a decision. Example: A jump table in assembly
language and the PASCAL case statement. In the test design point of view, there are no differences
between decisions and case statements.

Junctions

A junction is a point in the program where the control flow can merge. Examples of junctions
are, the target of a jump or skip instruction.

On a Flow Graph:

1. Arrows called edges indicates flow of control.

2. Circles called nodes indicates one or more actions.

3. Areas bounded by edges and nodes are called regions.

4. A predicate node is a node containing a condition.

3.4 Path Testing


Path testing is a structural testing method that involves using the source code of a program in
order to find every possible executable path. It helps to determine all faults lying within a piece
of code. This method is designed to execute all or selected path through a computer program.
Any software program includes, multiple entry and exit points. Testing each of these points is a
challenging as well as time-consuming. In order to reduce the redundant tests and to achieve

Downloaded by priya Manoharan ([email protected])


lOMoARcPSD|28728520

33

maximum test coverage, basis path testing is used. In path testing method, the control flow
graph of a program is designed to find a set of linearly independent paths of execution. In
this method Cyclomatic Complexity is used to determine the number of linearly independent
paths and then test cases are generated for each path.

If the set of paths are properly chosen then we have achieved some measure of test
thoroughness. For example, pick enough paths to assure that every source statement has
been executed at least once. Path testing techniques are the oldest of all structural test
techniques. Path testing is most applicable to new software for unit testing. It is a structural
technique. It requires complete knowledge of the program’s structure. It is most often used by
programmers to unit test their own code. The effectiveness of path testing rapidly deteriorates
as the size of the software aggregate under test increases.

The objective of basis path testing is to define the number of independent paths, so the number
of test cases needed can be defined explicitly to maximize test coverage. Here we will take a
simple example, to get a better idea what is basis path testing include

In the above example, we can see there are few conditional statements that is executed
depending on what condition it suffice. Here there are 3 paths or condition that need to be
tested to get the output,

Downloaded by priya Manoharan ([email protected])


lOMoARcPSD|28728520

34

 Path 1: 1,2,3,5,6, 7

 Path 2: 1,2,4,5,6, 7

 Path 3: 1, 6, 7

3.4.1 The Bug Assumption

The bug assumption for the path testing strategies is that something has gone wrong with the
software that makes it take a different path than intended. As an example “GOTO X” where
“GOTO Y” had been intended. Structured programming languages prevent many of the bugs
targeted by path testing: as a consequence the effectiveness for path testing for these languages
is reduced and for old code in COBOL, ALP, FORTRAN and Basic, the path testing is
indispensable.

3.5 Path Testing Process

 Control Flow Graph: Draw the corresponding control flow graph of the program in
which all the executable paths are to be discovered.

 Cyclomatic Complexity: After the generation of the control flow graph, calculate
the cyclomatic complexity of the program using the following formula.

McCabe’s Cyclomatic Complexity = E - N + 2P

Downloaded by priya Manoharan ([email protected])


lOMoARcPSD|28728520

35

Where,

E = Number of edges in control flow graph

N = Number of vertices in control floe graph

P = Program factor

 Make Set: Make a set of all the path according to the control floe graph and calculated
cyclomatic complexity. The cardinality of set is equal to the calculated cyclomatic
complexity.

 Create Test Cases: Create test case for each path of the set obtained in above
step.

3.6 Path Testing – Paths, Nodes and Links


A path through a program is a sequence of instructions or statements that starts at an entry,
junction, or decision and ends at another, or possibly the same junction, decision, or exit. A path
may go through several junctions, processes, or decisions, one or more times. Paths consists
of segments. The segment is a link - a single process that lies between two nodes. Each node in
a flow graph represents a line in the program with an executable statement. A path segment is
succession of consecutive links that belongs to some path. The length of path measured by the
number of links in it and not by the number of the instructions or statements executed along that
path. The name of a path is the name of the nodes along the path.

3.6.1 Fundamental Path Selection Criteria

There are many paths between the entry and exit of a typical routine. Every decision doubles
the number of potential paths. And every loop multiplies the number of potential paths by the
number of different iteration values possible for the loop. If a routine has one loop, each pass
through that loop (once, twice, three times, and so on) constitutes a different path through the
routine, even though the same code is traversed each time. Even small routines can have an
incredible number of potential paths. We can define complete testing if it satisfies the following:

1. Exercise every path from entry to exit.

2. Exercise every statement or instruction at least once.

3.Exercise every branch and case statement, in each direction, at least once.

Downloaded by priya Manoharan ([email protected])


lOMoARcPSD|28728520

36

If prescription 1 is followed then 2 and 3 are automatically followed. But it is impractical for most
routines. It can be done for the routines that have no loops, in which it is equivalent to 2 and 3
prescriptions.

EXAMPLE: Here is the correct version.

A negative value produces the correct answer. Every statement can be executed, but if the test
cases do not force each branch to be taken, the bug can remain hidden. The next example
uses a test based on executing each branch but does not force the execution of all statements:

The hidden loop around label 100 is not revealed by tests based on prescription 3 alone because
no test forces the execution of statement 100 and the following GOTO statement. Furthermore,
label 100 is not flagged by the compiler as an unreferenced label and the subsequent GOTO
does not refer to an undefined label.

A Static Analysis (that is, an analysis based on examining the source code or structure) cannot
determine whether a piece of code is or is not reachable. There could be subroutine calls with
parameters that are subroutine labels, or in the above example there could be a GOTO that
targeted label 100 but could never achieve a value that would send the program to that label.

Only a Dynamic Analysis (that is, an analysis based on the code’s behavior while running -
which is to say, to all intents and purposes, testing) can determine whether code is reachable
or not and therefore distinguish between the ideal structure we think we have and the actual,
buggy structure.

Downloaded by priya Manoharan ([email protected])


lOMoARcPSD|28728520

37

3.7 Path Testing Criteria


Any testing strategy based on paths must at least both exercise every instruction and take
branches in all directions. A set of tests that does this is not complete in an absolute sense, but
it is complete in the sense that anything less must leave something untested. So we have
explored three different testing criteria or strategies out of a potentially infinite family of strategies.

(i) Path Testing (Pinf):

1. Execute all possible control flow paths through the program: typically, this is restricted
to all possible entry/exit paths through the program.

2. If we achieve this prescription, we are said to have achieved 100% path coverage. This
is the strongest criterion in the path testing strategy family: it is generally impossible to
achieve.

(ii) Statement Testing (P1):

1. Execute all statements in the program at least once under some test. If we do enough
tests to achieve this, we are said to have achieved 100% statement coverage.

2. An alternate equivalent characterization is to say that we have achieved 100% node


coverage. We denote this by C1.

3. This is the weakest criterion in the family: testing less than this for new software is
unconscionable (unprincipled or cannot be accepted) and should be criminalized.

iii. Branch Testing (P2):

1. Execute enough tests to assure that every branch alternative has been exercised at
least once under some test.

2. If we do enough tests to achieve this prescription, then we have achieved 100% branch
coverage.

3. An alternative characterization is to say that we have achieved 100% link coverage.

4. For structured software, branch testing and therefore branch coverage strictly includes
statement coverage.

5. We denote branch coverage by C2.

Downloaded by priya Manoharan ([email protected])


lOMoARcPSD|28728520

38

Commonsense and Strategies:

 Branch and statement coverage are accepted today as the minimum mandatory testing
requirement.

 The question “why not use a judicious sampling of paths?, what is wrong with leaving
some code, untested?” is ineffectual in the view of common sense and experience
since: (1) Not testing a piece of a code leaves a residue of bugs in the program in
proportion to the size of the untested code and the probability of bugs. (2) The high
probability paths are always thoroughly tested if only to demonstrate that the system
works properly.

 Which paths to be tested? You must pick enough paths to achieve C1+C2. The question
of what is the fewest number of such paths is interesting to the designer of test tools
that help automate the path testing, but it is not crucial to the pragmatic (practical)
design of tests. It is better to make many simple paths than a few complicated paths.

Path Selection Example:

Figure 2.1 An example flow graph to explain path selection

Practical Suggestions in Path Testing:

1. Draw the control flow graph on a single sheet of paper.

2. Make several copies - as many as you will need for coverage (C1+C2) and several more.

3. Use a yellow highlighting marker to trace paths. Copy the paths onto master sheets.

Downloaded by priya Manoharan ([email protected])


lOMoARcPSD|28728520

39

4. Continue tracing paths until all lines on the master sheet are covered, indicating that you
appear to have achieved C1+C2.

5. As you trace the paths, create a table that shows the paths, the coverage status of each
process, and each decision.

6. The above paths lead to the following table considering Figure 2.1:

7. After you have traced a covering path set on the master sheet and filled in the table for every
path, check the following:

1. Does every decision have a YES and a NO in its column? (C2)

2. Has every case of all case statements been marked? (C2)

3. Is every three - way branch (less, equal, greater) covered? (C2)

4. Is every link (process) covered at least once? (C1)

8. Revised Path Selection Rules:

 Pick the simplest, functionally sensible entry/exit path.

 Pick additional paths as small variation from previous paths. Pick paths that do not have
loops rather than paths that do. Favor short paths that make sense over paths that
don’t.

 Pick additional paths that have no obvious functional meaning only if it’s necessary to
provide coverage.

 Be comfortable with your chosen paths. Play your hunches (guesses) and give your
intuition free reign as long as you achieve C1+C2.

 Don’t follow rules slavishly (blindly) - except for coverage.

Downloaded by priya Manoharan ([email protected])


lOMoARcPSD|28728520

40

3.8 Loops
Cases for a single loop: A Single loop can be covered with two cases: Looping and Not
looping. But, experience shows that many loop-related bugs are not discovered by C1+C2.
Bugs hide themselves in corners and congregate at boundaries - in the cases of loops, at or
around the minimum or maximum number of times the loop can be iterated. The minimum
number of iterations is often zero, but it need not be.

CASE 1: Single loop, Zero minimum, N maximum, No excluded values

1. Try bypassing the loop (zero iterations). If you can’t, you either have a bug, or zero is not
the minimum and you have the wrong case.

2. Could the loop-control variable be negative? Could it appear to specify a negative number
of iterations? What happens to such a value?

3. One pass through the loop.

4. Two passes through the loop.

5. A typical number of iterations, unless covered by a previous test.

6. One less than the maximum number of iterations.

7. The maximum number of iterations.

8. Attempt one more than the maximum number of iterations. What prevents the loop-
control variable from having this value? What will happen with this value if it is forced?

CASE 2: Single loop, Non-zero minimum, No excluded values

1. Try one less than the expected minimum. What happens if the loop control variable’s
value is less than the minimum? What prevents the value from being less than the
minimum?

2. The minimum number of iterations.

3. One more than the minimum number of iterations.

4. Once, unless covered by a previous test.

Downloaded by priya Manoharan ([email protected])


lOMoARcPSD|28728520

41

5. Twice, unless covered by a previous test.

6. A typical value.

7. One less than the maximum value.

8. The maximum number of iterations.

9. Attempt one more than the maximum number of iterations.

CASE 3: Single loops with excluded values

1. Treat single loops with excluded values as two sets of tests consisting of loops without
excluded values, such as case 1 and 2 above.

2. Example, the total range of the loop control variable was 1 to 20, but that values 7,
8,9,10 were excluded. The two sets of tests are 1-6 and 11-20.

3. The test cases to attempt would be 0,1,2,4,6,7 for the first range and 10,11,15,19,20,21
for the second range.

Kinds of Loops: There are only three kinds of loops with respect to path testing. These three
kinds are illustrated in Figure 2.2:

1. Nested Loops: The number of tests to be performed on nested loops will be the exponent of
the tests performed on single loops. As we cannot always afford to test all combinations of
nested loops’ iterations values. Here’s a tactic used to discard some of these values:

1. Start at the inner most loop. Set all the outer loops to their minimum values.

2. Test the minimum, minimum+1, typical, maximum-1 , and maximum for the innermost loop,
while holding the outer loops at their minimum iteration parameter values. Expand the tests as
required for out of range and excluded values.

3. If you’ve done the outmost loop, GOTO step 5, else move out one loop and set it up as in step
2 with all other loops set to typical values.

4. Continue outward in this manner until all loops have been covered.

5. Do all the cases for all loops in the nest simultaneously.

Downloaded by priya Manoharan ([email protected])


lOMoARcPSD|28728520

42

2. Concatenated Loops: Concatenated loops fall between single and nested loops with respect
to test cases. Two loops are concatenated if it’s possible to reach one after exiting the other
while still on a path from entrance to exit. If the loops cannot be on the same path, then they are
not concatenated and can be treated as individual loops.

3. Horrible Loops: A horrible loop is a combination of nested loops, the use of code that jumps
into and out of loops, intersecting loops, hidden loops, and cross connected loops. Makes
iteration value selection for test cases an awesome and ugly task, which is another reason
such structures should be avoided.

Figure 2.2 Example of Loop types

Downloaded by priya Manoharan ([email protected])


lOMoARcPSD|28728520

43

Loop Testing Time: Any kind of loop can lead to long testing time, especially if all the extreme
value cases are to attempted (Max-1, Max, Max+1).

 This situation is obviously worse for nested and dependent concatenated loops.

 Consider nested loops in which testing the combination of extreme values lead to long
test times. Several options to deal with:

 Prove that the combined extreme cases are hypothetically possible, they are not possible
in the real world.

 Put in limits or checks that prevent the combined extreme cases. Then you have to test
the software that implements such safety measures.

3.9 Summary
Path testing based on structure is a powerful unit-testing tool. With suitable interpretation, it
can be used for system functional tests. The objective of path testing is to execute enough
tests to assure that, as a minimum, C1 + C2 have been achieved. Select paths as deviations
from the normal paths, starting with the simplest, most familiar, most direct paths from the
entry to the exit. Add paths as needed to achieve coverage. Add paths to cover extreme cases
for loops and combinations of loops.

3.10 Model Questions


1. What is a flow graph? Discuss about the flow graph elements.

2. Define: Path testing.

3. Explain in detail about the Path Testing Criteria.

4. What are loops and illustrate the three kinds of loops.

Downloaded by priya Manoharan ([email protected])


lOMoARcPSD|28728520

44

LESSON-4

ACHIEVABLE PATHS AND TRANSACTION


FLOW TESTING
Structure

4.1 Introduction

4.2 Objectives

4.3 Achievable and Unachievable Paths

4.4 Path Instrumentation

4.5 Transaction Flow Testing

4.6 Transaction Flow Graphs

4.7 Transaction Flow Testing Techniques

4.8 Summary

4.9 Model Questions

4.1 Introduction
If you can find a solution, then the path is achievable. If you can’t find a solution to any of the
sets of inequalities, the path is unachievable. The act of finding a set of solutions to
the path predicate expression is called Path Sensitization.

Transaction flows are introduced as a representation of a system’s processing. The methods


that were applied to control flow graphs are then used for functional testing. The transaction
flow graph is to create a behavioral model of the program that leads to functional testing.

4.2 Objectives
 To study the concepts behind achievable and unachievable paths.

 To explore the basics of path instrumentation and it’s types.

 To discuss the Transaction Flow Graphs and the Transaction Flow Testing Techniques.

Downloaded by priya Manoharan ([email protected])


lOMoARcPSD|28728520

45

4.3 Achievable and Unachievable Paths (Path Sensitizing)


The objective is to select and test just enough paths to achieve a satisfactory notion of test
completeness such as C1 + C2. Extract the programs control flow graph and select a set of
tentative covering paths. For any path in that set, interpret the predicates along the path as
needed to express them in terms of the input vector. In general individual predicates are
compound or may become compound as a result of interpretation. Trace the path through,
multiplying the individual compound predicates to achieve a boolean expression such as (A+BC)
(D+E) (FGH) (IJ) (K) (l) (L). Multiply out the expression to achieve a sum of products form:
ADFGHIJKL+AEFGHIJKL+BCDFGHIJKL+BCEFGHIJKL. Each product term denotes a set of
inequalities that if solved will yield an input vector that will drive the routine along the designated
path. Solve any one of the inequality sets for the chosen path and you have found a set of input
values for the path. If you can find a solution, then the path is achievable. If you can t find a
solution to any of the sets of inequalities, the path is un achievable. The act of finding a set of
solutions to the path predicate expression is called Path Sensitization.

4.3.1 Heuristic Procedures for Sensitizing Paths

This is a workable approach, instead of selecting the paths without considering how to sensitize,
attempt to choose a covering path set that is easy to sensitize and pick hard to sensitize paths
only as you must to achieve coverage. Identify all variables that affect the decision. Classify the
predicates as dependent or independent. Start the path selection with uncorrelated, independent
predicates. If coverage has not been achieved using independent uncorrelated predicates, extend
the path set using correlated predicates. If coverage has not been achieved extend the cases
to those that involve dependent predicates. Last, use correlated, dependent predicates.

4.4 Path Instrumentation


Path instrumentation is what we have to do to confirm that the outcome was achieved by the
intended path. The coincidental correctness stands for achieving the desired outcome for wrong
reason. The Figure 4.1 is an example of a routine that, for the (unfortunately) chosen input value
(X = 16), yields the same outcome (Y = 2) no matter which case we select. Therefore, the tests
chosen this way will not tell us whether we have achieved coverage. For example, the five
cases could be totally jumbled and still the outcome would be the same.

Downloaded by priya Manoharan ([email protected])


lOMoARcPSD|28728520

46

Figure 4.1 Coincidental Correctness

The types of instrumentation methods include:

1. Interpretive Trace Program: An interpretive trace program is one that executes every
statement in order and records the intermediate values of all calculations, the statement labels
traversed etc. If we run the tested routine under a trace, then we have all the information we
need to confirm the outcome and, furthermore, to confirm that it was achieved by the intended
path. The trouble with traces is that they give us far more information than we need. In fact, the
typical trace program provides so much information that confirming the path from its massive
output dump is more work than simulating the computer by hand to confirm the path.

2. Traversal Marker or Link Marker: A simple and effective form of instrumentation is called
a traversal marker or link marker. Name every link by a lower case letter. Instrument the links so
that the link’s name is recorded when the link is executed. The succession of letters produced
in going from the routine’s entry to its exit should, if there are no bugs, exactly correspond to the
path name.

Figure 4.2 Single Link Marker Instrumentation

Downloaded by priya Manoharan ([email protected])


lOMoARcPSD|28728520

47

Why Single Link Markers aren’t enough: Unfortunately, a single link marker may not do the
trick because links can be chewed by open bugs. Figure 4.3 represent the Single Link Marker.

Figure 4.3 Single Link Markers

We intended to traverse the ikm path, but because of a rampaging GOTO in the middle of the
m link, we go to process B. If coincidental correctness is against us, the outcomes will be the
same and we won’t know about the bug.

Two Link Marker Method:

The solution to the problem of single link marker method is to implement two markers per link:
one at the beginning of each link and on at the end. The two link markers now specify the path
name and confirm both the beginning and end of the link and Figure 4.4 illustrates the Double
Link Marker.

Figure 4.4 Double Link Marker Instrumentation

Link Counter: A less disruptive (and less informative) instrumentation method is based on
counters. Instead of a unique link name to be pushed into a string when the link is traversed, we
simply increment a link counter. We now confirm that the path length is as expected. The same
problem that led us to double link markers also leads us to double link counters.

Downloaded by priya Manoharan ([email protected])


lOMoARcPSD|28728520

48

4.5 Transaction Flow Testing


A transaction is a unit of work seen from a system user’s point of view. A transaction consists
of a sequence of operations, some of which are performed by a system, persons or devices
that are outside of the system. Transaction begins with Birth-that is they are created as a result
of some external act. At the conclusion of the transaction’s processing, the transaction is no
longer in the system.

Example of a transaction: A transaction for an online information retrieval system might consist
of the following steps or tasks:

• Accept input (tentative birth)

• Validate input (birth)

• Transmit acknowledgement to requester

• Do input processing

• Search file

• Request directions from user

• Accept input

• Validate input

• Process request

• Update file

• Transmit output

• Record transaction in log and clean up (death)

4.6 Transaction Flow Graphs


Transaction flows are introduced as a representation of a system’s processing. The methods
that were applied to control flow graphs are then used for functional testing. Transaction flows
and transaction flow testing are to the independent system tester what control flows are path
testing are to the programmer. The transaction flow graph is to create a behavioural model of
the program that leads to functional testing. The transaction flow graph is a model of the structure
of the system’s behaviour (functionality). An example of a Transaction Flow is in Figure 4.5.

Downloaded by priya Manoharan ([email protected])


lOMoARcPSD|28728520

49

Figure 4.5 An Example of a Transaction Flow

4.6.1 Applications

Transaction flows are indispensable for specifying requirements of complicated systems,


especially online systems. A big system such as an air traffic control or airline reservation
system, has not hundreds, but thousands of different transaction flows. The flows are
represented by relatively simple flow graphs, many of which have a single straight-through
path. Loops are infrequent compared to control flow graphs. The most common loop is used
to request a retry after user input errors. An ATM system, for example, allows the user to try, say
three times, and will take the card away the fourth time.

4.6.2 Complications

In simple cases, the transactions have a unique identity from the time they are created to the
time they are completed. In many systems the transactions can give birth to others, and
transactions can also merge.

Downloaded by priya Manoharan ([email protected])


lOMoARcPSD|28728520

50

Births: There are three different possible interpretations of the decision symbol, or nodes with
two or more out links. It can be a Decision, Biosis or a Mitosis.

1. Decision: Here the transaction will take one alternative or the other alternative but not both.
(See Figure 4.6 (a))

2. Biosis: Here the incoming transaction gives birth to a new transaction, and both transaction
continue on their separate paths, and the parent retains it identity. (See Figure 4.6 (b))

3. Mitosis: Here the parent transaction is destroyed and two new transactions are created.(See
Figure 4.6 (c))

Figure 4.6 Nodes with multiple outlinks

4.6.3 Mergers

Transaction flow junction points are potentially as troublesome as transaction flow splits. There
are three types of junctions: (1) Ordinary Junction (2) Absorption (3) Conjugation

 Ordinary Junction: An ordinary junction which is similar to the junction in a control flow
graph. A transaction can arrive either on one link or the other. (See Figure 4.7 (a)).

 Absorption: In absorption case, the predator transaction absorbs prey transaction. The
prey gone but the predator retains its identity. (See Figure 4.7 (b)).

 Conjugation: In conjugation case, the two parent transactions merge to form a new
daughter. In keeping with the biological flavour this case is called as conjugation (See
Figure 4.7 (c)).

Downloaded by priya Manoharan ([email protected])


lOMoARcPSD|28728520

51

Figure 4.7 Transaction Flow Junctions and Mergers

We have no problem with ordinary decisions and junctions. Births, absorptions, and conjugations
are as problematic for the software designer as they are for the software modeller and the test
designer; as a consequence, such points have more than their share of bugs. The common
problems are: lost daughters, wrongful deaths, and illegitimate births.

4.7 Transaction Flow Testing Techniques


4.7.1 Get the Transaction Flows

Complicated systems that process a lot of different, complicated transactions should have
explicit representations of the transactions flows, or the equivalent. Transaction flows are like
control flow graphs, and consequently we should expect to have them in increasing levels of
detail. The system’s design documentation should contain an overview section that details the
main transaction flows. Detailed transaction flows are a mandatory pre requisite to the rational
design of a system’s functional test.

4.7.2 Inspections, Reviews and Walkthroughs

Transaction flows are natural agenda for system reviews or inspections. In conducting the
walkthroughs, you should:

 Discuss enough transaction types to account for 98%-99% of the transaction the system
is expected to process.

 Discuss paths through flows in functional rather than technical terms.

 Ask the designers to relate every flow to the specification and to show how that
transaction, directly or indirectly, follows from the requirements.

Downloaded by priya Manoharan ([email protected])


lOMoARcPSD|28728520

52

Make transaction flow testing the corner stone of system functional testing just as path testing
is the corner stone of unit testing. Select additional flow paths for loops, extreme values, and
domain boundaries. Design more test cases to validate all births and deaths. Publish and
distribute the selected test paths through the transaction flows as early as possible so that they
will exert the maximum beneficial effect on the project.

4.7.3 Path Selection

Select a set of covering paths (C1+C2) using the analogous criteria you used for structural
path testing. Select a covering set of paths based on functionally sensible transactions as you
would for control flow graphs. Try to find the most tortuous, longest, strangest path from the
entry to the exit of the transaction flow.

4.7.4 Path Sensitization

Most of the normal paths are very easy to sensitize-80% - 95% transaction flow coverage
(C1+C2) is usually easy to achieve. The remaining small percentage is often very difficult.
Sensitization is the act of defining the transaction. If there are sensitization problems on the
easy paths, then bet on either a bug in transaction flows or a design bug.

4.7.5 Path Instrumentation

Instrumentation plays a bigger role in transaction flow testing than in unit path testing. The
information of the path taken for a given transaction must be kept with that transaction and can
be recorded by a central transaction dispatcher or by the individual processing modules. In
some systems, such traces are provided by the operating systems or a running log.

4.7.6 Test Databases

About 30%-40% of the effort of transaction-flow test design is the design and maintenance of
the test database(s). It may be a third of the labour, but it carries a disproportionately high part
of the headaches. In order to avoid a repetition of the previous chaos, it is decided that there will
be one comprehensive database that will satisfy all testing needs. A typical system of a half-
million lines of source code will probably need four or five different, incompatible databases to
support testing. The design of these databases is no less important than the design of the
system data structures. It requires talented, mature, diplomatic, experienced designers –
experienced both in the system design and in test design.

Downloaded by priya Manoharan ([email protected])


lOMoARcPSD|28728520

53

4.7.7 Execution

To do transaction-flow testing for a system of any size, be committed to test execution automation
from the start. If more than a few hundred test cases are required to achieve C1 + C2 transaction-
flow coverage, you can run and rerun those transactions not once but hundreds of times over
the project’s life. Transaction-flow testing with the intention of achieving C1 + C2 usually leads
to a big (four to five fold) increase in the number of test cases. Without execution, automation
can’t do right. Capture/replay tools and other test drivers are essential.

4.8 Summary
The methods discussed for path testing of units and programs can be applied with suitable
interpretation to functional testing based on transaction flows. The biggest problem and the
biggest payoff may be getting the transaction flows in the first place. Full coverage (C1 + C2) is
required for all flows, but most bugs will be found on the strange, meaningless, weird paths.
Transaction-flow control may be implemented by means of an undeclared and unrecognized
internal language.

4.9 Model Questions


1. Define: Achievable paths.

2. Discuss on Path Instrumentation and it’s types.

3. Write short notes on Transaction Flow Graphs.

4. Explain in detail about the Transaction Flow Testing Techniques.

Downloaded by priya Manoharan ([email protected])


DATA FLOW TESTING

Data Flow Testing


Data Flow Testing is a type of structural testing . A method is used to find the test paths of a program
according to the locations of definitions and uses of variables in the program. It has nothing to do with data
flow diagrams. Furthermore, it is concerned with:
 Statements where variables receive values,
 Statements where these values are used or referenced.
Data Flow Testing uses the control flow graph to find the situations that can interrupt the flow of the
program. Reference or defined anomalies in the flow of the data are detected at the time of associations
between values and variables. These anomalies are:
 A variable is defined but not used or referenced,
 A variable is used but never defined,
 A variable is defined twice before it is used
STRATEGIES OF DATA FLOW TESTING:
INTRODUCTION:
o Data Flow Testing Strategies are structural strategies.
o In contrast to the path-testing strategies, data-flow strategies take into account what happens to data objects
on the links in addition to the raw connectivity of the graph.
o In other words, data flow strategies require data-flow link weights (d,k,u,c,p).
o Data Flow Testing Strategies are based on selecting test path segments (also called sub paths) that satisfy
some characteristic of data flows for all data objects.
o For example, all sub paths that contain a d (or u, k, du, dk).
o A strategy X is stronger than another strategy Y if all test cases produced under Y are included in those
produced under X - conversely for weaker.
TERMINOLOGY:
1. Definition-Clear Path Segment, with respect to variable X, is a connected sequence of links such that X is
(possibly) defined on the first link and not redefined or killed on any subsequent link of that path segment. II
paths in Figure 3.9 are definition clear because variables X and Y are defined only on the first link (1,3) and
not thereafter. In Figure 3.10, we have a more complicated situation. The following path segments are
definition-clear: (1,3,4), (1,3,5), (5,6,7,4), (7,8,9,6,7), (7,8,9,10), (7,8,10), (7,8,10,11). Subpath (1,3,4,5) is not
definition-clear because the variable is defined on (1,3) and again on (4,5). For practice, try finding all the
definition-clear subpaths for this routine (i.e., for all variables).
2. Loop-Free Path Segment is a path segment for which every node in it is visited at most once. For Example,
path (4,5,6,7,8,10) in Figure 3.10 is loop free, but path (10,11,4,5,6,7,8,10,11,12) is not because nodes 10 and
11 are each visited twice.
3. Simple path segment is a path segment in which at most one node is visited twice. For example, in Figure
3.10, (7,4,5,6,7) is a simple path segment. A simple path segment is either loop-free or if there is a loop, only
one node is involved.
4. A du path from node i to k is a path segment such that if the last link has a computational use of X, then the
path is simple and definition-clear; if the penultimate (last but one) node is j - that is, the path is
(i,p,q,...,r,s,t,j,k) and link (j,k) has a predicate use - then the path from i to j is both loop-free and definition-
clear.
STRATEGIES:
The structural test strategies discussed below are based on the program's control flow graph. They
differ in the extent to which predicate uses and/or computational uses of variables are included in the test set.
Various types of data flow testing strategies in decreasing order of their effectiveness are:
1. All - du Paths (ADUP): The all-du-paths (ADUP) strategy is the strongest data-flow testing strategy
discussed here. It requires that every du path from every definition of every variable to every some test.
For variable X and Y: In Figure 3.9, because variables X and Y are used only on link (1,3), any test
that starts at the entry satisfies this criterion (for variables X and Y, but not for all variables as required by the
strategy).
For variable Z: The situation for variable Z (Figure 3.10) is more complicated because the variable is
redefined in many places. For the definition on link (1,3) we must exercise paths that include subpaths (1,3,4)
and (1,3,5). The definition on link (4,5) is covered by any path that includes (5,6), such as sub path
(1,3,4,5,6, ...). The (5,6) definition requires paths that include sub paths (5,6,7,4) and (5,6,7,8).
For variable V: Variable V (Figure 3.11) is defined only once on link (1,3). Because V has a predicate
use at node 12 and the subsequent path to the end must be forced for both directions at node 12, the all-du-
paths strategy for this variable requires that we exercise 17 all loop-free entry/exit paths and at least one path
that includes the loop caused by (11,4).
Note that we must test paths that include both sub paths (3,4,5) and (3,5) even though neither of these
has V definitions. They must be included because they provide alternate du paths to the V use on link (5,6).
Although (7,4) is not used in the test set for variable V, it will be included in the test set that covers the
predicate uses of array variable V() and U.
The all-du-paths strategy is a strong criterion, but it does not take as many tests as it might seem at first
because any one test simultaneously satisfies the criterion for several definitions and uses of several different
variables.
2. All Uses Strategy (AU):
The all uses strategy is that at least one definition clear path from every definition of every variable to
every use of that definition be exercised under some test.
Just as we reduced our ambitions by stepping down from all paths (P) to branch coverage (C2), say, we
can reduce the number of test cases by asking that the test set should include at least one path segment from
every definition to every use that can be reached by that definition.
For variable V: In Figure 3.11, ADUP requires that we include sub paths (3,4,5) and (3,5) in some
test because subsequent uses of V, such as on link (5,6), can be reached by either alternative. In AU either
(3,4,5) or (3,5) can be used to start paths, but we don't have to use both. Similarly, we can skip the (8,10) link if
we've included the (8,9,10) sub path.
Note the hole. We must include (8,9,10) in some test cases because that's the only way to reach the c
use at link (9,10) - but suppose our bug for variable V is on link (8,10) after all? Find a covering set of paths
under AU for Figure 3.11.
3. All p-uses/some c-uses strategy (APU+C) :
For every variable and every definition of that variable, include at least one definition free path from
the definition to every predicate use; if there are definitions of the variables that are not covered by the above
prescription, then add computational use test cases as required to cover every definition.
For variable Z: In Figure 3.10, for APU+C we can select paths that all take the upper link (12,13) and
therefore we do not cover the c-use of Z: but that's okay according to the strategy's definition because every
definition is covered. Links (1,3), (4,5), (5,6), and (7,8) must be included because they contain definitions for
variable Z. Links (3,4), (3,5), (8,9), (8,10), (9,6), and (9,10) must be included because they contain predicate
uses of Z. Find a covering set of test cases under APU+C for all variables in this example - it only takes two
tests.
For variable V:In Figure 3.11, APU+C is achieved for V by
(1,3,5,6,7,8,10,11,4,5,6,7,8,10,11,12[upper], 13,2) and (1,3,5,6,7,8,10,11,12[lower], 13,2). Note that the c-use
at (9,10) need not be included under the APU+C criterion.
4. All c-uses/some p-uses strategy (ACU+P) :
The all c-uses/some p-uses strategy (ACU+P) is to first ensure coverage by computational use cases
and if any definition is not covered by the previously selected paths, add such predicate use cases as are needed
to assure that every definition is included in some test.
For variable Z: In Figure 3.10, ACU+P coverage is achieved for Z by path (1,3,4,5,6,7,8,10,
11,12,13[lower], 2), but the predicate uses of several definitions are not covered. Specifically, the (1,3)
definition is not covered for the (3,5) p-use, the (7,8) definition is not covered for the (8,9), (9,6) and (9, 10) p-
uses.
The above examples imply that APU+C is stronger than branch coverage but ACU+P may be weaker
than, or incomparable to, branch coverage.
5. All Definitions Strategy (AD):
The all definitions strategy asks only every definition of every variable be covered by at least one use
of that variable, be that use a computational use or a predicate use.
For variable Z: Path (1,3,4,5,6,7,8, . . .) satisfies this criterion for variable Z, whereas any entry/exit
path satisfies it for variable V. From the definition of this strategy we would expect it to be weaker than both
ACU+P and APU+C.
6. All Predicate Uses (APU), All Computational Uses (ACU) Strategies :
The all predicate uses strategy is derived from APU+C strategy by dropping the requirement that we
include a c- use for the variable if there are no p-uses for the variable. The all computational uses strategy is
derived from ACU+P strategy by dropping the requirement that we include a p-use for the variable if there are
no c-uses for the variable. It is intuitively obvious that ACU should be weaker than ACU+P and that APU
should be weaker than APU+C.
ORDERING THE STRATEGIES:
Figure 3.12compares path-flow and data-flow testing strategies. The arrows denote that the strategy at
the arrow's tail is stronger than the strategy at the arrow's head.
o The right-hand side of this graph, along the path from "all paths" to "all statements" is the more
interesting hierarchy for practical applications.
o Note that although ACU+P is stronger than ACU, both are incomparable to the predicate-biased
strategies. Note also that "all definitions" is not comparable to ACU or APU.
SLICING AND DICING:
o A (static) program slice is a part of a program (e.g., a selected set of statements) defined with respect to a
given variable X (where X is a simple variable or a data vector) and a statement i: it is the set of all statements
that could (potentially, under static analysis) affect the value of X at statement i - where the influence of a
faulty statement could result from an improper computational use or predicate use of some other variables at
prior statements. o If X is incorrect at statement i, it follows that the bug must be in the program slice for X
with respect to i
o A program dice is a part of a slice in which all statements which are known to be correct have been
removed. o In other words, a dice is obtained from a slice by incorporating information obtained through
testing or experiment (e.g., debugging).
o The debugger first limits her scope to those prior statements that could have caused the faulty value at
statement i (the slice) and then eliminates from further consideration those statements that testing has shown to
be correct.
o Debugging can be modeled as an iterative procedure in which slices are further refined by dicing,
where the dicing information is obtained from ad hoc tests aimed primarily at eliminating possibilities.
Debugging ends when the dice has been reduced to the one faulty statement.
o Dynamic slicing is a refinement of static slicing in which only statements on achievable paths to the
statement in question are included.
Advantages of Data Flow Testing:
 Data Flow Testing is used to find the following issues-
 To find a variable that is used but never defined,
 To find a variable that is defined but never used,
 To find a variable that is defined multiple times before it is use,
 Deallocating a variable before it is used.
Disadvantages of Data Flow Testing
 Time consuming and costly process
 Requires knowledge of programming languages
Example:
1. read x, y;
2. if(x>y)
3. a = x+1
else
4. a = y-1
5. print a;
Control flow graph of above example:

Domain Testing :
It is a software testing technique where minimum numbers of inputs are used to access appropriate output of a
system, to ensure the system does not accept invalid input values. The system is expected to give required
outputs blocking the invalid inputs. If you’re looking to enhance your skills in domain testing and other
essential testing techniques, consider enrolling in this comprehensive software testing course. This course
offers in-depth knowledge and practical tools to improve your ability to identify and address software defects
efficiently.
Importance of Domain Testing
 Protection of Input Space: A software application’s complete input area should be sufficiently covered,
and domain testing helps to verify this. It seeks to identify possible problems with data handling and
processing by testing particular domains or ranges of input values.
 Error detection: It works well for identifying mistakes or irregularities that could happen in particular
input domains. Through the focus of testing efforts on pertinent subsets, domain testing can identify
issues that may be missed in more general, random testing situations.
 Preventing Bugs: Software can be designed with fewer vulnerabilities and critical situations if
developers have a better understanding of the properties and limits of input domains. By taking a
proactive approach to input domain consideration, issues may be avoided before
 they even arise.
 Enhanced Efficiency of Tests: Domain testing aids in organizing testing efforts according to the
software’s most important and pertinent sections. Because more resources are allocated to testing
scenarios that are more likely to uncover significant faults, test effectiveness is raised as a result.
Structure of Domain Testing

 The process is quite similar everywhere when it comes to building the strategy, where the following
step-by-structure is used that suits most of the scenarios:
 Determine the Domain: To comprehend the needs and requirements of the system, the testing team first
works with stakeholders. They establish the precise domain or set of circumstances that the software is
supposed to function in by doing this.
 Split the Domain: After the domain has been located, it is split up into more manageable sections. The
different sets of inputs, states or situations that the software will experience while operating are
represented by these divisions. This stage guarantees thorough coverage and aids in organizing the
testing effort.
 Choose Test Cases: The testing team chooses representative test cases based on the divided domains.
With the goal of covering a range of scenarios within each partition, these test cases offer a
comprehensive analysis of the behavior of the software under diverse circumstances.
 Design Test Data: Within each partition, test data is created to mimic real-world situations. This entails
selecting values, such as boundary values and potentially error-prone inputs, that are likely to be
encountered during actual usage.
 Run Test Cases: The provided test data is used to run the chosen test cases. The testing team watches
how the software behaves during this phase and contrasts it with the outcomes that are anticipated
based on the requirements and specifications.
 Boundary Value Analysis: Testing at the input domain’s boundaries receives particular attention.
Given that errors frequently arise in these extreme circumstances, this also requires checking values at
each partition’s lower and higher borders.
 Error Handling: Within the specified domain, the testing team confirms how the programmed handles
erroneous or unexpected input. It involves making that the programme handles exceptions gracefully
and doesn’t crash, as well as looking for the proper error messages.
 Automation: Depending on the situation, the domain’s complicated or repetitive scenarios may benefit
from automation. In particular, automated testing methods can be used in situations when manual
testing might not be feasible to improve efficiency, repeatability, and coverage.
Domain Knowledge
Domain knowledge is a good understanding of a particular sphere i.e., a person is acquainted
with a particular term or discipline. It helps to minimize the delivery cycle, improve customer
service reduce development time.
Is Domain knowledge required for Domain testing?
It is difficult for someone to perform effectively in a field where the person is not familiar. So a domain tester
should have basic domain knowledge. It is important because:
Online banking- A tester must have to be an expert in online banking activities like login, bill payment, and
transfers.
Retail domains- To successfully run a domain test, the tester has to recognize how things work flow at
different levels. Some examples of retail domains are warehouse management, in-store solutions, etc.
Healthcare- A tester with a proper understanding of domain knowledge should handle a healthcare system. It
is a huge risk to someone’s life when someone with zero knowledge handles the system.
A real-life example of Domain testing
Let there be a group of students on a study tour. For entertainment purposes, they have been given a
ticket to perform a specific activity based on gender and age inputs. Here the entertainment facility acts as the
test, age groups will be boundary values with numerous possible scenarios. Students perform activities in the
following manner:
 Children less than 5 years old are to tell a poem
 Boys 5>=10 are to draw
 Girls 5>=10 are to sing a song
 Boys >10 are to compete in a sport
 Girls >10 are to participate in the quiz
 The remaining children >15 are to participate in an essay competition
Based on the given algorithm, the specialist groups the values into classes i.e., age groups, and then
boundary values are picked i.e., highest and lowest age values in a group. Then different scenarios are built
with expected results for each.

Advantages of Domain Testing


 Effective Utilization of Testing Materials: Domain testing ensures that testing resources are used
efficiently by enabling focused testing on particular ranges or domains. With this focused approach,
important software domains are prioritized.
 Boosts Test Coverage: Domain testing improves test coverage by testing a broad range of variables
inside a domain. This is crucial to guaranteeing thorough testing of the software,particularly in regions
where problems are more likely to occur.
 Enhances the Quality of Software: Better software quality is a result of finding and resolving edge case
problems. Domain testing contributes to the development of more durable and dependable software by
addressing weaknesses in important scenarios.
 Cost-Effective Testing Technique: As domain testing concentrates on particular input ranges and
utilizes fewer resources than comprehensive testing, it is thought to be more cost effective. This makes
it a useful method for identifying important problems within certain fields.
Disadvantages of Domain Testing
 False Sense of Security: Supposing domain testing alone is sufficient could give an illusion of security.
Although it works well in certain areas, it might miss problems outside thoselimits, which could result
in mistakes.
 Difficult to Determine Domain Boundaries: Establishing precise domain boundaries can be difficult. A
complete understanding of the system is necessary to determine the boundaries of input domains, and
inaccurate boundary definitions could result in insufficient testing.
 Might Miss Complex Problems: Domain testing might not be appropriate for detecting intricate
problems involving the interplay of various variables or system components. In such cases, advanced
testing methods might be needed.
 Not Fit for Every System: Domain testing alone might not be sufficient to test certain systems,
especially those with complicated input and output parameters or those requiring more advanced testing
methods.
UNIT –III
DOMAIN TESTING
(1) Domains and paths:
(i) The Model:
 Domain testing can be based on specifications and/or equivalent implementation
information.
 If domain testing is based on specifications, it is a functional test technique; if based on
implementations, it is a structural technique.
 Domain testing is applied to one input variable or to simple combinations of two variables,
based on specifications.
 The schematic representation of Domain testing is given below.

INPUT CLASSIFY DO CASE 1 OUTPUT

DO CASE 2

DO CASE 3

DO CASE n

 First the different input variables are provided to a program.


 The classifier receives all input variables and divides them into different cases.
 Every case there should be at least one path to process that specified case.
 Finally output is received from this do cases..
(ii) A domain is a set:
 An input domain is a set. If the source language supports set definitions less testing is
needed because the compiler (compile-time and run-time) do much of it for us.
(iii) Domains, paths and predicates:
 In domain testing, predicates are assumed to be interpreted in terms of input vector
variables.
 If domain testing is applied to structure (implementation), then predicate interpretation must
be based on control flowgraph.
 If domain testing is applied to specifications, then predicate interpretation is based on data
flowgraph.
 For every domain there is at least one path through the routine.
 There may be more than one path if the domain consists of disconnected parts.
 Unless stated otherwise, we’ll assume that domains consist of a single, connected part.
 We’ll also assume that the routine has no loops.
 Domains are defined by their boundaries. For every boundary there is at least one
predicate.
 For example in the statement, IF X > 0 THEN ALPHA ELSE BETA we know that number
greater than zero, belong to ALPHA, number smaller to zero, belong to BETA.
Page 1

1
Review:
1. A domain is a loop free program.
2. For every domain there is at least one path through the routine.
3. The set of interpreted predicates defines the domain boundaries.
(iv) Domain Closure:
 To understand the domain closure, consider the following figure.

MIN MAX
D1 D2 D3

(a) Both side closed


MIN MAX
D1 D2 D3

(b) One side open


MIN MAX
D1 D2 D3

(c) Both side open

 If the domain boundary point belongs to the same domain then the boundary is said to
close. If the domain boundary point belongs to some other domain then the boundary is
said to open.
 In the above figure there are three domains D1, D2, D3.
 In figure a D2’s boundaries are closed both at the minimum and maximum values. If D2 is
closed, then the adjacent domains D1 and D3 must be open.
 In figure b D2 is closed on the minimum side and open on the maximum side, meaning
that D1 is open and D3 is closed. In figure c D2 is open on both sides, which mean that the
adjacent domains D1 and D3 must be closed.
(v) Domain Dimensionality:
 Depending on the input variables, the domains can be classified as number line domains,
planer domains or solid domains.
 That is for one input variable the value of the domain is on the number line, for two
variables the resultant is planer and for three variables the domain is solid.
 One important thing here is to note that we need not worry about the domains
dimensionality with the number of predicates. Because there might be one or more
boundary predicates.
(vi) The Bug Assumptions:
 The bug assumption for domain testing is that processing is okay but the domain definition
is wrong.
 An incorrectly implemented domain means that boundaries are wrong, which mean that
control-flow predicates are wrong.
 The following are some of the bugs that give to domain errors.
(a) Double-Zero Representation:
 Boundary errors for negative zero occur frequently in computers or programming
languages where positive and negative zeros are treated differently.
Page 2

2
(b) Floating-Point Zero Check:
 A floating-point number can equal to zero only if the previous definition of that number is
set it to zero or if it is subtracted from itself, multiplied by zero.
 Floating-point zero checks should always be done about a small interval.
(c) Contradictory Domains:
 Here at least two assumed distinct domains overlap.
(d) Ambiguous Domains:
 These are missing domain, incomplete domain.
(e) Over specified Domains:
 The domain can be overloaded with so many conditions.
(f) Boundary Errors:
 This error occurs when the boundary is shifted or when the boundary is tilted or missed.
(g) Closure Reversal
 This bug occurs when we have selected the wrong predicate such as x>=0 is written as
x<=0.
(h) Faulty Logic:
 This bug occurs when there are incorrect manipulations, calculations or simplifications
in a domain.
(vii) Restrictions:
(a) General
 Domain testing has restrictions. i.e. we cannot use domain testing if they are violated.
 In testing there is no invalid test, only unproductive test.
(b) Coincidental Correctness
 Coincidental correctness is assumed not to occur.
 Domain testing is not good for which outcome is correct for the wrong reason.
 One important point to be noted here is that, domain testing does not support Boolean
outcomes (TRUE/FALSE).
 If suppose the outputs are some discrete values, then there are some chances of
coincidental correctness.
(c) Representative Outcome
 Domain testing is an example of partition testing.
 Partition testing divide the program’s input space into domains.
 If the selected input is shown to be correct by a test, then processing is correct, and
inputs within that domain are expected to be correct.
 Most test techniques, functional or structural fall under partition testing and therefore
make this representative outcome assumption.
(d) Simple Domain Boundaries and Compound Predicates
 Each boundary is defined by a simple predicate rather than by a compound predicate.
 Compound predicates in which each part of the predicate specifies a different boundary
are not a problem: for example, x >= 0 .AND. x < 17, just specifies two domain
boundaries by one compound predicate.
(e) Functional Homogeneity of Bugs
 Whatever the bug is, it will not change the functional form of the boundary predicate.
(f) Linear Vector Space
 A linear predicate is defined by a linear inequality using only the simple relational
operators >, >=, =, <=, <>, and <.
 Example x2 + y2 > a2.
(g) Loop-free Software
 Loops (indefinite loops) are problematic for domain testing.
Page 3

3
 If a loop is an overall control loop on transactions, say, there’s no problem.
 If the loop is definite, then domain testing may be useful for the processing within the
loop, and loop testing can be applied to the looping values.
(2) Nice Domains:
(i) Where Do Domains Come From?
 Domains are often created by salesmen or politicians.
 The first step in applying domain testing is to get consistent and complete domain
specifications.
(ii) Specified versus Implemented Domains:
 Implemented domains can’t be incomplete or inconsistent but specified domains can be
incomplete or inconsistent.
 Incomplete means that there are input vectors for which no path is specified and
inconsistent means that there are at least two contradictory specifications.
(iii) Nice Domains:
(1) General
 The representation of Nice two-dimensional domains is as follows. .
U1 U2 U3 U4 U5

V1 D11 D12 D13 D14 D15

V2 D21 D22 D23 D24 D25

V3 D31 D33 D34 D35


D32

 The U and V represent boundary sets and D represents domains.


 The boundaries have several important properties. They are linear, complete,
systematic, orthogonal, consistently closed, simply connected and convex.
 If domains have these properties, domain testing is very easy otherwise domain testing
is tough.
(2) Linear and Nonlinear Boundaries
 Nice domain boundaries are defined by linear inequalities or equations.
 The effect on testing comes from only two points then it represents a straight line.
 If it considers three points then it represents a plane and in general it considers n + 1
points then it represents an n-dimensional hyperplane.
 Linear boundaries are more frequently used than the non-linear boundaries.
 We can linearize the non-linear boundaries by using simple transformations.
(3) Complete Boundaries
 Complete boundaries are those boundaries which do not have any gap between them.
 Nice domain boundaries are complete boundaries because they cover from plus infinity
to minus infinity in all dimensions.
 Incomplete boundaries are those boundaries which consist of some gaps between them
and are not covered in all dimensions.
 The following figure represents some incomplete boundaries.
Page 4

4
E A

A
B B
C
C
D
E
 The Boundaries A and E have gaps so they are incomplete & the boundaries B, C, D
are complete.
 The main advantage of a complete boundary is that it requires only one set of tests to
verify the boundary
(4) Systematic Boundaries
 Systematic boundaries refer to boundary inequalities with simple mathematical
functions such as a constant.
 Consider the following relations,
f1(X) >= k1 or f1(X) >= g(1,c)
f2(X) >= k2 f2(X) >= g(2,c)
................ ................
fi(X) >= ki fi(X) >= g(i,c)
 Where fi is an arbitrary linear function, X is the input vector, ki and c are constants,
and g(i,c) is a decent function that yields a constant, such as k + ic.
(5) Orthogonal Boundaries
 The U and V boundary sets in Nice two-dimensional domains figure are orthogonal; that
is, the every boundary V is perpendicular to every other boundary U.
 If two boundary sets are orthogonal, then they can be tested independently.
 If we want to tilt the above orthogonal boundary we can do it by testing its intersection
points but this can change the linear growth, O(n) into the quadratic growth O(n 2).
 If we tilt the boundaries to get the following figure then we must test the intersections.

(6) Closure Consistency


 Consistent closures are the most simple and fundamental closure.
 It gives consistent and systematic results.
 The following figure shows the boundary closures are consistent.
Page 5

5
y = k1 + bx

y = k2 + bx

y = k3 + bx

x = A1 x = A2 x = A3 x = A4 x = A5
 In the above figure, the shading lines show one boundary and thick lines show other
boundary.
 It shows Non orthogonal domain boundaries, which mean that every inequality in
domain x is not perpendicular to every inequality in domain y.
(7) Convex
 A figure is said to be convex when for any two boundaries, with two points placed on
them are combined by using a single line then all the points on that line are within the
range of the same figure.
 Nice domains support convex property, where as dirty domains don’t.
(8) Simply Connected
 Nice domains are usually simply connected because they are available at one place as
a whole but not dispersed in other domains..
 Simple connectivity is a weaker requirement than convexity; if a domain is convex it is
simply connected, but not vice versa.
(iv) Ugly Domains:
(a) General
 Some domains are born ugly. Some domains are bad specifications.
 So every simplification of ugly domains by programmers can be either good or bad.
 If the ugliness results from bad specifications and the programmer’s simplification is
harmless, then the programmer has made ugly good.
 But if the domain’s complexity is essential such simplifications gives bugs.
(b) Nonlinear Boundaries
 Non linear boundaries are rare in ordinary programming, because there is no
information on how programmers correct such boundaries.
 So if a domain boundary is non linear, then programmers make it linear.
(c) Ambiguities and Contradictions:.

(a) Ambiguities

(c) Overlapped Domains

Hole B

(d) Contradiction:
Dual Closure (b) Ambiguity:
Missing Boundary
Page 6

6
 Domain ambiguity is missing or incomplete domain boundary.
 In the above figure Domain ambiguities are holes in the A domain and missing
boundary in the B domain.
 An ambiguity for one variable can be see easy.
 An ambiguity for two variables can be difficult to spot.
 An ambiguity for three or more variables impossible to spot. Hence tools are required.
 Overlapping domains and overlapping domain closure is called contradiction.
 There are two types of contradictions are possible here.
(1) Overlapped domain specifications
(2) Overlapped closure specifications.
 In the above figure there is overlapped domain and there is dual closure contradiction.
This is actually a special kind of overlap.
(d) Simplifying the Topology
 Connecting disconnected boundary segments and extending boundaries is called
simplifying the topology
 There are three generic cases of simplifying the topology.

(a) Making it convex

(b) Filling in the Holes

(c) Joining the Pieces


 Programmers introduce bugs and testers misdesign test cases by, smoothing out
concavities, filling in holes, joining disconnected pieces.
(e) Rectifying Boundary Closures
 Different boundaries in different directions can obtain in consistent direction is called
rectifying boundary closures.
 That is domain boundaries which are different directions can obtain in one direction.

(a) Consistent Direction


Page 7

7
(b) Inclusion/Exclusion Consistency
 In the above figure the hyper plane boundary is outside that can obtain inside. This is
called inclusion / exclusion consistency.
(3) Domain Testing:
(i) Overview:
 Domains are defined by their boundaries. So domain testing concentrates test points on
boundaries or near boundaries.
 Find what wrong with boundaries, and then define a test strategy.
 Because every boundary uses at least two different domains, test points used to check one
domain can also be used to check adjacent domains.
 Run the tests, and determine if any boundaries are faulty.
 Run enough tests to verify every boundary of every domain.
(ii) Domain Bugs and How to Test for Them:
(a) General:
EXTREME POINT

BOUNDARY POINT

INTERIOR POINT

EPSILON NEIGHBORHOOD
 An interior point is a point in a domain. It can be defined as a point which specifies
certain distance covered by some other points in the same domain.
 This distance is known as epsilon neighborhood.
 A boundary point is on the boundary that is a point with in a specific epsilon
neighborhood.
 An extreme point is a point that does not lie between any other two points.
ON POINTS

OFF POINTS

 An on point is a point on the boundary. An off point is outside the boundary.


 If the domain boundary is closed, an off point is a point near the boundary but in the
adjacent domain.
Page 8

8
 If the domain boundary is open, an off point is a point near the boundary but in the
same domain.
 Here we have to remember CLOSED OFF OUTSIDE, OPEN OFF INSIDE
 i.e. COOOOI
 The following figure shows a generic domain ways.

EXTRA BOUNDARY
SHIFTED BOUNDARIES

TILTED BOUNDARIES MISSING BOUNDARY

CORRECT
INCORRECT
OPEN / CLOSE ERROR

(b) Testing One-Dimensional Domains:


 The following figure shows one dimensional domain bugs for open boundaries.
B A

a) An Open Domain (A)


X
B A

b) Closure bug

X
B A

c) Boundary shifted left


X
B A

c) Boundary shifted right

X
B

e) Missing Boundary

X X
B A C

f) Extra Boundary

Page 9

9
 In the above figure a) we assume that the boundary was to open for A.
 In figure b) one test point (marked X) on the boundary detects the bug.
 In figure c) a boundary shifts to left.
 In figure d) a boundary shifts to right.
 In figure e) there is a missing boundary. In figure f) there is an extra boundary.
 The following figure shows one dimensional domain bugs for closed boundaries.
B A

a) A closed Domain (A)


X
B A

b) Closure bug

X
B A

c) Boundary shifted left

X
B A

c) Boundary shifted right

X
B

e) Missing Boundary

X X
B A C

f) Extra Boundary

 In the above figure a) we assume that the boundary was to close for A.
 In figure b) one test point (marked X) on the boundary detects the bug.
 In figure c) a boundary shifts to left. In figure d) a boundary shifts to right.
 In figure e) there is a missing boundary. In figure f) there is an extra boundary.
 Only one difference from this diagram to previous diagram is here we have closed
boundaries.
(c) Testing Two-Dimensional Domains:
 The following figure shows domain boundary bugs for two dimensional domains.
 A and B are adjacent domains, and the boundary is closed with respect to A and the
boundary is opened with respect to B.
(i) Closure Bug:
 The figure (a) shows a wrong closure, that is caused by using a wrong operator for
example, x>=k was used when x > k was intended.
 The two on points detect this bug.
(ii) Shifted Boundary:
 In figure (b) the bug is shifted up, which converts part of domain B into A’.
 This is caused by incorrect constant in a predicate for example x + y >= 17 was used
when x + y > = 7 was intended. Similarly figure (c) shows a shift down.
Page 10

10
B

X A X
(a) Closure Bug

X
A'

X A X
(b) Shifted Up

X
X B X

B'

A
(c) Shifted Down

A'
X

X A X
B'

(d) Tilted Boundary

A'
X

X A X
B'
(e) Extra Boundary

B
X

X A X

(f) Missing Boundary

Page 11

11
(iii) Tilted boundary:
 A tilted boundary occurs, when coefficients in the boundary inequality are wrong.
 For example we used 3x + 7y > 17 when 7x + 3y > 17 is needed.
 Figure (d) shows a tilted boundary which creates domain segments A’ and B’.
(iv) Extra Boundary:
 An extra boundary is created by an extra predicate.
 Figure (e) shows an extra boundary. The extra boundary is caught by two on points.
(v) Missing Boundary:
 A missing boundary is created by leaving out the predicate.
 A missing boundary shown in figure (f) is caught by two on points.
 The following figure summarizes domain testing for two dimensional domains.

 There are two on points (closed circles) for each segment and one off point (open circle)
 Note that the selected test points are shared with adjacent domains.
 The on points for two adjacent boundary segments can also be shared.
 The shared on points is given below.

(d) Equality and Inequality Predicates:


 Equality predicates are defined by equality equation such as x + y =12.
 Equality predicates supports only few domain boundaries

A a

c c'
C
d

B
b

 Inequality predicates are defined by inequality equation such as x + y > 12 or x + y <12


 Inequality predicates supports most of the domain boundaries.
 In domain testing, equality predicate of one dimension is a line.
 Similarly equality of two dimensions is a two dimensional domain and equality of three
dimensions is a planer domain.
Page 12

12
 Inequality predicates test points are obtained by taking adjacent domains into
consideration.
 In the above figure the three domains A, B, C are planer. The domain C is a line.
 Here domain testing is done by two on points & two off points.
 That is test point b for B, and test point a for A and test points c and c’ for C.
(e) Random Testing:
 Random testing is a form of functional testing that is useful when the time needed to
write and run directed tests are too long.
 One of the big issues of random testing is to know when a test fails.
 When doing random testing we must ensure that they cover the specification.
 The random testing is less efficient than direct testing. But we need random test
generators.
(f) Testing n-Dimensional Domains:
 If domains defined over n-dimensional input space with p-boundary segments then the
domain testing gives testing n-dimensional domains.
(iii) Procedure:
 Generally domain testing can be done by hand for two dimensions.
 Without tools the strategy is practically impossible for more than two variables.
1. Identify the input variables.
2. Identify variables which appear in domain predicates.
3. Interpret all domain predicates in terms of input variables.
4. For p binary predicates there are 2 p domains.
5. Solve the inequalities to find all the extreme points of each domain.
6. Use the extreme points to solve for near by on points.
(iv) Variations, Tools, Effectiveness:
 Variations can vary the number of on and off points or the extreme points.
 The basic domain testing strategy discussed here is called the N X 1 strategy, because it
uses N on points and one off point.
 In cost effectiveness of domain testing they use partition analysis, which includes domain
testing, computation verification and both structural and functional information.
 Some specification tools are used in domain testing.
(4) Domains and Interface Testing:
(i) General:
 The domain testing plays a very important role in integration testing. In integration testing
we can find the interfaces of different components.
 We can determine whether the components are accurate or not.
(ii) Domains and Range:
 Domains are the input values used. Range is just opposite of domains.
 i.e. Range is output obtained.
 In most testing techniques, more forces on the input values.
 This is because with the help of input values it will be easy to identify the output.
 But interface testing gives more forces on the output values.
 An interface test consists of exploring the correctness of the following mappings.

Caller domain Caller range


Caller range Called domain
Called domain Called range
Page 13

13
(iii) Closure Compatibility:
 Assume that the caller’s range and the called domain spans the same numbers say 0 to 17
 The closure compatibility shows the four cases in which the caller’s range closure and the
called’s domain closure can agree.
 The four cases consists of domains that are closed on top (17) & bottom (0), open top &
closed bottom, closed top & open bottom and open top & bottom.
 Here the thick line represents closed and thin line represents open.
caller called open tops open bottoms both bottom
17

0
both closed
 The following figure shows the twelve different ways the caller and the called can disagree
about closure. Not all of them are necessarily bugs.
17

 Here the four cases in which a caller boundary is open and the called is closed are not
buggy.
(iv) Span Compatibility:
 The following figure shows three possibly harmless of span incompatibilities.
 In this figure Caller span is smaller than Called.
9 9 9 9
7 7

3 3
1 1 1 1
 The range of a caller is a sub set of the called domain. That is not necessarily a bug.
 The following figure shows Called is Smaller than Caller.
Page 14

14
9 9 9 9
7 7

3 3
1 1 1 1
(v) Interface Range/ Domain Compatibility Testing:
 The application of domain testing is also very important for interface testing because it tests
the range and domain compatibilities among caller and called routines.
 It is the responsibility of the caller to provide the valid inputs to the called routine.
 After getting the valid input, the test will be done on every input variable.
(vi) Finding the values:
 Start with the called routine’s domains and generate test points.
 A good component test should have included all the interesting domain-testing cases.
 Those test cases are the values for which we must find the input values of the caller.
(5) Domains and Testability:
(i) General:
 Domain testing gives orthogonal domain boundaries, consistent closure, independent
boundaries, linear boundaries, and other characteristics. We know that which makes
domain testing difficult. That is it consists of applying algebra to the problem.
(ii) Linearizing Transformations:
 This is used to transfer non linear boundaries to equivalent linear boundaries.
 The different methods used here are
(i)Polynomials:
 A boundary is specified by a polynomial or multinomial in several variables.
 For a polynomial each term can be replaced by a new variable.
 i.e. x, x2, x3, …can be replaced by y1 = x, y2 = x2, y3 = x3 , …
 For multinomials you add more new variables for terms such as xy, x 2y, xy2, …
 So polynomial plays an important role in linear transformations.
(ii)Logarithmic Transforms:
 Products such as xyz can be linearized by substituting u = log (x), v = log (y), log (z).
 The original predicate xyz > 17 now becomes u + v + w > 2.83.
(iii)More general forms:
 Apart from logarithmic transform & polynomials there are general linearizable forms
such as x / (a + b) and axb. We can also linearize by using Taylor series.
(iii) Coordinate Transformations:
 The main purpose of coordinate transformation technique is to convert Parallel boundary
inequalities into non parallel boundary inequalities and Non-parallel boundary inequalities
into orthogonal boundary inequalities.
(iv) A Canonical Program Form:
 Testing is clearly divided into testing the predicate and coordinate transformations.
 i.e. testing the individual case selections, testing the control flow and then testing the case
processing..
(v) Great Insights:
 Sometimes programmers have great insights into programming problems that result in
much simpler programs than one might have expected.

Page 15

15
UNITIII

METRICS AND COMPLEXITY


1. SYNOPSIS

Measuring software complexity. Halstead’s metrics, token counts, McCabe’s metric, hybrid metrics.
Application and implementation.

2. METRICS, WHAT AND WHY

2.1. Philosophy

One of the characteristics of a maturing discipline is the replacement of art by science. Science is
epitomized by quantitative methods. In classical Greece, for example, early physics was dominated by
discussions of the “essence” of physical objects, with almost no attempts to quantify such “essences.”
They were struggling with what questions should be asked. Quantification was impossible until the
right questions were asked. By the sixteenth century, physics began to be quantified—first by Galileo
and then by others. In the eighteenth century, we called physical scientists “natural philosophers”
recognizing that the right questions hadn’t been asked, and that all was not yet, even in principle,
amenable to quantification. Today, physics is as quantified as science can get and far removed from its
speculative roots in philosophy.*

2.2. Historical Perspective

There’s no record of programming labor estimates on ENIAC, but I’m sure they fell far short of
reality. The first programmer, Lady Lovelace, who coded for Charles Babbage’s wonderful but
unfinished computer in the nineteenth century, I’m sure often said “Just one more week, Mr. Babbage,
and it’ll be done.” Lucky for her the hardware was never finished, so she didn’t have to go beyond
desk checking. By the mid-1950s individual programmers had coalesced into programming groups,
which meant there was now a new profession—programmer manager. The managers, who were
responsible for productivity and quality, began to measure software effort in terms of “number of lines
of code.” That was used to predict programming costs, testing costs, running time, number of bugs,
salaries, the gross national product, the inflation rate of the peso, and who knows what else. It is today
still the primary quantitative measure in use: “Count the number of lines of code, and you know all
there is to know 2.3. Objectives

2.3.1. How Big Is It?

Science begins with quantification: you can’t do physics without a notion of length and time; you can’t
do thermodynamics until you measure temperature. The most fundamental question you can ask is
“How big is it?” Without defining what “big” means, it’s obvious that it makes no sense to say “This
program will need more testing than that program” unless we know how big they are relative to one
another. Comparing two strategies also needs a notion of size. The number of tests required by a
strategy should be normalized to size. For example, “Strategy A needs 1.4 tests per unit of size, while
strategy B needs 4.3 tests per unit of size.”
65
What is meant by “size” is not obvious in the early phases of science development. Newton’s use of
mass instead of weight was a breakthrough for physics, and early researchers in thermodynamics had
heat, temperature, and entropy hopelessly confused. We seem to be doing about as well (or as badly)
as the sixteenth- and seventeenth-century physicists did at a comparable phase of development. “Size”
isn’t obvious for software. .3.2. The Questions

Our objective in this book is not to quantify all of computer science, but only to explore those metrics
of complexity that have proved their worth in practice. To see what kinds of metrics we need, let’s ask
some questions:

1. When can we stop testing?


2. How many bugs can we expect?
3. Which test technique is more effective?
4. Are we testing hard or are we testing smart?
5. Do we have a strong program or a weak test suite?

2.3.3. Desirable Metric Properties

A useful metric should satisfy several requirements:**

**
A formal examination of metrics requirements by Weyuker (WEYU88B) formally extends these concepts and evaluates
the extent to which several popular metrics, including statement count, cyclomatic number, Halstead’s effort metric, and
dataflow complexity do or do not meet them.

1. It can be calculated, uniquely, for all programs to which we apply it.


2. It need not be calculated for programs that change size dynamically or programs that in
principle cannot be debugged.
3. Adding something to a program (e.g., instructions, storage, processing time) can never decrease the
measured complexity.

2.3.4. Metrics Taxonomy

There’s no agreement in the literature on how to classify metrics—and there are so many metrics to
classify. Here are some broad categories: linguistic metrics, structural metrics, and hybrid metrics.
Each can be applied to either programs or specifications; to date, however, application to programs has
dominated. The taxonomy is not rigid because a narrowly defined linguistic metric can define a
structural property. For example, cyclomatic complexity can be defined in terms of links and nodes in
the control flowgraph or alternatively by the number of equivalent branches in the program.

Linguistic Metrics—Metrics based on measuring properties of program or specification text


without interpreting what that text means or the ordering of components of the text. For
example: lines of code, number of statements, number of unique operators, number of unique
operands, total number of operators, total number of operands, total number of keyword
appearances, total number of tokens.
Structural Metrics—Metrics based on structural relations between objects in the program—
usually metrics on properties of control flowgraphs or data flowgraphs; for example, number of
links, number of nodes, nesting depth.
66
Hybrid Metrics—Metrics based on some combination of structural and linguistic properties of a
program or based on a function of both structural and linguistic properties

3. LINGUISTIC METRICS

3.1. General

Linguistic metrics measure some property of text without interpreting what is measured. A metric is
(mainly) linguistic if its value doesn’t change when you rearrange the text. Linguistic metrics, to date,
have been applied mostly to program text. They can be just as easily applied to formal specifications,
but because formal, processable specifications are rare, there is almost no experience with such usage;
but see RAMA85.

3.2. Lines of Code, Statement Count, and Related Metrics

3.2.1. What Is It?

Count the number of lines of code in a program and use that number as a measure of complexity. If
then, bugs appear to occur at 1% per line, a 1000-line program should have 10 bugs and a 10,000 line
program should have 100. Going further, if we find that it takes an average of twenty tests to find a
bug, we might infer (empirically) the expected number of tests needed per line of code.

3.2.2. What to Count and Not Count

Early users of lines of code did not include data declarations, comments, or any other lines that did not
result in object code. Later users decided to include declarations and other unexecutable statements but
still excluded comments and blank lines. The reason for this shift is the recognition that contemporary
code can have 50% or more data statements and that bugs occur as often in such statements as in
“real” code. There is a rationale for including comments. The quality of comments materially affects
maintenance costs because the maintenance programmer will depend on the comments more than
anything else to do her job. Conversely, too many blank lines and wordy but information-poor
comments will increase maintenance effort. The problem with including comments is that we must be
able to distinguish between useful and useless comments, and there’s no rigorous way to do that. The
same can be said of blank lines and formatting text: a little helps us read the code but too much forces
us into excessive page turning.

3.2.3. Statement Counts

Some of the difficulties with lines of code can be overcome by using statements instead—but this
evokes new problems that are as bad as those of lines of code. The problem, aptly stated by Levitin
(LEVI86) is that there’s no unique way to count statements in some languages (e.g., Pascal), and
there’s no simple rule for defining “statement” across different languages. Just as subjectivity is
involved in applying lines of code, there’s subjectivity in deciding what is and what is not to be called
a statement (for some languages).

3.2.4. How Good (Bad) Are They?

Thayer, Lipow and Nelson, in their monumental software reliability study (THAY76) showed error
67 ranging from 0.04% to 7% when measured against statement counts, with the most reliable
rates
routine being one of the largest. The same lack of useful correlation is shown in Rubey (RUBE75).
Curtis, Sheppard, and Milliman (CURT79A) show that lines of code is as good as other metrics for
small programs, but is optimistic for big programs. Moderate performance is also reported for lines of
code by Schneidewind (SCHN79A). The definitive report on the relation between program size and
bug rate is Lipow’s (LIPO82). In his study of 115,000 JOVIAL statements, he found a nonlinear
relation between bugs per line of code and statements; but also included other factors related to
language type and modularization. Small programs had an error rate of 1.3% to 1.8%, with big
programs increasing from 2.7% to 3.2%. Lipow’s study, however, and most of the others, only
included executable lines and not data declarations. The bottom line is that lines of code is reasonably
linear for small programs (under 100 lines) but increases nonlinearly with program size. It seems to
correlate with maintenance costs.

3.3. Halstead’s Metrics

3.3.1. What Are They?

Halstead’s metrics are based on a combination of arguments derived from common sense, information
theory, and psychology. The clearest exposition is still to be found in Halstead’s Elements of Software
Science (HALS77). It’s an easy-to-read little book that should be read before applying these metrics.
The following exposition is intended only as an introductory overview. The set of metrics are based on
two, easily measured, parameters of programs:

n1 = the number of distinct operators in the program (e.g., keywords)

n2 = the number of distinct operands in the program (e.g., data objects)

From these he defines program length, which is not to be confused with the number of statements in
a program, by the following relation:
H = n1 log2 n1 + n2 log2 n2 (1)*

3.3.2. How Good?

Confirmation of these metrics has been extensively published by Halstead and others. The most solid
confirmation of the bug prediction equation is by Lipow (LIPO82), who compared actual to predicted
bug counts to within 8% over a range of programs sizes from 300 to 12,000 executable statements.
The analysis was of postcompilation bugs, so that syntax errors caught by the compiler are properly
excluded. However, of the 115,000 statements in the experiment, only the 75,000 executable
statements were examined. It would be interesting to see whether better accuracy would have been
obtained had all declarations been included in the analysis. Ottenstein, in an earlier report (OTTE79),
showed similar good correlation. Curtis (CURT79A) shows that Halstead’s metrics are at least twice
as good as lines of code and are not improved by augmenting them with lines of code or with
McCabe’s metric.

3.3.3. The Hidden Assumptions and Weaknesses

Because Halstead’s metric is the best established (serious) linguistic metric we have, it’s fitting that it
be subjected to the harshest criticism. There are fewer hidden assumptions in Halstead’s work than in
other comparable efforts, but some assumptions and weaknesses still remain. Some of the following
68
weaknesses apply to all linguistic metrics.
1. Modularity—Modularity is not ignored because each call, together with the parameters of the
call sequence, will contribute to the values of n1, n2, N1, and N2, and therefore to the predicted
bug count. Note that Halstead treats each distinct subroutine call as a unique operator and not
just as the one keyword “CALL.” In this respect, the impact of hypermodularity on bug rate is
not ignored.

2. 2. Database Impact and Declarations—Halstead isn’t to be faulted on this one. It’s just that
most evaluators and confirmers have continued the erroneous practice of ignoring unexecutable
statements such as declarations and data statements—which is to say that most initialization
and data structure bugs are ignored. They can’t be faulted for this entirely, because in many
cases errors in data declarations and initializing data statements are not even counted as bugs.

3. Opera/Operand Ambiguity—There is a hidden assumption that code is code and data are data,
and never the twain should meet. If Halstead’s metrics are rigidly applied to an assembly
language program that does extensive modification of its own code (ugh!) it will predict the
same bug count as a routine that performs the same operations on a fixed data area.

4. Data-Type Distinctions—In strongly typed languages that force the explicit declaration of all
types and prohibit mixed-type operations unless a type conversion statement has been inserted,
the weakness does not exist if we count all type conversation statements, whether or not they
result in object code. If we count all declarations, including user-defined type declarations,
there is no weakness in Halstead’s metrics.

5. Call Depth—No notice is made of call depth. A routine that calls ten different subroutines as a
sequence of successive calls would actually be considered more complex than a routine that
had ten nested calls. If the totality of the routines and all that it calls were considered, the bug
prediction would be the same for both because the total operator and operand counts would be
the same.

6. Operator Types—An IF-THEN-ELSE statement is given the same weight as a FOR-


UNTIL, even though it’s known that loops are more troublesome. This is an example of a
broader criticism—that operators and operands are treated equally, with no correlation to the
bug incidence associated with specific operators or with operand types.
7. General Structure Issues—Of these, the fact that nesting is ignored (nested loops, nested IF-
THEN-ELSE, and so on) is probably the most serious critique. A Basic statement such as
100 A = B + C @ D = SIN(A) N=9

is less bug-prone than


100 D = SIN(B + C) N=6

3.4. Token Count

Levitin, in his incisive critique of linguistic metrics (LEVI86), makes a strong argument for the use of
program token count rather than lines of code, statement count, or Halstead’s length. A token in a
programming
69 language is the basic syntactic unit from which programs are constructed. Tokens
include keywords, labels, constants, strings, and variable names. Token counting is easy because the
first stage of compilation is to translate the source code into equivalent numerical tokens. The number
of tokens is easily obtained as a by-product of compilation. Token count gets around the subjectivity
problems we have with statement counts and lines of code.

4. STRUCTURAL METRICS

4.1. General

Structural metrics take the opposite viewpoint of linguistic metrics. Linguistic complexity is ignored
while attention is focused on control-flow or data-flow complexity—metrics based on the properties of
flowgraph models of programs. Graph theory is more than a half-century old but the study of graph-
theoretic problems and metrics goes back three centuries: see MAYE72. If it’s a graph (i.e., it consists
of links and nodes) then there are hundreds of interesting metrics that can be applied to it: metrics
whose mathematical properties have been studied in minute detail. The thrust of structural metrics
research has not been so much the invention of new graph-theoretic metrics but the investigation of the
utility of known graph-theoretic metrics. McCabe’s use of the cyclomatic number (MCCA76) is an
archetypal example.

4.2. Cyclomatic Complexity (McCabe’s Metric)

4.2.1. Definition

McCabe’s cyclornatic complexity metric (MCCA76) is defined as:

M = L — N + 2P

where

L = the number of links in the graph

N = the number of nodes in the graph

P = the number of disconnected parts of the graph (e.g., a calling program and a subroutine)

The number M that appeared alongside the flowgraphs in Chapter 3 was McCabe’s metric for that
flowgraph. In all the examples, except for the one on page 73, there was only one connected part, and
consequently the value of P was 1. Figure 7.1 shows some more examples and their associated M
values.

4.2.2. Applications to Test Plan Completeness and Inspections

Evaluate the cyclomatic complexity of the program’s design (e.g., from the design control flowgraph).
As part of self-inspection, reevaluate the complexity by counting decisions in the code. Any
significant difference should be explained, because it’s more likely that the difference is due to a
missing path, an extra path, or an unplanned deviation from the design than to something else. Having
verified the code’s cyclomatic complexity, compare the number of planned test cases to the code’s
complexity. In particular, count how many test cases are intended to provide coverage. If the number
70
of covering test cases is less than the cyclomatic complexity, there is reason for caution, because one
of the following may be true:

1. You haven’t calculated the complexity correctly. Did you miss a decision?
2. Coverage is not really complete; there’s a link that hasn’t been covered.
3. Coverage is complete, but it can be done with a few more but simpler paths.
4. It might be possible to simplify the routine.

4.2.3. When to Subroutine

McCabe’s metric can be used to help decide whether it pays to make a piece of code which is common
to two or more links into a subroutine. Consider the graph of Figure 7.3. The program has a common
part that consists of Nc nodes and Lc links. This is the part being considered for conversion to a
subroutine. This common part recurs k times in the body of the main program. The main program has
Nm nodes and Lm links over and above the common part.

EMBEDDED SUBROUTINE FOR


COMMON PART COMMON PART

Main nodes Nm + kNc Nm


Main links Lm + kLc Lm + k
Subnodes 0 Nc + 2
Sublinks 0 Lc
Main complexity Lm + kLc – Nm + kNc + 2 Lm + k
Subcomplexity 0 L c – Nc – 2 + 2 = L c – Nc = M
Total complexity + 2 Lm + kLc – Nm + kNc + 2 L m + L c – Nm – Nc + k + 2

.2.4. A Refinement

Myers (MYER77) points out a weakness of McCabe’s metric and suggests the use of a refinement
thereto. A decision statement in languages such as FORTRAN can contain a compound predicate of
the form: IF A & B & C THEN . . . . A statement such as this could be implemented as a string of IFs,
resulting in a different complexity measure. Figure 7.4 shows three alternate, equivalent,
representations of the same IF-THEN-ELSE statement. If the compound predicate is used in a single
statement as in the first case, the complexity of the construct is only two, but if it is broken down into
its constituent parts, the complexity increases to four. However, intuitively, all three constructs should
have the same complexity. The refinement consists of accounting for each term of the predicate
expression separately. For a predicate expression of the form A&B&C . . . , each predicate should be
counted as if it was a decision. In more complicated expressions, such as A&B&C OR D&E . . . ,
again count each predicate. If a predicate appears more than once in an expression, you can take a
pessimistic point of view and count each appearance or a slightly optimistic point of view and count
only the first appearance.

4.2.5. How Good a Measure; A Critique

71
Statistics on how well McCabe’s metric correlates with design, test, and debugging difficulty are
encouraging (BELF79, CHEN78A, CURT79A, ENDR75, FEUE79A, FEUE79B, LIPO77,
SCHN79A, SCHN79B, SCHN79D, SHEP79C, THAY76, WALS79, ZOLN77).* The reported results
confirm the utility of McCabe’s metric as a convenient rule of thumb that is significantly superior to
statement count.

4.3. Other Structural Metrics

Many other structural metrics have been proposed and investigated. Chen (CHEN78A) combines
structural properties with information theoretic concepts. Gong (GONG85) combines decisions and
nesting depth. Rodriguez and Tsai (RODR86) apply structural concepts to data flowgraphs, as do Tai
(TAIK84) and Tsai et al. (TSAI86). Van Verth (VANV87) proposes using a combination of control
flow and data flow, as does Whitworth (WHIT80B). This list is incomplete as the number of papers on
alternate complexity metrics is as great as those on alternate testing strategies. The problem with these
metrics is that confirming statistics on their usefulness is lacking. It is intuitively clear that cyclomatic
complexity over data flowgraphs should be as useful a metric as cyclomatic complexity over control
flowgraphs but corroboration is still lacking (OVIE80).

. HYBRID METRICS

The appearance of McCabe’s and Halstead’s metrics spurred the proposal, development, refinement,
and validation of a host of similar and related metrics, or totally different alternatives based on
different assumptions. Some of those cited below preceded McCabe’s and Halstead’s work and some
followed. It’s too early to tell which will be experimentally confirmed independently over a range of
projects and applications, no matter how rational and sensible their basis might be. Some of the more
interesting and promising alternatives are presented in BAKE79A, BAKE80, BELF79, CHAP79,
CHEN78A, DEYO79, EVAN84, FEUE79B, GILB77, LIPO77, MCCL78B, PAIG80, RAMA88,
SCHN79B, VANV87, WHIT80B, and ZONL77. Most of these metrics and their variations recognize
one or more weaknesses in the popular metrics and seek to measure reliability and/or predict bug
counts through refinement or alternative formulation of the problem. It is inevitable that increased
predictability and fidelity can only be achieved at a cost of increased sophistication, complexity of
evaluation, and difficulty of use.

PATHS, PATH PRODUCTS, AND REGULAR EXPRESSIONS


1. SYNOPSIS

Path expressions are introduced as an algebraic representations of sets of paths in a graph. With
suitable arithmetic laws (BRZO62A, BRZO62B, BRZO63, MCNA60, PRAT83) and weights, path
expressions are converted into algebraic functions or regular expressions that can be used to examine
structural properties of flowgraphs, such as the number of paths, processing time, or whether a data-
flow anomaly can occur. These expressions are then applied to problems in test design and debugging.

. MOTIVATION
72
This chapter and its continuation, Chapter 12, are the two most abstract chapters in this book; but that
doesn’t mean that they’re difficult. Considering the generally pragmatic orientation of this book, some
motivation for this abstraction is warranted. I could be high-handed and patronizing and say: “Trust
me, it’s good for you,” but you’re too intelligent to let me get away with that. This stuff is good for
you and I do want you to trust me about it; but I would rather you take the effort needed to master the
subject and then to see all the nice ways it can be applied. So here’s some motivation for you.

1. I introduced flowgraphs as an abstract representation of programs. I assume that by now


you appreciate their utility and want to know how playing with flowgraphs is really done.
2. Almost any question you have about programs can be cast into an equivalent question about
an appropriate flowgraph. This chapter tells you how to do that.
3. The concepts discussed in this chapter will help you better understand and exploit syntax testing
(Chapter 9) and state testing (Chapter 11), both of which yield a

3.2. Basic Concepts

Every link of a graph can be given a name; the link name will be denoted by lowercase italic letters. In
tracing a path or path segment through a flowgraph, you traverse a succession of link names. The
name of the path or path segment that corresponds to those links is expressed naturally by
concatenating those link names. If you traverse links a, b, c, and d along some path, the name for that
path segment is abcd. This path name is also called a path product. Figure 8.1 shows some examples.

3.3. Path Products

The name of a path that consists of two successive path segments is conveniently expressed by the
concatenation or path product of the segment names. For example, if X and Y are defined as

X = abcde
Y = fghij

then the path corresponding to X followed by Y is denoted by

XY = abcdefghij

Similarly,

YX = fghijabcde
aX = aabcde
Xa = abcdea
XaX = abcdeaabcde

If X and Y represent sets of paths or path expressions, their product represents the set of paths that can
be obtained by following every element of X by any element of Y in all possible ways. For example,

X = abc + def + ghi


Y = uvw + z

Then
73
XY = abcuvw + defuvw + ghiuvw + abcz + defz + ghiz

If a link or segment name is repeated, that fact is denoted by an exponent. The exponent’s value
denotes the number of repetitions:

a1 = a; a2 = aa; a3 = aaa; an = aaaa . . . n times.

Similarly, if

X = abcde

then

X1 = abcde
X2 = abcdeabcde = (abcde)2
X3 = abcdeabcdeabcde = (abcde)2abcde
= abcde(abcde)2 = (abcde)3

3.4. Path Sums

The “+” sign was used to denote the fact that path names were part of the same set of paths. Consider
the set of paths that can lie between two arbitrary nodes of a graph. Even though these paths can
traverse intermediate nodes, they can be thought of as “parallel” paths between the two nodes. The
path sum denotes paths in parallel between two nodes. Links a and b in Figure 8.1a are parallel paths
and are denoted by a + b. Similarly, links c and d are parallel paths between the next two nodes and
are denoted by c + d. The set of all paths between nodes 1 and 2 of Figure 8.1a can be thought of as a
set of parallel paths between nodes I and 2 and can be denoted by eacf + eadf + ebcf + ebdf. If X and Y
are sets of paths that lie between the same pair of nodes, then X + Y denotes the union of those sets of
paths. As an example,

The first set of parallel paths is denoted by X + Y + d and the second set by U + V + W + h + i + j. The
set of all paths in this flowgraph is

3.5 Distributive Laws

The product and sum operations are distributive, and the ordinary rules of multiplication apply; that is,

Rule 4: A(B + C) = AB + AC and (B + C)D = BC + BD

Applying these rules to Figure 8.1a yields

e(a + b)(c + d)f = e(ac + ad + bc + bd)f


= eacf + eadf + ebcf + ebdf

for the set of all paths from node 1 to node 2.


74
.6. Absorption Rule

If X and Y denote the same set of paths, then the union of these sets is unchanged; consequently,

Rule 5: X + X = X (absorption rule)

Similarly, if a set consists of path names and a member of that set is added to it, the “new” name,
which is already in that set of names, contributes nothing and can be ignored. For example, if

X = a + aa + abc + abcd + def

then

X + a = X + aa = X + abc = X + abcd = X + def = X

It follows that any arbitrary sum of identical path expressions reduces to the same path expression.

3.7. Loops

Loops can be understood as an infinite set of parallel paths. Say that the loop consists of a single link
b. Then the set of all paths through that loop point is

b0 + b1 + b2 + b3 + b4 + b5 + . . .

This potentially infinite sum is denoted by b* for an individual link and by X* when X is a
path expression. If the loop must be taken at least once, then it is denoted by a+ or X+. The path
expressions for Figure 8.1c and 8.1d, respectively, as expressed by this notation, a ab*c = ac +
abc + abbc + abbbc + . . .

and

a(bc)*bd = abd + abcbd + abcbcbd + a(bc)3bd + . . .

Evidently

aa* = a*a = a+

and

XX* = X*X = X+

It is sometimes convenient to denote the fact that a loop cannot be taken more than a certain, say n,
number of times. A bar is used under the exponent to denote that fact as foll Xn = X0 + X1 + X2 + X3 + .
. . + Xn-1 + Xn
75
The following rules can be derived from the previous rules:

Rule 6: Xn + Xm = Xn if n is bigger than m

= Xm if m is bigger than n

Rule 7: XnXm = Xn+m

Rule 8: XnX* = X*Xn = X*

Rule 9: XnX+ = X+Xn = X+

Rule 10: X*X+ = X+X* = X+

3.8. Identity Elements

Returning to the meaning of terms such as a0 or X0, which denote the path whose length is zero, the
following rules, previously used without explanation, apply:

Rule 11: 1 + 1 = 1

Rule 12: 1X = X1 = X Following or preceding a set of paths by a path of zero length doesn’t change
the set.

Rule 13: 1n = 1n = 1* = 1+ = 1 No matter how often you traverse a path of zero length, it is still a path
of zero length.

Rule 14: 1+ + 1 = 1* = 1

The final notation needed is the empty set or the set that contains no paths, not even the zero-length
path 1. The null set of paths is denoted by the numeral 0.* Zero, in this algebra, plays the same role as
zero does in ordinary arithmetic. That is, it obeys the following rules:

Rule 15: X + 0 = 0 + X = X

Rule 16: X0 = 0X = 0 If you block the paths of a graph fore or aft by a graph that has no paths, there
won’t be any paths.

Rule 17: 0* = 1 + 0 + 02 + . . . = 1

The meaning and behavior of zero and one (the identity elements) given above apply only to path
names. Other applications have different identity elements with different properties.

4. A REDUCTION PROCEDURE
76
4.1. Overview

This section presents a reduction procedure for converting a flowgraph whose links are labeled with
names into a path expression that denotes the set of all entry/exit paths in that flowgraph. The
procedure is a node-by-node removal algorithm. You follow these steps, which initialize the process:

1. Combine all serial links by multiplying their path expressions.


2. Combine all parallel links by adding their path expressions.
3. Remove all self-loops (from any node to itself) by replacing them with a link of the form
X*, where X is the path expression of the link in that loop.

The remaining steps are in the algorithm’s loop:

4. Select any node for removal other than the initial or final node. Replace it with a set of
equivalent links whose path expressions correspond to all the ways you can form a product of
the set of inlinks with the set of outlinks of that node.
5. Combine any remaining serial links by multiplying their path expressions.
6. Combine all parallel links by adding their path expressions.
7. Remove all self-loops as in step 3.
8. Does the graph consist of a single link between the entry node and the exit node? If yes,
then the path expression for that link is a path expression for the original flowgraph; otherwise,
return to step 4.
Each step will be illustrated and explained in further detail in the next sections. ows:

4.2. Cross-Term Step (Step 4)

The cross-term step* is the fundamental step of the reduction algorithm. It removes a node, thereby
reducing the number of nodes by one. Successive applications of this step eventually get you down to
one entry and one exit node. The following diagram shows the situation at an arbitrary node that has
been selected for removal:

The rationale for this step is intuitively obvious. Whatever the set of paths represented by, say, A and
C, there can be no other paths for the combination other than those represented by AC. Similarly, the
removal of the node results in the AD, AE, BC, BD, and BE path expressions or path sets. If the path
expressions are path names, it is clear that the resulting path names will be the names obtained by
traversing the pair of links. If the path expressions denote sets of paths or path sums, using the
definition of multiplication and the distributive rule produces every combination of incoming and
outgoing path segments, as in

(a + b)(c + d) = ac + ad + bc + bd

77
Applying this step to the graph of, we remove several nodes in order; that is,

Remove node 10 by applying step 4 and combine by step 5 to yield

Remove node 9 by applying steps 4 and 5 to yield

Remove node 7 by steps 4 and 5, as follows:

Remove node 8 by steps 4 and 5, to obtain

4.3. Parallel Term (Step 6)

Removal of node 8 above led to a pair of parallel links between nodes 4 and 5. Combine them to create

a path expression for an equivalent link whose path expression is c + gkh; that is,

4.4. Loop Term (Step 7)

Removing node 4 leads to a loop term. The graph has now been replaced with the following

equivalent, simpler graph:

There are two ways of looking at the loop-removal operation:

In the first way, we remove the self-loop and then multiply all outgoing links by Z*. The second way
shows things in more detail. We split the node into two equivalent nodes, call them A and A’ and put
in a link between them whose path expression is Z*. Then we remove node A’ using steps 4 and 5 to
yield outgoing links whose path expressions are Z*X and Z*Y.

Continue the process by applying the loop-removal step, as follows:

Removing node 5 produces

Remove the loop at node 6 to yield

Remove node 3 to yield

Removing the loop and then node 6 results in the following ugly expression:

a(bgjf)*b(c + gkh)d((ilhd)*imf(bgjf)*b(c + gkh)d)*(ilhd)*e

.5.78
Comments, Identities, and Node-Removal Order
I said earlier that the order in which the operations are done affects the appearance of the path
expressions. Such appearances of differences also result in identities that can sometimes be used to
simplify path expressions.

I1: (A + B)* = (A* + B*)*


I2: (A*B*)*
I3: = (A*B)*A*
I4: = (B*A)*B*
I5: = (A*B + A)*
I6: = (B*A + B)*
I7: (A + B + C + . . .)* = (A*+B*+C*+ . . .)*
I8: = (A*B*C* . . .)*

These can be derived by considering different orders of node removals and then applying the series-
parallel-loop rules. Each change in order can produce a different appearance for the path expression
and therefore a path expression identity. Don’t make the mistake of applying these identities to finite
exponents or +. These identities hold only because they denote infinite sets of paths. These identities
are not very important anyhow, because we will rarely deal with path expressions as such but rather
with other kinds of expressions derived from the path expressions by using link weights and link
arithmetics. As an example of misapplying the identities, consider:

(A + B)2 ≠ (A2 + B2)2 ≠ (A2B2)2

If A consists of the single link a and B is link b, the three expressions correspond to the following sets
of paths:

(A + B)2 = aa + ab + bb + ba
(A2 + B2)2 = (a4 + a2b2 + b2a2 + b4)
(A2B2)2 = a2b2a2b2 = (a2b2)2
This algorithm can be used to find the path expression between any two nodes in a graph, including a
node and itself. It is not restricted to finding the path expression between the entry and the exit of a
flowgraph, although that might be the most common and most useful application. The method is
tedious and cumbersome, what with having to constantly redraw the graph.

5. APPLICATIONS

5.1. General

The previous sections of this chapter are more abstract than I and most readers are apt to like. They
are, I admit, remote from the substantive problems of testing and test design. The purpose of all that
abstraction was to present one very generalized concept—the path expression and one very
generalized way of getting it, the node-removal algorithm, so that the same concepts would not have to
be discussed over and over again as one variation on a theme after another. Every application follows
this common pattern:

1. Convert the program or graph into a path expression.


2. Identify a property of interest and derive an appropriate set of “arithmetic” rules that
79 characterizes the property.
3. Replace the link names by the link weights (remember them?) for the property of interest. The path
expression has now been converted to an expression in some algebra, such as ordinary algebra, regular
expressions, or boolean algebra.

5.2. How Many Paths in a Flowgraph?

5.2.1. The Question

The question is not simple. Here are some ways you could ask it:

1. What is the maximum number of different paths possible?


2. What is the fewest number of paths possible?
3. How many different paths are there really?
4. What is the average number of paths?
In all that follows, by “path” I mean paths from the entrance to the exit of a single-entry/single-exit
routine.*

5.2.2. Maximum Path Count Arithmetic

Label each link with a link weight that corresponds to the number of paths that that link represents.
Typically, that’s one to begin with; but if the link represented a subroutine call, say, and you wanted to
consider the paths through the subroutine in the path count, then you would put that number on the
link. Also mark each loop with the maximum number of times that the loop can be taken. If the answer
is infinite, you might as well stop the analysis because it’s clear that the maximum number of paths
will be infinite. There are three cases of interest: parallel links, serial links, and loops. In what follows,
A and B are path expressions and WA and WB are algebraic expressions in the weights.

but some other number, say j, then you do the summation from j to n rather than from 0 to n.

5.2.3. A Concrete Example

Here is a reasonably well-structured program. Its path expression, with a little work, is:

Each link represents a single link and consequently is given a weight of “1” to start. Let’s say that the
outer loop will be taken exactly four times and the inner loop can be taken zero to three times. The
steps in the reduction are as follows:

For the inner loop,

80
Alternatively, you could have substituted a “1” for each link in the path expression and then
simplified, as follows:

1(1 + 1)1(1(1 x 1)31 x 1 x 1(1 + 1)1)41(1 x 1)31 x 1 x 1


= 2(131 x (2))413

but

13 = 1 + 11 + 12 + 13 = 4
= 2(4 x 2)4 x 4 = 2 x 84 x 4
= 32,768

This is the same result we got graphically. Reviewing the steps in the reduction, we:

1. Annotated the flowgraph by replacing each link name with the maximum number of paths
through that link (1) and also noted the number of possibilities for looping. The inner loop was
indicated by the range (0-3) as specified, and the outer loop by the range (4-4).
2. Combined the first pair of parallels outside of the loop and also the pair corresponding to
the IF-THEN-ELSE construct in the outer loop. Both yielded two possibilities.
3. Multiplied things out and removed nodes to clear the clutter.
4. Took care of the inner loop: there were four possibilities, leading to the four values. Then
we multiplied by the link weight following (originally link g) whose weight was also 1.
5. Got rid of link e.

5.3. Approximate Minimum Number of Paths

5.3.1. Structured Code

The node-by-node reduction procedure can also be used as a test for structured code. Structured code
can be defined in several different ways that do not involve ad hoc rules such as not using GOTOs. A
graph-based definition by Hecht (HECH77B) is as follows:

A structured flowgraph is one that can be reduced to a single link by successive application
of the transformations of

Note that the cross-term transformation is missing. An alternate characterization by McCabe


(MCCA76) states that

Flowgraphs that do not contain one or more of the graphs shown in Figure 8.4 as subgraphs are
structured.

5.3.2. Lower Path Count Arithmetic

A lower bound on the number of paths in a routine can be approximated for structured flowgraphs. It is not a true lower
bound because, again, unachievable paths could reduce the actual number of paths to a lower number yet. The appropriate
arithmetic is as follows:

81
CASE PATH EXPRESSION WEIGHT EXPRESSION

PARALLEL A+B WA + WB
SERIES AB MAX(WA,WB)
LOOP An 1, W1

The parallel case is the same as before. The values of the weights are the number of members in a set
of paths. There could be an error here because both sets could contain the zero-length path, but
because of the way the loop expression is defined, this cannot happen. The series case is explained by
noting that each term in the first set will combine with at least one term in the second set. The
minimum number of combinations must be the gr5.4. The Probability of Getting There

5.4.1. The Problem

I suggested that, if anything, path selection should be biased toward the low- rather than the high-
probability paths. This raises an interesting question: What is the probability of being at a certain point
in a routine? This question can be answered under suitable assumptions, primarily that all probabilities
involved are independent, which is to say that all decisions are independent and uncorrelated. This
restriction can be removed, but the method is beyond the scope of this book. We use the same
algorithm as before—node-by-node removal of uninteresting nodes.

5.4.2. Weights, Notation, Arithmetic

Probabilities can come into the act only at decisions (including decisions associated with loops).
Annotate each outlink with a weight equal to the probability of going in that direction. Evidently, the
sum of the outlink probabilities must equal 1. For a simple loop, if the loop will be taken a mean of N
times, the looping probability is N/(N + 1) and the probability of not looping is 1/(N + 1). A link that is
not part of a decision node has a probability of 1. The arithmetic rules are those of ordinary arithmetic.

eater of the number of possibilities in the first set and the second set.
SYNTAX TESTING
2.8. Overview

Syntax testing consists of the following steps:

1. Identify the target language or format (hidden or explicit).


2. Define the syntax (format) of the language, formally, in a convenient notation such as
Backus-Naur form (BNF).
3. Test and debug the syntax to assure that it is complete and consistent and that it satisfies the
intended semantics.

3. A GRAMMAR FOR FORMATS


82
3.1. Objectives

Every input has a syntax. That syntax may be formally specified or undocumented and “just
understood,” but it does exist. Data validation consists (in part) of checking the input for correct
syntax. It’s best when the syntax is defined in a formal language—best for the designer and the tester.

3.2. BNF Notation (BACK59)

3.2.1. The Elements

Every input can be considered as if it were a string of characters. The software accepts valid strings
and rejects invalid ones. If the software fails on a string, we’ve really got it. If it accepts an invalid
string, then it’s guilty of GIGO.
alpha_characters ::= A/B/C/D/E/F/G/H/l/J/K/L/M/N/O/P/Q/
R/S/T/U/V/W/X/Y/Z
numerals ::= 1/2/3/4/5/6/7/8/9
zero ::= 0
signs ::= !/#/$/%/&/*/(/)/-/+/=/;/:/“/’/,/./?
space ::= sp

3.2.2. BNF Operators

The operators are the same as those used in path expressions and regular expressions: “or,”
concatenate, (which doesn’t need a special symbol), “×”, and “+”. Exponents, such as An, have the
same meaning as before—n repetitions of the strings denoted by the letter A3.2.3. Repetitions

As before, object1-3 means one to three objects, object* means zero or more repetitions of object
without limit, and object+ means one or more repetitions of object. Neither the star (*) nor the plus (+)
can legitimately appear in any syntax because both symbols mean a 3.2.4. Examples

This is an example and not a real definition of a telephone number:

special_digit ::= 1/2/5


zero ::= 0
other_digit ::= 3/4/6/7/8/9
ordinary_digit ::= special digit / zero / other_digit
exchange_part ::= other_digit2 ordinary_digit
number_part ::= ordinary_digit4
phone-number ::= exchange_part number-part
possibly infinite number of repetitions.. Syntax is defined in BNF as a set of definitions

4. TEST CASE GENERATION

4.1. Generators, Recognizers, and Approach

A data-validation routine is designed to recognize strings that have been explicitly or implicitly
defined in accordance with an input syntax. It either accepts the string, because it is recognized as
valid, or rejects it and takes appropriate action.

String errors can be categorized as follows:


83
1. High-Level Syntax Errors—The strings have violations at the topmost level in a top-down
BNF syntax specification.
2. Intermediate-Level Syntax Errors—Syntax errors at any level other than the top or bottom.
3. Field-Syntax Errors—Syntax errors associated with an individual field, where a field is
defined as a string of characters that has no subsidiary syntax specification other than the
identification of characters that compose it. A field is the lowest level at which it is productive
to think in terms of syntax testing.
4. Delimiter Errors—Violation of the rules governing the placement and the type of characters that
must appear as separators between fields

4.2. Test Case Design

4.2.1. Strategy

The strategy is to create one error at a time, while keeping all other components of the input string
correct; that is, in the absence of the single error, the string would have been accepted. Once a
complete set of tests has been specified for single errors, do the same for double errors and then triple
errors.

.2.2. Top, Intermediate, and Field-Level Syntax Errors

Say that the topmost syntax level is defined as:

item ::= atype / btype / ctype / dtype etype

Here are some obvious test cases:

1. Do It Wrong—Use an element that is correct at some other lower syntax level, but not at
this level.
2. Use a Wrong Combination. The last element is a combination of two other elements in a
specified order. Mess up the order and combine the wrong things:
dtype atype / btype etype / etype dtype / etype etype / dtype dtype

4.2.3. Delimiter Errors

Delimiters are characters or strings placed between two fields to denote where one ends and the other
begins. Delimiter problems are an excellent source of test cases.
1. Missing Delimiter—This kind of error causes adjacent fields to merge. This may result in a
different, but valid, field or may be covered by another kind of syntax error.
2. Wrong Delimiter—It’s nice when several different delimiters are used and there are rules
that specify which can be used where. Mix them up and use them in the wrong places.
3. Not a Delimiter—There are some characters or strings that are not delimiters but could be
put into that position. Note the possibility of changing adjacent field types as a result.
4. Too Many Delimiters—The number of delimiters appearing at a field boundary may be variable.
This is typical for spaces, which can serve as delimiters.

4.2.4. Field-Value Errors

84
Field-value errors are clearly a domain-testing issue, and domain testing is where it’s at. Whether you
choose to implement field-value errors in the context of syntax testing or the other way around (i.e.,
syntax testing under domain testing) or whether you choose to implement the two methods as separate
test suites depends on which aspect dominates. Syntax-testing methods will usually wear out more
quickly than will domain testing.

4.2.6. State-Dependency Errors

The string or field value that may be acceptable at one instant may not be acceptable at the next
because validity depends on the transaction’s or the system’s state. As an example, say that the
operator’s command-input protocol requires confirmation of all commands. After every command the
system expects either an acknowledgment or a cancellation, but not another command.

4.3. Sources of Syntax

4.3.1. General

Where do you get the syntax? Here’s another paradox for you. If the syntax is served up in a nice,
neat, package, then syntax-testing methods probably won’t be effective and if syntax testing is
effective, you’ll have to dig out and formally define the syntax yourself. Where do you get the syntax?

Ideally, it comes to you previously defined, formally defined, in BNF or an equivalent, equally
convenient notation.*

4.3.2. Designer-Tester Cooperation and Design Specifications

If there is no BNF specification, I try to get the designers to create one—at least the first version of
one. Realistically, though, if a BNF specification does not exist, the designers will have to create a
document that can be easily converted into one or what is she designing to? If you get the designer to
create the first version of the BNF specification, you may find that it is neither consistent nor
complete.

. Manuals as Sources

Manuals, such as instruction manuals, reference manuals, and operator manuals are the obvious place
to start for command languages if there isn’t a formal syntax document and you can’t get designers to
do the job for you. The syntax in manuals may be fairly close to a formal syntax definition. Manuals
are good sources because more often than not, we’re dealing with a maintenance situation, rehosting,
or a rewrite of an old application.

.4. Help Screens and Systems

Putting user information such as command syntax into HELP systems and on-line tutorial is becoming
more commonplace, especially for PC software because it’s cheaper to install a few hundred K of
HELP material on a floppy than it is to print a few hundred pages of instruction manual. You may find
the undocumented syntax on these screens.

Data Dictionary and Other Design Documents


85
For internal hidden languages, your most likely source of syntax is the data dictionary and other design
documents. Also look at internal interface standards, style manuals, and design practice memos.
Common subroutine and macro library documents are also good sources.

4.3.6. Prototypes

If there’s a prototype, then it’s likely to embody much of the user interface and command language
syntax you need. This source will become more useful in the future as p. Programmer Interviews

The second most expensive way to get user and operator command syntax is to drag the information
out of the implementing programmer’s head by interviews. I would do it only after I had exhausted all
other sources.rototyping gains popularity.

3.8. Experimental

The most expensive possible way to get the syntax is by experimenting with the running program.
Think back to the times you’ve had to use a new system without an instruction manual and of how
difficult it was to work out even a few simple commands—now think of how much work that can be
for an entire set of commands;
IMPLEMENTATION AND APPLICATION
5.1.1. General

Syntax testing, more than any other technique I know, forces us into test execution automation because
it’s so easy to design so many tests (even by hand) and because design automation is also easy.

5.1.2. Manual Execution

Manual execution? Don’t! Even primitive automation methods such as putting test cases on paper tape
(see the first edition) was better than doing it manually. I found that the only way it could be done by
hand was to use three persons, as in the following scenario. If that doesn’t convince you to automate,
then you’re into compulsive masochism.

1.3. Capture/Replay

See Chapter 13 for a more detailed discussion of capture/replay systems. A capture/replay system
captures your keystrokes and stuff sent to the screen and stores them for later execution. However
you’ve designed your syntax tests, execute them the first time through a capture/replay system if that’s
the only kind of execution automation you can manage. These systems (at least the acceptable ones)
have a built-in editor or can pass the test data to a word processor for editing. That way, even if your
first execution is faulty, you’ll be able to correct it.

Drivers

5.Build or buy a driver—a program that automatically sequences through a set of test cases usually
stored as data. Don’t build the bad strings (especially) as code in an ordinary programming language
because you’ll be going down a diverging infinite sequence of test testing.

5.1.5.
86 Scripting Languages
A scripting language is a language used to write test scripts. CASL (CROS89, FURG89) is nice
scripting language because it can be used to emulate any interface, work from strings stored as data,
provide smart comparisons for test outcome validation, editing, and c5.2. Design Automation

5.2.1. General

Syntax testing is a good place to begin a test design automation effort because it’s so easy and has
such a high, immediate payoff. It’s about the only test design automation area in which you can count
on a payback the first time out.

5.2.2. Primitive Methods

You can do design automation with a word processor. If you don’t have that, will you settle for a
copying machine and a bottle of white-out? Design a covering set of correct input strings. If you want
to, because you have to produce paper documentation for every test case, bracket your test strings with
control sequences such as “$$$XXX” so that you’ll be able to extract them later on. apture/replay

. Scripting Languages

A scripting language and processor such as CASL has the features needed to automate the replacement
of good substrings by bad ones on the fly. You can use random number generators to select which
incorrect, single, character will be used in any spot.

Random String Generators

Why not just use a random number generator to generate completely random strings? Two reasons:
random strings get recognized as invalid too soon, and even a weak front end will catch most bad
strings. The probability of hitting vulnerable points is too low, just as it was for random inputs in
domain testing—there are too many bad strings in the Getting Sophisticated

Getting sophisticated means building an anti-parser. It’s about as complicated as a simple compiler.
The language it compiles is BNF, and instead of producing output code it produces structured garbage.
I’ll assume that you know the rudiments of how a compiler works—if not, this section is beyond you.
world.

Productivity, Training, and Effectiveness

I used syntax test design as basic training for persons new to a test group. With very little effort they
can churn out hundreds of good tests. It’s a great confidence builder for people who have never done
formal test design before and who may be intimidated by the prospect of subjecting a senior designer’s
masterpiece to a barrage of calculated heartburn.

Ad-Lib Tests

Whenever you run a formal system test there’s always someone in the crowd who wants to try ad-lib
tests. And almost always, the kind of test they want to ad-lib is an input-syntax error test. I used to
object to adlibbing, because it didn’t prove anything—I thought. It doesn’t prove anything substantive
about the system, assuming you’ve done a good job of testing—which is why I used to object to it. It
may87 save time to object to ad-lib tests, but it’s not politic.
UNIT IV
PATHS, PATH PRODUCTS AND REGULAR EXPRESSIONS

Paths,Path products and Regular expressions:- path products


&pathexpression,reduction procedure, applications, regular expressions & flow anomaly
detection.
Logic Based Testing:-overview,decision tables,pathexpressions,kv charts, specifications.

PATH PRODUCTS AND PATH EXPRESSION:

∑ MOTIVATION:
o Flow graphs are being an abstract representation of programs.
o Any question about a program can be cast into an equivalent question about an
appropriate flowgraph.
o Most software development, testing and debugging tools use flow graphs
analysis techniques.

∑ PATH PRODUCTS:
o Normally flow graphs used to denote only control flow connectivity.
o The simplest weight we can give to a link is a name.
o Using link names as weights, we then convert the graphical flow graph into an
equivalent algebraic like expressions which denotes the set of all possible paths
from entry to exit for the flow graph.
o Every link of a graph can be given a name.
o The link name will be denoted by lower case italic letters In tracing a path or
path segment through a flow graph, you traverse a succession of link names.
o The name of the path or path segment that corresponds to those links is
expressed naturally by concatenating those link names.
o For example, if you traverse links a,b,c and d along some path, the name for that
path segment is abcd. This path name is also called a path product. Figure 5.1
shows some examples:

88
Figure 5.1: Examples of paths.
∑ PATH
EXPRESSION:

89
o Consider a pair of nodes in a graph and the set of paths between those node.
o Denote that set of paths by Upper case letter such as X,Y. From Figure 5.1c,
the members of the path set can be listed as follows:
ac, abc, abbc, abbbc, abbbbc.............
o Alternatively, the same set of paths can be denoted by :
ac+abc+abbc+abbbc+abbbbc+....
.......
o The + sign is understood to mean "or" between the two nodes of interest, paths ac,
or abc, or abbc, and so on can be taken.
o Any expression that consists of path names and "OR"s and which denotes a set of
paths between two nodes is called a "Path Expression”.

∑ PATH PRODUCTS:
o The name of a path that consists of two successive path segments is
conveniently expressed by the concatenation or Path Product of the segment
names.
o For example, if X and Y are defined as X=abcde,Y=fghij,then the
path corresponding to X followed by Y is denoted by
XY=abcdefg
hij
o Similarly,
YX=fghijabcde
aX=aabcde
Xa=abcdea
XaX=abcdeaabcde
o If X and Y represent sets of paths or path expressions, their product represents
the set of paths that can be obtained by following every element of X by any
element of Y in all possible ways. For example,
o X = abc + def + ghio Y = uvw + z
Then,
XY = abcuvw + defuvw + ghiuvw + abcz + defz + ghiz
o If a link or segment name is repeated, that fact is denoted by an exponent.
The exponent's value denotes the number of repetitions:
o a1 = a; a2 = aa; a3 = aaa; an = aaaa . . . n times.
Similarly, if X = abcde then

X1 = abcde
X2 = abcdeabcde = (abcde)2
X3 = abcdeabcdeabcde = (abcde)2abcde
= abcde(abcde)2 = (abcde)3
o The path product is not commutative (that is XY!=YX).
o The path product is Associative.
RULE 1: A(BC)=(AB)C=ABC
where A,B,C are path names, set of path names or path expressions.
o The zeroth power of a link name, path product, or path expression is
also needed for completeness. It is denoted by the numeral "1" and denotes the
"path" whose length is zero - that is, the path that doesn't have any links.
o a0 = 1
o X0 = 1

∑ PATH SUMS:
o The "+" sign was used to denote the fact that path names were part of the same
set of paths.
o The "PATH SUM" denotes paths in parallel between nodes.
o Links a and b in Figure 5.1a are parallel paths and are denoted by a + b. Similarly,
links c and d are parallel paths between the next two nodes and are denoted by c +
d.
o The set of all paths between nodes 1 and 2 can be thought of as a set of parallel
paths and denoted by eacf+eadf+ebcf+ebdf.
o If X and Y are sets of paths that lie between the same pair of nodes, then X+Y
denotes the UNION of those set of paths. For example, in Figure 5.2:

Figure 5.2: Examples of path sums.


The first set of parallel paths is denoted by X + Y + d and the second set by U + V
+ W + h + i + j. The set of all paths in this flowgraph is f(X + Y + d)g(U + V + W
+ h + i + j)k
o The path is a set union operation, it is clearly Commutative and Associative.
o RULE 2: X+Y=Y+X
o RULE 3: (X+Y)+Z=X+(Y+Z)=X+Y+Z
∑ DISTRIBUTIVE LAWS:
o The product and sum operations are distributive, and the ordinary rules of
multiplication apply; that is
RULE 4: A(B+C)=AB+AC and (B+C)D=BD+CD
o Applying these rules to the below Figure 5.1a yields
o e(a+b)(c+d)f=e(ac+ad+bc+bd)f = eacf+eadf+ebcf+ebdf

∑ ABSORPTION RULE:
o If X and Y denote the same set of paths, then the union of these sets is
unchanged; consequently,
RULE 5: X+X=X (Absorption Rule)
o If a set consists of paths names and a member of that set is added to it, the
"new" name, which is already in that set of names, contributes nothing and can be
ignored.
o For example,
o if X=a+aa+abc+abcd+def then
X+a = X+aa = X+abc = X+abcd = X+def = X
It follows that any arbitrary sum of identical path expressions reduces to the same path expression.
∑ LOOPS:
Loops can be understood as an infinite set of parallel paths. Say that the loop consists of a
single link b. then the set of all paths through that loop point is
b0+b1+b2+b3+b4+b5+..............

Figure 5.3: Examples of path loops.


This potentially infinite sum is denoted by b* for an individual link and by X*

Figure 5.4: Another example of path loops.


o The path expression for the above figure is denoted by the
notation: ab*c=ac+abc+abbc+abbbc+................
o Evidently,
aa*=a*a=a+ and XX*=X*X=X+
o It is more convenient to denote the fact that a loop cannot be taken more than a
certain, say n, number of times.
o A bar is used under the exponent to denote the fact as
follows: Xn = X0+X1+X2+X3+X4+X5+..................
+Xn
RULES 6 - 16:
The following rules can be derived from the previous rules
RULE 6: Xn + Xm =
Xn if n>m RULE 6:
Xn + Xm = Xm if
m>n RULE 7: XnXm =
Xn+m
RULE 8: XnX* = X*Xn = X* RULE 9: XnX+ = X+Xn = X+ RULE
10: X*X+ = X+X* = X+ RULE 11: 1 + 1 = 1
RULE 12: 1X = X1 = X
Following or preceding a set of paths by a path of zero length does
not change the set. RULE 13: 1n = 1n = 1* = 1+ = 1
No matter how often you traverse a path of zero length,It is a path of zero length. RULE 14: 1++1 =
*
1 =1
The null set of paths is denoted by the numeral 0. it obeys the following rules:
RULE 15:
X+0=0+X=
X RULE
16:
0X=X0=0
If you block the paths of a graph for or aft by a graph that has no paths , there won’t be any paths.
REDUCTION PROCEDURE:

∑ REDUCTION PROCEDURE ALGORITHM:


o This section presents a reduction procedure for converting a flowgraph whose
links are labeled with names into a path expression that denotes the set of all
entry/exit paths in that flowgraph. The procedure is a node-by-node
removal algorithm.
o The steps in Reduction Algorithm are as follows:
1. Combine all serial links by multiplying their pathexpressions.
2. Combine all parallel links by adding their pathexpressions.
3. Remove all self-loops (from any node to itself) by replacing them
with a link of the form X*, where X is the path expression of the link
in that loop.

STEPS 4 - 8 ARE IN THE ALGORIHTM'S


LOOP:
4. Select any node for removal other than the initial or final node.
Replace it with a set of equivalent links whose path expressions
correspond to all the ways you can form a product of the set of
inlinks with the set of outlinks of that node.
5. Combine any remaining serial links by multiplying their path expressions.
6. Combine all parallel links by adding their path expressions.
7. Remove all self-loops as in step 3.
8. Does the graph consist of a single link between the entry node and
the exit node? If yes, then the path expression for that link is a
path expression for the original flowgraph; otherwise, return to step
4.
o A flowgraph can have many equivalent path expressions between a given
pair of nodes; that is, there are many different ways to generate the set of
all paths between two nodes without affecting the content of that set.
o The appearance of the path expression depends, in general, on the
order in which nodes are removed.

∑ CROSS-TERM STEP (STEP 4):


o The cross - term step is the fundamental step of the reduction
algorithm.o It removes a node, thereby reducing the number of nodes
by one.
o Successive applications of this step eventually get you down to one entry and
one exit node. The following diagram shows the situation at an arbitrary node that
has been selected for removal:

o From the above diagram, one can infer:


o (a + b)(c + d + e) = ac + ad + + ae + bc + bd + be

∑ LOOP REMOVAL OPERATIONS:


o There are two ways of looking at the loop-removal operation:

o In the first way, we remove the self-loop and then multiply all outgoing links by
Z*.
o In the second way, we split the node into two equivalent nodes, call them A and A'
and put in a link between them whose path expression is Z*. Then we remove
node A' using steps 4 and 5 to yield outgoing links whose path expressions are
Z*X and Z*Y.

∑ A REDUCTION PROCEDURE - EXAMPLE:


o Let us see by applying this algorithm to the following graph where we remove
several nodes in order; that is
Figure 5.5: Example Flowgraph for demonstrating reduction
procedure.

o Remove node 10 by applying step 4 and combine by step 5 to yield


o Remove node 9 by applying step4 and 5 to yield

o Remove node 7 by steps 4 and 5, as follows:

o Remove node 8 by steps 4 and 5, to obtain:

o PARALLEL TERM (STEP 6):


Removal of node 8 above led to a pair of parallel links between nodes 4 and 5. combine
them to create
a path expression for an equivalent link whose path expression is c+gkh; that is

o LOOP TERM (STEP 7):


Removing node 4 leads to a loop term. The graph has now been replaced with
the following equivalent simpler graph:

o Continue the process by applying the loop-removal step as follows:

o Removing node 5 produces:

o Remove the loop at node 6 to yield:

o Remove node 3 to yield

o Removing the loop and then node 6 result in the following


expression:
a(bgjf)*b(c+gkh)d((ilhd)*imf(bjgf)*b(c+gkh)d)*(ilhd)
*e

o You can practice by applying the algorithm on the following flowgraphs and
generate their respective path expressions:
Figure 5.6: Some graphs and their path expressions.
APPLICA
TIONS:
o The purpose of the node removal algorithm is to present one
very generalized concept- the path expression and way of
getting it.
o Every application follows this common pattern:
1. Convert the program or graph into a
path expression.
2. Identify a property of interest and derive an appropriate set of
"arithmetic" rules that characterizes the property.
Replace the link names by the link weights for the property of interest. The
path expression has now been
converted to an expression in some algebra, such as
1. Ordinary algebra, regular expressions, or boolean
algebra. This algebraic expression summarizes the
property of interest over the set of allpaths.
2. Simplify or evaluate the resulting "algebraic" expression
to answer the question you asked.

∑ HOW MANY PATHS IN A FLOW GRAPH ?


o The question is not simple. Here are some ways you
could ask it:
1. What is the maximum number of different paths possible?
2. What is the fewest number of paths possible?
3. How many different paths are there really?
4. What is the average number of paths?
o Determining the actual number of different paths is an
inherently difficult problem because there could be unachievable
paths resulting from correlated
and dependent predicates.
o If we know both of these numbers (maximum and minimum number of possible
paths) we have a good idea of how complete our testing is.
o Asking for "the average number of paths" is meaningless.

∑ MAXIMUM PATH COUNT


ARITHMETIC:
o Label each link with a link weight that corresponds to the number of paths that
link represents.
o Also mark each loop with the maximum number of times that loop can be taken.
If the answer is infinite, you might as well stop the analysis because it is clear that
the maximum number of paths will be infinite.
o There are three cases of interest: parallel links, serial links, and loops.

o This arithmetic is an ordinary algebra. The weight is the number of paths in


each set.
o EXAMPLE:
ß The following is a reasonably well-structured program.

Each link represents a single link and consequently is given a weight of "1"
to start. Let’s say the outer loop will be taken exactly four times and inner
Loop Can be taken zero or three times Its path expression, with a little
work, is:
Path expression: a(b+c)d{e(fi)*fgj(m+l)k}*e(fi)*fgh
ß A: The flow graph should be annotated by replacing the link name with
the maximum of paths through that link (1) and also note the number of
times for looping.
ß B: Combine the first pair of parallel loops outside the loop and also
the pair in the outer loop.
ß C: Multiply the things out and remove nodes to clear the clutter.
1. For the Inner Loop:
D:Calculate the total weight of inner loop, which can execute a min. of 0
times and max.
of 3 times. So, it inner loop can be evaluated as follows:

13 = 10 + 11 + 12 + 13 = 1 + 1 + 1 + 1 = 4
2. E: Multiply the link weights inside the loop: 1 X 4 = 4
3. F: Evaluate the loop by multiplying the link wieghts: 2 X 4 = 8.
4. G: Simpifying the loop further results in the total maximum number
of paths in the flowgraph:

2 X 84 X 2 = 32,768.
Alternatively, you could have substituted a "1" for each link in the path expression and then
simplified, as follows:

a(b+c)d{e(fi)*fgj(m+l)k}*e(fi)*fgh
= 1(1 + 1)1(1(1 x 1)31 x 1 x 1(1 + 1)1)41(1 x 1)31 x 1 x 1
= 2(131 x (2))413
= 2(4 x 2)4 x 4
= 2 x 84 x 4 = 32,768
This is the same result we got graphically.Actually, the outer loop should be taken exactly four
times. That doesn't mean it will be taken zero or four times. Consequently, there is a superfluous
"4" on the outlink in the last step. Therefore the maximum number of different paths is 8192
rather than 32,768.

STRUCTURED FLOWGRAPH:
Structured code can be defined in several different ways that do not involve ad-hoc rules such as
not using
GOTOs.
A structured flowgraph is one that can be reduced to a single link by successive
application of the transformations of Figure 5.7.

Figure 5.7: Structured Flowgraph Transformations.

The node-by-node reduction procedure can also be used as a test for structured code.Flow graphs
that DO NOT
contain one or more of the graphs shown below (Figure 5.8) as subgraphs are structured.
1. Jumping into loops
2. Jumping out of loops
3. Branching into decisions
4. Branching out of decisions
Figure 5.8: Un-structured sub-graphs.
LOWER PATH COUNT ARITHMETIC:
A lower bound on the number of paths in a routine can be approximated for structured flow
graphs.
The arithmetic is as follows:

The values of the weights are the number of members in a set of paths.
EXAMPLE:
ß Applying the arithmetic to the earlier example gives us the identical
steps unitl step 3 (C) as below:
ß From Step 4, the it would be different from the previous example:

ß If you observe the original graph, it takes at least two paths to cover
and that it can be done in two paths.
ß If you have fewer paths in your test plan than this minimum you
probably haven't covered. It's another check.

CALCULATING THE PROBABILITY:


Path selection should be biased toward the low - rather than the high-probability paths.This
raises an interesting question:
What is the probability of being at a certain point in a routine?

This question can be answered under suitable assumptions primarily that all probabilities
involved are independent, which is to say that all decisions are independent and uncorrelated.
We use the same algorithm as before: node-by-node removal of uninteresting nodes.
Weights, Notations and Arithmetic:
ß Probabilities can come into the act only at decisions (including decisions
associated with loops).
ß Annotate each outlink with a weight equal to the probability of going in
that direction.
ß Evidently, the sum of the outlink probabilities must equal 1
ß For a simple loop, if the loop will be taken a mean of N times, the looping
probability is N/(N + 1) and the probability of not looping is 1/(N + 1).
ß A link that is not part of a decision node has a probability of 1.
ß The arithmetic rules are those of ordinary arithmetic.

ß In this table, in case of a loop, PA is the probability of the link leaving


the loop and PL is the probability of looping.
ß The rules are those of ordinary probability theory.
1. If you can do something either from column A with a probability of
PA or from column B with a probability P B, then the probability
that you do either is PA + PB.
2. For the series case, if you must do both things, and their
probabilities are independent (as assumed), then the probability that
you do both is the product of their probabilities.
ß For example, a loop node has a looping probability of PL and a
probability of not looping of PA, which is obviously equal to I - PL.
ß Following the above rule, all we've done is replace the outgoing
probability with 1 - so why the complicated rule? After a few steps in
which you've removed nodes, combined parallel terms, removed loops
and the like, you might find something like this:

because PL + PA + PB + PC = 1, 1 - PL = PA + PB + PC, and

which is
what we've postulated for any decision. In other words,
division by 1 - PL renormalizes the outlink probabilities so
that their sum equals unity after the loop is removed.
EXAMP
LE: ß Here is a complicated bit of logic. We want to know
the probability associated with cases A, B, and C.

ß Let us do this in three parts, starting with case A. Note that


the sum of the probabilities at each decision node is equal
to 1. Start by throwing away anything that isn't on the way
to case A, and then apply the reduction procedure. To avoid
clutter, we usually leave out probabilities equal to 1.
CASE
A:
ß Case B is simpler:

ß Case C is similar and should yield a probability of 1 - 0.125 - 0.158 =


0.717:

ß These checks. It's a good idea when doing this sort of thing to calculate all
the probabilities and to verify that the sum of the routine's exit
probabilities does equal 1.
ß If it doesn't, then you've made calculation error or, more likely, you've left
out some bra How about path probabilities? That's easy. Just trace the path
of interest and multiply the probabilities as you go.
ß Alternatively, write down the path name and do the indicated arithmetic
operation.
ß Say that a path consisted of links a, b, c, d, e, and the associated
probabilities were .2, .5, 1., .01, and I respectively. Path
abcbcbcdeabddea would have a probability of 5 x 10-10.
ß Long paths are usually improbable.

MEAN PROCESSING TIME OF A ROUTINE:


Given the execution time of all statements or instructions for every link in a flowgraph and
the probability for each direction for all decisions are to find the mean processing time for the
routine as a whole.
The model has two weights associated with every link: the processing time for that link, denoted by
T, and the
probability of that link P.
The arithmetic rules for calculating the mean time:

EXAMPLE:
1. Start with the original flow graph annotated with probabilities and processing time.

2.Combine the parallel links of the outer loop. The result is just the mean of the
processing times for the links because there aren't any other links leaving the first node.
Also combine the pair of links at the beginning of the flow graph.
3. Combine as many serial links as you can.
4. Use the cross-term step to eliminate a node and to create the inner self - loop.
5.Finally, you can get the mean processing time, by using the arithmetic rules as
follows:

PUSH/POP, GET/RETURN:
This model can be used to answer several different questions that can turn up in debugging.
It can also help decide which test cases to design.
The question is:

Given a pair of complementary operations such as PUSH (the stack) and POP (the stack),
considering the set of all possible paths through the routine, what is the net effect of the
routine? PUSH or POP? How many times? Under what conditions?
Here are some other examples of complementary operations to which this model applies:
GET/RETURN a resource block.
OPEN/CLOSE
a file. START/STOP a
device or process.
EXAMPLE 1 (PUSH / POP):
ß Here is the Push/Pop Arithmetic:

ß The numeral 1 is used to indicate that nothing of interest (neither


PUSH nor POP) occurs on a given link.
ß "H" denotes PUSH and "P" denotes POP. The operations are
commutative, associative, and distributive.

ß Consider the following flow graph:

P(P + 1)1{P(HH)n1HP1(P + H)1}n2P(HH)n1HPH


ß Simplifying by using the arithmetic tables,
= (P2 + P){P(HH)n1(P + H)}n1(HH)n1
= (P2 + P){H2n1(P2 + 1)}n2H2n1
ß Below Table 5.9 shows several combinations of values for the twolooping
terms - M1 is the number of times the inner loop will be taken and M2 the
number of times the outer loop will be taken.
Figure 5.9: Result of the PUSH / POP Graph Analysis.
ß These expressions state that the stack will be popped only if the inner
loop is not taken.
ß The stack will be left alone only if the inner loop is iterated once, but it
may also be pushed.
ß For all other values of the inner loop, the stack will only be pushed.

EXAMPLE 2 (GET / RETURN):


ß Exactly the same arithmetic tables used for previous example are used
for GET / RETURN a buffer block or resource, or, in fact, for any pair of
complementary operations in which the total number of operations in
either direction is cumulative.
ß The arithmetic tables for GET/RETURN are:

"G" denotes GET and "R" denotes RETURN.


ß Consider the following flowgraph:

ß G(G + R)G(GR)*GGR*R
= G(G + R)G3R*R
= (G + R)G3R*
= (G4 + G2)R*
ß This expression specifies the conditions under which the resources will be
balanced on leaving the routine.
ß If the upper branch is taken at the first decision, the second loop must be
taken four times.
ß If the lower branch is taken at the first decision, the second loop must be
taken twice.
ß For any other values, the routine will not balance. Therefore, the
first loop does not have to be instrumented to verify this behavior because
its impact should be nil.

LIMITATIONS AND SOLUTIONS:

o The main limitation to these applications is the problem of unachievable paths.


o The node-by-node reduction procedure, and most graph-theory-based algorithms work
well when all paths are possible, but may provide misleading results when some paths
are unachievable.
o The approach to handling unachievable paths (for any application) is to partition
the graph into subgraphs so that all paths in each of the subgraphs are achievable.
o The resulting subgraphs may overlap, because one path may be common to
several different subgraphs.
o Each predicate's truth-functional value potentially splits the graph into two
subgraphs. For n predicates, there could be as many as 2n subgraphs.
REGULAR EXPRESSIONS AND FLOW ANOMALY DETECTION:

∑ THE PROBLEM:
o The generic flow-anomaly detection problem (note: not just data-flow
anomalies, but any flow anomaly) is that of looking for a specific sequence of
options considering all possible paths through a routine.
o Let the operations be SET and RESET, denoted by s and r respectively, and we
want to know if there is a SET followed immediately a SET or a RESET
followed immediately by a RESET (an ss or an rr sequence).
o Some more application examples:
1. A file can be opened (o), closed (c), read (r), or written (w). If the file is
read or written to after it's been closed, the sequence is nonsensical.
Therefore, cr and cw are anomalous. Similarly, if the file is read before
it's been written, just after opening, we may have a bug. Therefore, or is
also anomalous. Furthermore, oo and cc, though not actual bugs, are a
waste of time and therefore should also be examined.
2. A tape transport can do a rewind (d), fast-forward (f), read (r), write (w),
stop (p), and skip (k). There are rules concerning the use of the transport;
for example, you cannot go from rewind to fast-forward without an
intervening stop or from rewind or fast-forward to read or write without an
intervening stop. The following sequences are anomalous: df, dr, dw, fd,
and fr. Does the flowgraph lead to anomalous sequences on any path? If
so, what sequences and under what circumstances?
3. The data-flow anomalies discussed in Unit 4 requires us to detect the
dd, dk, kk, and ku sequences. Are there paths with anomalous data
flows?

∑ THE METHOD:
o Annotate each link in the graph with the appropriate operator or the null
operator 1.
o Simplify things to the extent possible, using the fact that a + a = a and 12 = 1.
o You now have a regular expression that denotes all the possible sequences
of operators in that graph. You can now examine that regular expression for
the sequences of interest.
o EXAMPLE: Let A, B, C, be nonempty sets of character sequences whose smallest
string is at least one character long. Let T be a two-character string of characters.
Then if T is a substring of (i.e., if T appears within) AB nC, then T will appear in
AB2C. (HUANG's Theorem)
As an example,
let
o A =
pp B =
srr C
= rp T
= ss
The theorem states that ss will appear in pp(srr)nrp if it appears in pp(srr)2rp.
o However, let

A = p + pp + ps
B = psr + ps(r +
ps) C = rp
T = P4

Is it obvious that there is a p4 sequence in ABnC? The theorem states that we have only to look at

(p + pp + ps)[psr + ps(r + ps)]2rp

Multiplying out the expression and simplifying shows that there is no p4


sequence.
o Incidentally, the above observation is an informal proof of the wisdom of looping
twice discussed in Unit 2. Because data-flow anomalies are represented by two-
character sequences, it follows the above theorem that looping twice is what you
need to do to find such anomalies.

∑ LIMITATIONS:
o Huang's theorem can be easily generalized to cover sequences of greater length
than two characters. Beyond three characters, though, things get complex and this
method has probably reached its utilitarian limit for manual application.
o There are some nice theorems for finding sequences that occur at the beginnings
and ends of strings but no nice algorithms for finding strings buried in an
expression.
o Static flow analysis methods can't determine whether a path is or is not
achievable. Unless the flow analysis includes symbolic execution or similar
techniques, the impact of unachievable paths will not be included in the analysis.
The flow-anomaly application, for example, doesn't tell us that there will be a flow
anomaly - it tells us that if the path is achievable, then there will be a flow anomaly. Such
analytical problems go away, of course, if you take the trouble to design routines for which
all paths are achievable.
UNIT IV(Part-II)
LOGIC BASED TESTING

OVERVIEW OF LOGIC BASED TESTING:

∑ INTRODUCTION:
o The functional requirements of many programs can be specified by decision
tables, which provide a useful basis for program and test design.
o Consistency and completeness can be analyzed by using boolean algebra, which
can also be used as a basis for test design. Boolean algebra is trivialized by using
Karnaugh-Veitch charts.
o "Logic" is one of the most often used words in programmers' vocabularies but one
of their least used techniques.
o Boolean algebra is to logic as arithmetic is to mathematics. Without it, the tester or
programmer is cut off from many test and design techniques and tools that
incorporate those techniques.
o Logic has been, for several decades, the primary tool of hardware logic designers. o
Many test methods developed for hardware logic can be adapted to software logic
testing. Because hardware testing automation is 10 to 15 years ahead of software
testing automation, hardware testing methods and its associated
theory is a fertile ground for software testing methods.
o As programming and test techniques have improved, the bugs have shifted
closer to the process front end, to requirements and their specifications. These
bugs range from 8% to 30% of the total and because they're first-in and last-out,
they're the costliest of all.
o The trouble with specifications is that they're hard to express.
o Boolean algebra (also known as the sentential calculus) is the most basic of all
logic systems.
o Higher-order logic systems are needed and used for formal specifications.
o Much of logical analysis can be and is embedded in tools. But these tools
incorporate methods to simplify, transform, and check specifications, and the
methods are to a large extent based on boolean algebra.

o KNOWLEDGE BASED SYSTEM:

ß The knowledge-based system (also expert system, or "artificial


intelligence" system) has become the programming construct of choice for
many applications that were once considered very difficult.
ß Knowledge-based systems incorporate knowledge from a knowledge
domain such as medicine, law, or civil engineering into a database. The
data can then be queried and interacted with to provide solutions to
problems in that domain.
ß One implementation of knowledge-based systems is to incorporate the
expert's knowledge into a set of rules. The user can then provide data
and ask questions based on that data.
ß The user's data is processed through the rule base to yield conclusions
(tentative or definite) and requests for more data. The processing is done
by a program called the inference engine.
ß Understanding knowledge-based systems and their validation problems
requires an understanding of formal logic.
o Decision tables are extensively used in business data processing; Decision-table
preprocessors as extensions to COBOL are in common use; boolean algebra is
embedded in the implementation of these processors.
o Although programmed tools are nice to have, most of the benefits of boolean
algebra can be reaped by wholly manual means if you have the right conceptual
tool: the Karnaugh-Veitch diagram is that conceptual tool.

∑ DECISION TABLES:

∑ Figure 6.1 is a limited - entry decision table. It consists of four areas called the condition
stub, the condition entry, the action stub, and the action entry.
∑ Each column of the table is a rule that specifies the conditions under which the actions
named in the action stub will take place.
∑ The condition stub is a list of names of conditions.

Figure 6.1 : Examples of Decision Table.


∑ A more general decision table can be as below:
Figure 6.2 : Another Examples of Decision Table.
∑ A rule specifies whether a condition should or should not be met for the rule to be
satisfied. "YES" means that the condition must be met, "NO" means that the condition
must not be met, and "I" means that the condition plays no part in the rule, or it is
immaterial to that rule.
The action stub names the actions the routine will take or initiate if the rule is satisfied.
∑ If the action entry is "YES", the action will take place; if "NO", the action will not take
place.
The table in Figure 6.1 can be translated as follows:

Action 1 will take place if conditions 1 and 2 are met and if conditions 3 and 4 are not met (rule
1) or if conditions 1, 3, and 4 are met (rule 2).
∑ "Condition" is another word for predicate.
∑ Decision-table uses "condition" and "satisfied" or "met". Let us use "predicate" and
TRUE / FALSE.
∑ Now the above translations become:
1. Action 1 will be taken if predicates 1 and 2 are true and if predicates 3 and 4 are
false (rule 1), or if predicates 1, 3, and 4 are true (rule 2).
2. Action 2 will be taken if the predicates are all false, (rule 3).
3. Action 3 will take place if predicate 1 is false and predicate 4 is true (rule 4).
∑ In addition to the stated rules, we also need a Default Rule that specifies the default
action to be taken when all other rules fail. The default rules for Table in Figure 6.1 is
shown in Figure 6.3

Figure 6.3 : The default rules of Table in Figure 6.1

∑ DECISION-TABLE PROCESSORS:

o Decision tables can be automatically translated into code and, as such, are a
higher-order language
o If the rule is satisfied, the corresponding action takes place
o Otherwise, rule 2 is tried. This process continues until either a satisfied rule
results in an action or no rule is satisfied and the default action is taken
o Decision tables have become a useful tool in the programmers kit, in business
data processing.
DECISION-TABLES AS BASIS FOR TEST CASE DESIGN:

1. The specification is given as a decision table or can be easily converted into one.
2. The order in which the predicates are evaluated does not affect interpretation of the
rules or the resulting action - i.e., an arbitrary permutation of the predicate order
will not, or should not, affect which action takes place.
3. The order in which the rules are evaluated does not affect the resulting action - i.e.,
an arbitrary permutation of rules will not, or should not, affect which action takes
place.
4. Once a rule is satisfied and an action selected, no other rule need be examined.
5. If several actions can result from satisfying a rule, the order in which the actions
are executed doesn't matter.

DECISION-TABLES AND STRUCTURE:

o Decision tables can also be used to examine a program's structure.


o Figure 6.4 shows a program segment that consists of a decision tree.
o These decisions, in various combinations, can lead to actions 1, 2, or 3.

Figure 6.4 : A Sample Program


o If the decision appears on a path, put in a YES or NO as appropriate. If the
decision does not appear on the path, put in an I, Rule 1 does not contain
decision C, therefore its entries are: YES, YES, I, YES.
o The corresponding decision table is shown in Table 6.1
RULE 1 RULE 2 RULE 3 RULE 4 RULE 5 RULE 6
CONDITI
ON A
CONDITI YE YE YE NO NO N
ON B S S S I I O
CONDITI YE NO YE YE NO I
ON C SI II SI SI YE N
CONDITI YE NO S O
ON D S N
O

ACTION Y Y N N N N
1 ES E O O O O
ACTION N S Y Y Y
2 O N E ES ES N
ACTION N O S N N O
3 O N N O O
O O Y
E
S
Table 6.1: Decision Table corresponding to Figure
6.4
As an example, expanding the immaterial cases results as below:

Table 6.2: Expansion of Table 6.1


o Similalrly, If we expand the immaterial cases for the above Table 6.1, it
results in
Table 6.2 as below:
R1 RULE 2 R3 RULE 4
R5 R6
CONDITION A YY NNYY
YY CONDITION B YY YNNY
YY CONDITION C NN
CONDITION YN D NN
YY
YY NNNN YY YYNN YN N NN NY NN YY
YYYY NN NYYN N YN NN NN

1. Sixteen cases are represented in Table 6.1, and no case appears twice.
2. Consequently, the flowgraph appears to be complete and consistent.
3. As a first check, before you look for all sixteen combinations, count
the number of Y's and N's in each row. They should be equal. We
can find the bug that way.

∑ ANOTHER EXAMPLE - A TROUBLE SOME PROGRAM:

1. Consider the following specification whose putative flowgraph is shown in


Figure
6.5:
1. If condition A is met, do process A1 no matter what other
actions are taken or what other conditions are met.
2. If condition B is met, do process A2 no matter what other actions are
taken or what other conditions are met.
3. If condition C is met, do process A3 no matter what other actions are
taken or what other conditions are met.
4. If none of the conditions is met, then do processes A1, A2, and A3.
5. When more than one process is done, process A1 must be done first, then
A2, and then A3. The only permissible cases are: (A1), (A2), (A3),
(A1,A3), (A2,A3) and (A1,A2,A3).
2. Figure 6.5 shows a sample program with a bug.

Figure 6.5 : A Troublesome Program


o The programmer tried to force all three processes to be executed for the
cases but forgot that the B and C predicates would be done again, thereby
bypassing processes A2 and A3.
o Table 6.3 shows the conversion of this flow graph into a decision table
after expansion.

Table 6.3: Decision Table for Figure 6.5

PATH
EXPRESSIONS
:
GENERAL:
o Logic-based testing is structural testing when it's applied to structure (e.g.,
control flow graph of an implementation); it's functional testing when it's applied
to a specification.
o In logic-based testing we focus on the truth values of control flow predicates.
o A predicate is implemented as a process whose outcome is a truth-functional
value.
o For our purpose, logic-based testing is restricted to binary predicates.
o We start by generating path expressions by path tracing as in Unit V, but this
time, our purpose is to convert the path expressions into boolean algebra, using the
predicates' truth values (e.g., A and ) as weights.

BOOLEAN ALGEBRA:
o STEPS:
1. Label each decision with an uppercase letter that represents the truth
value of the predicate. The YES or TRUE branch is labeled with a letter
(say A) and the NO or FALSE branch with the same letter overscored (say
).
2. The truth value of a path is the product of the individual labels.
Concatenation or products mean "AND". For example, the straight-
through path of Figure 6.5, which goes via nodes 3, 6, 7, 8, 10, 11, 12, and
2, has a truth value of ABC. The path via nodes 3, 6, 7, 9 and 2 has a value
of .
3. If two or more paths merge at a node, the fact is expressed by use of a
plus sign (+) which means "OR".

Figure 6.5: A Troublesome Program


o Using this convention, the truth-functional values for several of the nodes can be
expressed in terms of segments from previous nodes. Use the node name to
identify the point.
o There are only two numbers in boolean algebra: zero (0) and one (1). One means
"always true" and zero means "always false".
o RULES OF BOOLEAN ALGEBRA:
ß Boolean algebra has three operators: X (AND), + (OR) and (NOT)
ß X : meaning AND. Also called multiplication. A statement such as AB (A X
B) means "A and B are both true". This symbol is usually left out as
in ordinary algebra.
ß + : meaning OR. "A + B" means "either A is true or B is true or both".
ß meaning NOT. Also negation or complementation. This is read as either "not
A" or "A
bar". The entire expression under the bar is negated.
ß The following are the laws of boolean algebra:

In all of the above, a letter can represent a single sentence or an entire boolean
algebra expression. Individual letters in a boolean algebra expression are called
Literals (e.g. A,B) The product of
several literals is called a product term (e.g., ABC, DE).
An arbitrary boolean expression that has been multiplied out so that it consists of the sum of
products (e.g., ABC + DEF + GH) is said to be in sum-of-products form.
The result of simplifications (using the rules above) is again in the sum of product form and each
product term in such a simplified version is called a prime implicant. For example, ABC + AB
+ DEF reduce by rule 20 to AB + DEF; that is, AB and DEF are prime
implicants. The path expressions of Figure 6.5 can now be simplified by
applying the rules.
The following are the laws of boolean algebra:
Similarly,

The deviation from the specification is now clear. The functions should have been:

Loops complicate things because we may have to solve a boolean equation to determine what
predicate value combinations lead to where.

KV CHARTS:

INTRODUCTION:
o If you had to deal with expressions in four, five, or six variables, you could get
bogged down in the algebra and make as many errors in designing test cases as
there are bugs in the routine you're testing.
o Karnaugh-Veitch chart reduces boolean algebraic manipulations to graphical
trivia.
o Beyond six variables these diagrams get cumbersome and may not be effective.
SINGLE VARIABLE:
o Figure 6.6 shows all the boolean functions of a single variable and their
equivalent representation as a KV chart.
Figure 6.6 : KV Charts for Functions of a Single Variable.
o The charts show all possible truth values that the variable A can have.
o A "1" means the variable’s value is "1" or TRUE. A "0" means that the variable's
value is 0 or FALSE.
o The entry in the box (0 or 1) specifies whether the function that the chart
represents is true or false for that value of the variable.
o We usually do not explicitly put in 0 entries but specify only the conditions under
which the function is true.
TWO VARIABLES:
o Figure 6.7 shows eight of the sixteen possible functions of two variables.

Figure 6.7: KV Charts for Functions of Two Variables.


o Each box corresponds to the combination of values of the variables for the row
and column of that box.
o A pair may be adjacent either horizontally or vertically but not diagonally.
o Any variable that changes in either the horizontal or vertical direction does not
appear in the expression.
o In the fifth chart, the B variable changes from 0 to 1 going down the column, and
because the A variable's value for the column is 1, the chart is equivalent to a
simple A.
o Figure 6.8 shows the remaining eight functions of two variables.
Figure 6.8: More Functions of Two Variables.
o The first chart has two 1's in it, but because they are not adjacent, each must
be taken separately.
o They are written using a plus sign.
o It is clear now why there are sixteen functions of two variables.
o Each box in the KV chart corresponds to a combination of the variables' values.
o That combination might or might not be in the function (i.e., the
box corresponding to that combination might have a 1 or 0 entry).
o Since n variables lead to 2n combinations of 0 and 1 for the variables, and
each such combination (box) can be filled or not filled, leading to 2 2n ways of
doing this.
o Consequently for one variable there are 221 = 4 functions, 16 functions of 2
variables, 256 functions of 3 variables, 16,384 functions of 4 variables, andso
on.
o Given two charts over the same variables, arranged the same way, their product is
the term by term product, their sum is the term by term sum, and the
negation of a chart is gotten by reversing all the 0 and 1 entries in the chart.

O
R

THREE VARIABLES:
o KV charts for three variables are shown below.
o As before, each box represents an elementary term of three variables with a bar
appearing or not appearing according to whether the row-column heading for
that box is 0 or 1.
o A three-variable chart can have groupings of 1, 2, 4, and 8 boxes.
o A few examples will illustrate the principles:
Figure 6.8: KV Charts for Functions of Three Variables.
o You'll notice that there are several ways to circle the boxes into maximum-
sized covering groups.
UNIT-V
STATES, STATE GRAPHS, AND TRANSITION TESTING

State, State Graphs and Transition testing:- state graphs, good & bad state graphs,
state testing, Testability tips.
Graph Matrices and Application:-Motivational overview, matrix of graph, relations,
power of a matrix, node reduction algorithm, building tools. ( Student should be given an
exposure to a tool like JMeter or Win-runner).

Introduction
The finite state machine is as fundamental to software engineering as boolean algebra
to logic.
State testing strategies are based on the use of finite state machine models for software
structure, software behavior, or specifications of software behavior.
Finite state machines can also be implemented as table-driven software, in which case
they are a powerful design option.
State Graphs
∑ A state is defined as: “A combination of circumstances or attributes belonging for the
time being to a person or thing.”
For example, a moving automobile whose engine is running can have the following
states with respect to its transmission.
ß Reverse gear
ß Neutral gear
ß First gear
ß Second gear
ß Third gear
ß Fourth gear
State graph - Example
∑ For example, a program that detects the character sequence “ZCZC” can be in the
following states.
Neither ZCZC nor any part of it has been detected.
ß Z has been detected.
ß ZC has been detected.
ß ZCZ has been detected.
ß ZCZC has been detected.

States are represented by Nodes. State are numbered or may identified by words or whatever else is
convenient.
Inputs and Transitions
Whatever is being modeled is subjected to inputs. As a result of those inputs, the state
changes, or is said to have made a Transition.
Transitions are denoted by links that join the states.

The input that causes the transition are marked on the link; that is, the inputs are link
weights.
There is one out link from every state for every input.
∑ If several inputs in a state cause a transition to the same subsequent state, instead of
drawing a bunch of parallel links we can abbreviate the notation by listing the several
inputs as in: “input1, input2, input3………”.

Finite State Machine


A finite state machine is an abstract device that can be represented by a state graph
having a finite number of states and a finite number of transitions between states.
o Outputs
An output can be associated with any link.
∑ Out puts are denoted by letters or words and are separated from inputs by a slash as
follows: “input/output”.
∑ As always, output denotes anything of interest that’s observable and is not restricted to
explicit outputs by devices.
Outputs are also link weights.
If every input associated with a transition causes the same output, then denoted it as:
o “input1, input2, input3…………../output”
es
State tables
∑ Big state graphs are cluttered and hard to follow.
∑ It’s more convenient to represent the state graph as a table (the state table or
state transition table) that specifies the states, the inputs, the transitions and the
outputs.
∑ The following conventions are used:
∑ Each row of the table corresponds to a state.
∑ Each column corresponds to an input condition.
∑ The box at the intersection of a row and a column specifies the next state
(the transition) and the output, if any.
State Table-Example
Time Versus Sequence
∑ State graphs don’t represent time-they represent sequence.
A transition might take microseconds or centuries;
A system could be in one state for milliseconds and another for years- the state graph
would be the same because it has no notion of time.
Although the finite state machines model can be elaborated to include notions of time
in addition to sequence, such as time Petri Nets.
o Software implementation
There is rarely a direct correspondence between programs and the behavior of a
process described as a state graph.
The state graph represents, the total behavior consisting of the transport, the software,
the executive, the status returns, interrupts, and so on.
There is no simple correspondence between lines of code and states. The state table
forms the basis.
Good State Graphs and Bad
What constitutes a good or a bad state graph is to some extent biased by the kinds of
state graphs that are likely to be used in a software test design context.
Here are some principles for judging.
o The total number of states is equal to the product of the possibilities of factors
that make up the state.
o For every state and input there is exactly one transition specified to exactly one,
possibly the same, state.
o For every transition there is one output action specified. The output could be
trivial, but at least one output does something sensible.
o For every state there is a sequence of inputs that will drive the system back to
the same state.

Important graphs

State Bugs-Number of States


The number of states in a state graph is the number of states we choose to recognize or
model.
The state is directly or indirectly recorded as a combination of values of variables that
appear in the data base.
For example, the state could be composed of the value of a counter whose possible
values ranged from 0 to 9, combined with the setting of two bit flags, leading to a total of
2*2*10=40 states.
The number of states can be computed as follows:
o Identify all the component factors of the state.
o Identify all the allowable values for each factor.
o The number of states is the product of the number of allowable values of all the
factors.
Before you do anything else, before you consider one test case, discuss the number of
states you think there are with the number of states the programmer thinks there are.
∑ There is no point in designing tests intended to check the system’s behavior in various
states if there’s no agreement on how many states there are.
o Impossible States
Some times some combinations of factors may appear to beimpossible.
∑ The discrepancy between the programmer’s state count and the tester’s state count is
often due to a difference of opinion concerning “impossible states”.
A robust piece of software will not ignore impossible states but will recognize them and
invoke an illogical condition handler when they appear to have occurred.

Equivalent States
Two states are Equivalent if every sequence of inputs starting from one state produces
exactly the same sequence of outputs when started from the other state. This notion
can also be extended to set of states.

Merging of Equivalent States


Recognizing Equivalent States
Equivalent states can be recognized by the following procedures:
The rows corresponding to the two states are identical with respect to
input/output/next state but the name of the next state could differ.
There are two sets of rows which, except for the state names, have identical state
graphs with respect to transitions and outputs. The two sets can be merged.

TransitionBugs-
unspecified and contradictory Transitions
Every input-state combination must have a specified transition.
If the transition is impossible, then there must be a mechanism that prevents the input
from occurring in that state.
Exactly one transition must be specified for every combination of input and state.
∑ A program can’t have contradictions or ambiguities.
Ambiguities are impossible because the program will do something for every input. Even
the state does not change, by definition this is a transition to the same state.

Unreachable States
An unreachable state is like unreachable code.
A state that no input sequence can reach.
An unreachable state is not impossible, just as unreachable code is not impossible There
may be transitions from unreachable state to other states; there usually because the state
became unreachable as a result of incorrect transition.
There are two possibilities for unreachable states:
o There is a bug; that is some transitions are missing.
o The transitions are there, but you don’t know about it.

Dead States
A dead state is a state that once entered cannot be left.
This is not necessarily a bug but it is suspicious.

The states, transitions, and the inputs could be correct, there could be no dead or
unreachable states, but the output for the transition could be incorrect.
Output actions must be verified independently of states and
transitions. State Testing
Impact of Bugs
If a routine is specified as a state graph that has been verified as correct in all details.
Program code or table or a combination of both must still be implemented.
A bug can manifest itself as one of the following symptoms:
Wrong number of states.
Wrong transitions for a given state-input combination.
Wrong output for a given transition.
Pairs of states or sets of states that are inadvertently made equivalent.
States or set of states that are split to create in equivalent duplicates.
States or sets of states that have become dead.
States or sets of states that have become unreachable.

Principles of State Testing


The strategy for state testing is analogous to that used for path testing flow graphs.
∑ Just as it’s impractical to go through every possible path in a flow graph, it’s
impractical to go through every path in a state graph.
The notion of coverage is identical to that used for flow graphs.
∑ Even though more state testing is done as a single case in a grand tour, it’s impractical to
do it that way for several reasons.
In the early phases of testing, you will never complete the grand tour because of bugs.
Later, in maintenance, testing objectives are understood, and only a few of the states
and transitions have to be tested. A grand tour is waste of time.
Theirs is no much history in a long test sequence and so much has happened that
verification is difficult.

Starting point of state testing


Define a set of covering input sequences that get back to the initial state when starting
from the initial state.
For each step in each input sequence, define the expected next state, the expected
transition, and the expected output code.
A set of tests, then, consists of three sets of sequences:
o Input sequences
o Corresponding transitions or next-state names
o Output sequences

Limitations and Extensions


State transition coverage in a state graph model does not guarantee complete testing. How
defines a hierarchy of paths and methods for combining paths to produce covers of state
graphs.
∑ The simplest is called a “0 switch” which corresponds to testing each transition
individually.
∑ The next level consists of testing transitions sequences consisting of two transitions
called “1 switches”.
∑ The maximum length switch is “n-1 switch” where there are n numbers of states.
o Situations at which state testing is useful
Any processing where the output is based on the occurrence of one or more sequences
of events, such as detection of specified input sequences, sequential format validation,
parsing, and other situations in which the order of inputs is important.
Most protocols between systems, between humans and machines, between
components of a system.
Device drivers such as for tapes and discs that have complicated retry and
recovery procedures if the action depends on the state.
Whenever a feature is directly and explicitly implemented as one or more state transition tables.

UNIT-5 (PART-II) GRAPH MATRICES AND APPLICATIONS

Problem with Pictorial Graphs


Graphs were introduced as an abstraction of softwarestructure.
Whenever a graph is used as a model, sooner or later we trace paths through it- to find a
set of covering paths, a set of values that will sensitize paths, the logic function that
controls the flow, the processing time of the routine, the equations that define the
domain, or whether a state is reachable or not.
∑ Path is not easy, and it’s subject to error. You can miss a link here and there or cover
some links twice.
∑ One solution to this problem is to represent the graph as a matrix and to use matrix
operations equivalent to path tracing. These methods are more methodical and
mechanical and don’t depend on your ability to see a path they are more reliable.

Tool Building
If you build test tools or want to know how they work, sooner or later you will be
implementing
or investigating analysis routines based on these methods.
It is hard to build algorithms over visual graphs so the properties or graph
matrices are fundamental to tool building.
The Basic Algorithms
The basic tool kit consists of:
Matrix multiplication, which is used to get the path expression from every node to
every other node.
A partitioning algorithm for converting graphs with loops into loop free graphs or
equivalence classes.
A collapsing process which gets the path expression from any node to any other node.
The Matrix of a Graph
∑ A graph matrix is a square array with one row and one column for every node in the
graph.
∑ Each row-column combination corresponds to a relation between the node
corresponding to the row and the node corresponding to thecolumn.
∑ The relation for example, could be as simple as the link name, if there is a link
between the nodes.
Some of the things to be observed:
The size of the matrix equals the number of nodes.
There is a place to put every possible direct connection or link between any and any
other node. The entry at a row and column intersection is the link weight of the link
that connects the two nodes in that direction.
A connection from node i to j does not imply a connection from node j to node i.
UNIT-4
Syntax Testing
Syntax Testing is a type of black box testing technique which is used to
examine the format and the grammar of the data inputs used in the software
application, either external or input, which may be formally described in technical
or established & specified notations such as BNF and could be used to design input
validation tests.
Syntax testing is performed to verify and validate the both internal and
external data input to the system, against the specified format, file format, database
schema, protocol and other similar things. Generally, syntax tests are automated, as
they involve the production of large number of tests.
Syntax testing consists of the following steps:
The methodology of syntax testing can be perceived and understood through
following steps in the sequential order:
 The very first step of the syntax testing involves the identification of the target
language or format.
 Thereafter, syntax of the language is defined, as specified in the formal
notation such as Backus-Naur form (BNF). As each and every input has some
syntax, which may be formally pre-specified, undocumented, etc.
 The final step involves testing and debugging the syntax to ensure its
completeness & consistency. Generally, the syntax is tested using two
conditions as stated below.
 Testing the normal condition using the covering set of paths of the syntax
graph, for the minimum necessary requirements.
 Testing the garbage condition*, using invalid set of input data.
Note*: Garbage condition is the method of testing the system's tolerance against
the bad or dirty data. The condition is executed by providing dirty data (invalid
data) to the system, which is being not supported by the specified format and
grammar of the syntax.
Garbage in, Garbage out (GIGO)
GIGO (garbage in, garbage out) is a concept common to computer science
and mathematics: the quality of output is determined by the quality of the input.
So, for example, if a mathematical equation is improperly stated, the answer is
unlikely to be correct. Similarly, if incorrect data is input to a program, the output
is unlikely to be informative.
Applications of Syntax Testing
The following are the applications of syntax testing:

www.Jntufastupdates.com 1
1. Command-Driven Software: This is the most obvious application and
probably the most common. If a system is mostly command driven, then
much of its testing can be organized under syntax testing.
2. Menu-Driven Software: The popular alternative to command-driven
software is menu-driven software, in which actions are initiated by selecting
from choices presented in a menu. However, menu-driven or not, there are
still data fields to enter and those fields have a syntax against which syntax
testing is effective. It’s usually very easy, though, because the syntax tree
tends to be shallow
3. Macro Languages: Many commercial software packages for PCs have a
macro language, also called a scripting language. This is a programming
language that can be used to automate repetitive operations. These languages
are often implemented in commercial packages not just for the users’
convenience but because they are a neat way to implement a lot of
complicated features, such as mail-merge in a word processor. In their best
form, these are full-blown, albeit specialized, programming languages with
formal specifications (often in BNF or the equivalent). Non-commercial
software or software that serves a narrow segment of an industry*also may
have a macro language, but usually, it’s not as clearly defined or implemented
as the commercial (horizontal) packages. Such languages, if they are part of
an application, warrant syntax testing.
4. Communications: All communications systems have an embedded language.
It is the language of the format of messages. Even telephone exchanges use
language to communicate with each other. But proper telephone numbers,
local, long distance and international, have a very formal syntax called a
number plan. Every message used to communicate must have a format (i.e.,
syntax) that must be parsed.
5. Database Query Languages: Any database system has a command language
used to specify what is to be searched and what is to be retrieved. The
simplest ones allow only one key. The more mature systems allow boolean
searches based on user-supplied parameters, which we recognize as being
predicates. Such predicates obviously have a formal syntax and semantics,
and should, therefore, be treated to syntax testing.
6. Compilers and Generated Parsers: The one place syntax testing should not
be used, especially dirty syntax testing, is to test a modern compiler. This
might seem perverse and strange, but a tester should never repeat the tests
previously done by another. Modern compilers have a parser, of course. But
that parser is generated completely automatically by use of a parser generator
given a formal definition in something like BNF. Lexical analyzers are also

www.Jntufastupdates.com 2
generated automatically from formal specifications. From a testing viewpoint,
there’s not much that syntax testing, even when fully automated, can do to
break such lexers and parsers. Before you spend a lot of effort on syntax
testing (even if automated), look at the application and how it has been
implemented. If a generated lexer and parser are used or planned, you’re
unlikely to have great success in using syntax testing; those kinds of bugs
have been totally prevented. The only thing left to test is semantics, which is
often done by another technique such as domain testing.
Backus-Naur notation (BNF)
Backus-Naur Form (BNF) is a notation technique used to describe recursively
the syntax of
programming languages
document formats
communication protocols and etc.
Syntax:
<symbol>::= exp1|exp2|exp3|……..
Where <symbol> is non-terminal, expression are sequences of terminals and
non-terminals.
Example:
<number> ::= <digit> | <number> <digit>
<digit> ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
• “::=” means “is defined as”, LHS can be replaced with RHS (some variants
use “:=”)
• “|” means “or”
• Angle brackets mean a nonterminal
• Symbols without angle brackets are terminals
Operators in BNF:

[] Brackets enclose optional items.

{} Braces enclose items only one of which is required.

| A vertical bar separates alternatives within brackets or braces.

+ or # The character "+" or “#” preceding an element indicates one more occurrences

* The character "*" preceding an element indicates zero more occurrences

. (dot) Concatenation

www.Jntufastupdates.com 3
Example:
<digit> ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
<unsigned integer> ::= <digit> | <unsigned integer><digit>
<integer> ::= <unsigned integer> | + <unsigned integer> | −<unsigned
integer>
<letter> ::= a | b | c | . . .
<identifier> ::= <letter> | <identifier><letter> | <identifier><digit>
Designed in the 1950–60s to define the syntax of the programming language
ALGOL
Example
As an example, consider this possible BNF for a postal address:

<postal-address> ::= <name-part> <street-address> <zip-part>


<name-part> ::= <personal-part> <name-part>
<personal-part> ::= <initial> "." | <first-name>
<street-address> ::= <house-num> <street-name> <opt-apt-num> <EOL>
<zip-part> ::= <town-name> "," <state-code> <ZIP-code> <EOL>
<opt-apt-num> ::= <apt-num> | ""

This translates into English as:

 A postal address consists of a name-part, followed by a street-address part,


followed by a zip-code part.
 A name-part consists of a personal part followed by a name part (this rule
illustrates the use of recursion in BNFs, covering the case of people who use
multiple first and middle names and initials).
 A personal-part consists of either a first name or an initial followed by a dot.
 A street address consists of a house number, followed by a street name,
followed by an optional apartment specified, followed by an end-of-line.
 A zip-part consists of a town-name, followed by a comma, followed by a state
code, followed by a ZIP-code followed by an end-of-line.
 An opt-suffix-part consists of a suffix, such as "Sr.", "Jr." or a roman-numeral,
or an empty string (i.e. nothing).
 An opt-apt-num consists of an apartment number or an empty string (i.e.
nothing).

www.Jntufastupdates.com 4
Note that many things (such as the format of a first-name, apartment specifier, ZIP-
code, and Roman numeral) are left unspecified here. If necessary, they may be
described using additional BNF rules.
Why BNF?
• Using a BNF specification is an easy way to design format-validation test cases.
• It is also an easy way for designers to organize their work.
• You should not begin to design tests until you are able to distinguish incorrect
data from correct data.
Test Case Generation
• There are three possible kinds of incorrect actions:
 Recognizer does not recognize a good string.
 Recognizer accepts a bad string.
 Recognizer crashes during attempt to recognize a string.
• Even small BNF specifications lead to many good strings and far more bad
strings.
• There is neither time nor need to test all strings.
Testing case design Strategy
• Create one error at a time, while keeping all other components of the input string
correct.
• Once a complete set of tests has been specified for single errors, do the same for
double errors, then triple, errors, ...
• Focus on one level at a time and keep the level above and below as correct as you
can.
Sources of Syntax
• Designer-Tester Cooperation
• Manuals
• Help Screens
• Design Documents
• Prototypes
• Programmer Interviews
• Experimental (hacking)
Dangers of Syntax Test Design
• It’s easy to forget the “normal” cases.
• Don’t go overboard with combinations:
 Syntax testing is easy compared to structural testing.

www.Jntufastupdates.com 5
 Don’t ignore structural testing because you are thorough in syntax
testing.
 Knowing a program’s design may help you eliminate cases without
sacrificing the thoroughness of the testing process
Syntax Testing Drivers
• Build a driver program that automatically sequences through a set of test cases
usually stored as data.
• Do not try to build the “garbage” strings automatically because you will be
going down a diverging infinite sequence of syntax testing.
Design Automation:
Primitive Method
• Use a word processor to specify a covering set of correct input strings.
• Using search/replace, replace correct substrings with incorrect ones.
• Using the syntax definition graph as a guide, generate all single-error cases.
• Do same for double errors, triple errors.
Random String Generators
• Easy to do, but useless.
• Random strings get recognized as invalid too soon.
• The probability of hitting vulnerable points is too low because there are
simply too many “garbage” strings.
Productivity, Training, Effectiveness
• Syntax testing is a great confidence builder for people who have never
designed tests.
• A testing trainee can easily produce 20-30 test cases per hour after a bit of
training.
• Syntax testing is an excellent way of convincing a novice tester that:
 Testing is often an infinite process.
 A tester’s problem knows which tests to ignore.
Ad-lib Testing
• Ad-lib testing is futile and doesn’t prove anything.
• Most of the ad-lib tests will be input strings with format violations.
• Ad-lib testers will try good strings that they think are bad ones!
• If ad-lib tests are able to prove something, then the system is so buggy that it
deserves to be thrown out!

www.Jntufastupdates.com 6
Logic-based testing
Logic-based testing is structural testing when it's applied to structure (e.g.,
control flow graph of an implementation); it is functional testing when it is applied
to a specification. In logic-based testing we focus on the truth values of control
flow predicates. Logic-based testers design tests from logical expressions that
appear in software artifacts such as source code, design models, and requirements
specifications.
Programmers and logic:
 "Logic" is one of the most often used words in programmer’s vocabularies but
one of their least used techniques.
 Boolean algebra is being the simplest form of logic.
 Boolean algebra is to logic as arithmetic is to mathematics. Without it, the
tester or programmer is cut off from many test and design techniques and
tools that incorporate those techniques.
Hardware Logic Testing:
 Logic has been the primary tool of hardware logic designers.
 Many test methods developed for hardware logic can be adapted to software
logic testing. Because hardware testing automation is 10 to 15 years ahead of
software testing automation.
 So hardware testing methods and its associated theory is a perfect opportunity
for software testing methods.
Specification Systems and Languages:
 The trouble with specifications is that they're hard to express.
 Boolean algebra (also known as the sentential calculus) is the most basic of
all logic systems.
 Higher-order logic systems are needed and used for formal specifications.
 Much of logical analysis can be and is embedded in tools. But these tools
incorporate methods to simplify, transform, and check specifications, and
the methods are to a large extent based on Boolean algebra.
KNOWLEDGE BASED SYSTEM:
 The knowledge-based system (also expert system, or "artificial
intelligence" system) has become the programming construct of choice for
many applications that were once considered very difficult.
 Knowledge-based systems incorporate knowledge from a knowledge domain
such as medicine, law, or civil engineering into a database. The data can
then be queried and interacted with to provide solutions to problems in that
domain.

www.Jntufastupdates.com 7
 One implementation of knowledge-based systems is to incorporate the
expert's knowledge into a set of rules. The user can then provide data and
ask questions based on that data.
 The user's data is processed through the rule base to yield conclusions
(tentative or definite) and requests for more data. The processing is done by
a program called the inference engine.
 Understanding knowledge-based systems and their validation problems
requires an understanding of formal logic.

Decision Table Testing:


 Decision table testing is a software testing technique used to test system
behavior for different input combinations.
 This is a systematic approach where the different input combinations and their
corresponding system behavior (Output) are captured in a tabular form.
 That is why it is also called as a Cause-Effect table where Cause and effects
are captured for better test coverage.
 A Decision Table is a tabular representation of inputs versus rules/cases/test
conditions. Let's learn with an example.
 Decision table consists of four areas called
1 Condition stub
2 Condition entry
3 Action stub
4 Action entry
 Each column of the table is a rule that specifies the conditions under which
the actions named in the action stub will take place.
 The condition stub is a list of names of conditions.
 A rule specifies whether a condition should or should not be met for the rule
to be satisfied. "YES" means that the condition must be met, "NO" means that
the condition must not be met, and "I" means that the condition plays no part
in the rule, or it is immaterial to that rule.
 The action stub names the actions the routine will take or initiate if the rule is
satisfied.
 If the action entry is "YES", the action will take place; if "NO", the action
will not take place.

www.Jntufastupdates.com 8
Example 1: How to make Decision Base Table for Login Screen
• Let's create a decision table for a login screen.
• The condition is simple if the user provides correct username and password
the user will be redirected to the homepage. If any of the input is wrong, an
error message will be displayed.

Rule 1 Rule 2 Rule 3 Rule 4

Username INVALID VALID INVALID VALID

Password INVALID INVALID VALID VALID

www.Jntufastupdates.com 9
Action 1 YES YES YES NO
(Error Message)
Action 2 NO NO NO YES
(HOME SCREEN)

Why is Decision Table Testing is important?


• This testing technique becomes important when it is required to test different
combination. It also helps in better test coverage for complex business logic.
Advantages of Decision Table Testing
• When the system behavior is different for different input and not same for a
range of inputs, both equivalent partitioning, and boundary value analysis won't
help, but decision table can be used.
• The representation is simple so that it can be easily interpreted and is used for
development and business as well.
• This table will help to make effective combinations and can ensure a better
coverage for testing
• Any complex business conditions can be easily turned into decision tables.
• In a case we are going for 100% coverage typically when the input combinations
are low, this technique can ensure the coverage.
Disadvantages of Decision Table Testing
• The main disadvantage is that when the number of input increases the table
will become more complex

www.Jntufastupdates.com 10
www.Jntufastupdates.com 11
RULE 1 RULE 2 RULE 3 RULE 4

CONDITION 1 YES YES NO NO


CONDITION 2 YES I NO I
CONDITION 3 NO YES NO I
CONDITION 4 NO YES NO YES
ACTION 1 YES YES NO NO
ACTION 2 NO NO YES NO
ACTION 3 NO NO NO YES

Rule 2 in above table contains on I entry and therefore expands into two equivalent sub-
rules. Rule 4 contains two I entries and therefore expands into four sub-rules. The expansion of
rules 2 and 4 are shown in below table.
Rule 2 has been expanded by converting the I entry for condition 2 into a separate rule
2.1 for YES and 2.2 for NO. Similarly, condition 2 was expanded in rule 4 to yield intermediate
rules 4.1, 4.2,4.3,4.4.

www.Jntufastupdates.com 12
www.Jntufastupdates.com 13
PATH EXPRESSIONS:
 Logic-based testing is structural testing when it's applied to structure (e.g., control flow
graph of an implementation); it's functional testing when it's applied to a specification.
 In logic-based testing we focus on the truth values of control flow predicates.
 A predicate is implemented as a process whose outcome is a truth-functional value.
 For our purpose, logic-based testing is restricted to binary predicates.
 We start by generating path expressions by path tracing as in Unit V, but this time, our
purpose is to convert the path expressions into boolean algebra, using the predicates' truth
values (e.g., A and A ) as weights.
BOOLEAN ALGEBRA:
STEPS:
1. Label each decision with an uppercase letter that represents the truth value of the predicate.
The YES or TRUE branch is labeled with a letter (say A) and the NO or FALSE branch with
the same letter overscored (say A ).
2. The truth value of a path is the product of the individual labels. Concatenation or products
mean "AND". For example, the straight-through path of Figure 6.5, which goes via nodes 3,
6, 7, 8, 10, 11, 12, and 2, has a truth value of ABC. The path via nodes 3, 6, 7, 9 and 2 has a
value of ABC .

www.Jntufastupdates.com 14
3. If two or more paths merge at a node, the fact is expressed by use of a plus sign (+) which
means "OR".

Using this convention, the truth-functional values for several of the nodes can be expressed in
terms of segments from previous nodes. Use the node name to identify the point.

N6=A+ABC
N8=(N6)B+AB=AB+ABCB+AB
N11=(N8)C+(N6)BC
N12=N11+ABC
N2=N12+(N8)C+(N6)BC

www.Jntufastupdates.com 15
www.Jntufastupdates.com 16
 The charts show all possible truth values that the variable A can have.
 A "1" means the variable’s value is "1" or TRUE. A "0" means that the variable's value is
0 or FALSE.
 The entry in the box (0 or 1) specifies whether the function that the chart represents is
true or false for that value of the variable.
 We usually do not explicitly put in 0 entries but specify only the conditions under which
the function is true

TWO VARIABLES:

www.Jntufastupdates.com 17
Each box corresponds to the combination of values of the variables for the row and
column of that box.
 A pair may be adjacent either horizontally or vertically but not diagonally.
 Any variable that changes in either the horizontal or vertical direction does not appear in
the expression.
 In the fifth chart, the B variable changes from 0 to 1 going down the column, and because
the A variable's value for the column is 1, the chart is equivalent to a simple A.
THREE VARIABLES:
 KV charts for three variables are shown below.
 As before, each box represents an elementary term of three variables with a bar appearing
or not appearing according to whether the row-column heading for that box is 0 or 1.
 A three-variable chart can have groupings of 1, 2, 4, and 8 boxes.
 A few examples will illustrate the principles:

www.Jntufastupdates.com 18
www.Jntufastupdates.com 19

You might also like