0% found this document useful (0 votes)
126 views40 pages

Metamorphic Testing: Addressing The Oracle Problem

Uploaded by

attila.nemet.001
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
126 views40 pages

Metamorphic Testing: Addressing The Oracle Problem

Uploaded by

attila.nemet.001
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Metamorphic Testing:

Addressing the oracle problem

Fabrizio Pastore – [email protected]

5 December 2023
1
§ Faults are subtle
§ Discovered when testing with many inputs
(e.g., after deployment in the field)
§ Manually specifying expected results for all of
them is impractical
§ Automatically deriving expected results is
2 infeasible
Metamorphic Testing alleviates the Oracle Problem

• Invented by T.Y. Chen in 1998 G


• Metamorphic Testing (MT) assumes that a
• it is simpler to reason about relations between
outputs of multiple test executions, than to specify b e
the output of the system for a given input c
• MT is a property-based testing approach d
• In MT, system properties are captured as metamorphic f
relations (MRs) that
• specify how to automatically transform an initial set f
of test inputs (source inputs) into follow-up test
inputs
• specify the relation between the outputs obtained
from source and follow-up inputs function under test: shortPath
• A failure is observed when such relations are violated
x2=(π- x1) ⇒ sin(x1) = sin(x2) x1=(G,a,f) ∧ x2=(G,f,a)
⇒ len(shortPath(x1)) = len(shortPath(x2))
3
Application Domains

Picture from:

§ S. Segura, D. Towey, Z. Q. Zhou and T. Y. Chen, "Metamorphic


Testing: Testing the Untestable," in IEEE Software, vol. 37, no. 3,
pp. 46-53, May-June 2020, doi: 10.1109/MS.2018.2875968.

Other relevant surveys:

§ S. Segura, G. Fraser, A. B. Sanchez and A. Ruiz-Cortés, "A


Survey on Metamorphic Testing," in IEEE Transactions on
Software Engineering, vol. 42, no. 9, pp. 805-824, 1 Sept. 2016,
doi: 10.1109/TSE.2016.2532875.

§ Tsong Yueh Chen, Fei-Ching Kuo, Huai Liu, Pak-Lok Poon, Dave
Towey, T. H. Tse, and Zhi Quan Zhou. 2018. Metamorphic
Testing: A Review of Challenges and Opportunities. ACM
Comput. Surv. 51, 1, Article 4 (January 2019), 27 pages.
https://doi.org/10.1145/3143561

4
Examples: Testing Web engines

MR: with an additional filtering criterion, the returned results should be lower or equal

5
Examples: Testing Web engines (2)

Z. Q. Zhou, L. Sun, T. Y. Chen and D. Towey, "Metamorphic Relations for Enhancing System Understanding and Use," in IEEE Transactions on Software Engineering,
vol. 46, no. 10, pp. 1120-1154, 1 Oct. 2020, doi: 10.1109/TSE.2018.2876433.
6
Examples: Testing Web APIs

S. Segura, J. A. Parejo, J. Troya and A. Ruiz-Cortés, "Metamorphic Testing of RESTful Web APIs," in IEEE Transactions
on Software Engineering, vol. 44, no. 11, pp. 1083-1099, 1 Nov. 2018, doi: 10.1109/TSE.2017.2764464.
7
Examples: Testing Deep Neural Networks

MR: The steering angle predicted by the DNN should remain


the same even in the presence of fog

https://towardsdatascience.com/metamorphic-testing-of-machine-
learning-based-systems-e1fe13baf048

Yuchi Tian, Kexin Pei, Suman Jana, and Baishakhi Ray. 2018. DeepTest: automated testing of deep-neural-network-driven
autonomous cars. In Proceedings of the 40th International Conference on Software Engineering (ICSE '18). Association for Computing
Machinery, New York, NY, USA, 303–314. https://doi.org/10.1145/3180155.3180220 8
Examples: Elevators

MR: If we increase the number of elevators (simulated in the test environment) the average wait time should decrease

J. Ayerdi, S. Segura, A. Arrieta, G. Sagardui and M. Arratibel, "QoS-aware Metamorphic Testing: An Elevation Case Study," 2020 IEEE
31st International Symposium on Software Reliability Engineering (ISSRE), Coimbra, Portugal, 2020, pp. 104-114, doi:
10.1109/ISSRE5003.2020.00019. 9
Examples: Compilers

Chen, T. Y., Kuo, F. C., Ma, W., Susilo, W., Towey, D., Voas, J., & Zhou, Z. Q. (2016). Metamorphic Testing for
Cybersecurity. Computer, 49(6), 48–55. https://doi.org/10.1109/MC.2016.176
10
Deriving Metamorphic Relations
§ Input-driven approach
§ thinking of changes to the program’s inputs that should produce expected changes in the outputs.
§ the possible changes in the input parameters depend on their data type.
§ Example for lists: adding an element to the list; removing an element from the list; splitting the list; reordering
the list; and so on.
§ numerical and graph-theory programs
§ Output-driven approach
§ starting from possible relations among outputs typically found in the target domain, and then thinking
about what kind of changes in the program's inputs would lead to satisfaction of the expected relation
among outputs
§ example output relations having a result set which is a subset of another result set; having two result sets
containing the same items; or having two disjoint result sets (sets with no common elements)

11
Inference of Metamorphic Relations

§ State-of-the-art: search-based approach (AutoMR)


§ Zhang, B., Zhang, H., Chen, J., Hao, D., & Moscato, P. (2019). Automatic Discovery and Cleansing of Numerical Metamorphic Relations. 2019 IEEE International
Conference on Software Maintenance and Evolution (ICSME), 235–245. https://doi.org/10.1109/ICSME.2019.00035

§ Other approaches:
§ Genetic programming (industrial application)
§ Zhang, B., Zhang, H., Chen, J., Hao, D., & Moscato, P. (2019). Automatic Discovery and Cleansing of Numerical Metamorphic Relations. 2019 IEEE International Conference on
Software Maintenance and Evolution (ICSME), 235–245. https://doi.org/10.1109/ICSME.2019.00035

§ Symbolic regression
§ Hong, J., Zhang, J., Qiu, Q., Ma, A., Li, M., Yan, S., & Gong, H. (2022). A Dynamic Recognition Method of Metamorphic Relation Identification. 13th International Conference on
Reliability, Maintainability, and Safety: Reliability and Safety of Intelligent Systems, ICRMS 2022, 81–86. https://doi.org/10.1109/ICRMS55680.2022.9944595

12
SnT contribution:
Metamorphic Testing for Web System Security
Framework for Metamorphic Testing automation

Work partially done with University of Ottawa

Nazanin Bayati Fabrizio Pastore Arda Goknil Lionel Briand


University of Ottawa University of Luxembourg University of Ottawa
SINTEF Digital, Norway
University of Luxembourg

13 September 2023 Nazanin Bayati - Metamorphic Testing for Web System Security 13 13
Metamorphic Security Testing

• Source input: a sequence of valid interactions with the system


{login(Admin), RequestURL(settings_page)}

• Follow-up input: generated by altering valid interactions as an attacker would do


{login(User1), RequestURL(settings_page)}

• Relations: capture properties that hold when the system is not vulnerable

if the user in the follow-up input cannot access the URL from her GUI then the output of the
source and follow-up inputs should be different

14
MST-wi: Metamorphic Security Testing for Web Interfaces

1 2
Select or Specify the Translate Metamorphic
Metamorphic Relations Relations to Java
List of Executable
Metamorphic Relations Metamorphic
Catalog of 76 Relations in Java
Metamorphic Relations

3 Log in 4
Execute the Data
Log in
Execute the
Collection Framework Submit Metamorphic Testing
form
logout Framework
logout Test results

Web System Source Inputs

15
MST-wi – MR Example

• Security issue: Bypass Authorization Schema

16
MST-wi – MR Example

• Security issue: Bypass Authorization Schema

17
MST-wi – MR Example

• Security issue: Bypass Authorization Schema

18
MST-wi – MR Example

• Security issue: Bypass Authorization Schema

19
MST-wi – MR Example

• Security issue: Bypass Authorization Schema

20
MST-wi – MR Example

• Security issue: Bypass Authorization Schema

21
MST-wi – MR Example

• Security issue: Bypass Authorization Schema

Our metamorphic testing algorithm executes


each MR multiple times, to ensure that every
possible combination of source and follow-up
inputs is exercised

22
D4.4 Prototype of a toolset for specification-based functional security testing of CPS

M18 EC review
23

Demo: Web systems


Project meeting

2023, June 13
24

MST Demo Objective: detect a real vulnerability


Project meeting

2023, June 13
25

MST Demo Objective: detect a real vulnerability

• Only admins should be able to launch/relaunch agent slaves

• But users can do it


M18 EC
D4.4
26 Prototype
review of a toolset for specification-based functional security testing of CPS
D4.4 Prototype of a toolset for specification-based functional security testing of CPS

M18 EC review
27

Demo: ROS
D4.4 Prototype of a toolset for specification-based functional security testing of CPS

M18 EC review
28

ROS vulnerability https://github.com/aliasrobotics/RVD/issues/88

When running a test scenario


Publisher1 Master Subscriber1 Attacker1 with two interacting actors,
executing a third unauthorized
advertise(“position”) actor interacting with one of the
subscribe(“position”) two should not alter the output.

publisherUpdate([“Publisher1-URI”])

tcpConnect(“position”)

data(“position1”) • Source input: a test scenario with


interacting components
print(“position1”)
data(“position2”) • Follow-up input: a test scenario
with additional remote calls to one
print(“position2”)
component from one unauthorized
component


publisherUpdate([])
Relations: the output should be
X
data(“position3”)
the same
print(“position3”)
D4.4 Prototype of a toolset for specification-based functional security testing of CPS

M18 EC review
29

MR for ROS vulnerability pattern

Source input
Follow-up input

2nd Follow-up input

User-provided Source input


1st Follow-up input
catalog file
Delay
D4.4 Prototype of a toolset for specification-based functional security testing of CPS

M18 EC review
30
D4.4 Prototype of a toolset for specification-based functional security testing of CPS

M18 EC review
31

Empirical Results
MST-wi – Research Questions

• RQ1. What testing activities can be automated thanks to oracle automation provided by MST-wi?

• RQ2. What vulnerability types can MST-wi detect?

• RQ3. What testability guidelines can we define to enable effective test automation with MST-wi?

• RQ4. How does MST-wi compare to state-of-the-art SAST and DAST tools?

• RQ5. Can we identify patterns for writing MST-wi relations?

• RQ6. Is MST-wi effective?

• RQ7. Is MST-wi efficient?

32
MST-wi – What vulnerability types can MST-wi detect?

• We investigated the feasibility of implementing MRs that discover the vulnerability types described in the
MITRE Common Weakness Enumeration (CWE) database

• Considered three subsets:


• CWE view for common security architectural tactics
• CWE Top 25 most dangerous software errors
• OWASP Top 10 Web security risks

• To implement an MR, for each weakness, we first inspect its description, its demonstrative examples, the
description of concrete vulnerabilities (CVE) and common attack patterns (CAPEC) associated with the
weakness.

• This process led to a catalog of 76 MRs.


33
MST-wi – What vulnerability types can MST-wi detect?

Summary of the CWE architectural security design principles and weaknesses


addressed by MST-wi.

Security Design Principle Vulnerability types Addressed by MST-wi Rank


Audit 6 1(16%) 10th
Authenticate Actors 28 12 (43%) 4th
Authorize Actors 60 34 (57%) 3rd
Cross Cutting 9 3 (33%) 6th
Encrypt Data 38 8 (21%) 8th
Identify Actors 12 3 (25%) 7th
Limit Access 8 3 (38%) 5th
Limit Exposure 6 0 (0%) 11th
Lock Computer 1 0 (0%) 11th
Manage User Session 6 4 (67%) 2nd
Validate Inputs 39 31 (79%) 1st
Verify Message Integrity 19 2 (20%) 9th
Total 223 101 (45%)

34
MST-wi – How does MST-wi compare to state-of-the-art SAST and
DAST tools?
• We compared the vulnerability types detected by MST-wi, with the vulnerability types detected by state-
of-the-art SAST and DAST tool reported in a recent empirical study

35
MST-wi – How does MST-wi compare to state-of-the-art SAST and
DAST tools?

Security Design Weaknesses Addresses by Weaknesses Addressed by MST but not Weaknesses bot addresses by MST but
Principle by addresses by
MST Zap DA2 Sonar SA2 Zap DA2 Sonar SA2 Zap DA2 Sonar SA2
Audit 1 0 0 0 3 1 1 1 0 0 0 0 2
Authenticate Actors 12 0 2 1 9 12 11 11 7 0 1 0 4
Authorize Actors 34 2 0 1 13 32 34 34 25 0 0 1 4
3 0 0 2 0 3 3 2 3 0 0 1 0
Cross Cutting
Encrypt Data 8 2 5 8 10 8
The
8
set of7 weaknesses
4
targeted
2 5
by MST-wi
7 6
Identify Actors 3 1 1 1 7 3 is larger
3 than
3 what1 can be1 targeted
1 by applying
1 5
Limit Access 3 0 1 1 5 3 all
3 four competing
2 0 approaches
0 1 together.
0 2
Limit Exposure 0 1 0 0 1 0 0 0 0 1 0 0 1
Lock Computer 0 0 0 0 0 0 0 0 0 0 0 0 0
Manage User Session 4 0 0 0 2 4 4 4 2 0 0 0 0
Validate Inputs 31 10 7 2 14 24 25 30 19 3 1 1 2
Verify Message Integrity 2 1 0 0 3 2 2 2 1 1 0 0 2
Total 101 17 16 16 67 92 94 96 62 8 9 11 28

84
36
MST-wi – Is MST-wi effective?

Applied MST-wi to test well-known Web systems:


• Jenkins v 2.121
• Joomla v. 3.8.7.
Assessed MST-wi capability to detect known vulnerabilities:
• 11 for Jenkins, 3 for Joomla.
• One of them discovered by MST-wi (CVE-2018-17857)
Considered two setups:
• Derive source inputs with crawler only
• Consider additional manually implemented functional test cases
Metrics:
• Sensitivity: proportion of vulnerabilities identified
• Specificity: proportion of inputs not leading to false alarms
13 September 2023 Nazanin Bayati - Metamorphic Testing for Web System Security 37
MST-wi – Is MST-wi effective?

• The high specificity indicates that only a negligible fraction of follow-up inputs leads to false alarms
• Since sensitivity reflects the fault detection rate (i.e., the proportion of vulnerabilities discovered),
we conclude that our approach is highly effective
• We can discover more than 60% of vulnerabilities in a completely automated manner, using only the crawler
• And up to 85% using both crawler and manual inputs
38
https://github.com/MetamorphicSecurityTesting/MST

39
Metamorphic Testing:
Addressing the oracle problem

Fabrizio Pastore – [email protected]

5 December 2023
40

You might also like