0% found this document useful (0 votes)
61 views3 pages

A General Approach To Modeling Java Framework Behaviors

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
61 views3 pages

A General Approach To Modeling Java Framework Behaviors

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

A General Approach to Modeling Java Framework Behaviors

Linghui Luo
[email protected]
Paderborn University
Germany

ABSTRACT
Interprocedural static analysis tools such as security analyses need Library.main() Simulate Android Library.doItAll()
good call graphs, which are challenging to scale for framework-
based applications. So most tools model rather than analyzing MainActivity.
frameworks. These models are manually crafted to capture frame- onStart()
work semantics crucial for the particular analysis, and are inher- MainActivity() PushHandler()
MainActivity.
ently incomplete. We propose a general approach to modeling Java onPause() TaskService.
onStartCommand()
frameworks. It is not limited to any framework or analysis tool, TaskService() TaskService.
PushHandler.
onCreate()
therefore, highly reusable. While a generic approximation can be handleMessage()
noisy, we show our carefully-constructed one does well. Experi-
ments on Android with a client taint analysis show that our ap- Figure 1: Simulating Android behavior.
proach produces more complete call graphs than the original anal-
ysis. As a result, the client analysis works better: both precision
(from 0.83 to 0.86) and recall (from 0.20 to 0.31) are improved. generates a placeholder library for a given Android/Java framework.
This generated library can be used as a replacement of the original
CCS CONCEPTS framework by popular call graph construction algorithms [7, 15]
and further client analyses. The framework behavior is modeled in
• Theory of computation → Program analysis. the placeholder library code and will be reflected in the constructed
call graphs.
KEYWORDS
call graph, static analysis, framework modeling, taint analysis, Java 2 BACKGROUND AND RELATED WORK
ACM Reference Format: FlowDroid is a prominent static Android taint analysis tool; it
Linghui Luo. 2021. A General Approach to Modeling Java Framework Be- performs context-, field-, and flow-sensitive data-flow analysis and
haviors. In Proceedings of the 29th ACM Joint European Software Engineering delivers precise results. In comparison to other tools, FlowDroid
Conference and Symposium on the Foundations of Software Engineering (ES- achieves good result in evaluation on micro benchmarks such as
EC/FSE ’21), August 23–28, 2021, Athens, Greece. ACM, New York, NY, USA, DroidBench [10, 11]. We recently evaluated Android taint anal-
3 pages. https://doi.org/10.1145/3468264.3473489
ysis tools on TaintBench [8, 9], which consists of 39 real-world
malware apps and 203 documented malicious taint flows. Although
1 INTRODUCTION FlowDroid produced the best result on TaintBench, it has espe-
Call graphs are a key prerequisite of interprocedural static analy- cially low recall (0.2). Specifically, 35% (70/203) of the malicious taint
ses. To be scalable, most static analysis tools choose to construct flows in TaintBench could not be detected due to relevant meth-
application-only call graphs and carefully model the behavior of ods being missing in the call graphs. Clearly, we need to construct
frameworks [2, 3, 13, 16]. FlowDroid, for instance, models the better call graphs.
behavior of the Android framework by crafting a dummy main Our work was inspired by Averroes [1]. Averroes generates a
method which simulates the lifecycle of Android components [3]. placeholder library to overapproximate behavior of an original Java
However, such carefully crafted models often produce incomplete library. Averroes relies on the separate compilation assumption—the
call graphs, and are impractical to do for every framework [4, 14]. library can be compiled separately without the client application.
So we build a model that over-approximates to obviate framework Based on that, Averroes generates a Library.doItAll() method
details and yet retains enough detail for analysis. Particularly, we de- which implements behaviors such as object creation, invocation of
veloped Averroes-GenCG—an improvement of Averroes [1] that library callbacks, etc.

Permission to make digital or hard copies of all or part of this work for personal or 3 APPROACH
classroom use is granted without fee provided that copies are not made or distributed
for profit or commercial advantage and that copies bear this notice and the full citation The separate compilation assumption works for Android and other
on the first page. Copyrights for components of this work owned by others than ACM Java web frameworks. However, we could not directly use the
must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,
to post on servers or to redistribute to lists, requires prior specific permission and/or a placeholder.jar generated by Averroes for Android or web apps,
fee. Request permissions from [email protected]. since these apps do not have any main entry point. One could take
ESEC/FSE ’21, August 23–28, 2021, Athens, Greece the Library.doItAll() method as an entry point. However, it is
© 2021 Association for Computing Machinery.
ACM ISBN 978-1-4503-8562-6/21/08. . . $15.00 less useful for detecting issues that requires flow-sensitivity, since
https://doi.org/10.1145/3468264.3473489 Library.doItAll() contains no control flow at all and callbacks
ESEC/FSE ’21, August 23–28, 2021, Athens, Greece Linghui Luo

Application
(.apk, .jar) Construct Call Graph with Perform
Library.main()
Library.<clinit>()
Client
Averroes-GenCG Placeholder.jar as entry points Analysis
Library
(.jar)

Figure 2: Overview of how to utilize our approach for a client analysis.

are unordered. Furthermore, Averroes uses a single libraryPointsTo detected 24 (row 2: 64 - 40) more true positives (TP) with just 2 more
field to represent all objects that the library may reference, which is false positives (FP) (row 3: 10 - 8 ). Thus, our call graphs improved
too coarse for a field-sensitive taint analysis: once libraryPointsTo both precision (from 0.83 to 0.86) and recall (from 0.20 to 0.31) of
is tainted, it propagates everywhere, potentially resulting in many the client taint analysis.
false positives. To produce call graphs suitable for flow- and field-
sensitive client analyses, we adapt Averroes as follows:
5 CONCLUSION
● We introduce typed libraryPointsTo fields for every type
Our approach is not limited to any framework or any specific anal-
T that the library could point to.
ysis tool. Experiments on Android show our approach enables de-
● We move object creation from Library.doItAll() to a sep-
tection of more real-world issues without introducing much noise.
arate Library.main() method to avoid unnecessary strong
As a next step, we will experiment our approach on Java web frame-
updates [5]. This simulates the framework as shown in Fig-
works.
ure 1 for Android.
● We introduce control flow into Library.doItAll() to model
other side effects of the library. ACKNOWLEDGMENTS
We refer to our adaption as Averroes-GenCG. Its generated place- This research was supported by the research training group Human
holder.jar can be used by client analyses together with the ap- Centered Systems Security (NERD.NRW) sponsored by the state of
plication to construct call graphs with popular analysis frame- North Rhine-Westphalia in Germany.
works [6, 12] as shown in Figure 2. Note that Library.<clinit>()
is the static constructor of the Library class. REFERENCES
[1] Karim Ali and Ondrej Lhoták. 2013. Averroes: Whole-Program Analysis with-
4 EVALUATION AND RESULTS out the Whole Program. In ECOOP 2013 - Object-Oriented Programming - 27th
European Conference, Montpellier, France, July 1-5, 2013. Proceedings (Lecture
Table 1: Evaluation on TaintBench. Notes in Computer Science, Vol. 7920), Giuseppe Castagna (Ed.). Springer, 378–400.
https://doi.org/10.1007/978-3-642-39038-8_16
[2] Anastasios Antoniadis, Nikos Filippakis, Paddy Krishnan, Raghavendra Ramesh,
FlowDroid FlowDroid𝐺𝑒𝑛 Nicholas Allen, and Yannis Smaragdakis. 2020. Static analysis of Java enterprise
TP 40 64 applications: frameworks and caches, the elephants in the room. In Proceedings
of the 41st ACM SIGPLAN International Conference on Programming Language
FP 8 10 Design and Implementation, PLDI 2020, London, UK, June 15-20, 2020, Alastair F.
FN (incomplete call graph) 70 19 Donaldson and Emina Torlak (Eds.). ACM, 794–807. https://doi.org/10.1145/
3385412.3386026
FN (other reasons) 93 120 [3] Steven Arzt, Siegfried Rasthofer, Christian Fritz, Eric Bodden, Alexandre Bartel,
Jacques Klein, Yves Le Traon, Damien Octeau, and Patrick D. McDaniel. 2014.
Precision p=TP/(TP+FP) 0.83 0.86 FlowDroid: precise context, flow, field, object-sensitive and lifecycle-aware taint
Recall r=TP/(TP+FN) 0.20 0.31 analysis for Android apps. In PLDI ’14, Edinburgh, United Kingdom - June 09 -
11, 2014, Michael F. P. O’Boyle and Keshav Pingali (Eds.). ACM, 259–269. https:
F-measure f=2pr/(p+r) 0.32 0.46 //doi.org/10.1145/2594291.2594299
[4] Sam Blackshear, Alexandra Gendreau, and Bor-Yuh Evan Chang. 2015. Droidel:
a general approach to Android framework modeling. In Proceedings of the 4th
We evaluate our approach with FlowDroid𝐺𝑒𝑛 , which is Flow- ACM SIGPLAN International Workshop on State Of the Art in Program Analysis,
Droid using our call graphs. We compare it to FlowDroid on SOAP@PLDI 2015, Portland, OR, USA, June 15 - 17, 2015, Anders Møller and Mayur
Naik (Eds.). ACM, 19–25. https://doi.org/10.1145/2771284.2771288
TaintBench. Both tools were configured with sources and sinks [5] Arnab De and Deepak D’Souza. 2012. Scalable Flow-Sensitive Pointer Analysis
that are used by the documented taint flows in each TaintBench for Java with Strong Updates. In ECOOP 2012 - Object-Oriented Programming -
app. The evaluation result is shown in Table 1. We explicitly dis- 26th European Conference, Beijing, China, June 11-16, 2012. Proceedings (Lecture
Notes in Computer Science, Vol. 7313), James Noble (Ed.). Springer, 665–687. https:
tinguish false negatives (FN) caused by incomplete call graphs. A //doi.org/10.1007/978-3-642-31057-7_29
false negative can be caused by multiple factors. Here we mean [6] Raja Vallee-Rai et al. 1997. Soot. https://github.com/soot-oss/soot. Accessed:
incomplete call graphs is one factor. As we can see, 51 ( row 4: 70 2021-05-05.
[7] Ondrej Lhoták and Laurie J. Hendren. 2003. Scaling Java Points-to Analysis Using
- 19) more malicious flows are captured in the call graphs of our SPARK. In Compiler Construction, 12th International Conference, CC 2003, Held as
approach in comparison to FlowDroid. For these 51 flows, our ap- Part of the Joint European Conferences on Theory and Practice of Software, ETAPS
2003, Warsaw, Poland, April 7-11, 2003, Proceedings (Lecture Notes in Computer
proach enabled FlowDroid’s taint analysis to analyze all methods Science, Vol. 2622), Görel Hedin (Ed.). Springer, 153–169. https://doi.org/10.1007/3-
that are on their data-flow paths. Consequently, FlowDroid𝐺𝑒𝑛 540-36579-6_12
A General Approach to Modeling Java Framework Behaviors ESEC/FSE ’21, August 23–28, 2021, Athens, Greece

[8] Linghui Luo, Felix Pauck, Goran Piskachev, Manuel Benz, Ivan Pashchenko, Programming, Systems, Languages, and Applications, OOPSLA 2011, part of SPLASH
Martin Mory, Eric Bodden, Ben Hermann, and Fabio Massacci. 2021. TaintBench. 2011, Portland, OR, USA, October 22 - 27, 2011, Cristina Videira Lopes and Kathleen
https://taintbench.github.io. Accessed: 2021-05-05. Fisher (Eds.). ACM, 1053–1068. https://doi.org/10.1145/2048066.2048145
[9] Linghui Luo, Felix Pauck, Goran Piskachev, Manuel Benz, Ivan Pashchenko, Mar- [14] Li Sui, Jens Dietrich, Amjed Tahir, and George Fourtounis. 2020. On the recall of
tin Mory, Eric Bodden, Ben Hermann, and Fabio Massacci. 2021. TaintBench: static call graph construction in practice. In ICSE ’20, Seoul, South Korea, 27 June
Automatic Real-World Malware Benchmarking of Android Taint Analyses. Em- - 19 July, 2020, Gregg Rothermel and Doo-Hwan Bae (Eds.). ACM, 1049–1060.
pirical Software Engineering (2021). to appear. https://doi.org/10.1145/3377811.3380441
[10] Felix Pauck, Eric Bodden, and Heike Wehrheim. 2018. Do Android taint analysis [15] Vijay Sundaresan, Laurie J. Hendren, Chrislain Razafimahefa, Raja Vallée-Rai,
tools keep their promises?. In ESEC/SIGSOFT FSE 2018, Lake Buena Vista, FL, Patrick Lam, Etienne Gagnon, and Charles Godin. 2000. Practical virtual method
USA, November 04-09, 2018, Gary T. Leavens, Alessandro Garcia, and Corina S. call resolution for Java. In Proceedings of the 2000 ACM SIGPLAN Conference on
Pasareanu (Eds.). ACM, 331–341. https://doi.org/10.1145/3236024.3236029 Object-Oriented Programming Systems, Languages & Applications (OOPSLA 2000),
[11] Lina Qiu, Yingying Wang, and Julia Rubin. 2018. Analyzing the analyzers: Flow- Minneapolis, Minnesota, USA, October 15-19, 2000, Mary Beth Rosson and Doug
Droid/IccTA, AmanDroid, and DroidSafe. In Proceedings of the 27th ISSTA. ACM. Lea (Eds.). ACM, 264–280. https://doi.org/10.1145/353171.353189
https://doi.org/10.1145/3213846.3213873 [16] Fengguo Wei, Sankardas Roy, Xinming Ou, and Robby. 2014. Amandroid: A
[12] IBM Research. 2006. WALA. https://github.com/wala/WALA. Accessed: 2021-05- Precise and General Inter-component Data Flow Analysis Framework for Security
05. Vetting of Android Apps. In Proceedings of the 2014 ACM SIGSAC Conference
[13] Manu Sridharan, Shay Artzi, Marco Pistoia, Salvatore Guarnieri, Omer Tripp, on Computer and Communications Security, Scottsdale, AZ, USA, November 3-
and Ryan Berg. 2011. F4F: taint analysis of framework-based web applications. 7, 2014, Gail-Joon Ahn, Moti Yung, and Ninghui Li (Eds.). ACM, 1329–1341.
In Proceedings of the 26th Annual ACM SIGPLAN Conference on Object-Oriented https://doi.org/10.1145/2660267.2660357

You might also like