Papers by Yannis Smaragdakis
Shooting from the heap: ultra-scalable static analysis with heap snapshots
Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis

MadMax
Communications of the ACM
Ethereum is a distributed blockchain platform, serving as an ecosystem for smart contracts: full-... more Ethereum is a distributed blockchain platform, serving as an ecosystem for smart contracts: full-fledged intercommunicating programs that capture the transaction logic of an account. A gas limit caps the execution of an Ethereum smart contract: instructions, when executed, consume gas, and the execution proceeds as long as gas is available. Gas-focused vulnerabilities permit an attacker to force key contract functionality to run out of gas---effectively performing a permanent denial-of-service attack on the contract. Such vulnerabilities are among the hardest for programmers to protect against, as out-of-gas behavior may be uncommon in nonattack scenarios and reasoning about these vulnerabilities is nontrivial. In this paper, we identify gas-focused vulnerabilities and present MadMax: a static program analysis technique that automatically detects gas-focused vulnerabilities with very high confidence. MadMax combines a smart contract decompiler and semantic queries in Datalog. Our ap...

Proceedings of the 2019 ACM SIGPLAN International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software
The dream of programming language design is to bring about orders-of-magnitude productivity impro... more The dream of programming language design is to bring about orders-of-magnitude productivity improvements in software development tasks. Designers can endlessly debate on how this dream can be realized and on how close we are to its realization. Instead, I would like to focus on a question with an answer that can be, surprisingly, clearer: what will be the common principles behind next-paradigm, high-productivity programming languages, and how will they change everyday program development? Based on my decade-plus experience of heavy-duty development in declarative languages, I speculate that certain tenets of highproductivity languages are inevitable. These include, for instance, enormous variations in performance (including automatic transformations that change the asymptotic complexity of algorithms); a radical change in a programmer's workflow, elevating testing from a near-menial task to an act of deep understanding; a change in the need for formal proofs; and more. [W]e've passed the point of diminishing returns. No future language will give us the factor of 10 advantage that assembler gave us over binary. No future language will give us 50%, or 20%, or even 10% reduction in workload.
Artifact: Static Analysis of Enterprise Applications: Frameworks and Caches, the Elephants in the Room
Artifact Digital Object Group

Proceedings of the 2019 ACM SIGPLAN International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software - Onward! 2019, 2019
The dream of programming language design is to bring about orders-of-magnitude productivity impro... more The dream of programming language design is to bring about orders-of-magnitude productivity improvements in software development tasks. Designers can endlessly debate on how this dream can be realized and on how close we are to its realization. Instead, I would like to focus on a question with an answer that can be, surprisingly, clearer: what will be the common principles behind next-paradigm, high-productivity programming languages, and how will they change everyday program development? Based on my decade-plus experience of heavy-duty development in declarative languages, I speculate that certain tenets of highproductivity languages are inevitable. These include, for instance, enormous variations in performance (including automatic transformations that change the asymptotic complexity of algorithms); a radical change in a programmer's workflow, elevating testing from a near-menial task to an act of deep understanding; a change in the need for formal proofs; and more. [W]e've passed the point of diminishing returns. No future language will give us the factor of 10 advantage that assembler gave us over binary. No future language will give us 50%, or 20%, or even 10% reduction in workload.

Proceedings of the ACM on Programming Languages
Static analyses aspire to explore all possible executions in order to achieve soundness. Yet, in ... more Static analyses aspire to explore all possible executions in order to achieve soundness. Yet, in practice, they fail to capture common dynamic behavior. Enhancing static analyses with dynamic information is a common pattern, with tools such as Tamiflex. Past approaches, however, miss significant portions of dynamic behavior, due to native code, unsupported features (e.g., invokedynamic or lambdas in Java), and more. We present techniques that substantially counteract the unsoundness of a static analysis, with virtually no intrusion to the analysis logic. Our approach is reified in the HeapDL toolchain and consists in taking whole-heap snapshots during program execution, that are further enriched to capture significant aspects of dynamic behavior, regardless of the causes of such behavior. The snapshots are then used as extra inputs to the static analysis. The approach exhibits both portability and significantly increased coverage. Heap information under one set of dynamic inputs all...

Lecture Notes in Computer Science, 2016
We present a points-to analysis for C/C++ that recovers much of the available high-level structur... more We present a points-to analysis for C/C++ that recovers much of the available high-level structure information of types and objects, by applying two key techniques: (1) It records the type of each abstract object and, in cases when the type is not readily available, the analysis uses an allocation-site plus type abstraction to create multiple abstract objects per allocation site, so that each one is associated with a single type. (2) It creates separate abstract objects that represent (a) the fields of objects of either struct or class type, and (b) the (statically present) constant indices of arrays, resulting in a limited form of array-sensitivity. We apply our approach to the full LLVM bitcode intermediate language and show that it yields much higher precision than past analyses, allowing accurate distinctions between subobjects, v-table entries, array components, and more. Especially for C++ programs, this precision is invaluable for a realistic analysis. Compared to the state-of-the-art past approach, our techniques exhibit substantially better precision along multiple metrics and realistic benchmarks (e.g., 40+% more variables with a single points-to target).
A Personal Outlook on Generator Research (A Position Paper)
Dagstuhl Seminars, 2003
Hybrid Context Sensitivity for Poinst-To Analysis
High-level data structures: technical perspective
Communications of the Acm, Dec 1, 2012

Multiparadigm programming with OO languages
Lecture Notes in Computer Science, 2002
ABSTRACT While OO has become ubiquitous for design, implementation, and even conceptualization, m... more ABSTRACT While OO has become ubiquitous for design, implementation, and even conceptualization, many practitioners recognize the need for other programming paradigms, according to problem domain. We seek answers to the question of how to address the need for other programming paradigms in the general context of OO languages. Can OO programming languages effectively support other programming paradigms? The tentative answer seems to be affirmative, at least for some paradigms; for example, significant progress has been made for the case of functional programming in C++. Additionally, several efforts have been made to integrate support for other paradigms as a front-end for OO languages (the Pizza language, extending Java, is a prominent example). This workshop seeks to bring together practitioners and researchers in this developing field to ’compare notes’ on their work-that is, to describe techniques, idioms, methodologies, language extensions, software, or supporting theoretical work for expressing non-OO paradigms in OO languages. Work-in-progress descriptions are welcome, as are experience papers if they present a lesson to be learned.

Refactoring Java generics by inferring wildcards, in practice
Proceedings of the 2014 Acm International Conference, Oct 15, 2014
Wildcard annotations can improve the generality of Java generic libraries, but require heavy manu... more Wildcard annotations can improve the generality of Java generic libraries, but require heavy manual effort. We present an algorithm for refactoring and inferring more general type instantiations of Java generics using wildcards. Compared to past approaches, our work is practical and immediately applicable: we assume no changes to the Java type system, while taking into account all its intricacies. Our system allows users to select declarations (variables, method parameters, return types, etc.) to generalize and considers declarations not declared in available source code. It then performs an inter-procedural flow analysis and a method body analysis, in order to generalize type signatures. We evaluate our technique on six Java generic libraries. We find that 34% of available declarations of variant type signatures can be generalized - i.e., relaxed with more general wildcard types. On average, 146 other declarations need to be updated when a declaration is generalized, showing that this refactoring would be too tedious and error-prone to perform manually.

Introspective analysis (context-sensitivity, across the board)
Proceedings of the 35th Acm Sigplan Conference, Jun 9, 2014
ABSTRACT Context-sensitivity is the primary approach for adding more precision to a points-to ana... more ABSTRACT Context-sensitivity is the primary approach for adding more precision to a points-to analysis, while hopefully also maintaining scalability. An oft-reported problem with context-sensitive analyses, however, is that they are bi-modal: either the analysis is precise enough that it manipulates only manageable sets of data, and thus scales impressively well, or the analysis gets quickly derailed at the first sign of imprecision and becomes orders-of-magnitude more expensive than would be expected given the program's size. There is currently no approach that makes precise context-sensitive analyses (of any flavor: call-site-, object-, or type-sensitive) scale across the board at a level comparable to that of a context-insensitive analysis. To address this issue, we propose introspective analysis: a technique for uniformly scaling context-sensitive analysis by eliminating its performance-detrimental behavior, at a small precision expense. Introspective analysis consists of a common adaptivity pattern: first perform a context-insensitive analysis, then use the results to selectively refine (i.e., analyze context-sensitively) program elements that will not cause explosion in the running time or space. The technical challenge is to appropriately identify such program elements. We show that a simple but principled approach can be remarkably effective, achieving scalability (often with dramatic speedup) for benchmarks previously completely out-of-reach for deep context-sensitive analyses.

Forsaking inheritance (supercharged delegation in DelphJ)
Proceedings of the 2013 Acm Sigplan International Conference, Oct 29, 2013
ABSTRACT We propose DelphJ: a Java-based OO language that eschews inheritance completely, in favo... more ABSTRACT We propose DelphJ: a Java-based OO language that eschews inheritance completely, in favor of a combination of class morphing and (deep) delegation. Compared to past delegation approaches, the novel aspect of our design is the ability to emulate the best aspects of inheritance while retaining maximum flexibility: using morphing, a class can select any of the methods of its delegatee and export them (if desired) or transform them (e.g., to add extra arguments or modify type signatures), yet without needing to name these methods explicitly and handle them one-by-one. Compared to past work on morphing, our approach adopts and adapts advanced delegation mechanisms, in order to add late binding capabilities and, thus, provide a full substitute of inheritance. Additionally, we explore complex semantic issues in the interaction of delegation with late binding. We present our language design both informally, with numerous examples, and formally in a core calculus.
Morphing: Structurally shaping a class by reflecting on others
Acm Transactions on Programming Languages and Systems, 2011
We present MorphJ: a language for specifying general classes whose members are produced by iterat... more We present MorphJ: a language for specifying general classes whose members are produced by iterating over members of other classes. We call this technique “class morphing” or just “morphing.” Morphing extends the notion of genericity so that not only types of methods and fields, but also the structure of a class can vary according to type variables. This adds a
One of the main challenges facing ubiquitous computing research and development is the difficulty... more One of the main challenges facing ubiquitous computing research and development is the difficulty of writing software for complex, heterogeneous distributed applications. In this paper, we evaluate automatic application partitioning as an approach to rapid prototyping of ubiquitous computing systems. Our approach allows developers to largely ignore distribution issues when developing their applications, by providing tools for generating distribution code automatically, under user guidance. We claim that automatic partitioning is promising for a large class of ubiquitous computing applications and discuss an example ubicomp application re-engineered using our approach.
Trace Reduction for LRU-Based Simulations
Optimal Trace Reduction for LRU-based Simulations
An Overview of the Oregon Programming Languages Summer School
Acm Sigplan Notices, 2009
The Oregon Programming Languages Summer School (OPLSS) has been held at the University of Oregon ... more The Oregon Programming Languages Summer School (OPLSS) has been held at the University of Oregon each summer since 2002. We believe it has been an unqualified success in providing students and instructors a unique opportunity to study relevant background and ...
Proceedings of the 2006 International Symposium on Software Testing and Analysis, 2006
DSD-Crasher is a bug finding tool that follows a three-step approach to program analysis: D. Capt... more DSD-Crasher is a bug finding tool that follows a three-step approach to program analysis: D. Capture the program's intended execution behavior with dynamic invariant detection. The derived invariants exclude many unwanted values from the program's input domain. S. Statically analyze the program within the restricted input domain to explore many paths. D. Automatically generate test cases that focus on reproducing the predictions of the static analysis. Thereby confirmed results are feasible. This three-step approach yields benefits compared to past two-step combinations in the literature. In our evaluation with third-party applications, we demonstrate higher precision over tools that lack a dynamic step and higher efficiency over tools that lack a static step.
Uploads
Papers by Yannis Smaragdakis