0% found this document useful (0 votes)
14 views217 pages

An LLVM Obfuscator For Binary Patch Generation

Abstract As part of this Master’s Thesis some patches to LLVM have been written allowing the application of obfuscation techniques to the LLVM IR. These patches allow both obfuscation and polymorphism which results in code that is both hard to read and different from previous versions. This, makes finding the real changes made between versions harder for the attacker

Uploaded by

fidokido86
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views217 pages

An LLVM Obfuscator For Binary Patch Generation

Abstract As part of this Master’s Thesis some patches to LLVM have been written allowing the application of obfuscation techniques to the LLVM IR. These patches allow both obfuscation and polymorphism which results in code that is both hard to read and different from previous versions. This, makes finding the real changes made between versions harder for the attacker

Uploaded by

fidokido86
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 217

LO!

LLVM Obfuscator
An LLVM obfuscator for binary patch generation
Master of Science Thesis

FRANCISCO BLAS IZQUIERDO RIERA

Chalmers University of Technology


University of Gothenburg
Department of Computer Science and Engineering
Göteborg, Sweden, January 2014
The Author grants to Chalmers University of Technology and University of Gothenburg
the non-exclusive right to publish the Work electronically and in a non-commercial pur-
pose make it accessible on the Internet.
The Author warrants that he/she is the author to the Work, and warrants that the Work
does not contain text, pictures or other material that violates copyright law.
The Author shall, when transferring the rights of the Work to a third party (for example
a publisher or a company), acknowledge the third party about this agreement. If the
Author has signed a copyright agreement with a third party regarding the Work, the
Author warrants hereby that he/she has obtained any necessary permission from this
third party to let Chalmers University of Technology and University of Gothenburg store
the Work electronically and make it accessible on the Internet.

LO! LLVM Obfuscator


An LLVM obfuscator for binary patch generation
FRANCISCO BLAS IZQUIERDO RIERA
©FRANCISCO BLAS IZQUIERDO RIERA, January 2014.
Examiner: ANDREI SABELFELD
Supervisor: JONAS MAGAZINIUS
Chalmers University of Technology
University of Gothenburg
Department of Computer Science and Engineering
SE-412 96 Göteborg
Sweden
Telephone + 46 (0)31-772 1000
Cover:
The dragon representing the LLVM project having a rusty part of him replaced by a
new blurry part.
The picture represents the idea of replacing broken parts of projects with new obfuscated
parts.
The original dragon image is owned by Apple Inc. All right, title and interest in the
image, including the copyright therein, is retained by Apple.
Department of Computer Science and Engineering
Göteborg, Sweden January 2014
Abstract
As part of this Master’s Thesis some patches to LLVM have been written allowing
the application of obfuscation techniques to the LLVM IR. These patches allow both
obfuscation and polymorphism which results in code that is both hard to read and
different from previous versions. This, makes finding the real changes made between
versions harder for the attacker.
The techniques are applied using a function attribute as the seed for the CPRNGs used
by the transformations as a source of entropy. As a result it is possible to mark the
functions that should be obfuscated in the prototypes allowing the developer to create
binaries with the desired amount of changes and a sufficiently large amount of functions
that are hard to read and (if the seed is changed) different from previous versions.
In this Master’s Thesis the possible ways in which the applied techniques can be “re-
versed” have been evaluated to be able to compare the resulting code. For this to succeed
a transformation able to obtain LLVM IR from the resulting binary code is necessary,
this was not done as part of this work.
Acknowledgements

Life is all about choices: you exist because at some point of time your parents made
a choice on having a child1 . You are probably reading this text because you have made
a choice on that it will be interesting and you likely are who you are because others have
influenced your life through their own choices along with yours.
Choices can be good or bad with a whole gamut of grays in the middle. But inde-
pendently of the result, the path that a choice makes you take is more important and
enriching than the choice itself.
Despite I’m the one presenting this work as my Master’s Thesis and thus closing a
chapter of my life, it wouldn’t have been possible for me to do so if some people hadn’t
chosen to support me in one or another way. This pages will never be enough to thank
all of them.
Even worse, though, was making the big mistake of not writing them as I did on my
Computer Engineer final project [11], to make up for this, this section will also cover the
acknowledgments that weren’t done in that publication.
First of all and typical as it may seem I’d like to thank my parents. If they hadn’t
chosen to have me this thesis nor the project would have existed. Transitively the thanks
should expand to their parents and those’s parents (and so until the first freewilled being
I suppose) for taking similar choices.
Next of course comes the rest of my family, they have chosen to support me in my
studies all along and without their help this wouldn’t have been possible.
Going back before I even started my project some people I should thank would be
Jon Ander Gómez who had arranged the ICPC local programming contests in the UPV
as they helped me develop the skills I needed later and Miguel Sánchez and Alberto
Conejero who have been amongst the best teachers I had. Also I should thank the ELP
group at the DSIC department for giving me a chance of tasting what research felt like
back then.
Focusing on my Computer Engineer project I should be thanking Pedro López who
pushed for me, Julio Sahuquillo, my examiner at the UPV, Per Stenström, my exam-
iner at Chalmers and Rubén González, my advisor and the biggest influence in the
project.
Finally focusing on this actual work I’d have to thank the PaX Team and Anthony
(blueness) G. Basile for their great input on the project, Jonas Magazinius for agreeing
to be my advisor (and putting up with me all this time), Andrei Sabelfeld for being
such a nice examiner and Grim Schjetne for being such a nice opponent despite so little
notice. Also thanks to all who attended the presentation and provided input which has
helped improve this document.
1
Or just on not using contraceptives that fateful night and then following with the pregnancy.
But specially, thanks to all of you who hasn’t been mentioned on this page, your
small contributions are what really made this possible.
Whilst doing this work many things have changed in my life, I have seen the Hack-
erspace at Göteborg where I’m writing these lines take off, I have started a relationship
with a girl, and have met some new friends. I don’t know what the future will bring
with it as it’s really hard to see it now, but I’m quite convinced on what the past has
brought thanks to all that people, as this future yet to come wouldn’t have been possible
without their incredible help.
Thus, to all those who have helped in one way or another to make this possible I can
only say ¡Muchísimas gracias desde lo más profundo de mi corazón!2
Francisco, Göteborg 14/3/2013

2
Thanks a lot from the bottom of my heart!
Contents

1 Introduction 1
1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Problem statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2.1 Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2.2 Delimitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Thesis Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2 Definitions 5
2.1 Compiler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.1.1 Intermediate representation . . . . . . . . . . . . . . . . . . . . . . 5
2.1.1.1 Basic block . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.1.1.2 SSA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.1.1.3 PHI node . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.1.2 Frontend . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.1.3 Middle end . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.1.4 Backend . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2 LLVM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2.1 Clang . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2.2 DragonEgg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2.3 LLVM IR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2.4 Evaluation pass . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2.5 Transformation pass . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2.5.1 Optimization transformation . . . . . . . . . . . . . . . . 9
2.2.6 DAG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2.6.1 Legalization . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2.6.2 Register allocation . . . . . . . . . . . . . . . . . . . . . . 9
2.3 Code obfuscation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3.1 Obfuscation transformation . . . . . . . . . . . . . . . . . . . . . . 10
2.3.1.1 Control flattening . . . . . . . . . . . . . . . . . . . . . . 10

vi
CONTENTS

2.3.1.2 Constant obfuscation . . . . . . . . . . . . . . . . . . . . 10


2.3.1.3 Opaque predicate . . . . . . . . . . . . . . . . . . . . . . 11
2.3.2 Polymorphism transformation . . . . . . . . . . . . . . . . . . . . . 11
2.3.2.1 Register swap . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3.2.2 Dead code insertion . . . . . . . . . . . . . . . . . . . . . 11
2.3.2.3 Code reordering . . . . . . . . . . . . . . . . . . . . . . . 11
2.4 PRNG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.4.1 CPRNG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.5 Encryption algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.5.1 Symmetric key encryption algorithm . . . . . . . . . . . . . . . . . 13
2.5.2 Block encryption algorithm . . . . . . . . . . . . . . . . . . . . . . 13
2.5.3 AES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.5.3.1 CTR mode of operation . . . . . . . . . . . . . . . . . . . 14
2.5.3.2 CMAC mode of operation . . . . . . . . . . . . . . . . . . 15

3 Methodology 16
3.1 Information gathering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.1.1 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.2 Development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

4 Algorithm Implementation 19
4.1 Control Flattening . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.2 Constant obfuscation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.3 Register Swap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
4.4 Code reordering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.4.1 Instruction reordering . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.4.2 Basic block reordering . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.5 CPRNG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

5 Code design considerations 30


5.1 Coding conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
5.1.1 Transformation specific conventions . . . . . . . . . . . . . . . . . 30
5.1.2 Auxiliar library conventions . . . . . . . . . . . . . . . . . . . . . . 30
5.2 Design choices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
5.2.1 AES implementation used . . . . . . . . . . . . . . . . . . . . . . . 32
5.2.2 LLVM transformations . . . . . . . . . . . . . . . . . . . . . . . . 32
5.2.2.1 Transformation parameters . . . . . . . . . . . . . . . . . 32
5.2.2.2 Transformation implementation . . . . . . . . . . . . . . 32

6 Transformation implementations 33
6.1 Obfuscation key . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
6.1.1 addmodulekey . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
6.1.2 propagatemodulekey . . . . . . . . . . . . . . . . . . . . . . . . . . 33
6.2 Obfuscation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

vii
CONTENTS

6.2.1 flattencontrol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
6.2.2 obfuscateconstants . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
6.3 Polymorphic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
6.3.1 bbsplit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
6.3.2 randbb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
6.3.3 randins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
6.3.4 randfun . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
6.3.5 randglb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
6.3.6 swapops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

7 Evaluation 52
7.1 Proof: the CPRNG is a good PRNG . . . . . . . . . . . . . . . . . . . . . 52
7.1.1 Determinism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
7.1.2 Uniformity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
7.1.3 Independence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
7.1.4 Function period . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
7.2 Proof: the CPRNG is secure . . . . . . . . . . . . . . . . . . . . . . . . . . 54
7.2.1 Next bit-test resistance . . . . . . . . . . . . . . . . . . . . . . . . 54
7.2.2 Impossibility of knowing the result if only the state is known . . . 54
7.2.3 State compromise extension resistance . . . . . . . . . . . . . . . . 54
7.3 Proof: the key derivation is secure . . . . . . . . . . . . . . . . . . . . . . 55
7.4 Reversing the transformations . . . . . . . . . . . . . . . . . . . . . . . . . 56
7.4.1 Defining a global ordering of values . . . . . . . . . . . . . . . . . . 56
7.4.2 Defining a global ordering of instructions . . . . . . . . . . . . . . 56
7.4.3 Reversing the randfun and randglb transformations . . . . . . . . . 56
7.4.4 Reversing the swapops transformation . . . . . . . . . . . . . . . . 57
7.4.5 Reversing the randins transformation . . . . . . . . . . . . . . . . 57
7.4.6 Reversing the obfuscateconstants transformation . . . . . . . . . . 57
7.4.7 Reversing the flattencontrol transformation . . . . . . . . . . . . . 57
7.4.8 Reversing the bbsplit transformation . . . . . . . . . . . . . . . . . 57
7.4.9 Reversing the randbb transformation . . . . . . . . . . . . . . . . . 57
7.5 Binary patch obfuscation technique evaluation . . . . . . . . . . . . . . . 59

8 Conclusions 60

9 Future development 61

Bibliography 62

A Using the tools 66

B Popularization 69
B.1 Programming computers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
B.1.1 Assembly . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

viii
CONTENTS

B.1.2 Programming languages . . . . . . . . . . . . . . . . . . . . . . . . 70


B.2 Compilers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
B.3 Antireverse engineering . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
B.3.1 Obfuscation techniques . . . . . . . . . . . . . . . . . . . . . . . . 73
B.3.1.1 Control flattening . . . . . . . . . . . . . . . . . . . . . . 74
B.3.1.2 Constant obfuscation . . . . . . . . . . . . . . . . . . . . 75
B.3.1.3 Register swap . . . . . . . . . . . . . . . . . . . . . . . . 76
B.3.1.4 Instruction Reordering . . . . . . . . . . . . . . . . . . . 77
B.3.2 Focused obfuscation . . . . . . . . . . . . . . . . . . . . . . . . . . 79
B.3.3 Compiler-level obfuscation . . . . . . . . . . . . . . . . . . . . . . . 79
B.4 Deobfuscation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
B.4.1 Control unflattening . . . . . . . . . . . . . . . . . . . . . . . . . . 81
B.4.2 Constant deobfuscation . . . . . . . . . . . . . . . . . . . . . . . . 81
B.4.3 Register swap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
B.4.4 Instruction Reordering . . . . . . . . . . . . . . . . . . . . . . . . . 81
B.5 Program updates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
B.6 Practical example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

C Source code 84
C.1 Patches to LLVM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
C.2 Patches to Clang . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
C.3 Obf library . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
C.4 AES library . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

ix
1
Introduction

1.1 Background

A
s stated by [5], there is no silver bullet to prevent programmers from making
mistakes when coding applications. Some of these errors may not necessarily
cause more than a nuisance when hit by the users of the software, but some of
them may actually end up being vulnerabilities exploitable by a third party,
which can have undesirable effects for the software user ranging from unavailability of
software to more serious compromises like attackers gaining control of the system.
Generally, the likelihood of a software error increases with the size and complexity of
the software. Likewise, the probability of one of the errors being exploitable increases
with the amount of software errors. To make matters worse, the majority of current
devices use a reduced set of software programs, due to the size, complexity, and in-
compatibility of some softwares. For example, according to StatCounter more than half
of the opertaing systems (or OS) used to browse the web are Microsoft Windows NT
derivatives [24] (mostly Windows 7 and XP), while two-thirds of web browsing is done
using either Microsoft Internet Explorer, Google Chrome, or Mozilla Firefox [25]. As a
result of this lack of diversity, vulnerabilities can affect large amounts of devices and,
since most are connected to the Internet, attackers can remotely exploit these.
Usually, when a vulnerability is discovered or reported to the creator of the affected
software, he addresses the issue and creates a new version of that software in which the
problem is corrected.
Since distributing an entire copy of the new version of the program may require
many resources (For example, the Firefox 27.0.1 xul.dll file containing most of the GUI
runtime is 21.7 MB.), in most cases the developer instead releases a small file containing
the required updates and, occasionally, a short program to apply these changes to the

1
1.1 CHAPTER 1. INTRODUCTION

software. The files with the required changes are generally known as patches, and the
process of applying them is known as patching, although this word may also be used
to refer to the entire process of correcting a software issue. In addition to addressing
vulnerabilities, updating software in this manner is used to correct other types of software
errors (known as bugs) and to add new features that improve the software, though the
latter is quite rare.
When a developer deploys an upgrade, it necessarily takes some time before all of the
users actually apply the patch and secure their systems from attack, even if the update
process is done automatically by the software without user intervention. This period,
from the time the patch is made public to the time the last user applies it, represents
a window in which third parties can attack any non-upgraded software. Furthermore,
such parties could infer what vulnerability is present in the (un-patched) system, based
on the published patches.

2
1.2 CHAPTER 1. INTRODUCTION

1.2 Problem statement

A
s patches tend to be small to reduce the amount of resources required by the
updating process, it is relatively easy for the attacker to identify what is being
changed on the system. Thus, the attacker can discover the fault being fixed
and abuse it, if the user has not yet patched their system. This is usually
known as a 1-day vulnerability [22].

1.2.1 Goals

The goal of this thesis is to provide a set of tools which can help software developers
increase the amount of time an attacker requires to find and understand a particular
update in the patch. Two different methods will be combined to achieve this:
1. Applying obfuscation transformations to the code to render it more difficult for a third
party to understand.
2. Applying polymorphism transformations to increase the amount of differences between
the old and the new program and, thus, decrease the signal to noise ratio.
The project aims are to:
1. Provide a set of tools to allow the application of those transformations without mod-
ifying the original code during file compilation.
2. Adapt the polymorphism transformations and obfuscation transformations so they can
be applied to a compiler’s intermediate representation (or IR), particularly to LLVM
(whose IR is named LLVM IR) which uses a three way static single assignment form (or
SSA).
3. Create a proof of concept for integrating these tools into a real compiler, which would
then allow the developer to focus transformations only on the desired functions.
4. Evaluate the effectiveness of the transformations and, if possible, explain the steps
that could be taken to reverse them (or, at least, render them useless).

1.2.2 Delimitations

Every project inherently contains certain restrictions due to limited resources. In


this project, given the amount of time and other resources available for completion, the
following limits were placed upon the goals:
1. Not all of the possible transformations will be adapted or evaluated.
2. A proof of concept for the method in which an attacker could make the transformations
useless (if it is possible to do) will not be developed.

3
1.3 CHAPTER 1. INTRODUCTION

1.3 Thesis Structure

A
s seen above, this work starts with an introduction (chapter 1), containing
the project background (section 1.1), problem statement (section 1.2), project
goals (subsection 1.2.1), delimitations on those goals (subsection 1.2.2), and
thesis structure (section 1.3).
It continues with definitions (chapter 2), in which the various concepts and terms
used in this work are presented. This section also serves as an introduction for the less
experienced researcher.
From there, this paper proceeds onto explaining the different methodologies uti-
lized during the project (chapter 3). Including the process used to obtain information
(section 3.1), an overview of relevant research papers and other current work (subsec-
tion 3.1.1) , and, finally, the development techniques and conventions necessary for the
project (section 3.2).
In the next chapter (chapter 4) algorithms implemented in this project, along with
the cryptographic pseudorandom number generator (or CPRNG), and their conversion
into code, are explained.
Afterwards, the various decisions taken while designing the code are outlined (chap-
ter 5). In particular, style decisions (section 5.1), and design decisions (section 5.2).
Next, implementations of the different transformations are discussed in specifics
(chapter 6).
Following that, some proofs are provided and the developed techniques are studied
(chapter 7). A proof that the developed CPRNG has the properties desired for a pseu-
dorandom number generator (or PRNG) is provided (section 7.1) , along with proof
that CPRNG has the properties to be secure (section 7.2). Furthermore, proof that
the key generation system used is also safe is shown (section 7.3). Finally, an analysis
of how an attacker could reverse the developed transformations is given (section 7.4),
and an evaluation of the technique used for obfuscation of binary patches is shown (sec-
tion 7.5).
This paper ends with the project conclusions (chapter 8) and the expectations of
future development based on this work (chapter 9). A list of references can be found
afterwards.
In the annexes, instructions on how to use the developed tool (appendix A), an
explanation of the project aimed at the general public (appendix B), and project sources
and patches (appendix C) are found.

4
2
Definitions

2.1 Compiler

A
compiler is a tool able to perform translations from one language to another,
generally from a higher level language to a lower level one that is closer to the
language of the destination platform.
When creating a compiler two options can be chosen: making a direct translation
between each source and destination, or using a frontend to convert the language to
a common intermediate language and then a backend to convert from this common
language to the destination language.
The first alternative allows the programmer to exploit more of the expressibility of
the destination language and results in more efficient and smaller translations; however,
it requires a different compiler for each source and destination pair. The result is that the
number of different compilers required to cover a particular set of source and destination
languages is the product of the number of those languages.
The second alternative allows the programmer to maintain a common pipeline to
optimize the resulting intermediate language and may result in larger and slightly less
efficient translations; however, this alternative only requires as many frontends as source
languages and as many backends as destination languages.

2.1.1 Intermediate representation

This name is used to refer to a language or set of languages which the compiler will
translate the original code to before translating it into the desired final language. It is
also used to refer to any code written in any such language.

5
2.1 CHAPTER 2. DEFINITIONS

Generally, the generic optimization transformations are performed in the intermediate


representation, as they can then thus be applied to all of the destination languages.
IR is the abbreviation of intermediate representation and is used by LLVM to refer
to the language utilized in the optimization stages.

2.1.1.1 Basic block

A basic block, or BB, is a set of instructions with a single entry point and a single
exit point. As a result, jumps cannot point to the middle of a basic block, and basic
blocks cannot contain more than one jump instruction (which is always placed at the
end).

2.1.1.2 SSA

SSA is the acronym for static single assignment form. In general, it is used to refer to a
property of intermediate representations: that each variable is assigned only once.

2.1.1.3 PHI node

PHI Nodes are used by SSA languages to assign a variable with the appropriate value,
based on the basic block from which the node is reached. This allows SSA languages
to have some variables’ values assigned from other basic blocks when more than one
basic block jumps to a particular basic block. This is the case for loops and conditional
structures.

2.1.2 Frontend

The frontend is the part of the compiler transforming the original source language
into the first intermediate representation used by the compiler pipeline.
The responsibilities of this part of the compiler include checking the validity of the
source code, detecting and warning the user of any uncovered errors, performing any
source language specific optimizations, and extracting the metadata from the source
code for use at later stages.

2.1.3 Middle end

This is the part of the compiler that handles the transformations between interme-
diate representations either by transforming it into a different version using the same
intermediate representation or by generating code using a different intermediate repre-
sentations.

6
2.1 CHAPTER 2. DEFINITIONS

Optimizing the code returned by the frontend is the main responsibility of this part
of the compiler.

2.1.4 Backend

This is the final part of the compiler pipeline and transforms the last intermediate
representation into the desired destination language.
The main responsibilities of this part of the compiler are legalizing the code by ap-
plying transformations so that it only uses the instructions supported by the target
architecture, performing target specific optimizations, allocating the source architecture
registers to the different instructions, and emitting them.

7
2.2 CHAPTER 2. DEFINITIONS

2.2 LLVM

T
he LLVM project aims to provide a compiler framework with various native
frontends (C, C++, objective C, opencl, and Haskell, among others), a gcc in-
termediate representation code frontend (which allows support for Ada, D, and
Fortran), and backends for many CPU- and GPU-based platforms (including
ARM, x86, x86-64, MIPS, Nvidia’s PTX, and ATI’s R600).
LLVM uses the frontend to transform the source code into an SSA intermediate
representation known as IR, runs the desired optimization and transformation passes
over this code, and finally converts the final SSA to a DAG that is passed to the backend
for legalization, register allocation, and instruction emission.
A good explanation of the inner workings of the LLVM pipeline can be found at
[3].

2.2.1 Clang

Clang is a frontend for LLVM supporting C-type languages (C, C++, Objective C,
OpenCL, etc.). It is the frontend used by default when compiling those languages by
the LLVM compiler.

2.2.2 DragonEgg

DragonEgg is a frontend that allows the use of most of the gcc frontends with LLVM
and support of languages like ADA or Fortran.

2.2.3 LLVM IR

LLVM IR is the SSA language used internally by LLVM for intermediate repre-
sentations. This language can be represented by a set of memory structures during
compilation time, bitcode when stored on a disk or passed around pipes, and a human
readable assembly-like representation useful for developers.
Well formed code using this language must hold at least the properties stated by the
Verifier pass. A good description of the language can be found at [17], and the properties
held by well-formed language instances are explained in the Verifier.cpp file.

2.2.4 Evaluation pass

An evaluation pass parses the IR to generate some information about the code that
can be used by other passes.
Evaluation passes cannot modify the IR.

8
2.2 CHAPTER 2. DEFINITIONS

2.2.5 Transformation pass

A transformation pass parses the IR and generates a new IR (and, occasionally, in-
formation about the generated IR). In general, these passes must generate a functionally
equivalent IR in order to be considered correct. This is the type of optimization passes
used by LLVM, obfuscation passes, and polymorphism passes that have been created
during this project.

2.2.5.1 Optimization transformation

An optimization transformation is a transformation pass which parses the IR and


generates a new IR (and, occasionally, some information about the generated IR). These
transformations generate a functionally equivalent IR which is expected to execute using
less resources from the system.

2.2.6 DAG

DAG is the acronym for directed acyclic graph. DAGs are the intermediate repre-
sentation passed to the backends for transformation into the target instructions.

2.2.6.1 Legalization

Legalization is the process by which the backend transforms unsupported instructions


in the DAG into instructions supported by the instruction emitter. For example, this
stage would convert floating point instructions in an architecture into calls to auxiliary
functions that support them or sets of integer instructions able to perform the same
operations.

2.2.6.2 Register allocation

This is the process by which the registers available in the target architecture are
assigned to be source and destination registers for the DAG instructions. Performing
this process properly will result in significant performance differences in the resulting
code, especially for architectures with a small number of general purpose registers.
Finding the optimal allocation for registers can be reduced to graph coloring, which
is known to be an NP-Complete problem; however, a proof exists in [10] demonstrating
that register allocation can be done in polynomial time when using SSA form.

9
2.3 CHAPTER 2. DEFINITIONS

2.3 Code obfuscation

T
he procedure by which the input source code is transformed into a harder-
to-read code which is functionally equivalent to the source code is called code
obfuscation. Unlike optimization transformations, obfuscation transformations
do not necessarily produce faster code, as they only focus on making the re-
sulting code harder for humans to understand. For example the control flattening code
reduces the efficiency of the code, as it creates more difficulties for the jump predictor.
In a similar way, different constant obfuscation techniques make the processor execute a
greater number of instructions to achieve the same results.

2.3.1 Obfuscation transformation

Obfuscation transformations are transformations which perform code obfuscation.


This term will be used to refer only to those code obfuscation transformations which are
not intended to generate polymorphic code, such as constant obfuscation and control
flattening although in other literature the term “obfuscation transformations” covers a
wider variety of transformations intended to make the resulting code harder to read or
to detect.

2.3.1.1 Control flattening

The control flattening transformation was initially defined by Wang [32]. It is gen-
erally based on the idea of picking two or more basic blocks and making them jump
unconditionally to a new basic block, where the destination basic block will be chosen
depending upon the previous basic block. In this project, the transformation is applied
to all of the basic blocks inside a function, resulting in a single main basic block which
decides where to jump at its completion. The details of the algorithm implementing this
transformation with LLVM IR will be explained in the implementation section.

2.3.1.2 Constant obfuscation

The constant obfuscation transformation intends to make constants harder to read


by transforming them into a set of instructions which will produce the desired constant.
There are multiple ways of achieving this:
1. Fetching the constants from a memory position, for example an array. [32] pro-
posed using aliasing transformations over the array to make reversing the transforma-
tion more difficult (although this last part has not been implemented in this particular
project).

10
2.3 CHAPTER 2. DEFINITIONS

2. Encrypting the constants, for example with arithmetic operations, and converting
them into a cipher text and a key which when combined will result in the desired con-
stant.
3. Using an opaque predicate which will result in the desired constant. This was not
implemented in this project, either.

2.3.1.3 Opaque predicate

An opaque predicate is a predicate which will return the same value independently
of the input values. These are usually derived from mathematical identities.

2.3.2 Polymorphism transformation

This kind of transformation chooses and performs one of many possible transforma-
tions of a type which can be applied to the source code. By changing the seed of the
CPRNG used by these transformations different instances of code functionally equiva-
lent to the source code are created, which can be helpful in making the set of differences
of a patch larger.

2.3.2.1 Register swap

This transformation works by changing one general purpose register to another in


the code. This results in different instructions depending on the architectural register
being used.

2.3.2.2 Dead code insertion

Dead code insertion consists of inserting code that is unused by the resulting program.
This dead code may even be executed by the program, but the code’s results are not
used by the program.

2.3.2.3 Code reordering

Code reordering changes the order of the code inside a program, resulting in multiple
different programs depending upon how the code is reordered.

11
2.4 CHAPTER 2. DEFINITIONS

2.4 PRNG

A
PRNG or PseudoRandom Number Generator is a piece of code used to gen-
erate a series of numbers which has properties similar to those of real random
numbers. Since PRNGs generate the pseudorandom numbers by maintaining
a state derived from an initial seed (which is a small number used to initialize
the PRNG), it should be noted that they are deterministic, as they will generate the
same sequence when given the same seed and thus produce reproducible results.
Good PRNGSs follow these properties [8]:
Determinism Given the same seed the PRNG will produce the same sequence of
numbers.
Uniformity In the sequence all numbers are equally probable.
Independence Each output does not appear to depend on previous ones.

The previous properties, in turn, cause the following properties:


Good distribution The outputs are evenly distributed along the sequence.
Good dimensional distribution The outputs are evenly distributed when ana-
lyzed over many dimensions.
Appropriate distance between values The distance between the values gener-
ated is similar to that of a real random number generator.
Long period The generator requires a large amount of iterations before ending up
in the same state (and thus generating the same sequence again).
An example of PRNG is the rand() function provided by the C library (seeded by
the srand() function).

2.4.1 CPRNG

A CPRNG for Cryptographic PseudoRandom Number Generator, or CSPRNG for


Cryptographically Secure PseudoRandom Number Generator, is a PRNG with some
added properties that make it more resistant to cryptanalysis.
CPRNGs satisfy the properties of a good PRNG, while also holding the following,
stronger properties:
Satisfy the next-bit test Given the first k bits of the sequence, there is no poly-
nomial time algorithm able to predict the next bit with more than 50% accuracy.
Withstand state compromise extensions If the state of the CPRNG, or part
of it, has been compromised it should be impossible to guess the previous values
returned by the generator.

12
2.5 CHAPTER 2. DEFINITIONS

2.5 Encryption algorithm

A
n encryption algorithm is an algorithm, which given some data and a key,
merges the data with that key such that someone without the correct key
cannot read the data being encrypted with the algorithm.
Although there are some non-standard hieroglyphs carved in Egypt around 1900 BC
that were initially suspected of being the earliest cryptography, these are now considered
to be written merely as entertainment for literate individuals. Thus, the first verified
cryptography use dates to 1500 BC when an encrypted Mesopotamian clay tablet con-
taining some recipes, considered trade secrets, was written.
Despite this early start, not much serious work on cryptography and cryptanalysis was
done until the last century, when computers were used to automate the processes.
Perhaps the most robust encryption technique is the encryption algorithm known as
a one-time pad, which when used correctly is unbreakable, as the entropy provided by
the key (if truly random) is equal to the entropy of the message. Thus, it is impossible
to derive any information from the message.
There exist some ancient encryption algorithms, like Caesar or Vigénere ciphers, but
DES, AES, RSA, and RC4 are more recent examples.

2.5.1 Symmetric key encryption algorithm

A symmetric key encryption algorithm is an encryption algorithm which uses the


same key for encryption and decryption of data, so that key must be protected from
outsiders in order to protect the data.
Asymmetric key encryption algorithms are the opposite of symmetric key encryption
algorithms. With asymmetric key encryption algorithms, data is encrypted with one key
and decrypted with another, and there is no easily computable way of getting from the
encryption key to the decryption one. As a result, the encryption key can be published
and is called a public key, whilst the decryption key is kept secret by the receiver of the
message and is called a private key.
Classical ciphers, and modern ones like DES, AES, and RC4, are symmetric key
encryption algorithms.

2.5.2 Block encryption algorithm

A block encryption algorithm is an encryption algorithm which works over fixed-size


groups of bits independently.
Generally, these can be considered a pseudo random permutation, which is a function
performing a 1-to-1 mapping of an n bit input into an n bit output where the key is used
to choose one of the possible mappings.

13
2.5 CHAPTER 2. DEFINITIONS

As an encryption algorithm only able to encrypt a particular block size is quite unus-
able per se, various modes have been developed for these types of encryption algorithms
to make their use easier. The simplest such example is ECB, in which the message
is divided into blocks and then encrypted with the same key. Sadly, this is also quite
insecure, as equal blocks will be encrypted into the same output.
To solve this security problem, advanced modes like CBC (where the previous re-
sult is mixed with the plaintext before encryption) are available. Even more advanced
modes, such as those used to convert block encryption algorithms into stream ciphers
(which encrypt single bits) or authentication algorithms, or to provide authenticated
encryption (with or without authenticated extra data), also exist. The security of most
of these modes is usually based on the assumption that the algorithm is a pseudorandom
permutation.
Examples of such ciphers are AES and DES.

2.5.3 AES

Advanced Encryption Standard (or AES) is the NIST standardized version of Rijn-
dael, the winner of the AES selection process. It is a symmetric key block encryption
algorithm with a block size of 128 bits and key sizes of 128, 192, and 256 bits.
The AES competition was held to choose an appropriate successor to DES, the pre-
vious NIST standardized symmetric key block encryption algorithm which had key sizes
of 56 bits and block sizes of 64 and could be attacked by brute force.
Thanks to the standardization and widespread use of AES, efficient free software
implementations exist along with very efficient hardware implementations, including
those which use the AESNI instruction set on newer, x86 processors.

2.5.3.1 CTR mode of operation

The CTR (or counter) mode of operation converts a block encryption algorithm into
an stream encryption algorithm by using it to encrypt blocks which contain an increasing
counter and then doing an xor operation of the results with the plaintext, as would be
done by any stream encryption algorithm. This provides some advantages, such as easy
parallelization of the algorithm when applying it to various blocks, and, as with any
stream encryption algorithm, padding is not needed.
The CTR counter can be implemented in many ways (for example, by multiplying a
non-zero block number by a prime number), but the most popular method is to apply
an increment of one each time to the unsigned integer number made from the bits in
the previously used block, as this method is simple (especially when using a carry-aware
addition instruction) and still secure.

14
2.5 CHAPTER 2. DEFINITIONS

2.5.3.2 CMAC mode of operation

The Cipher based Message Authentication Code (or CMAC) mode of operation is an
authentication mode for block encryption algorithms which generates an authentication
tag for a given input. This mode of operation is similar to how a keyed hash based
authentication algorithm would work.
When used with a secret key, CMAC mode will prevent any information in the mes-
sage from being derived from the authentication tag, as long as the key is not known
and the block encryption algorithm is secure. If used with a known key, though, in some
cases CMAC mode can be reversed by the method shown in [12], but it is still useful as
a simple entropy collection algorithm for key derivation from variably sized data.

15
3
Methodology

3.1 Information gathering

I
nformation gathering has been conducted mainly by electronically searching for
published papers and other online sources that address the desired concepts, as well
as through reviewing the citations of those documents to discover other interesting,
related material.
The main focus has been on researching obfuscation transformations and polymor-
phism transformations that could be applied to this project. In this search, [1] and [33]
have been of special interest, given the outlooks they provide.
Some of the keywords used when searching for information have been Code Obfus-
cation, Control Flattening, Opaque Predicate, Binary Obfuscation, Dissasembly, 1-day
Exploit, Metamorphic Code, Deobfuscation, and Code Transformation.

3.1.1 Related work

Quite a lot of research has already been done on the topic of code obfuscation, even
though [2] proved that some functions cannot be obfuscated. One of the most relevant
papers on the topic is [32], which in Chapter 4 introduces a set of obfuscation techniques
that was later further developed by [33] into a general obfuscation method. This method
has served as the basis for the obfuscation techniques implemented in this project.
On a similar topic, [15] proposes the insertion of junk bytes before basic blocks along
with the use of opaque predicates in the added branch instructions to make it more
difficult for dissasemblers to go back to the original code by disrupting the instruction
stream.

16
3.1 CHAPTER 3. METHODOLOGY

The effectiveness of these techniques has been analyzed in papers such as [21], which
also contains an overview of some of these techniques and concludes that they can be
applied in an effective form at source code level. There has also been research on reversing
these techniques at [28, 20, 19], which introduce certain automatic and semi-automatic
tools to help in deobfuscating code generated through some common methods, including
control flattening.
Since obfuscation allows the intentions of the code being executed to be hidden, it
should not come as a surprise that one of the main focuses of obfuscation has been its
use in Malware programs in order to make such programs harder to detect, as shown by
papers such as [4, 26]. Even though the techniques explained in those works are useful for
similar purposes to that of this project, the focus of this project is very different.
Moving into more recent research, a robust review of available obfuscation techniques
can be found at [34]. One of the most recent and promising developments in this field
is the technique pointed to in [7], which proposes the use of cryptography to hide code
functionallity. Finally, some development on obfuscation using LLVM is presented by
[14, 6, 26, 23].
There exist many solutions to allow code obfuscation, but none could be found which
allow for focusing the transformations on the desired functions. Such an obfuscator
would permit the developer to control the size of the final patch. Additionally, many
tools only focus on obfuscating the resulting binary code, but in order to improve code
portability the developed tool needs to work on an upper layer. In this project the focus
will be on the intermediate representation code, resulting in a more portable tool.
The most similar project to the one described in this paper is obfuscator-llvm [13] 1 .
This project is limited by the need to identify functions by name (such approach is not
adequate in some languages, such as C++ where mangling is used); the use of a CPRNG
seeded randomly (which will result in different code on every compilation, making patch
generation more difficult); and a control flattening implementation that heavily depends
upon memory accesses (which require later optimization).

1
Code is available at https://github.com/obfuscator-llvm/obfuscator

17
3.2 CHAPTER 3. METHODOLOGY

3.2 Development

F
or this project the LLVM compiler and the Clang frontend were chosen be-
cause of the quality of their documentation (especially due to their explicative
tutorials, available at [18] and [27]).
As a result, the language used when developing this system was C++, as it is the
same language used by the above two projects. Notably, though, the AES library, which
provided the required AES modes for the CPRNG derived from Gladman’s library [9],
is written in C.
Since one of the objectives of this project is to produce code that can one day be
merged with LLVM, development has been made against the subversion sources, as was
the case for Clang2 .

2
The patches to the sources are all available for reference at http://klondike.es/programas/llvm_
obf/

18
4
Algorithm Implementation

4.1 Control Flattening

C
ontrol flattening tries to ensure that all basic blocks end up on the same basic
block, where the choice for the next basic block will be made based on the
information provided by the previous basic block (including which basic block
it was).
Depending upon the terminating instruction type, different actions will be needed on
the end of the basic block before jumping to the common basic block. Unconditional
branches should only transfer their value to a PHI node on the common basic block,
while conditional branches can transfer the value of a select instruction to that PHI
node. Advanced constructs, such as a switch, can be implemented in a similar way.
Regardless, because instructions such as indirect jumps cannot be easily handled, the
basic block may always be split before the terminator instruction so that it is processed
as an unconditional branch.
In this project, this transformation is implemented with the algorithm shown at
algorithm 4.1.
In order to improve the obfuscation generated by this transform, a pass which ran-
domly splits basic blocks can be used to make the basic blocks smaller and thus harder
to follow. Such an algorithm is defined at algorithm 4.3.

19
4.1 CHAPTER 4. ALGORITHM IMPLEMENTATION

Algorithm 4.1 Control Flattening


function controlFlattening(F )
if not entryblock E of F contains only a non conditional jump then
create a block B with a non conditional jump to E
make B the entryblock of F
end if
create a block M
for all instruction I1 in F do
replaceUses(F , I1 )
end for
for all PHI node P in F do
if not P is on M then
move P to M
for all basic block B not defined on P do
make P be P on B if P is alive
end for
end if
end for
for all terminator instruction I in F do
if not I can be processed then
split the basic block containing I before I
end if
end for
create an empty map I of identifiers to basic blocks
for all basic block B in F do
if B is reachable from M then
add an unique identifier for B on I
end if
end for
create a PHI node P on M
for all basic block B in F do
if B will point to M then
make P be the appropriate value of I on B
change the terminator instruction so it points to M
end if
end for
create a switch instruction in M
make M jump depending on the value of P
end function

20
4.1 CHAPTER 4. ALGORITHM IMPLEMENTATION

Algorithm 4.2 Use replacement


function replaceUses(F , I1 )
create an empty queue Q
for all instruction I2 in users of I1 do
if I2 is a PHI node then
if I2 gets the value I1 from a block not containing I1 then
queue I2 on Q
end if
else if I2 is on a different basic block than I1 then
queue I2 on Q
else if I2 is a terminator we can’t process then
queue I2 on Q
end if
end for
if Q contains elements then
create a new PHI node P in M
make P be P on blocks without I1 where I1 is alive
make P be I1 on the block with I1
for all instruction I2 in Q do
replace all uses of I1 in I2 by P
end for
end if
end function

Algorithm 4.3 Block Splitting


function blockSplitting(F , X, Y )
for all instruction I in F do
if x with Pr (x = true) = X Y then
split the basic block containing I before I
end if
end for
end function

21
4.2 CHAPTER 4. ALGORITHM IMPLEMENTATION

4.2 Constant obfuscation

C
onstant obfuscation is implemented by replacing constants with a set of in-
structions that result in the original constant. This can only be implemented
when the instruction containing the constant to be replaced is capable of us-
ing the result of other instructions in place of a constant at that particluar
position. The algorithm used for constant obfuscation is defined at algorithm 4.4.
The algorithm to obfuscate constants is implemented at algorithm 4.5. Currently,
this algorithm is only utilized for integer constants, as their arithmetic is relatively easy
to predict. For additional simplicity, some of the variables from the parent function are
not passed on to child functions in the pseudocode.

Algorithm 4.4 Finding constants to obfuscate


function constantObfuscation(F , X, Y )
create a pointer PA to the array with the constants moved to memory
for all basic block B in F do
for all PHI node P in B do
for all Constant C in P do
set IP to the terminator of the block returning C
replace C with obfuscateConstant(C, IP )
end for
end for
for all Instruction I in B do
for all Constant C in I do
replace C with obfuscateConstant(C, I)
end for
end for
end for
generate the array A with constants moved to memory
point PA to A
end function

22
4.2 CHAPTER 4. ALGORITHM IMPLEMENTATION

Algorithm 4.5 Obfuscating a constant


function obfuscateConstant(C, I)
if not C is integer then
return C
end if
if size of C ≤ integer array element size then
if x with Pr (x = true) = 12 then
. Memory fetch algorithm
create a constant C1 with the current size of the constant array
push C to the end of the constant array
if x with Pr (x = true) = X Y then
replace C1 with obfuscateConstant(C1 , I)
end if
insert before I a load instruction L1 of the array address from PA
insert before I a displacement instruction D with C1 over L1
insert before I a load instruction L2 of D
return L1
end if
end if
. Equivalent arithmetic instruction algorithm
create a random constant C1
if x with Pr (x = true) = 2Y X
then
replace C1 with obfuscateConstant(C1 , I)
end if
choose randomly an operation O of xor, add or sub
create a constant C2 so C1 (O)C2 = C
if x with Pr (x = true) = 2Y X
then
replace C2 with obfuscateConstant(C2 , I)
end if
insert before I a O instruction Oi with operands C1 and C2
return Oi
end function

23
4.3 CHAPTER 4. ALGORITHM IMPLEMENTATION

4.3 Register Swap

S
ince it is heavily architecture dependent, it would be quite complicated to define
this transformation without working directly with the architectural DAG. The
reason for this is that the LLVM IR has an unlimited number of anonymous
registers, thus making it impossible to swap two registers without also swapping
the instructions, which could lead to execution order issues.
To avoid this pitfall and gain a small portion of functionality, this project implements
random swapping of the operands of binary operators, where possible. The idea behind
this is that the register allocator may decide to issue the registers in a different order
when processing the DAGs. Furthermore, the effect of this register swapping is later
improved by the code reordering transformation, which takes into account instruction
dependencies inside basic blocks to ensure properly kept ordering.
The algorithm used for this simple transformation is defined at algorithm 4.6.

Algorithm 4.6 Register Swap


function registerSwap(F )
for all instruction I in F do
if I has 2 operands and I is conmutative then
if x with Pr (x = true) = 12 then
swap operands 1 and 2 of I
end if
end if
end for
end function

24
4.4 CHAPTER 4. ALGORITHM IMPLEMENTATION

4.4 Code reordering

C
ode reordering inside functions is mainly implemented through randomly re-
ordering the basic blocks and the instructions inside the function. The algo-
rithm to do so has some peculiarities, defined in the following sections.

4.4.1 Instruction reordering

Instruction reordering requires instructions with side effects to always be executed in


the same order (as the effects can cause hidden dependencies). It also requires dependen-
cies to be executed before the instructions which use them. Reordering is applied on a per
basic block basis to reduce the scope of the pass, as jumps would make the process more
complicated. The full algorithm presented at algorithm 4.7 is dividided into PHI node
scheduling (presented at algorithm 4.8), instruction dependecy list generation (presented
at algorithm 4.9), and instruction scheduling (presented at algorithm 4.10).
PHI nodes are handled independently from the other instructions, because they have
no dependencies between them and must always be scheduled at the begining of the
basic block. Furthermore, the terminator instruction is not altered, as it must always
be at the end of the basic block.

4.4.2 Basic block reordering

Basic block reordering only requires that the entry basic block is kept the same. The
algorithm simply creates a new ordering of all of the basic blocks on the function (except
the entry basic block) and reorders them according to that arrangement. The algorithm
definition can be seen at algorithm 4.11.

Algorithm 4.7 Instruction Reordering


function instructionReordering(B)
phiScheduling(B)
store in L and M the return value of instructionDependencies(B)
instructionScheduling(B, L, M )
end function

25
4.4 CHAPTER 4. ALGORITHM IMPLEMENTATION

Algorithm 4.8 Schedule PHI nodes


function phiScheduling(B)
make P1 the first PHI node in B
make L a list containing all PHI nodes in B
while size(L) > 0 do
extract a random element P2 from L
if not P2 = P1 then
swap the PHI nodes P1 with P2
end if
make P1 point to the PHI node after P1
end while
end function

Algorithm 4.9 Instruction dependency list creation


function instructionDependencies(B)
make L an empty list
make M a map of instructions to instruction lists
for all instruction I in B except PHI nodes and terminators do
if I has side effects then
for all instruction I2 in B after I do
if I2 has side effects or I2 reads memory then
append I to M [I2 ]
end if
end for
else if I reads memory then
for all instruction I2 in B after I do
if I2 has side effects then
append I to M [I2 ]
end if
end for
end if
for all operand O in I do
if O is an instruction in B before I then
append O to M [I]
end if
end for
if M [I] is empty then
append I to L
end if
end for
return L, M
end function

26
4.4 CHAPTER 4. ALGORITHM IMPLEMENTATION

Algorithm 4.10 Schedule Instructions


function instructionScheduling(B, L, M )
make I1 the first intruction not being a PHI node in B
while size(L) > 0 do
extract a random element I2 from L
if not I2 = I1 then
swap the instruction I1 with I2
end if
for all dependent instruction D in I2 do
Remove I2 from M [D]
if M [D] is empty then
place D on L
end if
end for
make I1 point to the instruction after I1
end while
end function

Algorithm 4.11 Basic block reordering


function basicBlockReordering(F )
make B1 the entry block in F
make L a list containing all PHI nodes in B
remove B1 from L
while size(L) > 0 do
make B1 point to the basic block after B1
extract a random element B2 from L
if not B2 = B1 then
swap the basic blocks B1 with B2
end if
end while
end function

27
4.5 CHAPTER 4. ALGORITHM IMPLEMENTATION

4.5 CPRNG

M
any of the transformations depend upon an entropy source to make random
choices. This project uses a CPRNG for this purpouse. The CPRNG utilizes
the pad used for encryption by AES in the CTR mode (which is the same
as encrypting blocks made of 0s using CTR), skipping any remaining bits
until the end of the basic block. This requires a key and an IV. The key is derived by
adding data dependent upon the module, the function, and the transformation, in order
to prevent the state from repeating. The initial IV is simply a string of 0 bits (as the
algorithm will still be safe even if the IV is known). The pseudocode for the CPRNG is
provided at algorithm 4.13.
In order to generate the key for this process, we will first summarize the obfuscation
key by using CMAC and the key “ABADCEBADABEBEFABADAACABACABECEA”.
(This is the Spanish phrase “Abad, cebada bebe, fabada acaba, cabecea”, which trans-
lates to ”The abbot drinks barley (referring to beer), ends with the fabada (a Spanish
dish made with white beans, sausages, and pork served with the water they were boiled
in), thus nods (out of sleepiness).”) Afterwards, we will use the resulting key and CMAC
to summarize the rest of the metadata which is considered to be public knowledge. This
procedure is chosen because it makes it more difficult to retrieve the obfuscation key
even if AES is broken and allows usage of any kind of data as an obfuscation key. The
algorithm for generating this key is provided at algorithm 4.12.
A proof for the security of a CPRNG created in this way is provided later in this
work.

28
4.5 CHAPTER 4. ALGORITHM IMPLEMENTATION

Algorithm 4.12 CPRNG initialization


function CPRNGInitialization(O)
. O is the obfuscation key
make K1 be 0xABADCEBADABEBEFABADAACABACABECEA in big endian
make K2 the result of CMACAESK1 (O)
if The pass applies to a function then
make P a byte set to 1
append to P the module name
append to P a byte set to 0
append to P the function name
append to P a byte set to 0
append to P the pass identifier
append to P a byte set to 0
else if The pass applies to a module then
make P a byte set to 2
append to P the module name
append to P a byte set to 0
append to P the pass identifier
append to P a byte set to 0
else
fail as this is not implemented
end if
make K3 the result of CMACAESK2 (P )
make S be K3 as key and a string of 0s as IV
return S
end function

Algorithm 4.13 CPRNG usage


function CPRNGRandom(S)
make K the key in S
make I the IV in S
make R the result of AESK (I)
increase I by 1
store in S the new value of I
return R
end function

29
5
Code design considerations

5.1 Coding conventions

T
he following conventions apply to all of the code which was written for this
project, though certain modifications of these were required, given the nature
of the original code. Such modifications are explained later.
Code is indented using 4 spaces for each opened brace not yet closed. No new line is
inserted between keywords or expressions and opening braces.
Variables and arguments can be named as desired. In general, iterators are either
given a letter starting from “i” or defined as “i” followed by an abbreviation of the class
being iterated. This convention was chosen mainly to reduce the development time of
the PoC, in spite of the maintenance cost, and will probably be dropped if the code is
submitted upstream.

5.1.1 Transformation specific conventions

Classes, methods, and functions follow mostly LLVM’s conventions: classes use
camel case starting with an upper-case letter, and methods and functions use camel
case starting with a lower-case letter.

5.1.2 Auxiliar library conventions

Classes, methods, and functions are given names in underscore-separated characters,


with case depending upon the use of abbreviations or words. Classes start with an
upper-case letter, while methods and functions do not. This will most likely be refactored

30
5.1 CHAPTER 5. CODE DESIGN CONSIDERATIONS

to adjust to LLVM’s conventions in later iterations, although the conventions will be


kept on the AES code unless it is merged into the utility library.

31
5.2 CHAPTER 5. CODE DESIGN CONSIDERATIONS

5.2 Design choices

A
set of libraries and a framework to implement the code needed to be chosen.
For AES support, a slightly modified version of Brian Gladman’s AES library
[9] was selected. For the transformations, LLVM’s framework was chosen. In
the following subsections the implications of such choices are exposed.

5.2.1 AES implementation used

Brian Gladman’s AES implementation was adapted (by altering the CTR mode so
that it will only provide the pad) and utilized because of its liberal license and high qual-
ity, demonstrated by references to it in Intel’s documentation, amongst others [35].

5.2.2 LLVM transformations

LLVM transformations inherit from ModulePass [30] and FunctionPass [29] and are
implemented in anonymous namespaces to prevent pollution. (Common code was moved
to the Utils.cpp file and implemented in the Obf namespace.)
Transformations are declared by using the RegisterPass [31] template. Also, a per
module ID (depending on the class) is declared as it used later for pass identifica-
tion.
When possible, the transformation keeps the analysis produced and reports it to the
pass manager.

5.2.2.1 Transformation parameters

Parameters are passed by the command line and parsed through the cl [16] API in
LLVM. A specific parser for probabilities was written for this project. Probabilities are
defined as “numerator/denominator”. For example, a probability of 50% (1 in 2) would
be expressed as 1/2.

5.2.2.2 Transformation implementation

The implemented transformations depend upon the presence of an obfuscation key


in order to work. As such, the presence of this key is used to decide whether or not the
chosen transformations should be applied to a particular function or module.

32
6
Transformation implementations

6.1 Obfuscation key

T
hese transformations handle the obfuscation keys used by other transforma-
tions. Some transformations require a module key which can only be provided
with the transformations below, whilst others require function keys which can
be forced on all functions with these transformations.

6.1.1 addmodulekey

The addmodulekey transformation simply attaches the specified obfuscation key (as
named metadata) onto the module for future use by other transformations. A pseu-
docode definition is provided at algorithm 6.1.
The addmodulekey transformation is the only current means of expressing a module
obfuscation key.
The key can be defined using the modulekey parameter, followed by the string used
as the module key.

6.1.2 propagatemodulekey

This transformation propagates the module obfuscation key to all of the functions in
the current module. It will overwrite any key already in place. A pseudocode definition
is provided at algorithm 6.2.
Propagating the module obfuscation key is useful for testing, applying transforma-
tions automatically in certain cases, and as an all-or-none switch.

33
6.1 CHAPTER 6. TRANSFORMATION IMPLEMENTATIONS

Algorithm 6.1 addmodulekey


function addmodulekey(Module & M , Key K)
setModuleMetadata(M ,”ObfuscationKey”,K) . Set the module key
end function

Algorithm 6.2 propagatemodulekey


function propagatemodulekey(Module & M )
String K = getModuleMetadata(M ,”ObfuscationKey”)
for all Function F in M do
setFunctionAttribute(F ,”ObfuscationKey”,K) . Set the function key
end for
end function

34
6.2 CHAPTER 6. TRANSFORMATION IMPLEMENTATIONS

6.2 Obfuscation

T
hese transformations take the original code and return a new one which is
harder for humans to read, yet still functionally equivalent to the original.
They are mostly based on the ideas of [32].

6.2.1 flattencontrol

This transformation applies the control flattening algorithm, but it is quite complex
given the way in which the LLVM IR language is implemented.
Furthermore, the current implementation could benefit from more code modulariza-
tion. This was not performed, due to the time constraints of the project.
The pseudocode for the transformation is provided at algorithm 6.3.

6.2.2 obfuscateconstants

This transformation applies the constant obfuscation algorithm.


One of the main issues is that some LLVM instructions and calls to intrinsics contain
operands which must be a constant (for example, the alignment in a load instruction or
the destinations on a switch instruction). These constants cannot be replaced by code
which returns them.
The current implementation is capable of separating the different transformations into
their own modules for simplicity, but this was sacrificed in order to speed up development
of the PoC.
The move to an array method could add random data when expanding the constants
to make inferring the size more difficult, or the move could use a single byte constant
so that bigger constants would be divided into smaller ones and reassembled. Also,
randomly reordering the array would make the resulting array impossible to read.
The pseudocode for the transformation is provided at algorithm 6.12.

35
6.2 CHAPTER 6. TRANSFORMATION IMPLEMENTATIONS

Algorithm 6.3 flattencontrol


function flattencontrol(Function & F )
CPRNG R = prng(F ,”flattencontrol”)
if not isNull(R) then
prepareEntriesAndExits(F )
BasicBlockList L = generateNodeList(F )
BasicBlock U = getUnreachable(F )
BasicBlock M = new BasicBlock . Create the main node
append(F ,M )
genPHINodes(F , M , L)
movePHINodes(F , M , L)
removeUnhandledTerminators(L)
BasicBlock2IntegerMap D = generateBlockIds(L, R)
handleTerminators(L, M , D)
end if
end function

Algorithm 6.4 prepareEntriesAndExits


function prepareEntriesAndExits(Function & F )
unifyFunctionExitNodes(F ) . Merge all exit points of the function
BasicBlock E = getEntryBlock(F )
Terminator T = getTerminator(E)
if size(E) 6=1 or isUnconditionalBranch(T ) then
InsertionPoint B = begin(E)
splitAt(E,B) . Make the entry block only an unconditional branch
end if
end function

Algorithm 6.5 generateNodeList


function generateNodeList(Function F )
BasicBlockList L = new BasicBlockList
for all BasicBlock B in F do
append(L,B)
end for
return L
end function

36
6.2 CHAPTER 6. TRANSFORMATION IMPLEMENTATIONS

Algorithm 6.6 getUnreachable


function getUnreachable(Function & F )
for all BasicBlock B in F do
Terminator T = getTerminator(B)
if isUnreachable(T ) then
return B . Return the found block
end if
end for
BasicBlock B = new BasicBlock . Create and return a new block
Unreachable U = new Unreachable
append(B,U )
append(F ,B)
return B
end function

37
6.2 CHAPTER 6. TRANSFORMATION IMPLEMENTATIONS

Algorithm 6.7 genPHINodes


function genPHINodes(Function & F , BasicBlock & M , BasicBlockList L1 )
for all BasicBlock B1 in F do
for all Instruction I in B1 do
UserList L2 = new UserList . Keep cross block uses
K = false . Shall we keep the value
for all User U in I do
if isPHINode(U ) then
if getBlockForUse(U ) 6=B1 then
K = true
append(L2 , U )
end if
else if isInstruction(U ) and getParent(U ) 6=B1 then
K = true
append(L2 , U )
else if isTerminator(U ) and not isBranch(U ) then
append(L2 , U )
end if
end for
if not empty(L2 ) then
PHINode P = new PHINode
append(P ,M )
for all BasicBlock B2 in L1 do
if B2 = getEntryBlock(F ) then
valueFrom(P ,B2 ,undefined)
else if B2 = B1 then
valueFrom(P ,B2 ,I)
else if K then
valueFrom(P ,B2 ,P )
else
valueFrom(P ,B2 ,undefined)
end if
end for
for all Use U in L2 do
replaceUseWith(U ,P )
end for
end if
end for
end for
end function

38
6.2 CHAPTER 6. TRANSFORMATION IMPLEMENTATIONS

Algorithm 6.8 movePHINodes


function movePHINodes(Function & F , BasicBlock & M , BasicBlockList L)
for all BasicBlock B1 in F do
for all PHINode P1 in B do
PHINode P2 = new PHINode
append(P ,M )
for all BasicBlock B2 in L do
if hasValueFrom(P ,B2 ) then
V = getValueFrom(P ,B2 )
valueFrom(P ,B2 ,V )
else if B2 = getEntryBlock(F ) then
valueFrom(P ,B2 ,undefined)
else
valueFrom(P ,B2 ,P )
end if
end for
for all Use U in P1 do
replaceUseWith(U ,P2 )
end for
end for
end for
end function

Algorithm 6.9 removeUnhandledTerminators


function removeUnhandledTerminators(BasicBlockList L)
for all BasicBlock B in L do
T = getTerminator(B)
if not isBranch(T ) then
splitAt(B, T )
end if
end for
end function

39
6.2 CHAPTER 6. TRANSFORMATION IMPLEMENTATIONS

Algorithm 6.10 generateBlockIds


function generateBlockIds(BasicBlockList L, CPRNG & R)
BasicBlockSet S = new BasicBlockSet
for all BasicBlock B in L do
T = getTerminator(B)
for all BasicBlock B in getDestinations(T ) do
add(S, B)
end for
end for
BasicBlockArray A = toArray(S)
randomizeOrder(R, A)
Integer P = 0
BasicBlock2IntegerMap D = new BasicBlock2IntegerMap
for all BasicBlock B in A do
D[B] = P
P =P +1
end for
return D
end function

40
6.2 CHAPTER 6. TRANSFORMATION IMPLEMENTATIONS

Algorithm 6.11 handleTerminators


function handleTerminators(BasicBlockList L, BasicBlock & M , BasicBlock2In-
tegerMap D)
PHInode P = new PHInode
append(M , P )
for all BasicBlock B in L do
T = getTerminator(B)
remove(B, T )
if isConditionalBranch(T ) then
C = getCondition(T )
Dt = getDestinationTrue(T )
Df = getDestinationFalse(T )
Select S = new Select
setCondition(S, C)
setValueTrue(S, D[Dt ])
setValueFalse(S, D[Df ])
append(B, S) valueFrom(P ,B,S)
else . It can only be an unconditional branch
Du = getDestination(T ) valueFrom(P ,B,D[Du ])
end if
UnconditionalBranch U = new UnconditionalBranch
setDestination(U , M )
append(B, U )
end for
Switch S = new Switch
for all BasicBlock B in D do
setDestinationIfValue(S, D[B], B)
end for
append(M , S)
end function

41
6.2 CHAPTER 6. TRANSFORMATION IMPLEMENTATIONS

Algorithm 6.12 obfuscateconstants


function obfuscateconstants(Module & M )
ArrayPointer P = new ArrayPointer . To be able to access the array
addGlobal(M , P )
ConstantList L = new ConstantList . Constants moved to memory go here
for all Function F in M do
CPRNG R = prng(F ,”obfuscateconstants”)
if not isNull(R) then
for all Instruction I1 in I do
if isPHINode(I1 ) then
for all BasicBlock B2 in getFrom(I1 ) do
Instruction I2 = getTerminator(B2 )
Value V = getValueFrom(P ,B2 )
obfuscateUse(R, P , L, I2 , V )
end for
else
for all Value V in I do
if canBeInstruction(V ,I) then
obfuscateUse(R, P , L, I1 , V )
end if
end for
end if
end for
end if
end for
Array A = arrayFromList(L)
addGlobal(M , A)
setValue(P )getReference(A) . Point P to A
end function

Algorithm 6.13 obfuscateUse


function obfuscateUse(CPRNG & R, ArrayPointer P , ConstantList & L, Instruc-
tion I1 , Value & V )
if isConstant(V ) then
Instruction I2 = obfuscateConstant(R, P , L, I1 , V )
if C 6=V then
obfuscatedConstants = obfuscatedConstants + 1
replace(V ,I2 )
end if
end if
end function

42
6.2 CHAPTER 6. TRANSFORMATION IMPLEMENTATIONS

Algorithm 6.14 obfuscateConstant


function obfuscateConstant(CPRNG & R, ArrayPointer P , ConstantList & L,
Instruction I1 , Value & V )
Integer SV = bitlength(V )
Integer SA = bitlength(getReferencedTypeP ())
if isInteger(V ) then
if SV ≤SA and withProbability(R, 12 ) then
return moveToArray(R, P , L, I1 , V )
else
return createOperation(R, P , L, I1 , V )
end if
end if
return V
end function

Algorithm 6.15 moveToArray


function moveToArray(CPRNG & R, ArrayPointer P , ConstantList & L, Instruc-
tion I1 , Value & V )
Probability PR = getReobfuscationProbability()
Integer SV = bitlength(V )
Integer SA = bitlength(getReferencedTypeP ())
Constant C1 = size(L)
append(L,V )
if withProbability(R, PR ) then
C1 = obfuscateConstant(R, P , L, I1 , C1 )
reobfuscatedConstants = reobfuscatedConstants + 1
end if
Instruction I2 = createLoad(P )
InsertBefore(I1 ,I2 )
Instruction I3 = createGetArrayAddress(I2 ,C1 )
InsertBefore(I1 ,I3 )
Instruction I4 = createLoad(I3 )
InsertBefore(I1 ,I4 )
Instruction VR = I4
if SV < SA then
Instruction I5 = createTruncate(I4 ,bitlength(V ))
InsertBefore(I1 ,I5 )
VR = I5
end if
return VR
end function

43
6.2 CHAPTER 6. TRANSFORMATION IMPLEMENTATIONS

Algorithm 6.16 createOperation


function createOperation(CPRNG & R, ArrayPointer P , ConstantList & L, In-
struction I1 , Value & V )
Probability PR = getReobfuscationProbability()
Constant C1 = getRandomInteger(R)
Operation O
if withProbability(R, P2R ) then
C1 = obfuscateConstant(R, P , L, I1 , C1 )
reobfuscatedConstants = reobfuscatedConstants + 1
end if
Constant C2
if withProbability(R, 13 ) then
C2 = V − C1
O = createAddOperation()
else if withProbability(R, 12 ) then
C2 = V + C1
O = createSubstractOperation()
else
C2 = V ⊕ C1
O = createXorOperation()
end if
if withProbability(R, P2R ) then
C2 = obfuscateConstant(R, P , L, I1 , C2 )
reobfuscatedConstants = reobfuscatedConstants + 1
end if
Instruction I2 = createOperationInstruction(O,C2 ,C1 )
InsertBefore(I1 ,I2 )
return I2
end function

44
6.3 CHAPTER 6. TRANSFORMATION IMPLEMENTATIONS

6.3 Polymorphic

T
he polymorphic transformations do not aim to make the code more difficult to
read but different every time it is run, according to the results of a PRNG.
This results in smaller penalties for using the transformations but can make
the code harder to compare.

6.3.1 bbsplit

This transformation will go over all of the basic blocks of the function and, for each
basic block, decide on splitting it for each instruction (except for the PHI nodes and the
first non PHI node instruction).
As splitting can alter the basic blocks list, all of the initial basic blocks are stored on
a vector, upon which splitting is then run.
The probability of splitting a basic block at each particular point can be adjusted by
using the splitprobability parameter. Keep in mind, though, that setting the parameter
to one will result in each instruction being split.
The pseudocode for the transformation is provided at algorithm 6.17.

6.3.2 randbb

This transformation applies the basic blocks reordering algorithm to each function.
Of greatest importance in this step is keeping the entry block the same.
The pseudocode for the transformation is provided at algorithm 6.18.

6.3.3 randins

This transformation applies the instructions reordering algorithm to each basic block.
The code could be improved upon by separating the PHI node handling function from
the more complex handling of normal instructions. Again, has not been done because
of the time constraints of the project.
The pseudocode for this transformation is provided at algorithm 6.19.

6.3.4 randfun

This transformation applies the functions reordering algorithm to each module.


Although not necessarily useful for binary patch obfuscation, this transformation was
developed because of the aid it provided in code hardening at compilation time.
The pseudocode for the transformation is provided at algorithm 6.23.

45
6.3 CHAPTER 6. TRANSFORMATION IMPLEMENTATIONS

6.3.5 randglb

This transformation applies the globals reordering algorithm to each module.


Again, although not of interest for binary patch obfuscation, the transformation was
developed for the assistance it provides in code hardening at compilation time.
The pseudocode for this transformation is provided at algorithm 6.24.

6.3.6 swapops

This transformation applies the operands reordering algorithm to each module, which
usually results in different registers being allocated on the ensuing assembly code.
The pseudocode for the transformation is provided at algorithm 6.25.

Algorithm 6.17 bbsplit


function bbsplit(Function & F )
CPRNG R = prng(F ,”bbsplit”)
if not isNull(R) then
BasicBlockQueue Q = new BasicBlockQueue
Probability PS = getSplitProbability()
for all BasicBlock B in F do
append(Q, B) . Queue blocks to avoid trouble
end for
for all BasicBlock B in Q do
for all Instruction I in B do
if not isPHINode(I) and not isFirstNonPhi(B,I) then
if withProbability(R, PS ) then
splitAt(I, B)
end if
end if
end for
end for
end if
end function

46
6.3 CHAPTER 6. TRANSFORMATION IMPLEMENTATIONS

Algorithm 6.18 randbb


function randbb(Function & F )
CPRNG R = prng(F ,”randbb”)
if not isNull(R) then
BasicBlockArray A = new BasicBlockArray
BasicBlock I = getEntryBlock(F )
for all BasicBlock B in F do
if B 6=I then
append(A, B)
end if
end for
randomizeOrder(R, A)
for all BasicBlock B in A do
if B 6=I then
moveAfter(I, B)
I =B
end if
end for
end if
end function

Algorithm 6.19 randins


function randins(Function & F )
CPRNG R = prng(F ,”randins”)
if not isNull(R) then
for all BasicBlock B in F do
reorderPHINodes(R, B)
Instruction2InstructionSetMap & M
InstructionList & L
M , L = createDependencyMap(B)
reorderNonPHINodes(R, B, M , L)
end for
end if
end function

47
6.3 CHAPTER 6. TRANSFORMATION IMPLEMENTATIONS

Algorithm 6.20 reorderPHINodes


function reorderPHINodes(CPRNG & R, BasicBlock & B)
PHINodeArray A = new PHINodeArray
for all Instruction I in B do
if isPHINode(I) then
append(A, I)
end if
end for
randomizeOrder(R, A)
PHINode I = getFirstPHINode(B)
for all PHINode P in A do
if P 6=I then
moveBefore(I, P )
else
I =P
end if
end for
end function

48
6.3 CHAPTER 6. TRANSFORMATION IMPLEMENTATIONS

Algorithm 6.21 createDependencyMap


function createDependencyMap(BasicBlock & B)
Instruction2InstructionSetMap M = new Instruction2InstructionSetMap
InstructionList L = new InstructionList
for all Instruction I1 in B do
if not isPHINode(I1 ) then
if hasSideEffects(I1 ) then
for all Instruction I2 in instructionsAfterI1 ( )do
if hasSideEffects(I2 ) or readsMemory(I2 ) then
append(M [I1 ], I2 )
end if
end for
end if
if readsMemory(I1 ) then
for all Instruction I2 in instructionsAfterI1 ( )do
if hasSideEffects(I2 ) then
append(M [I1 ], I2 )
end if
end for
end if
for all Operand O in I1 do
Boolean OI = isInstruction(O)
Boolean OA = isAfter(O,I1 ) . operand must be after instruction
Boolean OP = not isPHINode(O)
Boolean OB = getBasicBlock(O) 6=B
if OI and OA and OP and OB then
append(M [I1 ], O)
end if
end for
if empty(M [I1 ]) then
append(L, I1 )
end if
end if
end for
return M , L
end function

49
6.3 CHAPTER 6. TRANSFORMATION IMPLEMENTATIONS

Algorithm 6.22 reorderNonPHINodes


function reorderNonPHINodes(CPRNG & R, BasicBlock & B, Instruction2In-
structionSetMap & M , InstructionList & L)
Instruction I1 = getFirstNonPhi(B)
while not empty(L) do
Instruction I2 = extractRandomElement(R,L)
if I2 6=I1 then
moveBefore(I1 , I2 ) . move I2 before I1
else
I1 = getNext(I1 )
end if
for all User U in I2 do
if isInstruction(U ) and isOn(M ,U ) then
remove(M [U ], I2 )
if empty(M [U ]) then
remove(M [U ])
append(L, J)
end if
end if
end for
end while
end function

Algorithm 6.23 randfun


function randfun(Module & M )
CPRNG R = prng(M ,”randfun”)
if not isNull(R) then
FunctionArray A = new FunctionArray
for all Function F in M do
append(A, F )
end for
randomizeOrder(R, A)
Function I = getFirstFunction(M )
for all Function F in A do
if F 6=I then
moveAfter(I, F )
I =F
end if
end for
end if
end function

50
6.3 CHAPTER 6. TRANSFORMATION IMPLEMENTATIONS

Algorithm 6.24 randglb


function randglb(Module & M )
CPRNG R = prng(M ,”randglb”)
if not isNull(R) then
GlobalArray A = new GlobalArray
for all Global G in M do
append(A, G)
end for
randomizeOrder(R, A)
Global I = getFirstGlobal(M )
for all Global G in A do
if G 6=I then
moveAfter(I, G)
I =G
end if
end for
end if
end function

Algorithm 6.25 swapops


function swapops(Function & F )
CPRNG R = prng(F ,”swapops”)
if not isNull(R) then
for all Instruction I in F do
Boolean IC = isConmutative(I)
Boolean IB = isBinary(I)
if IC and IB and withProbability(R, PS ) then
swapOperands(I)
swappedOperands = swappedOperands + 1
end if
end for
end if
end function

51
7
Evaluation

7.1 Proof: the CPRNG is a good PRNG

A
s the CTR mode based CPRNG being used is at the heart of this project’s
transformations, it is important to prove that it follows the properties desirable
for any Pseudo-Random Number Generation in order to demonstrate that the
use of such a generator is adequate.
In the following subsections, proof is provided that the CPRNG has the properties of
determinism, uniformity, and independence. Additionally, its period is calculated.

7.1.1 Determinism

Determinism is given by the fact that no random data is used to generate the key used
by CTR mode and that the original IV is the same. Thus, as block ciphers need to be
deterministic to allow decryption on the other side, the CPRNG is deterministic.

7.1.2 Uniformity

Given that block ciphers are a one to one mapping of n-bit blocks to n-bit blocks
and that the IV is incremented by one each time, the CPRNG will cover all of the
2n possible inputs (and thus the 2n possible outputs), generating the largest possible
uniform output.

52
7.1 CHAPTER 7. EVALUATION

7.1.3 Independence

Since the mapping done by the encryption algorithm is based on the key used, all
outputs are independent from each other, as long as the encryption algorithm is a pseu-
dorandom permutation.

7.1.4 Function period

As the counter iterates over the total 2n states that are possible with its n-bits, and
as each input block is mapped to a different and unique ouput block, the period of the
CPRNG is exactly 2n , which should be large enough for any practical use.

53
7.2 CHAPTER 7. EVALUATION

7.2 Proof: the CPRNG is secure

I
n a similar way, because the CTR mode-based CPRNG used for this project can
be attacked for the purpose of determining the decisions made during the trans-
formations, it is important to prove that it follows the properties desirable for any
Cryptographic Pseudo-Random Number Generator, thus ensuring that the use of
such a generator is adequate.
In the following subsections, proof is provided that the CPRNG has the following
properties: resistance to next bit tests; impossibility of deriving the function result if the
state is known; and, based on this, resistance to the state compromise extension.

7.2.1 Next bit-test resistance

As long as the block encryption algorithm used for the CTR mode is resistant to
cryptanalysis, it will be impossible to derive the key used (and thus the state) to predict
the next block that will be generated. Inside blocks this property is held, as the cipher
is a pseudorandom permutation, and thus no bit presents a visible dependence from the
previous one.

7.2.2 Impossibility of knowing the result if only the state is known

One of the problems with the CPRNG is that the seed used for the state (the IV
of the CTR mode) is known (and is zero); however, since the attacker has no way of
knowing the key (if the obfuscation key is kept secret), it is impossible for him to know
which of all the possible blocks will be generated by AES.

7.2.3 State compromise extension resistance

Since the key used in CTR is hidden and is not part of the state (which is only the
IV), knowing the value of the IV provides no information, as long as the key is resistant
to known plaintext attacks. As a result, given the impossibility of knowing the result if
only the state is known, even if the state is known and previous and future states can be
derived, it is impossible for the attacker to know the result of the function, thus making
the algorithm secure.

54
7.3 CHAPTER 7. EVALUATION

7.3 Proof: the key derivation is secure

T
he first CMAC iteration is performed using a symmetric key encryption algo-
rithm as a hash function in order to summarize the entropy of the obfuscation
key string. Since CMAC uses the previous AES outputs to calculate the next
one, this effectively results in all of the entropy from the original key being kept
and compressed in the resulting tag (with up to the 2128 bits possible as output).
The second CMAC iteration uses the resulting key as the key to encrypt a string made
of publicly known data (an identifier depending on the function name being available or
not, the module name, and the transformation name).
Since the obfuscation key is only used as the key of the CMAC algorithm, it is
impossible for the attacker to derive it without actually breaking AES. Additionally, the
entropy provided by the key and the input string is effectively summarized by CMAC
into a smaller string which can be used as a key for CTR, as proved above.

55
7.4 CHAPTER 7. EVALUATION

7.4 Reversing the transformations

D
uring evaluation of the implemented transformations it was discovered that it
is possible to reverse each of them. The following sections describe a method
for doing so, although this method was not implemented, due to time con-
straints.
The objective is not to get back to the original code (doing so is most likely impos-
sible without breaking the CPRNG), but to gain a set of transformations that, when
applied to the original and the obfuscated assembly, will result in the same LLVM IR.
(Thus, if the obfuscated code is equal to the one previously provided, it will result in
the same IR code.) Furthermore, when the obfuscated assembly is different from the
original assembly, the resulting IR code will only be correspondingly different, allowing
an attacker to focus only on the vulnerability.
The possibility of reversing the transformations, though, depends on the possibility
of transforming the resulting assembly back into LLVM’s IR. A way to achieve this
is by modeling each instruction of the assembly language into one or more equivalent
instructions in LLVM’s IR, and then transforming register accesses into memory reads
and writes (which could be optimized later).
Currently, no known library is able to accomplish this, but it is reasonable to predict
that one may be developed in the future.

7.4.1 Defining a global ordering of values

Values can be constants or instruction results. Defining their ordering is important,


as it allows the deobfuscating program to define how to order the “contents” of an
instruction (i.e., the operands). The value ordering given the instructions I and J is
defined at algorithm 7.1.

7.4.2 Defining a global ordering of instructions

This is the core of reversing most of the code reordering transformations, as when
instructions can be ordered an ordering can also be created for the contents of basic
blocks and functions.
The instruction ordering given the instructions I and J in the same basic block is
defined at algorithm 7.2.

7.4.3 Reversing the randfun and randglb transformations

These can be reversed quite trivially by reordering globals and functions alphabeti-
cally or, when anonymous or with the same name, by their contents’ values.

56
7.4 CHAPTER 7. EVALUATION

7.4.4 Reversing the swapops transformation

This can be reversed (once an ordering for values is defined) by ordering the operands
of the instruction accordingly, if the instruction is a candidate for operand swapping.

7.4.5 Reversing the randins transformation

This transformation can be reversed by ordering the instructions according to the


global ordering in the basic block. At times, two instructions with exactly the same
contents may be found. If this happens, the instructions may be merged, resolving the
conflict.

7.4.6 Reversing the obfuscateconstants transformation

To reverse this transformation, the only thing that needs to be done is to pass a con-
stant calculation transformation, which will replace instructions by the constant values
they calculate and remove any unused global variables.

7.4.7 Reversing the flattencontrol transformation

To reverse this transformation, find the PHI node that chooses the destination of
the main basic block switch depending on the basic block which jumped to it. With
this, the unconditional jumps cand be replaced by the node chosen on the main basic
block, or, when using selection by a conditional basic block, by conditional the jumps
depending on the select condition valhe and the node that will be chosen in the main
basic block.
Afterwards, move the PHI nodes to the first basic block where they are used, accord-
ing to the CFG and a liveness analysis, and delete the main basic block.
Finally, apply the Unify Function Exit Nodes transformation to ensure both flow
graphs are equal.

7.4.8 Reversing the bbsplit transformation

To reverse this transformation, simply merge any two basic blocks, BB1 and BB2,
where BB1 has an unconditional jump to BB2, and BB2 has only BB1 as a predeces-
sor.

7.4.9 Reversing the randbb transformation

This transformation can be reversed through ordering the basic blocks by traversing
the CFG using breadth first search and choosing the basic blocks (when two or more are

57
7.4 CHAPTER 7. EVALUATION

available at the same level) according to the order in which their parents where chosen,
or, when the same parents are there, according to the contents of the basic block itself.
If the contents are the same, then the basic blocks may be merged, instead.
Since the contents may be different when reversed, this ordering may not result in
the same code on both sides, when the basic block contents are not the same. An
optimization pass can be used, though, to reduce these differences.

Algorithm 7.1 Global ordering of values


function valueOrdering(I, J)
if isConstant(I) and isConstant(J) then
return I < J . Normal ordering
else if isConstant(I) and not isConstant(J) then
return true . I goes before J
else if not isConstant(I) and isConstant(J) then
return false . J goes before I
else
return instructionOrdering(I,J) . Use the global ordering instead
end if
end function

Algorithm 7.2 Global ordering of instructions


function instructionOrdering(I, J)
if depends(I, J) or depends(J, I) then
return precedes(I, J) . Order by precedence if there are dependencies
else if operandCount(I) < operandCount(J) then
return true . Use operand count
else if operandCount(I) > operandCount(J) then
return false . Use operand count
else if opCode(I) < opCode(J) then
return true . Use opcode ordering
else if opCode(I) > opCode(J) then
return false . Use opcode ordering
else
for all IOP , JOP in pairs(getOperands(I), getOperands(J)) do
if IOP 6= JOP then
return valueOrdering(IOP , JOP ) . Order according to operands
end if
end for
end if
end function

58
7.5 CHAPTER 7. EVALUATION

7.5 Binary patch obfuscation technique evaluation

T
he proposed technique consists of obfuscating some of the functions of the code
being patched along with the patched function, choosing these extra functions
at random.
It is easy to see that if the attacker has no knowledge of which function was modified
and cannot reverse the transformations, he will need to analyze the mean of half of
the added functions before finding the changed functions, and analyze all of the added
functions before he can be certain no other functions are unmodified.
Additionally, if a function only introduces the security fix, then the probability of the
reverse engineer finding it after x attempts is inversely proportional to the number of
added functions and directly proportional to the number of attempts.
Sadly, an experiment to check how efficient the above obfuscation techniques are
could not be run, but, in theory, they should be as efficient as the original techniques
they are based upon. This lack of an experiment has made it impossible to measure the
amount of extra time that is required to analyze obfuscated functions.

59
8
Conclusions

W
e have developed a set of transformations which allow the focused obfusca-
tion of functions so that only these will be different on the resulting patch.
Such transformations have also been implemented into LLVM.
In the evaluation section, proof was provided that, given enough interest, the pro-
posed polymorphism transformations and obfuscation transformations can be reversed
or, when reversal is not possible, a similar transformation can be applied to both codes to
attain a minimal set of differences between the original and the modified code. A proof
that if the passes cannot be reversed, the difficulty of finding the security fix increases
proportionally to the number of extra obfuscated functions is also provided.
The results of the evaluation can be considered an example of the never-ending war
between researchers trying to elaborate better obfuscation techniques, and attackers
trying to reverse them. This situation will end either when a technique which cannot be
reversed is developed or when newer techniques cannot be created by developers. Sadly,
it currently appears as if the second possibility is more likely to happen than the first,
as the ways in which programs can be obfuscated are limited and human thinking can
adapt to read obfuscated code.

60
9
Future development

G
iven the time constraints of this project, many possible avenues could not be
explored in this project. The first task that should be performed in the future
is to improve the code quality of the transformations so that they can be
pushed onto the upstream LLVM.
In the constant obfuscation transformation, improvement can be made to the constant
memory fetch obfuscation by using Wang’s aliasing method and randomly reordering
the constant array. Another possible improvement to consider is that the Register Swap
could instead be performed over the resulting assembly by remapping registers (which
sadly are API-dependent). Also, A study of the efficiency of the transformations should
be run, although some preliminary tests hint of roughly a 6x slowdown. Finally, other
obfuscation techniques could be applied to render the resulting patches more difficult
for attackers to analyze.
As part of the development of this project, it was also discovered that some obfus-
cation techniques can be used to harden the resulting binaries against certain attacks,
with apparently negligible impact. In-depth research on this topic will be performed in
the near future.

61
Bibliography

[1] Arini Balakrishnan and Chloe Schulze. Code Obfuscation Literature Survey. 2005.
url: http://pages.cs.wisc.edu/~arinib/writeup.pdf.
[2] Boaz Barak et al. “On the (im)possibility of obfuscating programs”. In: Lecture
Notes in Computer Science. Springer-Verlag, 2001, pp. 1–18. url: http://citeseerx.
ist.psu.edu/viewdoc/download?doi=10.1.1.111.8717&rep=rep1&type=pdf.
[3] Eli Bendersky. Life of an instruction in LLVM. Nov. 2012. url: http://eli.
thegreenplace.net/2012/11/24/life-of-an-instruction-in-llvm/.
[4] Jean-Marie Borello and Ludovic Mé. “Code obfuscation techniques for metamor-
phic viruses”. English. In: Journal in Computer Virology 4.3 (2008), pp. 211–220.
issn: 1772-9890. doi: 10.1007/s11416-008-0084-2. url: http://dx.doi.org/
10.1007/s11416-008-0084-2.
[5] Frederick P. Brooks Jr. “No Silver Bullet Essence and Accidents of Software En-
gineering”. In: Computer 20.4 (Apr. 1987), pp. 10–19. issn: 0018-9162. doi: 10.
1109/MC.1987.1663532. url: http://dx.doi.org/10.1109/MC.1987.1663532.
[6] Chih-Fan Chen et al. CONFUSE: LLVM-based Code Obfuscation. May 2013. url:
http://www.cs.columbia.edu/~aho/cs4115_Spring-2013/lectures/13-05-
16_Team11_Confuse_Paper.pdf.
[7] S. Garg et al. “Candidate Indistinguishability Obfuscation and Functional Encryp-
tion for all Circuits”. In: Foundations of Computer Science (FOCS), 2013 IEEE
54th Annual Symposium on. Oct. 2013, pp. 40–49. doi: 10.1109/FOCS.2013.13.
url: http://eprint.iacr.org/2013/451.pdf.
[8] Anne Gille-Genest. Pseudo-Random Numbers Generators. Mar. 2012. url: https:
//quanto.inria.fr/pdf_html/mc_random_doc/#x1-40001.2.
[9] Brian Gladman. AES and Combined Encryption/Authentication Modes. 2014. url:
http://gladman.plushost.co.uk/oldsite/AES/index.php.

62
BIBLIOGRAPHY

[10] Sebastian Hack and Gerhard Goos. “Optimal Register Allocation for SSA-form
Programs in Polynomial Time”. In: Inf. Process. Lett. 98.4 (May 2006), pp. 150–
155. issn: 0020-0190. doi: 10 . 1016 / j . ipl . 2006 . 01 . 008. url: http : / /
citeseerx.ist.psu.edu/viewdoc/download;jsessionid=4E2D4510A92E0FD0AB2775F446362B18?
doi=10.1.1.204.2844&rep=rep1&type=pdf.
[11] Francisco Blas Izquierdo Riera. “Pint, herramienta de simulación basada en trazas
Pin”. MA thesis. Polytechnical University of Valencia, Dec. 2012. url: http://
hdl.handle.net/10251/18304.
[12] Francisco Blas Izquierdo Riera. The SIV mode of operation result in data leakage
with small messages (<= blocksize) when the authentication part of the key is
discovered and how to get data from CMAC. June 2011. url: http://seclists.
org/fulldisclosure/2011/Jun/382.
[13] Pascal Junod. Obfuscator-LLVM. 2013. url: https://github.com/obfuscator-
llvm/obfuscator/wiki.
[14] Pascal Junod. Obfuscator reloaded. Nov. 2012. url: http://crypto.junod.info/
asfws12_talk.pdf.
[15] Cullen Linn and Saumya Debray. “Obfuscation of Executable Code to Improve
Resistance to Static Disassembly”. In: Proceedings of the 10th ACM Conference
on Computer and Communications Security. CCS ’03. Washington D.C., USA:
ACM, 2003, pp. 290–299. isbn: 1-58113-738-9. doi: 10 . 1145 / 948109 . 948149.
url: http://doi.acm.org/10.1145/948109.948149.
[16] LLVM Project. CommandLine 2.0 Library Manual. Mar. 2014. url: http://llvm.
org/docs/CommandLine.html.
[17] LLVM Project. LLVM Language Reference Manual. 2014. url: http : / / llvm .
org/docs/LangRef.html.
[18] LLVM Project. Writing an LLVM Pass. 2014. url: http : / / llvm . org / docs /
WritingAnLLVMPass.html.
[19] Matias Madou, Ludo Van Put, and Koen De Bosschere. “LOCO: An Interactive
Code (De)Obfuscation Tool”. In: Proceedings of the 2006 ACM SIGPLAN Sympo-
sium on Partial Evaluation and Semantics-based Program Manipulation. PEPM
’06. Charleston, South Carolina: ACM, 2006, pp. 140–144. isbn: 1-59593-196-1.
doi: 10.1145/1111542.1111566. url: http://doi.acm.org/10.1145/1111542.
1111566.
[20] Matias Madou et al. Code (De)Obfuscation. June 2005. url: http : / / escher .
elis.ugent.be/publ/Edocs/DOC/P105_076.pdf.
[21] Matias Madou et al. “On the Effectiveness of Source Code Transformations for
Binary Obfuscation”. In: Proc. of the International Conference on Software En-
gineering Research and Practice (SERP06). 2006. url: https://biblio.ugent.
be/publication/374659/file/496495.pdf.

63
BIBLIOGRAPHY

[22] Jeongwook Oh. “Fight Against 1-day Exploits: Diffing Binaries vs Anti-diffing
Binaries”. In: Black Hat USA 2009 //Media Archives. Las Vegas, NE, USA, 2009,
p. 1. url: http://www.blackhat.com/presentations/bh-usa-09/OH/BHUSA09-
Oh-DiffingBinaries-PAPER.pdf.
[23] Axel ”0vercl0k” Souchet. Obfuscation of steel: meet my Kryptonite. July 2013.
url: http : / / download . tuxfamily . org / overclokblog / Obfuscation % 20of %
20steel%3a%20meet%20my%20Kryptonite/0vercl0k_Obfuscation_of_steel_
meet_kryptonite.pdf.
[24] StatCounter. Top 8 Operating Systems from Feb 2013 to Jan 2014. Feb. 2014. url:
http://gs.statcounter.com/#all-os-ww-monthly-201302-201401.
[25] StatCounter. Top 9 Browsers from Feb 2013 to Jan 2014. Feb. 2014. url: http:
//gs.statcounter.com/#all-browser-ww-monthly-201302-201401.
[26] Teja Tamboli. “Metamorphic Code Generation from LLVM IR Bytecode”. MA
thesis. San José State University, 2013. url: http://scholarworks.sjsu.edu/
cgi/viewcontent.cgi?article=1305&context=etd_projects.
[27] The Clang Team. “Clang” CFE Internals Manual. 2014. url: http : / / clang .
llvm.org/docs/InternalsManual.html#how-to-add-an-attribute.
[28] S.K. Udupa, S.K. Debray, and M. Madou. “Deobfuscation: reverse engineering
obfuscated code”. In: Reverse Engineering, 12th Working Conference on. Nov.
2005, 10 pp.–. doi: 10.1109/WCRE.2005.13. url: http://dx.doi.org/10.1109/
WCRE.2005.13.
[29] University of Illinois at Urbana-Champaign. LLVM API Documentation: llvm::Func-
tionPass Class Reference. Mar. 2014. url: http://llvm.org/docs/doxygen/
html/classllvm_1_1FunctionPass.html.
[30] University of Illinois at Urbana-Champaign. LLVM API Documentation: llvm::Mod-
ulePass Class Reference. Mar. 2014. url: http : / / llvm . org / docs / doxygen /
html/classllvm_1_1ModulePass.html.
[31] University of Illinois at Urbana-Champaign. LLVM API Documentation: llvm::Reg-
isterPass< passName > Struct Template Reference. Mar. 2014. url: http : / /
llvm.org/docs/doxygen/html/structllvm_1_1RegisterPass.html.
[32] Chenxi Wang. “A Security Architecture for Survivability Mechanisms”. PhD thesis.
Charlottesville, VA, USA: Faculty of the School of Engineering and Applied Science
at the University of Virginia, 2001. isbn: 0-493-08929-2. url: http://www.cs.
virginia.edu/~jck/publications/wangthesis.pdf.
[33] Gregory Wroblewski. “General Method of Program Code Obfuscation”. PhD thesis.
Wrocław University of Technology, 2002. url: http://citeseerx.ist.psu.edu/
viewdoc/download?doi=10.1.1.19.9052&rep=rep1&type=pdf.

64
BIBLIOGRAPHY

[34] Ilsun You and Kangbin Yim. “Malware Obfuscation Techniques: A Brief Survey”.
In: Broadband, Wireless Computing, Communication and Applications (BWCCA),
2010 International Conference on. Nov. 2010, pp. 297–300. doi: 10.1109/BWCCA.
2010.85. url: http://dx.doi.org/10.1109/BWCCA.2010.85.
[35] Dan Zimmerman. Got AES Performance? Oct. 2010. url: http://software.
intel.com/en-us/blogs/2010/10/14/got-aes-performance.

65
A
Using the tools

T
he project transformations are compiled into their own library at lib/Obf.so
and need to be loaded explicitly when using opt through the --load switch.
This is done mainly to keep the code isolated from the rest of the tools, making
it easier to integrate.
An example of how to load the library is shown at code listing A.1.

Code listing A.1 Loading the obfuscation library


1 $ opt --load ./ Release + Asserts /lib/Obf.so

opt takes as input an LLVM IR program and outputs another, transformed, one.
The transformations are applied in the order given. The following switches will add a
pass with each of the different transformations:
-addmodulekey Enables the transformation for adding a module key.
-bbsplit Enables the bbsplit transformation which randomly splits basic blocks.
-flattencontrol Enables the control flattening transformation which will flatten
the marked functions.
-obfuscateconstants Enables the constant obfuscation transformation.
-propagatemodulekey Enables the transformation for the module key propagation
to the module functions.
-randbb Enables the transformation for randomly reordering basic blocks.
-randfun Enables the transformation for randomly reordering functions.
-randglb Enables the transformation for randomly reordering globals.

66
A.0 APPENDIX A. USING THE TOOLS

-randins Enables the transformation for performing the dependence-based random


reordering of instructions.
-swapops Enables the transformation for randomly swapping instruction operands.
Additionally, some of the transformations have a set of tunable parameters, which
can be modified by using the following flags:
-modulekey string Defines the key inserted in the module by -addmodulekey.
-splitprobability probability Specifies the probability of splitting the basic
block at each instruction for -bbsplit.
-reobfuscationprobability probability Specifies the probability of reobfuscat-
ing a constant.
The order in which these transformations are run can affect the resulting code and the
effectiveness of the transformations. Thus, the following order is recommended:
1. Any optimization transformations
2. The -addmodulekey transformation
3. The -propagatemodulekey transformation
4. The -bbsplit transformation
5. The -flattencontrol transformation
6. The -obfuscateconstants transformation
7. The -randins -randbb -randfun -randglb and -swapops transformations
An example of a complete call to opt can be found at code listing A.2.

Code listing A.2 Using opt to obfuscate LLVM code


1 $ opt --load ./ Release + Asserts /lib/Obf.so -addmodulekey
-modulekey " Example ␣key" -propagatemodulekey -bbsplit
-splitprobability 1/8 -flattencontrol -obfuscateconstants
-reobfuscationprobability 1/5 -randins -randbb -randfun
-randglb -swapops < input.bc > output .bc

In order to work with clang, the -emit-llvm option and an extra call to clang for
linking must be added. The line at code listing A.3 contains the procedure to compile
and link a c file.
At code listing A.3, a pipeline from clang, to opt, and back to clang is gener-
ated.
The first call to clang compiles the provided C file, runs the level 3 standard opti-
mizations with -O3, and stops before linking with -c. It then emits the LLVM bytecode
with -emit-llvm and outputs it to the standard output with -o -.

67
A.0 APPENDIX A. USING THE TOOLS

Code listing A.3 Compiling an obfuscated binary using clang and opt
1 $ clang file.c -O3 -c -emit -llvm -o - | opt --load
./ Release + Asserts /lib/Obf.so -addmodulekey -modulekey " Example ␣
key" -propagatemodulekey -bbsplit -splitprobability 1/8
-flattencontrol -obfuscateconstants -reobfuscationprobability
1/5 -randins -randbb -randfun -randglb -swapops | clang -x ir -

The call to opt parses the generated bytecode, as explained above.


Finally, the last call to clang uses -x ir to specify that the input contains LLVM’s
IR (in bytecode form) and a single -, in order to make clang take input from the
standard input. This will link and compile the IR code generated by opt into the a.out
file. If object code is desired, then the -c option can be used, in a similar way. To specify
the desired output file, the -o option can be used.

68
B
Popularization

B.1 Programming computers

L
ike many other digital systems, computers work in binary, which is a language
where two clearly different values exist. These values are usually referred as 0
and 1, and each of them is considered a bit.
As these values by themselves allow for only two possible states, they can be grouped
to provide a larger array of possible values. For example, if two values are grouped, the
following four states can be obtained: 00, 01, 10 and 11. In a similar way, three values
yield eight different states and, in general, n values produce 2n states.
By themselves, these groups of values set to 0 or 1 are meaningless, but it is possible
to use them to encode data, such as the colors of an image or the amount of money that
you have in your bank account. This is done by providing a meaning to each of the bits
in the group, so that each of the two possible values will affect the meaning of the group
in one way or another.
You may also be aware that computers are programmable. This means that they use
some instructions to know what action they need to perform with the groups of bits they
utilize. These instructions are also encoded using groups of ones and zeros, so that the
computer knows, for example, that it needs to grab a value from a particular place or
add the values it can find in two places and put the result in a third place. The meaning
of these bit sequences is heavily dependent on the computer type, as different computer
designs interpret bits differently.

69
B.1 APPENDIX B. POPULARIZATION

B.1.1 Assembly

When humans want to give orders or transmit information to each other, they do not
say “01110010101”, but rather use words like “bring me water”. As a result of this, it is
difficult for humans to understand or speak directly in binary.
To fix this problem, assembly languages were created. An assembly language is a
compromise between the high level languages used by humans and the binary languages
used by machines. For example, on an Intel™ processor (like the one on most desktops)
the code “0000000011000011” means “add the value of the bits in the locations al and
bl and store the result in the location al”, which would be written in assembly as “add
%al,%bl”.
As stated above, though, machines have no way of understanding that “add %al,%bl”
is equivalent to “0000000011000011” without a special translation program. This pro-
gram is called an assembler. Similarly, some situations require a program to decode
“0000000011000011” into “add %al,%bl”. Such a program is known as a disassem-
bler.
Assembly languages usually allow for some small levels of abstraction, which permits
humans to more easily understand what is being coded. For example, these languages
may allow comments to be added, explaining what different parts of the code do or
adding descriptive names to different locations where information can be stored for pro-
cessing. When converting the program into a set of instructions written in binary (which
the machine can understand), this information is discarded, as it is unnecessary for the
machine (and in many cases it cannot be encoded anyways). As a result, when the disas-
sembler converts the binary instructions back into assembly language, this information
will be missing, making the code more difficult to understand.
The main advantage of assembly is that it allows the programmer to use a language
which is easier for them to understand whilst still exposing all of the capacities of the
machine. In exchange, however, some time needs to be invested in translating the
program into the language that the machine understands, although this only needs to
be done once. As mappings can be created both ways, it is also possible to convert the
original binary code back into an assembler instruction, which qualified individuals can
then more easily read.
The main disadvantages of this process is that assembly is still difficult for humans to
understand (as they need to imagine the machine performing the instructions in order
to visualize what the code actually does), and each machine uses a different assembly
language, as they each have different characteristics to expose.

B.1.2 Programming languages

In order to overcome the disadvantages of assembly languages, programming lan-


guages were created.

70
B.1 APPENDIX B. POPULARIZATION

A programming language is a construct which abstracts the details of the machine


(allowing the same code to be used on machines with different features) and attempts to
provide an interface which is easier for humans to work with. The costs of using program-
ming languages are a greater processing time and the necessity of writing more complex
programs which are capable of converting programs written using these languages into
binary programs that the machine can understand.
There are multiple paradigms which are differentiated mostly by the way in which
they allow the programmer to model the program.
As more abstraction is added, the resources provided by the computer are used less
efficiently, but the program is more easily transferable to other computers. Additionally,
a simpler interface is provided, which allows the programmer to model the program in a
language closer to that of the problem he is trying to solve. The alternate also applies,
as more and more details of the machine are provided by the language.

71
B.2 APPENDIX B. POPULARIZATION

B.2 Compilers

A
s stated above, machines utilize binary languages which tell them exactly what
to do to solve a problem. Thus, a program capable of converting the instruc-
tions provided in more abstract programming languages into a binary program
is needed. This program is known as a compiler.
In general, compilers do not perform this transformation into machine language di-
rectly, as the result would be very complex programs that would execute everything
at the same time. Instead, they go over a set of stages which convert more abstract
languages into either the same language or a less abstract one. This new program will
keep the same meaning as the original.
For example, a program compiled using the LLVM suite would first be converted
into an assembly like language that hides most of the limitations of real machines, then
optimized into a faster program in this same language, converted into a representation
where operands indicate their operators, optimized again into a faster program using this
representation, and rewritten again so that the final representation matches the limits
imposed by the destination machine. Finally, this representation would be converted
into assembly language and passed to an assembler that converts that program into
binary language.
During the compilation process, compilers, in a way similar to assemblers, discard the
information that is not needed by the machine to understand the program. This, along
with the fact that there is no direct mapping from the machine instructions to the original
abstract representation, makes it more difficult to recover the original program (minus
the discarded information) from the binary one provided. Despite these limitations, some
programs are able to recognize patterns on the code generated for different structures
by compilers. These kind of programs are called decompilers.

72
B.3 APPENDIX B. POPULARIZATION

B.3 Antireverse engineering

R
everse engineering is the process of taking apart the different pieces that make
a product in order to understand how the product works. In a similar way,
when applied to a programming context, it is the process of analyzing the
machine code contained in a program in order to understand what it does and
how it does it.
Reverse engineering has many uses: understanding how a program works, under-
standing the results it produces to interpret them or generate equivalent results with
your own program, modifying the program to override code in order to enforce license
restrictions, or even detecting flaws in the program that can be exploited.
Because of the possibility of performing reverse engineering, programmers try to make
it difficult for others to understand the machine code produced after compilation. There
are different ways of doing this.
One way of making programs harder to reverse engineer consists on removing any
unnecessary, human-understandable information which could be contained in the pro-
gram. This process is generally known as stripping, as the program is “stripped” to the
minimum necessary to be executed by the machine.
Another method is to thwart the efforts of decompilers or disassemblers by exploiting
some properties of the machine code. Regardless, neither this nor the previous technique
will stop people who can understand machine code, and thus the program.
A final method of making the resulting machine code more difficult for humans to
understand is the process known as obfuscation. This usually comes with a price, as it
may result in larger and slower programs. Also, as shown by [2], some programs cannot
be obfuscated.

B.3.1 Obfuscation techniques

As interest in the area of obfuscation grew, more and more techniques to make pro-
grams difficult to understand were developed. Similarly, more and more effective meth-
ods of reversing obfuscations were created. This has resulted in an arms race, wherein
one group develops stronger techniques and the other stronger methods to override
them.
Some of these techniques focus simply on making the program different from the
original one. These techniques are called polymorphism techniques, as they morph the
program into different shapes.
Other techniques try to modify the structure of the program to make the resulting
code more difficult to read.

73
B.3 APPENDIX B. POPULARIZATION

B.3.1.1 Control flattening

Some instructions tell the computer to continue executing instructions which are not
next in order. These instructions can be executed only when certain conditions are met
which allows for the creation of loops that will repeat the same sequence of instructions
either forever, or until a condition is met.
Control flattening works by replacing all of these instructions by a jump to the same
sequence of instructions. This sequence will then jump to the originally intended desti-
nation based on the information passed before the jump.
To explain the results of such an operation, this recipe will be obfuscated:
1. Put 4 eggs in an empty dish.
2. Add ½ glass of oil to that dish.
3. Put 500 grams of flour in an empty bowl.
4. Add 1 glass of water to the bowl.
5. Add a spoonful of yeast to the bowl.
6. Add the contents of the dish to the bowl.
7. Knead the mix.
8. If the mix is not a consistent dough, then repeat step 7.
9. If the dough has not risen, repeat step 9.
10. Start the oven at a temperature of 180º.
11. If the oven is not at 180º, repeat step 11.
12. Put the dough in the oven.
13. If the bread is not baked, repeat step 13.
14. Remove the bread from the oven.
15. Turn off the oven.
16. You are done.
As is apparent, this recipe is merely a set of simple steps to bake bread using an
oven. A computer running a program operates is similar to a human following the steps
of a recipe. Now, if the recipe was obfuscated using control flattening, it would look like
this:
1. Put 4 eggs in an empty dish.
2. Add ½ glass of oil to that dish.
3. Put 500 grams of flour in an empty bowl.
4. Add 1 glass of water to the bowl.

74
B.3 APPENDIX B. POPULARIZATION

5. Add a spoonful of yeast to the bowl.


6. Add the contents of the dish to the bowl.
7. Knead the mix.
8. If the mix is not a consistent dough, then write 1 on a paper. Otherwise write 2.
9. Go to step 21.
10. If the dough has not risen, then write 3 on a paper. Otherwise write 4.
11. Go to step 21.
12. Start the oven at a temperature of 180º.
13. If the oven is not at 180º, then write 5 on a paper. Otherwise write 6.
14. Go to step 21.
15. Put the dough in the oven.
16. If the bread is not baked, then write 7 on a paper. Otherwise write 8.
17. Go to step 21.
18. Remove the bread from the oven.
19. Turn off the oven.
20. You are done.
21. If the paper says 1 go to step 7.
22. If the paper says 2 go to step 10.
23. If the paper says 3 go to step 10.
24. If the paper says 4 go to step 12.
25. If the paper says 5 go to step 13.
26. If the paper says 6 go to step 15.
27. If the paper says 7 go to step 16.
28. If the paper says 8 go to step 18.
The result of this transformation is that it is more difficult for a reverse engineer to
understand how instructions flow inside the program, as they will first see that all of the
jumps go to the same place, and from there to the other instructions.

B.3.1.2 Constant obfuscation

Many programs need constant values to work. As an example, if you want to calculate
the price of a product including a fixed amount of taxes, you would need to know
the amount of tax in order to add it to the original price. The idea behind constant
obfuscation is to make these values harder to find.

75
B.3 APPENDIX B. POPULARIZATION

For example, imagine that you make a program which will add 4 to the value received
as input. You could do this simply by adding 4, or by adding the result of (2−1+5)×2
3 . The
second option is obviously harder to understand. Similarly, instead of directly adding 4,
you could add the result of fetching information from a particular memory address that
you know will return 4.
Using the previous recipe as an example, such a transformation appears as fol-
lows:
1. Write ½ on a paper.
2. Put (2−1+5)×2
3 eggs in an empty dish.
3. Add the amount on the paper glass of oil to that dish.
4. Put 4000 − 7 × 500 grams of flour in an empty bowl.
5. Add 8
23
glass of water to the bowl.
6. Add 24+6×2 spoonful of yeast to the
3×12
bowl.
7. Add the contents of the dish to the bowl.
8. Knead the mix.
9. If the mix is not a consistent dough, then repeat step 8.
10. If the dough has not risen, repeat step 10.
11. Start the oven with a temperature of 180º.
12. If the oven is not at 180º, repeat step 12.
13. Put the dough in the oven.
14. If the bread is not baked, repeat step 14.
15. Remove the bread from the oven.
16. Turn off the oven.
17. You are done.
This technique aims to make values which are constant during the execution of the
program more difficult to read, thus making it harder to find these points and use them
as references to understand how the program works.

B.3.1.3 Register swap

In computers, some places where binary information is stored have special meanings,
such as the place where the next instruction to be executed can be found or the color
that needs to be put in a particular place of your screen. The majority of these locations,
though, have no meaning other than the one given by the program being run.

76
B.3 APPENDIX B. POPULARIZATION

The technique known as register swapping randomly alters the meaning of pairs
of these otherwise meaningless locations within the program, resulting in a different
program.
Using the previous recipe as an example, the result would be:
1. Put 4 eggs in an empty bowl.
2. Add ½ glass of oil to the bowl.
3. Put 500 grams flour in an empty dish.
4. Add 1 glass of water to the dish.
5. Add a spoonful of yeast to the dish.
6. Add the contents of the bowl to the dish.
7. Knead the mix.
8. If the mix is not a consistent dough, then repeat step 7.
9. If the dough has not risen, repeat step 9.
10. Start the oven with a temperature of 180º.
11. If the oven is not at 180º, repeat step 11.
12. Put the dough in the oven.
13. If the bread is not baked, repeat step 13.
14. Remove the bread from the oven.
15. Turn off the oven.
16. You are done.
As you can see, the ingredients that would be in the dish were changed with those
in the bowl. The change may not seem meaningful at first, but a careful check reveals
that 6 out of the 16 instructions of the recipe were altered.
The result of this technique is code that is different every time it is compiled, thus
making it more difficult for a reverse engineer to uncover the differences.

B.3.1.4 Instruction Reordering

The last of the applied techniques consists of randomly reordering the instructions a
program executes, if they have no dependencies.
Using the same recipe from above, this transformation would appear as follows:
1. Add ½ glass of oil to an empty dish.
2. Put 4 eggs in the dish.
3. Add 1 glass of water to an empty bowl.

77
B.3 APPENDIX B. POPULARIZATION

4. Add the contents of the dish to the bowl.


5. Add a spoonful of yeast to the bowl.
6. Put 500 grams flour in the bowl.
7. Knead the mix.
8. If the mix is not a consistent dough, then repeat step 7.
9. If the dough has not risen, repeat step 9.
10. Start the oven with a temperature of 180º.
11. If If the oven is not at 180º, repeat step 11.
12. Put the dough in the oven.
13. If the bread is not baked, repeat step 13.
14. Turn off the oven.
15. Remove the bread from the oven.
16. You are done.
A similar process can be performed by reordering the sequences of instructions which
will always be executed in the same order. The recipe could then be modified to look
like this:
1. Go to step 8.
2. Remove the bread from the oven.
3. Turn off the oven.
4. You are done.
5. Start the oven with a temperature of 180º.
6. If the oven is not at 180º, repeat step 6.
7. Go to step 15.
8. Put 4 eggs in an empty dish.
9. Add ½ glass of oil to the dish.
10. Put 500 grams flour in an empty bowl.
11. Add 1 glass of water to the bowl.
12. Add a spoonful of yeast to the bowl.
13. Add the contents of the dish to the bowl.
14. Go to step 20.
15. Put the dough in the oven.
16. If the bread is not baked, repeat step 16.

78
B.3 APPENDIX B. POPULARIZATION

17. Go to step 2.
18. If the dough has not risen, repeat step 18.
19. Go to step 5.
20. Knead the mix.
21. If the mix is not a consistent dough, then repeat 20.
22. Go to step 18.
The result of this technique is a program where the order of the elements is changed
every time, thus making it more difficult to find differences from the original pro-
gram.

B.3.2 Focused obfuscation

To allow for some balance between the penalties introduced by obfuscation and the
benefits it provides by making programs harder to reverse engineer, obfuscation tech-
niques are focused in only some parts of the program.
This provides certain benefits. First, only the obfuscated parts of the program will
change. (Thus, less changes are sent to the user when he needs to update the program to
a new version). Second, only the obfuscated parts will receive the penalties introduced
by the obfuscation techniques (thus reducing the total impact). Finally, by using a secret
number to define how the techniques will be applied to different parts of the program (or
not applied at all), it is possible to always generate the same program, making updating
previously obfuscated programs easier, as the parts using the same number will remain
the same.
The main drawback of this technique is that by focusing the obfuscation on particular
parts of the program, the attacker can concentrate their efforts on those sections, as
they will expect relevant aspects of the code to be there. This problem can be solved by
choosing many irrelevant parts of the program to also be obfuscated.

B.3.3 Compiler-level obfuscation

It is possible to create compilers which will obfuscate programs as they process them.
These compilers provide certain advantages.
Firstly, such compilation simplifies the process for the user of the compiler, as they
simply need to tell the compiler to obfuscate the program.
Additionally, the obfuscation technique can be applied in a machine independent lan-
guage, so the obfuscation code can be used across machines using different designs.
Fianlly, this compiler simplifies the process of focusing the obfuscation techniques, as
the user can simply mark the structures that should be obfuscated.

79
B.3 APPENDIX B. POPULARIZATION

The problem is that some obfuscation techniques rely on features which are not mod-
eled by the language used when the obfuscation is applied. As a result, these techniques
cannot be implemented using that language, although they may be implemented in one
which is closer to the language that the computer understands.

80
B.4 APPENDIX B. POPULARIZATION

B.4 Deobfuscation

I
t is possible to reverse obfuscation techniques that have been implemented in a more
or less automated manner. Even though tools to do this do not yet exist, they may
be developed in the near future, as interest in their creation increases. Because
of this, care should be taken when looking for new developments in the field of
research before using one technique or another, as they may only introduce performance
penalties without providing any benefit.

B.4.1 Control unflattening

The idea behind this technique is to find the code block where the real control flow
of the program is decided, and then to move these decisions to the jumps to that code
block. As the decisions are constant values, it is reasonably possible to do this automat-
ically.

B.4.2 Constant deobfuscation

The idea used in this case is that, as constants will keep the same value during the
whole execution, they can be calculated once found, thus returning the original values
instead of the obfuscated ones.

B.4.3 Register swap

If you define a way to order the places where information is stored based on the
instruction being executed and the dependencies amongst instructions, you can then
order all of the possible places where information can be stored and assign them based
on the ordering you defined. By doing this on the original and the patched program you
should end with barely similar programs.

B.4.4 Instruction Reordering

By using the previously defined ordering, you can also order the instructions in both
the original and the modified program, thus reducing the amount of changes between
them.

81
B.5 APPENDIX B. POPULARIZATION

B.5 Program updates

D
evelopers may have many reasons for creating new versions of programs and
sending them to the users of that program. In some cases, they may have
added new features to the program, while in others they may have fixed errors
that were discovered. In some situations, these errors are reasonably harmless,
but when they can be used by a third party to make the program behave in an undesired
way they are considered security vulnerabilities.
Programmers can simply send the updated version of the program to its users, but
this is generally inefficient, as most of the program will remain the same, and can even
be problematic if the program is quite large. To prevent this, instructions explaining
how to create the new version of the program from the older one are sent instead. These
instructions are usually known as a patch, and they can be applied automatically by a
special program.
Using patches comes with some drawbacks. First of all, the new version of the
program has to be similar enough to the old version for the patch to be smaller than the
final program. For example, if polymorphic techniques are applied, the whole program
would change, making this process inefficient.
Another drawback is that an attacker can focus on the changes introduced by the
patch to discover what security vulnerabilities were fixed and once found, exploit them
on the users who have not yet updated the program. This kind of attack is known as
1-day exploit, as it is done after the updated program is released.

82
B.6 APPENDIX B. POPULARIZATION

B.6 Practical example

S
uppose a developer is reporting a security vulnerability in a program they devel-
oped. He fixes the issue and prepares a patch. In order to prevent an attacker
from simply looking at the changes introduced to patch the program, the devel-
oper could obfuscate the section of the program that needed to be updated.
As stated above, the attacker can still focus their efforts on the parts of the program
that were modified, even if obfuscated, so the developer then decides to also obfuscate
and modify also other parts of the program. Thus,the whole program is not changed,
but the attacker now needs to read a mean of half of the changes introduced before he
can find the one which fixes the vulnerability.
Obviously, these techniques do not prevent the attacker from eventually finding the
error being patched and potentially exploiting it in the computers of those who have not
upgraded the program, but they can delay the attacker and, by doing so, allow more
users to update the program before the attacker can abuse the vulnerability.
As you probably have inferred from the previous chapters, the use of obfuscation
techniques can make the program less efficient. Thus, the developer should release
a non-obfuscated patch after enough time for the users to upgrade the program has
passed.

83
C
Source code

C.1 Patches to LLVM

1 --- lib/ Transforms / Makefile ( revision 192535)


2 +++ lib/ Transforms / Makefile ( working copy)
3 @@ -9,6 +9,6 @@
4
5 LEVEL = ../..
6 -PARALLEL_DIRS = Utils Instrumentation Scalar InstCombine IPO
,→Vectorize Hello ObjCARC
7 + PARALLEL_DIRS = Utils Instrumentation Scalar InstCombine IPO
,→Vectorize Hello ObjCARC Obf
8
9 include $(LEVEL)/ Makefile . config

84
C.2 APPENDIX C. SOURCE CODE

C.2 Patches to Clang


1 --- tools /clang / include /clang /Basic /Attr.td ( revision 192535)
2 +++ tools /clang / include /clang /Basic /Attr.td ( working copy)
3 @@ -565,6 +565 ,12 @@
4 let Subjects = [ ParmVar ];
5 }
6
7 +def ObfKey : InheritableAttr {
8 + let Spellings = [GNU <" obfkey ">, GNU <" obfuscation_key ">,
,→GNU <" obfuscationkey ">, GNU <" ObfuscationKey ">];
9 + let Subjects = [ Function ];
10 + let Args = [ StringArgument <" Key", 1>];
11 +}
12 +
13 def ObjCException : InheritableAttr {
14 let Spellings = [GNU <" objc_exception ">];
15 }
16 --- tools /clang /lib/ CodeGen / CodeGenModule .cpp ( revision 192535)
17 +++ tools /clang /lib/ CodeGen / CodeGenModule .cpp ( working copy)
18 @@ -626,6 +626 ,10 @@
19 B. addAttribute (llvm :: Attribute :: Cold);
20 }
21
22 + // HACK: maybe there is a better way to do this
23 + if (const ObfKeyAttr *OKA = D->getAttr < ObfKeyAttr >())
24 + B. addAttribute (" ObfuscationKey ",OKA -> getKey ());
25 +
26 if (D->hasAttr < MinSizeAttr >())
27 B. addAttribute (llvm :: Attribute :: MinSize );
28
29 --- tools /clang /lib/Sema/ SemaDeclAttr .cpp ( revision 192535)
30 +++ tools /clang /lib/Sema/ SemaDeclAttr .cpp ( working copy)
31 @@ -2759 ,6 +2759 ,17 @@
32 ParmType ,
,→Attr. getLoc ()));
33 }
34
35 + static void handleObfKeyAttr (Sema &S, Decl *D, const AttributeList
,→&Attr) {
36 + // Make sure that there is a string literal as the sections 's
,→single
37 + // argument .
38 + StringRef Str;
39 + SourceLocation LiteralLoc ;
40 + if (!S. checkStringLiteralArgumentAttr (Attr , 0, Str , & LiteralLoc ))
41 + return ;
42 +
43 + D-> addAttr (:: new (S. Context ) ObfKeyAttr (Attr. getLoc () , S.Context ,

85
C.2 APPENDIX C. SOURCE CODE

,→Str , Attr. getAttributeSpellingListIndex ()));


44 +}
45 +
46 SectionAttr *Sema :: mergeSectionAttr (Decl *D, SourceRange Range ,
47 StringRef Name ,
48 unsigned AttrSpellingListIndex )
,→{
49 @@ -4646 ,6 +4657 ,7 @@
50 case AttributeList :: AT_InitPriority :
51 handleInitPriorityAttr (S, D, Attr); break;
52
53 + case AttributeList :: AT_ObfKey : handleObfKeyAttr (S, D,
,→Attr); break ;
54 case AttributeList :: AT_Packed : handlePackedAttr (S, D,
,→Attr); break;
55 case AttributeList :: AT_Section : handleSectionAttr (S, D,
,→Attr); break;
56 case AttributeList :: AT_Unavailable :

86
C.3 APPENDIX C. SOURCE CODE

C.3 Obf library

src/Obf/Utils.h
1
2 # ifndef LLVM_OBF_UTILS_H
3 # define LLVM_OBF_UTILS_H
4 # include "llvm/IR/ Function .h"
5 # include "llvm/IR/ Module .h"
6 # include "llvm/ADT/ StringRef .h"
7 # include "llvm/ Support / CommandLine .h"
8 # include <algorithm >
9 # include <cstdlib >
10 # include <cstring >
11 # include <cstdint >
12 # include "aes.h"
13
14

15 namespace Obf {
16 // Utilities for the Obfuscation transformations
17 // These involves mainly things like randomness generators and
,→vector randomization
18

19 // This is the base class of a PRNG , includes some interesting


,→functions
20 class PRNG_base {
21 protected :
22 // Minimal base implementation , generates a string of data
23 virtual void get_randoms (char *data , size_t len) = 0;
24 public :
25 virtual ~ PRNG_base () {}
26 // Get a random integer
27 template <class int_t > int_t get_randomi ( int_t end) {
28 int_t res;
29 get_randoms (( char *)&res , sizeof (int_t ));
30 res %= end;
31 // Negative modulos need to be normalized
32 res = abs(res);
33 return res;
34 }
35 template <class int_t > bool get_randomb ( int_t num , int_t den)
,→ {
36 int_t rnd = get_randomi (den);
37 bool rv = rnd < num;
38 return rv;
39 }
40 template <class int_t > int_t get_randomr ( int_t begin , int_t
,→end) {
41 return begin + ( get_randomi (end -begin));

87
C.3 APPENDIX C. SOURCE CODE

42 }
43 uint64_t rand64 () {
44 return get_randomi (( uint64_t ) UINT64_MAX );
45 }
46 // Randomly rearrange the elements of a vector , uses swaps
,→when available
47 template <class RandomAccessIterator > void randomize_vector (
,→RandomAccessIterator first , RandomAccessIterator last) {
48 RandomAccessIterator rfirst = first;
49 while (last != first) {
50 RandomAccessIterator relem = get_randomr (first ,last);
51 assert ( rfirst <= first && first < last && rfirst <=
,→relem && first <= relem && relem <= last);
52 if (first != relem)
53 std :: swap (* first , *relem);
54 first ++;
55 }
56 }
57 };
58
59 // Don 't use , it is weak!
60 class PRNG_rand : public PRNG_base {
61 protected :
62 virtual void get_randoms (char *data , size_t len);
63 public :
64 virtual ~ PRNG_rand () {}
65 };
66
67 class CPRNG_AES_CTR : public PRNG_base {
68 aes_encrypt_ctx cx;
69 unsigned char iv[ AES_BLOCK_SIZE ];
70 protected :
71 virtual void get_randoms (char *data , size_t len);
72 public :
73 CPRNG_AES_CTR (const llvm :: Function &F, llvm :: StringRef gref)
,→;
74 CPRNG_AES_CTR (const llvm :: Module &M, llvm :: StringRef gref);
75 static llvm :: StringRef get_obf_key ( const llvm :: Function &F);
76 static llvm :: StringRef get_obf_key ( const llvm :: Module &M);
77 static void set_obf_key (llvm :: Function &F, llvm :: StringRef
,→key);
78 static void set_obf_key (llvm :: Module &M, llvm :: StringRef key)
,→;
79 static bool has_obf_key ( const llvm :: Function &F) {
80 return ! get_obf_key (F).empty ();
81 }
82 static bool has_obf_key ( const llvm :: Module &M) {
83 return ! get_obf_key (M).empty ();
84 }

88
C.3 APPENDIX C. SOURCE CODE

85 virtual ~ CPRNG_AES_CTR () {}
86 };
87
88 class Probability {
89 uint64_t num;
90 uint64_t den;
91 public :
92 Probability () : num (0) , den (1) {
93 }
94 Probability ( uint64_t num , uint64_t den) : num(num), den(den)
,→{
95 }
96 inline void set( uint64_t num , uint64_t den) {
97 this ->num = num; this ->den = den;
98 }
99 inline bool roll( PRNG_base &prng) const {
100 return prng. get_randomb (num ,den);
101 }
102 inline bool rolldiv ( PRNG_base &prng , uint64_t div) const {
103 return prng. get_randomb (num ,den*div);
104 }
105 };
106 struct ProbabilityParser : public llvm ::cl:: parser < Probability > {
107 // parse - Return true on error .
108 bool parse(llvm ::cl:: Option &O, llvm :: StringRef ArgName , const
,→std :: string &ArgValue , Probability &Val);
109 };
110
111 };
112 # endif

src/Obf/Utils.cpp
1 # include <cinttypes >
2 # include <cstring >
3 # include "Utils .h"
4 # include "cmac.h"
5 # include "llvm/IR/ Metadata .h"
6 # include "llvm/IR/ Attributes .h"
7 # include "llvm/ Support / raw_ostream .h"
8

9 // Utilities for the Obfuscation transformations


10 // These involve mainly things like randomness generators and vector
,→randomization
11 namespace Obf {
12 # define emptystringref llvm :: StringRef ()
13 static const char * ObfKeyMDName = " ObfuscationKey ";
14 static const char * ObfKeyAttrName = " ObfuscationKey ";
15 static const unsigned char nchar = '\0 ';
16 // Create a key for use with the tag generation algorythm

89
C.3 APPENDIX C. SOURCE CODE

17 // This is CMAC_cmackey (kid || keydata ) where kid is the key type (1


,→ for function 2 for modules ) and keydata the keydata
18
19 static void make_cmac_key ( unsigned char kid , llvm :: StringRef
,→keydata , unsigned char*key) {
20 assert (! keydata .empty () && "The␣ obfuscation ␣key␣ shouldn 't␣be␣
,→empty ");
21 const static unsigned char cmac_key [16] = {0xab ,0xad ,0xce ,0
,→xba ,0xda ,0xbe ,0xbe ,0xfa ,0xba ,0xda ,0xac ,0xab ,0xac ,0xab ,0xec ,0 xea };
22 cmac_ctx ctx;
23 cmac_init (cmac_key ,& ctx);
24 cmac_data (&kid ,1,& ctx);
25 cmac_data (( const unsigned char *) keydata .data (),keydata .size
,→() ,&ctx);
26 cmac_end (key ,& ctx);
27 }
28
29 static void make_zero_iv ( unsigned char*iv) {
30 memset (iv ,0, AES_BLOCK_SIZE );
31 }
32
33 static void make_tag ( unsigned char tid , const unsigned char *key ,
,→ llvm :: StringRef mname , llvm :: StringRef fname , llvm :: StringRef gref
,→, unsigned char *tag) {
34 cmac_ctx ctx;
35 // Get the key
36 cmac_init (( const unsigned char *)key ,& ctx);
37 cmac_data (&tid ,1,& ctx);
38 cmac_data (( const unsigned char *) mname.data (),mname.size () ,&
,→ctx);
39 cmac_data (& nchar ,1,& ctx);
40 if (! fname.empty ()) {
41 cmac_data (( const unsigned char *) fname.data (),fname.size
,→() ,&ctx);
42 cmac_data (& nchar ,1,& ctx);
43 }
44 assert (! gref.empty () && "The␣ transformation ␣tag␣ shouldn 't␣be␣
,→empty ");
45 cmac_data (( const unsigned char *) gref.data (),gref.size () ,&
,→ctx);
46 cmac_data (& nchar ,1,& ctx);
47 cmac_end (tag ,& ctx);
48 }
49
50 llvm :: StringRef CPRNG_AES_CTR :: get_obf_key ( const llvm :: Function &
,→F) {
51 if(!F. hasFnAttribute ( ObfKeyAttrName ))
52 return emptystringref ;
53 llvm :: Attribute attr = F. getFnAttribute ( ObfKeyAttrName );

90
C.3 APPENDIX C. SOURCE CODE

54 if (! attr. isStringAttribute ())


55 return emptystringref ;
56 return attr. getValueAsString ();
57 }
58
59 void CPRNG_AES_CTR :: set_obf_key (llvm :: Function &F, llvm ::
,→StringRef key){
60 // Replace it
61 F. addFnAttr ( ObfKeyAttrName ,key);
62 }
63
64 llvm :: StringRef CPRNG_AES_CTR :: get_obf_key ( const llvm :: Module &M)
,→{
65 llvm :: NamedMDNode *nm = M. getNamedMetadata ( ObfKeyMDName );
66 if (!nm || nm -> getNumOperands () != 1)
67 return emptystringref ;
68 llvm :: MDNode *md = nm -> getOperand (0);
69 if (!md || md -> getNumOperands () != 1)
70 return emptystringref ;
71 llvm :: MDString * mds = llvm :: dyn_cast_or_null <llvm :: MDString
,→>(md -> getOperand (0));
72 if (! mds)
73 return emptystringref ;
74 return mds -> getString ();
75 }
76
77 void CPRNG_AES_CTR :: set_obf_key (llvm :: Module &M, llvm :: StringRef
,→key){
78 // First we have to delete the current metadata
79 llvm :: NamedMDNode *om = M. getNamedMetadata ( ObfKeyMDName );
80 if (om) {
81 M. eraseNamedMetadata (om);
82 }
83 llvm :: NamedMDNode *nm = M. getOrInsertNamedMetadata (
,→ObfKeyMDName );
84 assert (nm && " Named ␣ Metadata ␣node␣not␣ created ");
85 assert (nm -> getNumOperands () == 0 && " Named ␣ Metadata ␣node␣not␣
,→deleted ");
86 llvm :: MDString * mds = llvm :: MDString :: get(M. getContext (),key
,→);
87 assert (mds && " MDString ␣not␣ created ");
88 llvm :: MDNode *md = llvm :: MDNode :: get(M. getContext (),llvm ::
,→ArrayRef <llvm :: Value *>( mds));
89 assert (md && " MDNode ␣not␣ created ");
90 nm -> addOperand (md);
91 }
92
93 // Create a AES_CTR CPRNG object for use in a function
94 // The encryption key is created hashing with the CMAC algorithm :

91
C.3 APPENDIX C. SOURCE CODE

95 // 1|| ModuleName ||0|| FunctionName ||0|| ObfModuleName ||0


96 CPRNG_AES_CTR :: CPRNG_AES_CTR ( const llvm :: Function &F, llvm ::
,→StringRef gref) {
97 unsigned char nk [16];
98 unsigned char ck [16];
99 llvm :: StringRef ok = get_obf_key (F);
100 assert (!ok.empty () && "No␣ obfuscation ␣key␣ found ");
101 llvm :: StringRef mname = F. getParent () -> getModuleIdentifier ();
102 llvm :: StringRef fname = F. getName ();
103 make_cmac_key (1,ok ,ck);
104 make_tag (1,ck ,mname ,fname ,gref ,nk);
105 aes_encrypt_key128 (nk ,&cx);
106 make_zero_iv (iv);
107 }
108
109 // Create a AES_CTR CPRNG object for use in a module
110 // The encryption key is created hashing with the CMAC algorithm :
111 // 2|| ModuleName ||0|| ObfModuleName ||0
112 CPRNG_AES_CTR :: CPRNG_AES_CTR ( const llvm :: Module &M, llvm ::
,→StringRef gref) {
113 unsigned char nk [16];
114 unsigned char ck [16];
115 llvm :: StringRef ok = get_obf_key (M);
116 assert (!ok.empty () && "No␣ obfuscation ␣key␣ found ");
117 llvm :: StringRef mname = M. getModuleIdentifier ();
118 llvm :: StringRef fname = emptystringref ;
119 make_cmac_key (2,ok ,ck);
120 make_tag (2,ck ,mname ,fname ,gref ,nk);
121 aes_encrypt_key128 (nk ,&cx);
122 make_zero_iv (iv);
123 }
124
125 void PRNG_rand :: get_randoms (char *data , size_t len) {
126 for( size_t i=0; i < len ; i++) {
127 data[i]= rand ();
128 }
129 }
130
131 //We can generate up to 16 bytes of random data per call , we
,→generate only half of them to make
132 // finding the key or the plain text harder in the unlikely case
,→AES is broken
133 void CPRNG_AES_CTR :: get_randoms (char *data , size_t len) {
134 size_t i = 0;
135 while ( i < len ) {
136 unsigned char buf[ AES_BLOCK_SIZE ];
137 AES_RETURN rv;
138 rv = aes_ctr_pad (buf , iv , &cx);
139 assert (rv == EXIT_SUCCESS && " Failure ␣ generating ␣ pseudo ␣

92
C.3 APPENDIX C. SOURCE CODE

,→random ␣data");
140 if (len -i < 8)
141 memcpy (data+i,buf ,len -i);
142 else
143 memcpy (data+i,buf ,8);
144 i+=8;
145 }
146 }
147
148 bool ProbabilityParser :: parse(llvm ::cl:: Option &O, llvm ::
,→StringRef ArgName , const std :: string &ArgValue , Probability &Val) {
149 int nchars ;
150 uint64_t num , den;
151 int rv = sscanf ( ArgValue .c_str (),"%" PRIu64 "/%" PRIu64 "%n"
,→,&num ,&den ,& nchars );
152 if (rv != 2 || nchars != (int) ArgValue .size ())
153 return O.error("'" + ArgValue + "'␣used␣in␣'" + ArgName +
,→ "'␣is␣not␣a␣ valid ␣ probability !");
154 Val.set(num ,den);
155 return false ;
156 }
157 };

src/Obf/AddModuleKey.cpp
1 # define DEBUG_TYPE " addmodulekey "
2 # include "llvm/IR/ Module .h"
3 # include "llvm/ Support / CommandLine .h"
4 # include "llvm/Pass.h"
5 # include "llvm/ Support / ErrorHandling .h"
6 # include "llvm/ Support / raw_ostream .h"
7 # include "Utils .h"
8 using namespace llvm;
9 using namespace Obf;
10

11 namespace {
12 // TODO: generate a random key when none is specified
13 static cl::opt < std :: string > TheKey (" modulekey ", cl:: desc("
,→Specify ␣the␣ module ␣ obfuscation ␣key"), cl:: value_desc (" obfkey "), cl
,→:: Optional );
14 struct AddModuleKey : public ModulePass {
15 static char ID; // Pass identification , replacement for
,→typeid
16 AddModuleKey () : ModulePass (ID) {}
17 virtual bool runOnModule ( Module &M){
18 if ( TheKey . getNumOccurrences () != 1)
19 TheKey .error("This␣ option ␣has␣to␣be␣ declared ␣when␣
,→using ␣the␣ addmodulekey ␣pass");
20 if ( TheKey .empty ())
21 TheKey .error("No␣key␣(or␣an␣empty ␣key)␣was␣defined ,␣

93
C.3 APPENDIX C. SOURCE CODE

,→set␣some␣key");
22 CPRNG_AES_CTR :: set_obf_key (M, TheKey );
23 return true;
24 }
25 };
26 }
27
28 char AddModuleKey ::ID = 0;
29 static RegisterPass < AddModuleKey > X(" addmodulekey ", "Add␣the␣ desired ␣
,→obfuscation ␣key␣to␣the␣ module ␣ requires ␣the␣-modulekey ␣<modulekey >␣
,→option ␣set");

src/Obf/BBSplit.cpp
1 # define DEBUG_TYPE " bbsplit "
2 # include "llvm/ADT/ Statistic .h"
3 # include "llvm/IR/ InstrTypes .h"
4 # include "llvm/IR/ Instruction .h"
5 # include "llvm/IR/ BasicBlock .h"
6 # include "llvm/IR/ Function .h"
7 # include "llvm/IR/User.h"
8 # include "llvm/Pass.h"
9 # include "llvm/ Transforms / Utils / BasicBlockUtils .h"
10 # include "llvm/ Analysis / LoopInfo .h"
11 # include "llvm/ Analysis / Dominators .h"
12 # include "llvm/ADT/ Twine .h"
13 # include "Utils .h"
14 # include <vector >
15 using namespace llvm;
16
17 STATISTIC ( BBSplitCounter , " Number ␣of␣ basic ␣ blocks ␣ splitted ");
18
19
20 namespace {
21 static Obf :: Probability initialProbability (1 ,16);
22 static cl::opt < Obf :: Probability , false , Obf :: ProbabilityParser >
,→ splitProbability (" splitprobability ", cl:: desc(" Specify ␣the␣
,→probability ␣of␣ splitting ␣a␣BB"), cl:: value_desc (" probability "), cl
,→:: Optional , cl:: init( initialProbability ));
23 typedef std :: vector < BasicBlock *> blist;
24 struct BBSplit : public FunctionPass {
25 static char ID; // Pass identification , replacement for
,→typeid
26 BBSplit () : FunctionPass (ID) {
27 }
28
29 virtual bool runOnFunction ( Function &F) {
30 //if no module key found just leave the function alone
31 if (! Obf :: CPRNG_AES_CTR :: has_obf_key (F))
32 return false ;

94
C.3 APPENDIX C. SOURCE CODE

33
34 Obf :: CPRNG_AES_CTR prng(F," bbsplit ");
35 bool rval = false ;
36 blist BBlist ;
37 BBlist . reserve (F.size ());
38 // Fill the vector to prevent iterator invalidation
39 for ( Function :: iterator B = F.begin (); B != F.end (); B++)
40 BBlist . push_back (B);
41 for (blist :: iterator B = BBlist .begin (); B != BBlist .end
,→(); B++) {
42 BasicBlock * cbb = *B;
43 unsigned splitcnt =1;
44 //We go to the instruction after the first (if any)
45 for ( BasicBlock :: iterator I=((*B)->
,→getFirstInsertionPt ())++; I != cbb ->end (); I++) {
46 if ( splitProbability .roll(prng)) {
47 cbb= SplitBlock (cbb , I, this);
48 cbb -> setName (Twine ((*B)->getName (),". rsplit ")
,→+Twine( splitcnt ));
49 // Again get then next instruction after the
,→one where we did the split
50 I=(cbb -> getFirstInsertionPt ())++;
51 rval=true;
52 splitcnt ++;
53 ++ BBSplitCounter ;
54 }
55 }
56 }
57 return rval;
58 }
59 void getAnalysisUsage ( AnalysisUsage &AU) const {
60 AU. addPreserved <LoopInfo >();
61 AU. addPreserved < DominatorTree >();
62 }
63 };
64 }
65
66 char BBSplit ::ID = 0;
67 static RegisterPass <BBSplit > X(" bbsplit ", " Randomly ␣split ␣basic ␣
,→blocks ␣in␣two");

src/Obf/FlattenControl.cpp
1 # define DEBUG_TYPE " flattencontrol "
2 # include "llvm/IR/ InstrTypes .h"
3 # include "llvm/IR/ Instruction .h"
4 # include "llvm/IR/ Instructions .h"
5 # include "llvm/IR/ BasicBlock .h"
6 # include "llvm/IR/ Function .h"
7 # include "llvm/IR/User.h"

95
C.3 APPENDIX C. SOURCE CODE

8 # include "llvm/Pass.h"
9 # include "llvm/IR/ Constants .h"
10 # include "llvm/ Transforms / Utils / UnifyFunctionExitNodes .h"
11 # include "llvm/ADT/ Twine .h"
12 # include "llvm/ADT/ SmallPtrSet .h"
13 # include "llvm/ADT/ SmallVector .h"
14 # include "llvm/ADT/ DenseMap .h"
15 # include "llvm/ Transforms / Utils / BasicBlockUtils .h"
16 # include "Utils .h"
17 # include <vector >
18 # include <cstdint >
19 # include "llvm/ Support / raw_ostream .h"
20 using namespace llvm;
21 using namespace Obf;
22
23 namespace {
24 typedef SmallVector < BasicBlock *,128> bblist ;
25 typedef std :: vector <Use*> uselist ;
26 typedef SmallPtrSet < BasicBlock *,128> bbset;
27 typedef SmallDenseMap < BasicBlock *, ConstantInt *, 128> bb2id;
28 struct FlattenControl : public FunctionPass {
29 static char ID; // Pass identification , replacement for
,→typeid
30 FlattenControl () : FunctionPass (ID) {}
31 void generateBBIDs ( Function &F, bblist &sbbs , bb2id &bbids ,
,→PRNG_base &prng) const {
32 // This function generates an unique identifier for each
,→BB with incoming branches
33 // The identifiers follow a set of properties to make the
,→mainblock jump more efficient
34 //In particular they go from 0 to the number of blocks -1
,→to which jumps are possible
35 // so tables can be compact
36 // Works better if called before creating the mainblock
,→itself
37 // Step 1 create a set with all the target bbs we can
,→reach
38 bbset bbs;
39 bblist bbsv;
40 for ( bblist :: iterator B = sbbs.begin (), e = sbbs.end (); B
,→ != e; B++) {
41 TerminatorInst *t = (*B)->getTerminator ();
42 for ( unsigned i = 0, e = t-> getNumSuccessors (); i !=
,→e; i++) {
43 bbs. insert (t-> getSuccessor (i));
44 }
45 }
46 // Convert it into a vector
47 bbsv. reserve (bbs.size ());

96
C.3 APPENDIX C. SOURCE CODE

48 for (bbset :: iterator i=bbs.begin (),e=bbs.end (); i != e; i


,→++) {
49 bbsv. push_back (*i);
50 }
51 // Randomize it
52 prng. randomize_vector (bbsv.begin (),bbsv.end ());
53 // And finally associate the element position to each
,→block on the bb2id
54 int32_t id = 0;
55 for ( bblist :: iterator i=bbsv.begin (),e=bbsv.end (); i != e
,→; i++) {
56 bbids [*i]= ConstantInt :: get(F. getContext (), APInt (32,
,→id));
57 id ++;
58 }
59 return ;
60 }
61 virtual bool runOnFunction ( Function &F) {
62 //if no module key found just leave the function alone
63 if (! Obf :: CPRNG_AES_CTR :: has_obf_key (F))
64 return false ;
65
66 CPRNG_AES_CTR prng(F," flattencontrol ");
67 bblist BranchBlocks ;
68 // Ensure our entry point contains only the branch
,→instruction
69 BasicBlock * newentry = &(F. getEntryBlock ());
70 {
71 BasicBlock * entry = newentry ;
72 TerminatorInst *t = entry -> getTerminator ();
73 BranchInst *BI = dyn_cast <BranchInst >(t);
74 if (&*( entry -> getFirstInsertionPt ()) != t || !BI ||
,→BI -> isConditional ()) {
75 newentry = BasicBlock :: Create (F. getContext (), "
,→newentry " ,&F,entry);
76 BranchInst :: Create (entry , newentry );
77 }
78 }
79 // The blocks we will process , this ensures iterators don '
,→t break entry is included
80 BranchBlocks . reserve (F.size ()); // Number of blocks + the
,→new entry block
81 for ( Function :: iterator B = F.begin (); B != F.end (); B++)
82 BranchBlocks . push_back (B);
83 UnifyFunctionExitNodes &UFEN = getAnalysis <
,→UnifyFunctionExitNodes >();
84 // Create the unreachable block for the switch (if one isn
,→'t already there )
85 BasicBlock * unr = UFEN. getUnreachableBlock ();

97
C.3 APPENDIX C. SOURCE CODE

86 if (! unr) {
87 //If we create this node we don 't want it on the list
,→ processed by the algorithm hence the position
88 unr = BasicBlock :: Create (F. getContext (), "
,→UnifiedUnreachableBlock ", &F);
89 new UnreachableInst (F. getContext (), unr);
90 }
91 BasicBlock * main_node = BasicBlock :: Create (F. getContext ()
,→, " mainblock " ,&F ,++ Function :: iterator ( newentry ));
92 // Used later , moved here for efficiency
93 { // Keep the live of the list limited
94 uselist ul;
95 // Check all the uses of the operands , if they are
,→instructions outside of our basic block or the main block or are
,→phis
96 //we add a phi for them on the main block so uses
,→dominate users :)
97 for( bblist :: iterator Bi = BranchBlocks .begin (); Bi !=
,→ BranchBlocks .end (); Bi ++) {
98 BasicBlock *B = *Bi;
99 TerminatorInst *TI=B-> getTerminator ();
100 for( BasicBlock :: iterator It = B->begin (); It != B
,→->end (); It ++) {
101 // Generate a list of the uses we are
,→interested in
102 // These are non phi uses outside of the BB
,→and the BB terminator
103 bool keepvalues = false ;
104 ul. reserve (It -> getNumUses ()); // Ensure enough
,→ space is available
105 for (Value :: use_iterator Ut = It -> use_begin ()
,→; Ut != It -> use_end (); Ut ++) {
106 Use *U = &(Ut. getUse ());
107 Instruction *I = dyn_cast < Instruction >(U
,→-> getUser ());
108 if (I == 0) // Not a instruction so we
,→don 't care
109 continue ;
110 // It is a PHINode , these are handled in a
,→ different way
111 if (isa <PHINode >(*I)) {
112 PHINode * pn = cast <PHINode >(I);
113 //We only want to ignore it if it
,→refers to this block (so the instruction will be used instead )
114 if (pn -> getIncomingBlock (*U) == B)
115 continue ;
116 keepvalues = true;
117 } else
118 // If we refer to it from another block we

98
C.3 APPENDIX C. SOURCE CODE

,→ need to keep the phi values , for now we do this always but the
,→algorithm can be improved
119 if (I-> getParent () != B)
120 keepvalues = true;
121 else if (I != TI || isa <BranchInst >(TI))
,→// Same block and not the terminator , ignore the instruction
122 continue ;
123 ul. push_back (U);
124 }
125 // There is at least one interesting use
126 if (!ul.empty ()) {
127 Type *type = It -> getType ();
128 UndefValue *undef = UndefValue :: get(type)
,→;
129 // TODO optimize so it won 't always keep
,→the variables
130 PHINode *phi = PHINode :: Create (type ,
,→BranchBlocks .size (), Twine (It -> getName (),".uses"), main_node );
131 Value *def = undef; // The default value ,
,→if no need to keep it it will be undefined
132 if ( keepvalues )
133 def = phi; // Keep the value
134 for ( bblist :: iterator Bj = BranchBlocks .
,→begin (); Bj != BranchBlocks .end (); Bj ++) {
135 BasicBlock *Bjp = *Bj;
136 if (B == Bjp)
137 phi -> addIncoming (&*It , Bjp); //
,→Assign the value
138 else if (Bjp == newentry )
139 phi -> addIncoming (undef , Bjp); //
,→Undefined if it comes from the entry
140 else
141 phi -> addIncoming (def , Bjp); //
,→Keep or undef
142 }
143 // Replace the uses
144 for ( uselist :: iterator U = ul.begin (); U
,→!= ul.end (); U++) {
145 (*U)->set(phi);
146 }
147 ul.clear ();
148 }
149 }
150 }
151 }
152 //Go through our block list moving phis to the core block
,→.
153 // Order of the phis doesn 't matter as they always refer
,→to the predecessor block variables

99
C.3 APPENDIX C. SOURCE CODE

154 for( bblist :: iterator Bi = BranchBlocks .begin (); Bi !=


,→BranchBlocks .end (); Bi ++) {
155 BasicBlock *B = *Bi;
156 PHINode * newphi ;
157 BasicBlock :: iterator Ii=B->begin ();
158 while (isa <PHINode >(* Ii)) {
159 bbset oldbbs ;
160 PHINode * oldphi = cast <PHINode >(Ii);
161 Type *type = oldphi -> getType ();
162 newphi = PHINode :: Create (type , BranchBlocks .size
,→(), "", main_node );
163 // Take the name of the old phi
164 newphi -> takeName ( oldphi );
165 // Parse and move entries from the phi node (first
,→ pass)
166 for (User :: op_iterator O = oldphi -> op_begin (); O
,→!= oldphi -> op_end (); O++) {
167 BasicBlock *oldbb;
168 oldbb = oldphi -> getIncomingBlock (*O);
169 newphi -> addIncoming (*O,oldbb);
170 oldbbs . insert (oldbb);
171 }
172 // Fill it the rest with keeps ( second pass)
173 for ( bblist :: iterator Bj = BranchBlocks .begin ();
,→Bj != BranchBlocks .end (); Bj ++) {
174 BasicBlock *Bjp = *Bj;
175 if ( oldbbs .count(Bjp))
176 continue ;
177 // TODO: we should improve this to make undef
,→if usage is impossible
178 if (Bjp == newentry )
179 newphi -> addIncoming ( UndefValue :: get(type)
,→,Bjp); // Undefined if it comes from the entry
180 else
181 newphi -> addIncoming (newphi , Bjp); // Keep
,→the value for future references
182 }
183 ReplaceInstWithValue (B-> getInstList (), Ii , newphi
,→);
184 }
185 }
186 // Split blocks with terminators we don 't know how to
,→handle to get a br we know how to handle
187 for( bblist :: iterator Bi = BranchBlocks .begin (); Bi !=
,→BranchBlocks .end (); Bi ++) {
188 BasicBlock *B = *Bi;
189 TerminatorInst *t = B-> getTerminator ();
190 // Split the non conditional branches
191 if (!isa <BranchInst >(*t)) {

100
C.3 APPENDIX C. SOURCE CODE

192 // TODO: it is better if we get as much of the


,→flow control as possible here
193 // Use faster function since the transformation
,→breaks the analysis anyways
194 B-> splitBasicBlock (t,Twine(B-> getName (),". tflat ")
,→);
195 }
196 }
197 // Generate the list of possible destinations for our
,→blocks
198 bb2id bbids;
199 generateBBIDs (F, BranchBlocks ,bbids ,prng);
200 PHINode *phi = PHINode :: Create ( IntegerType :: get(F.
,→getContext (), 32) , BranchBlocks .size (), " mainphi ", main_node );
201 for( bblist :: iterator Bi = BranchBlocks .begin (); Bi !=
,→BranchBlocks .end (); Bi ++) {
202 Value *phiv;
203 BasicBlock *B = *Bi;
204 TerminatorInst *t = B-> getTerminator ();
205 BranchInst *BI = dyn_cast <BranchInst >(t);
206 if (BI && BI -> isConditional ()) {
207 // We associate a number with the destination
208 BasicBlock * s0 = BI -> getSuccessor (0) , *s1 = BI ->
,→getSuccessor (1);
209 assert (bbids.count(s0) && " Successor ␣0␣not␣on␣the
,→␣list");
210 assert (bbids.count(s1) && " Successor ␣1␣not␣on␣the
,→␣list");
211 phiv = SelectInst :: Create (BI -> getCondition (),
,→bbids[s0], bbids[s1], Twine(B-> getName (),". br_select "), t);
212 } else {
213 BasicBlock * s = BI -> getSuccessor (0);
214 assert (bbids.count(s) && " Successor ␣not␣on␣the␣
,→list");
215 // We add the number to the PHI in the main_node
216 phiv=bbids[s];
217 }
218 // Add the value to the phi
219 phi -> addIncoming (phiv , B);
220 //We make the block branch to the core block
221 ReplaceInstWithInst (t, BranchInst :: Create ( main_node ))
,→;
222 }
223 // Now the switch instruction
224 SwitchInst *sw = SwitchInst :: Create (phi , unr , bbids.size
,→(), main_node );
225 for (bb2id :: iterator i=bbids.begin (),e=bbids.end ();i != e
,→; i++) {
226 sw -> addCase (i->second ,i->first);

101
C.3 APPENDIX C. SOURCE CODE

227 }
228 return true;
229 }
230 void getAnalysisUsage ( AnalysisUsage &AU) const {
231 AU. addRequired < UnifyFunctionExitNodes >(); // Passing this
,→improves the resulting code a lot
232 }
233 };
234 }
235
236 char FlattenControl ::ID = 0;
237 static RegisterPass < FlattenControl > X(" flattencontrol ", " Flatten ␣all␣
,→the␣ nodes ␣to␣a␣ single ␣node␣to␣ offuscate ␣the␣code");

src/Obf/ObfuscateConstants.cpp
1 # define DEBUG_TYPE " obfuscateconstants "
2 # include "llvm/IR/ InstrTypes .h"
3 # include "llvm/IR/ Instruction .h"
4 # include "llvm/IR/ Instructions .h"
5 # include "llvm/IR/ BasicBlock .h"
6 # include "llvm/IR/ Function .h"
7 # include "llvm/IR/ Module .h"
8 # include "llvm/IR/User.h"
9 # include "llvm/Pass.h"
10 # include "llvm/ADT/ APInt .h"
11 # include "llvm/ADT/ Statistic .h"
12 # include "llvm/IR/ Constants .h"
13 # include <vector >
14 # include <cstdint >
15 # include <Utils.h>
16 # include "llvm/ Support / raw_ostream .h"
17 using namespace llvm;
18
19 STATISTIC ( ObfuscatedPHIs , " Number ␣of␣phis␣with␣ constants ␣ obfuscated ")
,→;
20 STATISTIC ( ObfuscatedIns , " Number ␣of␣ instructions ␣with␣ constants ␣
,→obfuscated ");
21 STATISTIC ( ObfuscatedUses , " Number ␣of␣ constants ␣uses␣ obfuscated ");
22 STATISTIC ( ObfuscatedCons , " Number ␣of␣ constants ␣ obfuscated ");
23 STATISTIC ( ReobfuscatedCons , " Number ␣of␣ constants ␣ reobfuscated ␣(
,→obfuscated ␣ after ␣ obfuscation )");
24
25 namespace {
26 static Obf :: Probability initialProbability (1 ,10);
27 static cl::opt < Obf :: Probability , false , Obf :: ProbabilityParser >
,→ reobfuscationProbability (" reobfuscationprobability ", cl:: desc("
,→Specify ␣the␣ probability ␣of␣ obfuscating ␣ again ␣a␣ constant "), cl::
,→value_desc (" probability "), cl:: Optional , cl:: init(
,→initialProbability ));

102
C.3 APPENDIX C. SOURCE CODE

28 // The real pass putting it here makes some code simpler


29 class DoObfuscateConstants {
30 Module &M;
31 GlobalVariable *intC;
32 IntegerType *intTy;
33 PointerType * intTyPtr ;
34 std :: vector < Constant *> intVs;
35 unsigned typelength ;
36 Obf :: CPRNG_AES_CTR *prng;
37 inline bool runOnFunction ( Function &F) {
38 //if no module key found just leave the function alone
39 if (! Obf :: CPRNG_AES_CTR :: has_obf_key (F))
40 return false ;
41
42 bool rval = false ;
43 prng = new Obf :: CPRNG_AES_CTR (F," obfuscateconstants ");
44 for ( Function :: iterator B = F.begin (); B != F.end (); B++)
,→ {
45 for ( BasicBlock :: iterator I=B->begin (); isa <PHINode
,→>(*I); I++)
46 if( runOnPHI (*cast <PHINode >(&*I))) {
47 ObfuscatedPHIs ++;
48 rval = true;
49 }
50 for ( BasicBlock :: iterator I=B-> getFirstInsertionPt ();
,→ I != B->end (); I++)
51 if( runOnNonPHI (*I)) {
52 ObfuscatedIns ++;
53 rval = true;
54 }
55 }
56 delete prng;
57 return rval;
58 }
59 /* obfuscate a constant by introducing instructions before the
,→ insertionPoint */
60 inline Value * obfuscateConstant ( Constant &C, Instruction *
,→insertBefore ) {
61 /* TODO:As of now we can only obfuscate these */
62 if(isa < ConstantInt >(C)) {
63 ObfuscatedCons ++;
64 ConstantInt &IC = cast < ConstantInt >(C);
65 if (IC. getType () ->getBitWidth () <= typelength && prng
,→-> get_randomb (1 ,2)) {
66 // Obfuscation technique 1: search for the
,→constant in a vector
67 ConstantInt *Cptr = ConstantInt :: get(intTy ,intVs.
,→size ());
68 intVs. push_back ( ConstantInt :: get(intTy ,IC.

103
C.3 APPENDIX C. SOURCE CODE

,→getValue (). zextOrSelf ( typelength )));


69 Value *Vptr = Cptr;
70 if ( reobfuscationProbability .roll (* prng)) {
71 ReobfuscatedCons ++;
72 Vptr = obfuscateConstant (*Cptr , insertBefore );
73 }
74 LoadInst *lic = new LoadInst (intC , "", false ,
,→insertBefore );
75 GetElementPtrInst * ptr = GetElementPtrInst ::
,→Create (lic , Vptr ,"",insertBefore );
76 LoadInst *li = new LoadInst (ptr , "", false ,
,→insertBefore );
77 if (IC. getType () ->getBitWidth () == typelength )
78 return li;
79 else return new TruncInst (li , IC. getType (), "",
,→insertBefore );
80 } else {
81 // Obfuscation technique 2: replace constant by an
,→ addition or substraction etc of two other constants
82 ConstantInt *C1 = ConstantInt :: get(IC. getType (),
,→prng -> rand64 ());
83 // Maybe keep obfuscating the new constant
84 Value *V1 = C1;
85 if ( reobfuscationProbability . rolldiv (*prng ,2)) {
86 ReobfuscatedCons ++;
87 V1 = obfuscateConstant (*C1 , insertBefore );
88 }
89 APInt VC2;
90 Instruction :: BinaryOps op;
91 // Basic example , we only use Add sub or xor since
,→ muls ands and ors are more complicated
92 switch (prng -> get_randomi (3)) {
93 case 0:
94 VC2 = IC. getValue ()-C1 -> getValue ();
95 op= Instruction :: Add;
96 assert (( VC2 + C1 -> getValue ())==IC.
,→getValue ());
97 break ;
98 case 1:
99 VC2 = IC. getValue ()+C1 -> getValue ();
100 op= Instruction :: Sub;
101 assert (( VC2 - C1 -> getValue ())==IC.
,→getValue ());
102 break ;
103 case 2:
104 VC2 = IC. getValue ()^C1 -> getValue ();
105 op= Instruction :: Xor;
106 assert (( VC2 ^ C1 -> getValue ())==IC.
,→getValue ());

104
C.3 APPENDIX C. SOURCE CODE

107 break ;
108 }
109 ConstantInt *C2 = cast < ConstantInt >( ConstantInt ::
,→get(IC. getType (),VC2));
110 Value *V2 = C2;
111 // Maybe keep obfuscating the new constant
112 if ( reobfuscationProbability . rolldiv (*prng ,2)) {
113 ReobfuscatedCons ++;
114 V2 = obfuscateConstant (*C2 , insertBefore );
115 }
116 return BinaryOperator :: Create (op , V2 , V1 , "",
,→insertBefore );
117 }
118 // TODO: obfuscation technique 3: use a formula
,→returning the constant over some previous value
119 }
120 return &C;
121 }
122 // Obfuscate an Use if it s a constant (and we want to do so)
123 // Returns true if the use was modified
124 inline bool obfuscateUse (Use &U, Instruction * insertBefore ) {
125 Constant *C = dyn_cast <Constant >(U.get ());
126 if (C == 0) return false ;// Not a constant
127 Value *NC = obfuscateConstant (*C, insertBefore );
128 if (NC == C) return false ; // The constant wasn 't modified
129 ObfuscatedUses ++;
130 U.set(NC);
131 return true;
132 }
133 /* Run on a phi instruction */
134 inline bool runOnPHI ( PHINode &phi) {
135 bool rval = false ;
136 /*If a constant is found the value must be calculated on
,→the phy node bringing us here */
137 for (User :: op_iterator O = phi. op_begin (); O != phi.
,→op_end (); O++) {
138 rval |= obfuscateUse (*O,phi. getIncomingBlock (*O)->
,→getTerminator ());
139 }
140 return rval;
141 }
142 /* Run on a non phi instruction */
143 inline bool runOnNonPHI ( Instruction &I) {
144 bool rval= false ;
145 /* Check only value (arg 1)*/
146 if(isa <SwitchInst >(I))
147 return obfuscateUse (I. getOperandUse (0) ,&I);
148 /* Check only vectors (args 1 and 2) */
149 if(isa < ShuffleVectorInst >(I))

105
C.3 APPENDIX C. SOURCE CODE

150 return obfuscateUse (I. getOperandUse (0) ,&I) |


,→obfuscateUse (I. getOperandUse (1) ,&I);
151 /* Check only struct and value (args 1 and 2) */
152 if(isa < InsertValueInst >(I))
153 return obfuscateUse (I. getOperandUse (0) ,&I) |
,→obfuscateUse (I. getOperandUse (1) ,&I);
154 /* Check only struct (arg 1) */
155 if(isa < ExtractValueInst >(I))
156 return obfuscateUse (I. getOperandUse (0) ,&I);
157 /* Check only NumElements (arg 1)*/
158 if(isa <AllocaInst >(I))
159 return obfuscateUse (I. getOperandUse (0) ,&I);
160 /* Ignore alignment */
161 if(isa <LoadInst >(I))
162 return obfuscateUse (I. getOperandUse (0) ,&I);
163 /* TODO: Ignore constants in structs */
164 if(isa < GetElementPtrInst >(I))
165 return false ;
166 /* landingpads ??? */
167 /* Intrinsics some lifetime ie give problems */
168 if(isa <CallInst >(I))
169 return false ;
170 /* Check all the values */
171 for (User :: op_iterator O = I. op_begin (); O != I. op_end ();
,→ O++) {
172 rval |= obfuscateUse (*O,&I);
173 }
174 return rval;
175 }
176 public :
177 DoObfuscateConstants ( Module &M) : M(M) {
178 }
179 bool run () {
180 // TODO:This should depend on the target type
181 typelength =64;
182 intTy = IntegerType :: get(M. getContext (), typelength );
183 intTyPtr = PointerType :: get(intTy , 0);
184
185 intC = new GlobalVariable (M,intTyPtr ,false , GlobalVariable
,→:: PrivateLinkage ,0,".data");
186 bool rval = false ;
187 for ( Module :: iterator F = M.begin (); F != M.end (); F++) {
188 if (F->empty ())
189 continue ;
190 rval |= runOnFunction (*F);
191 }
192 // TODO: check deeply the possibility of getting rid of
,→the pointer memory access by using a placeholder
193 ArrayType * ArrayTy = ArrayType :: get(intTy , intVs.size ());

106
C.3 APPENDIX C. SOURCE CODE

194 GlobalVariable *arrC = new GlobalVariable (M,ArrayTy ,false


,→, GlobalVariable :: PrivateLinkage , ConstantArray :: get(ArrayTy , intVs))
,→;
195 intC -> setInitializer ( ConstantExpr :: getGetElementPtr (arrC ,
,→ std :: vector < Constant *>(2, ConstantInt :: get(intTy ,0))));
196 return rval;
197 }
198 };
199 // Saddly we can 't keep global state if using the FunctionPass :(
200 struct ObfuscateConstants : public ModulePass {
201 static char ID; // Pass identification , replacement for
,→typeid
202 ObfuscateConstants () : ModulePass (ID) {}
203 virtual bool runOnModule ( Module &M){
204 DoObfuscateConstants obc(M);
205 return obc.run ();
206 }
207 };
208 }
209
210 char ObfuscateConstants ::ID = 0;
211 static RegisterPass < ObfuscateConstants > X(" obfuscateconstants ", "
,→Obfuscate ␣the␣code␣ constants ␣by␣ converting ␣them␣into␣ mathematical ␣
,→operations ␣and␣ dereferences ␣from␣a␣ vector ");

src/Obf/PropagateModuleKey.cpp
1 # define DEBUG_TYPE " propagatemodulekey "
2 # include "llvm/IR/ Module .h"
3 # include "llvm/IR/ Function .h"
4 # include "llvm/ Support / CommandLine .h"
5 # include "llvm/Pass.h"
6 # include "llvm/ Support / raw_ostream .h"
7 # include "Utils .h"
8 using namespace llvm;
9 using namespace Obf;
10
11 namespace {
12 // TODO: add flag to decide whether overwrite keys , merge them or
,→keep them
13 // For now we just replace
14 struct PropagateModuleKey : public FunctionPass {
15 static char ID; // Pass identification , replacement for
,→typeid
16 PropagateModuleKey () : FunctionPass (ID) {}
17 virtual bool runOnFunction ( Function &F) {
18 //if no module key found just leave it alone
19 if (! CPRNG_AES_CTR :: has_obf_key (*(F. getParent ())))
20 return false ;
21 StringRef TheKey = CPRNG_AES_CTR :: get_obf_key (*(F.

107
C.3 APPENDIX C. SOURCE CODE

,→getParent ()));
22 CPRNG_AES_CTR :: set_obf_key (F, TheKey );
23 return true;
24 }
25 };
26 }
27
28 char PropagateModuleKey ::ID = 0;
29 static RegisterPass < PropagateModuleKey > X(" propagatemodulekey ", "
,→Propagate ␣the␣ module ␣ obfuscation ␣key␣to␣the␣ function ");

src/Obf/RandBB.cpp
1 # define DEBUG_TYPE " randbb "
2 # include "llvm/IR/ BasicBlock .h"
3 # include "llvm/IR/ Function .h"
4 # include "llvm/IR/User.h"
5 # include "llvm/Pass.h"
6 # include "llvm/ADT/ SmallVector .h"
7 # include <algorithm >
8 # include "Utils .h"
9 using namespace llvm;
10

11 namespace {
12 typedef SmallVector < BasicBlock *,128> candmap ;
13 struct RandBB : public FunctionPass {
14 static char ID; // Pass identification , replacement for
,→typeid
15 RandBB () : FunctionPass (ID) {}
16
17 virtual bool runOnFunction ( Function &F) {
18 //if no module key found just leave the function alone
19 if (! Obf :: CPRNG_AES_CTR :: has_obf_key (F))
20 return false ;
21

22 Obf :: CPRNG_AES_CTR prng(F," randbb ");


23 candmap candidates ; // List of candidates
24 // Exclude the entry block
25 Function :: iterator src=F. getEntryBlock ();
26 candidates . reserve (F.size () -1);
27 // First pass , initialize structures
28 for ( Function :: iterator B = F.begin (), e = F.end (); B !=
,→e ; B++) {
29 if (B != src) { //DO NOT MOVE THE ENTRY POINT !
30 candidates . push_back (B);
31 }
32 }
33 prng. randomize_vector ( candidates .begin (),candidates .end ()
,→);
34 for ( candmap :: size_type i = 0; i < candidates .size (); i

108
C.3 APPENDIX C. SOURCE CODE

,→++) {
35 // Pick an element from the list
36 BasicBlock *dst= candidates [i];
37
38 if (dst != src) { // If swap is needed
39 dst -> moveAfter (src);
40 src = dst;
41 }
42 }
43 return true;
44 }
45 void getAnalysisUsage ( AnalysisUsage &AU) const {
46 AU. setPreservesCFG ();
47 }
48 };
49 }
50
51 char RandBB ::ID = 0;
52 static RegisterPass <RandBB > X(" randbb ", " Randomly ␣ rearrange ␣BBs␣
,→inside ␣ functions ␣ keeping ␣the␣entry ␣ point ");

src/Obf/RandFun.cpp
1 # define DEBUG_TYPE " randfun "
2 # include "llvm/IR/ Function .h"
3 # include "llvm/IR/ Module .h"
4 # include "llvm/Pass.h"
5 # include "llvm/ADT/ SmallVector .h"
6 # include <algorithm >
7 # include "llvm/ Support / raw_ostream .h"
8 # include "Utils .h"
9 using namespace llvm;
10
11 namespace {
12 typedef SmallVector < Function *,128> candmap ;
13 struct RandFun : public ModulePass {
14 static char ID; // Pass identification , replacement for
,→typeid
15 RandFun () : ModulePass (ID) {}
16
17 virtual bool runOnModule ( Module &M) {
18 //if no module key found just leave the function alone
19 if (! Obf :: CPRNG_AES_CTR :: has_obf_key (M))
20 return false ;
21
22 Obf :: CPRNG_AES_CTR prng(M," randfun ");
23 candmap candidates ; // List of candidates
24 Module :: iterator src=M.begin ();
25 Module :: FunctionListType &fl=M. getFunctionList ();
26 candidates . reserve (M.size ());

109
C.3 APPENDIX C. SOURCE CODE

27 // First pass , initialize structures


28 for ( Module :: iterator F = src , e = M.end (); F != e ; F++)
,→ {
29 candidates . push_back (F);
30 }
31 prng. randomize_vector ( candidates .begin (),candidates .end ()
,→);
32 for ( candmap :: size_type i = 0; i < candidates .size (); i
,→++) {
33 // Pick an element from the list
34 Function *dst= candidates [i];
35

36 if (dst != src) { // If swap is needed


37 fl. remove (dst);
38 fl. insert (src ,dst);
39 } else {
40 src ++;
41 }
42 }
43 return true;
44 }
45 void getAnalysisUsage ( AnalysisUsage &AU) const {
46 AU. setPreservesCFG ();
47 }
48 };
49 }
50
51 char RandFun ::ID = 0;
52 static RegisterPass <RandFun > X(" randfun ", " Randomly ␣ rearrange ␣
,→functions ␣ inside ␣a␣ module ");

src/Obf/RandGlb.cpp
1 # define DEBUG_TYPE " randglb "
2 # include "llvm/IR/ GlobalVariable .h"
3 # include "llvm/IR/ Module .h"
4 # include "llvm/Pass.h"
5 # include "llvm/ADT/ SmallVector .h"
6 # include <algorithm >
7 # include "llvm/ Support / raw_ostream .h"
8 # include "Utils .h"
9 using namespace llvm;
10
11 namespace {
12 typedef SmallVector < GlobalVariable *,128> candmap ;
13 struct RandGlb : public ModulePass {
14 static char ID; // Pass identification , replacement for
,→typeid
15 RandGlb () : ModulePass (ID) {}
16

110
C.3 APPENDIX C. SOURCE CODE

17 virtual bool runOnModule ( Module &M) {


18 //if no module key found just leave the function alone
19 if (! Obf :: CPRNG_AES_CTR :: has_obf_key (M))
20 return false ;
21
22 Obf :: CPRNG_AES_CTR prng(M," randglb ");
23 candmap candidates ; // List of candidates
24 Module :: global_iterator src=M. global_begin ();
25 Module :: GlobalListType &gl=M. getGlobalList ();
26 candidates . reserve (gl.size ());
27 // First pass , initialize structures
28 for ( Module :: global_iterator G = src , e = M. global_end ();
,→ G != e ; G++) {
29 candidates . push_back (G);
30 }
31 prng. randomize_vector ( candidates .begin (),candidates .end ()
,→);
32 for ( candmap :: size_type i = 0; i < candidates .size (); i
,→++) {
33 // Pick an element from the list
34 GlobalVariable *dst= candidates [i];
35
36 if (dst != src) { // If swap is needed
37 gl. remove (dst);
38 gl. insert (src ,dst);
39 } else {
40 src ++;
41 }
42 }
43 return true;
44 }
45 void getAnalysisUsage ( AnalysisUsage &AU) const {
46 AU. setPreservesCFG ();
47 }
48 };
49 }
50
51 char RandGlb ::ID = 0;
52 static RegisterPass <RandGlb > X(" randglb ", " Randomly ␣ rearrange ␣ global ␣
,→variables ␣ inside ␣a␣ module ");

src/Obf/RandIns.cpp
1 # define DEBUG_TYPE " randins "
2 # include "llvm/IR/ Instructions .h"
3 # include "llvm/IR/ Instruction .h"
4 # include "llvm/IR/ BasicBlock .h"
5 # include "llvm/IR/ Function .h"
6 # include "llvm/IR/User.h"
7 # include "llvm/Pass.h"

111
C.3 APPENDIX C. SOURCE CODE

8 # include "llvm/ADT/ DenseMap .h"


9 # include "llvm/ADT/ SmallVector .h"
10 # include "llvm/ADT/ SmallPtrSet .h"
11 # include <algorithm >
12 # include "Utils .h"
13 using namespace llvm;
14
15 namespace {
16 typedef SmallPtrSet < Instruction *,16> deplst ;
17 typedef SmallDenseMap < Instruction *,deplst ,256 > depmap ;
18 typedef SmallVector < Instruction *,256> candmap ;
19 struct RandIns : public FunctionPass {
20 static char ID; // Pass identification , replacement for
,→typeid
21 RandIns () : FunctionPass (ID) {}
22
23 virtual bool runOnFunction ( Function &F) {
24 //if no module key found just leave the function alone
25 if (! Obf :: CPRNG_AES_CTR :: has_obf_key (F))
26 return false ;
27
28 Obf :: CPRNG_AES_CTR prng(F," randins ");
29 candmap candidates ; // List of candidates
30 depmap dependencies ;
31 unsigned dsti;
32 BasicBlock :: iterator src;
33 for ( Function :: iterator B = F.begin (), e = F.end (); B !=
,→e ; B++) {
34 // First pass , initialize structures
35 // Map each element to its position on the list (and
,→the other way around )
36 candidates . reserve (B->size ());
37 src=B->begin ();
38 for ( BasicBlock :: iterator I=src , e = BasicBlock ::
,→iterator (B-> getFirstNonPHI ()); I != e; I++) {
39 candidates . push_back (I);
40 }
41 prng. randomize_vector ( candidates .begin (),candidates .
,→end ());
42 // Reorder Phi Nodes randomly
43 for ( candmap :: size_type i = 0; i < candidates .size ();
,→ i++) {
44 Instruction *dst= candidates [i];
45 if (dst != src) { //If swap is needed
46 dst -> moveBefore (src);
47 } else {
48 src ++;
49 }
50 }

112
C.3 APPENDIX C. SOURCE CODE

51 candidates .clear ();


52 candidates . reserve (B->size ());
53 dependencies .grow(B->size ());
54 // Generate the dependency map
55 src= B-> getFirstInsertionPt ();
56 for ( BasicBlock :: iterator I=src , e = BasicBlock ::
,→iterator (B-> getTerminator ()); I != e; I++) {
57 // Check the dependency map
58 deplst *lst = NULL;
59 // If the instruction may have undesired side
,→effects make it block other stuff with side effects or memory
,→accesses
60 if (I-> mayHaveSideEffects ()) for ( BasicBlock ::
,→iterator J=I; ++J != e;) {
61 if (J-> mayReadFromMemory () || J->
,→mayHaveSideEffects ())
62 dependencies [J]. insert (I);
63 }
64 if (I-> mayReadFromMemory ()) for ( BasicBlock ::
,→iterator J=I; ++J != e;) {
65 if (J-> mayHaveSideEffects ())
66 dependencies [J]. insert (I);
67 }
68 for (User :: op_iterator U = I-> op_begin (), e= I->
,→op_end (); U != e; ++U) {
69 Instruction *i = dyn_cast < Instruction >(*U);
70 if (!i)
71 continue ;
72 if (isa <PHINode >(*i))
73 continue ;
74 if (i-> getParent () != B)
75 continue ;
76 // Get the deplst
77 if (! lst)
78 lst = &( dependencies [I]);
79 // Insert use on the map
80 lst -> insert (i);
81 }
82 if (! lst) {
83 candidates . push_back (I);
84 }
85 }
86 // TODO: this should be done by the prng class given
,→the dependency list
87 // Reorder Instructions randomly
88 while (! candidates .empty ()) {
89 // Pick a random element from the list
90 dsti = prng. get_randomi ( candidates .size ());
91 Instruction *dst= candidates [dsti ];

113
C.3 APPENDIX C. SOURCE CODE

92 if (dst != src) { //If swap is needed


93 dst -> moveBefore (src);
94 } else {
95 src ++;
96 }
97 candidates [dsti ]= candidates .back ();
98 candidates . pop_back ();
99 // Remove the use from the dependencies
100 for (Value :: use_iterator It = dst -> use_begin (), e
,→ = dst -> use_end (); It != e; ++It) {
101 Instruction *i = dyn_cast < Instruction >(* It);
102 depmap :: iterator p = dependencies .find(i);
103 if ( p != dependencies .end () ) {
104 p-> second .erase(dst);
105 if (p-> second .empty ()) {
106 dependencies .erase(p);
107 candidates . push_back (i);
108 }
109 }
110 }
111 }
112 dependencies . shrink_and_clear ();
113 }
114 return true;
115 }
116 void getAnalysisUsage ( AnalysisUsage &AU) const {
117 AU. setPreservesCFG ();
118 }
119 };
120 }
121
122 char RandIns ::ID = 0;
123 static RegisterPass <RandIns > X(" randins ", " Randomly ␣ rearrange ␣
,→instructions ␣ inside ␣BBs␣ keeping ␣ dependences ");

src/Obf/SwapOps.cpp
1 # define DEBUG_TYPE " swapops "
2 # include "llvm/ADT/ Statistic .h"
3 # include "llvm/IR/ InstrTypes .h"
4 # include "llvm/IR/ Instruction .h"
5 # include "llvm/IR/ BasicBlock .h"
6 # include "llvm/IR/ Function .h"
7 # include "llvm/IR/User.h"
8 # include "llvm/Pass.h"
9 # include "Utils .h"
10 using namespace llvm;
11
12 STATISTIC ( SwapCounter , " Number ␣of␣ operands ␣ swapped ");
13

114
C.3 APPENDIX C. SOURCE CODE

14 namespace {
15 struct SwapOps : public FunctionPass {
16 static char ID; // Pass identification , replacement for
,→typeid
17 SwapOps () : FunctionPass (ID) {}
18

19 virtual bool runOnFunction ( Function &F) {


20 //if no module key found just leave the function alone
21 if (! Obf :: CPRNG_AES_CTR :: has_obf_key (F))
22 return false ;
23
24 bool rval = false ;
25 Obf :: CPRNG_AES_CTR prng(F," swapops ");
26 for ( Function :: iterator B = F.begin (); B != F.end (); B++)
,→ {
27 for ( BasicBlock :: iterator I=B-> getFirstInsertionPt ();
,→ I != B->end (); I++) {
28 if (!I-> isCommutative ())
29 continue ;
30 BinaryOperator *BO = dyn_cast < BinaryOperator >(I);
31 if (!BO)
32 continue ;
33 if (prng. get_randomb (1 ,2)) {
34 BO -> swapOperands ();
35 rval=true;
36 ++ SwapCounter ;
37 }
38 }
39 }
40 return rval;
41 }
42 void getAnalysisUsage ( AnalysisUsage &AU) const {
43 AU. setPreservesCFG ();
44 }
45 };
46 }
47
48 char SwapOps ::ID = 0;
49 static RegisterPass <SwapOps > X(" swapops ", " Randomly ␣swap␣the␣
,→operators ␣of␣ commutative ␣ binary ␣ instructions ");

115
C.4 APPENDIX C. SOURCE CODE

C.4 AES library

src/Obf/aes.h
1 /*
2 ---------------------------------------------------------------------------
,→
3 Copyright (c) 1998 -2010 , Brian Gladman , Worcester , UK. All rights
,→reserved .
4
5 The redistribution and use of this software (with or without changes )
6 is allowed without the payment of fees or royalties provided that:
7
8 source code distributions include the above copyright notice , this
9 list of conditions and the following disclaimer ;
10
11 binary distributions include the above copyright notice , this list
12 of conditions and the following disclaimer in their documentation .
13
14 This software is provided 'as is' with no explicit or implied
,→warranties
15 in respect of its operation , including , but not limited to ,
,→correctness
16 and fitness for purpose .
17 ---------------------------------------------------------------------------
,→
18 Issue Date: 20/12/2007
19
20 This file contains the definitions required to use AES in C. See
,→aesopt .h
21 for optimisation details .
22 */
23
24 # ifndef _AES_H
25 # define _AES_H
26
27 # include <stdlib .h>
28
29 /* This include is used to find 8 & 32 bit unsigned integer types
,→*/
30 # include " brg_types .h"
31
32 #if defined ( __cplusplus )
33 extern "C"
34 {
35 # endif
36

37 # define AES_128
38 # define FIXED_TABLES

116
C.4 APPENDIX C. SOURCE CODE

39
40 /* The following must also be set in assembler files if being used
,→*/
41
42 # define AES_ENCRYPT /* if support for encryption is needed
,→*/
43
44 # define AES_BLOCK_SIZE 16 /* the AES block size in bytes
,→*/
45 # define N_COLS 4 /* the number of columns in the state
,→*/
46

47 /* The key schedule length is 11, 13 or 15 16- byte blocks for 128,
,→*/
48 /* 192 or 256- bit keys respectively . That is 176, 208 or 240 bytes
,→*/
49 /* or 44, 52 or 60 32- bit words .
,→*/
50
51 # define KS_LENGTH 44
52
53 # define AES_RETURN INT_RETURN
54
55 /* the character array 'inf ' in the following structures is used
,→*/
56 /* to hold AES context information . This AES code uses cx ->inf.b[0]
,→*/
57 /* to hold the number of rounds multiplied by 16. The other three
,→*/
58 /* elements can be used by code that implements additional modes
,→*/
59
60 typedef union
61 { uint_32t l;
62 uint_8t b[4];
63 } aes_inf ;
64
65 typedef struct
66 { uint_32t ks[ KS_LENGTH ];
67 aes_inf inf;
68 } aes_encrypt_ctx ;
69

70 /* This routine must be called before first use if non - static


,→*/
71 /* tables are being used
,→*/
72
73 AES_RETURN aes_init (void);
74

117
C.4 APPENDIX C. SOURCE CODE

75 /* Key lengths in the range 16 <= key_len <= 32 are given in bytes ,
,→*/
76 /* those in the range 128 <= key_len <= 256 are given in bits
,→*/
77
78 AES_RETURN aes_encrypt_key128 ( const unsigned char *key ,
,→aes_encrypt_ctx cx [1]);
79 AES_RETURN aes_encrypt ( const unsigned char *in , unsigned char *out ,
,→const aes_encrypt_ctx cx [1]);
80
81 /* Multiple calls to the following subroutines for multiple block
,→*/
82 /* ECB , CBC , CFB , OFB and CTR mode encryption can be used to handle
,→*/
83 /* long messages incremantally provided that the context AND the iv
,→*/
84 /* are preserved between all such calls . For the ECB and CBC modes
,→*/
85 /* each individual call within a series of incremental calls must
,→*/
86 /* process only full blocks (i.e. len must be a multiple of 16) but
,→*/
87 /* the CFB , OFB and CTR mode calls can handle multiple incremental
,→*/
88 /* calls of any length . Each mode is reset when a new AES key is
,→*/
89 /* set but ECB and CBC operations can be reset without setting a
,→*/
90 /* new key by setting a new IV value . To reset CFB , OFB and CTR
,→*/
91 /* without setting the key , aes_mode_reset () must be called and the
,→*/
92 /* IV must be set. NOTE: All these calls update the IV on exit so
,→*/
93 /* this has to be reset if a new operation with the same IV as the
,→*/
94 /* previous one is required (or decryption follows encryption with
,→*/
95 /* the same IV array ).
,→*/
96
97 AES_RETURN aes_ecb_encrypt ( const unsigned char *ibuf , unsigned char *
,→obuf , int len , const aes_encrypt_ctx ctx [1]);
98 AES_RETURN aes_ctr_pad ( unsigned char *obuf , unsigned char *cbuf ,
,→aes_encrypt_ctx cx [1]);
99
100 #if defined ( __cplusplus )
101 }
102 # endif

118
C.4 APPENDIX C. SOURCE CODE

103
104 # endif

src/Obf/aes_modes.c
1 /*
2 ---------------------------------------------------------------------------
,→
3 Copyright (c) 1998 -2010 , Brian Gladman , Worcester , UK. All rights
,→reserved .
4

5 The redistribution and use of this software (with or without changes )


6 is allowed without the payment of fees or royalties provided that:
7
8 source code distributions include the above copyright notice , this
9 list of conditions and the following disclaimer ;
10

11 binary distributions include the above copyright notice , this list


12 of conditions and the following disclaimer in their documentation .
13
14 This software is provided 'as is' with no explicit or implied
,→warranties
15 in respect of its operation , including , but not limited to ,
,→correctness
16 and fitness for purpose .
17 ---------------------------------------------------------------------------
,→
18 Issue Date: 20/12/2007
19

20 These subroutines implement multiple block AES modes for ECB , CBC ,
,→CFB ,
21 OFB and CTR encryption , The code provides support for the VIA
,→Advanced
22 Cryptography Engine (ACE).
23

24 NOTE: In the following subroutines , the AES contexts (ctx) must be


25 16 byte aligned if VIA ACE is being used
26 */
27
28 # include <string .h>
29 # include <assert .h>
30 # include <stdio.h>
31
32 # include " aesopt .h"
33
34 #if defined ( __cplusplus )
35 extern "C"
36 {
37 # endif
38

119
C.4 APPENDIX C. SOURCE CODE

39 #if defined ( _MSC_VER ) && ( _MSC_VER > 800 )


40 # pragma intrinsic ( memcpy )
41 # endif
42
43 # define BFR_BLOCKS 8
44

45 /* These values are used to detect long word alignment in order to */


46 /* speed up some buffer operations . This facility may not work on */
47 /* some machines so this define can be commented out if necessary */
48
49 # define FAST_BUFFER_OPERATIONS
50

51 # define lp32(x) (( uint_32t *)(x))


52
53 #if defined ( USE_VIA_ACE_IF_PRESENT )
54
55 # include " aes_via_ace .h"
56

57 # pragma pack (16)


58
59 aligned_array ( unsigned long , enc_gen_table , 12, 16) =
,→NEH_ENC_GEN_DATA ;
60 aligned_array ( unsigned long , enc_load_table , 12, 16) =
,→NEH_ENC_LOAD_DATA ;
61 aligned_array ( unsigned long , enc_hybrid_table , 12, 16) =
,→NEH_ENC_HYBRID_DATA ;
62 aligned_array ( unsigned long , dec_gen_table , 12, 16) =
,→NEH_DEC_GEN_DATA ;
63 aligned_array ( unsigned long , dec_load_table , 12, 16) =
,→NEH_DEC_LOAD_DATA ;
64 aligned_array ( unsigned long , dec_hybrid_table , 12, 16) =
,→NEH_DEC_HYBRID_DATA ;
65
66 /* NOTE: These control word macros must only be used after */
67 /* a key has been set up because they depend on key size */
68 /* See the VIA ACE documentation for key type information */
69 /* and aes_via_ace .h for non - default NEH_KEY_TYPE values */
70
71 # ifndef NEH_KEY_TYPE
72 # define NEH_KEY_TYPE NEH_HYBRID
73 # endif
74

75 #if NEH_KEY_TYPE == NEH_LOAD


76 # define kd_adr (c) (( uint_8t *)(c)->ks)
77 #elif NEH_KEY_TYPE == NEH_GENERATE
78 # define kd_adr (c) (( uint_8t *)(c)->ks + (c)->inf.b[0])
79 #elif NEH_KEY_TYPE == NEH_HYBRID
80 # define kd_adr (c) (( uint_8t *)(c)->ks + ((c)->inf.b[0] == 160 ? 160
,→: 0))

120
C.4 APPENDIX C. SOURCE CODE

81 #else
82 # error no key type defined for VIA ACE
83 # endif
84
85 #else
86

87 # define aligned_array (type , name , no , stride ) type name[no]


88 # define aligned_auto (type , name , no , stride ) type name[no]
89
90 # endif
91
92 #if defined ( _MSC_VER ) && _MSC_VER > 1200
93
94 # define via_cwd (cwd , ty , dir , len) \
95 unsigned long* cwd = (dir ##_##ty## _table + (( len - 128) >> 4))
96
97 #else
98

99 # define via_cwd (cwd , ty , dir , len) \


100 aligned_auto ( unsigned long , cwd , 4, 16); \
101 cwd [1] = cwd [2] = cwd [3] = 0; \
102 cwd [0] = neh_ ## dir ##_##ty## _key(len)
103
104 # endif
105
106 AES_RETURN aes_ecb_encrypt ( const unsigned char *ibuf , unsigned char *
,→obuf ,
107 int len , const aes_encrypt_ctx ctx [1])
108 { int nb = len >> 4;
109

110 if(len & ( AES_BLOCK_SIZE - 1))


111 return EXIT_FAILURE ;
112
113 #if defined ( USE_VIA_ACE_IF_PRESENT )
114
115 if(ctx ->inf.b[1] == 0xff)
116 { uint_8t *ksp = ( uint_8t *)(ctx ->ks);
117 via_cwd (cwd , hybrid , enc , 2 * ctx ->inf.b[0] - 192);
118
119 if( ALIGN_OFFSET ( ctx , 16 ))
120 return EXIT_FAILURE ;
121

122 if(! ALIGN_OFFSET ( ibuf , 16 ) && ! ALIGN_OFFSET ( obuf , 16 ))


123 {
124 via_ecb_op5 (ksp , cwd , ibuf , obuf , nb);
125 }
126 else
127 { aligned_auto (uint_8t , buf , BFR_BLOCKS * AES_BLOCK_SIZE ,
,→16);

121
C.4 APPENDIX C. SOURCE CODE

128 uint_8t *ip , *op;


129
130 while (nb)
131 {
132 int m = (nb > BFR_BLOCKS ? BFR_BLOCKS : nb);
133

134 ip = ( ALIGN_OFFSET ( ibuf , 16 ) ? buf : ibuf);


135 op = ( ALIGN_OFFSET ( obuf , 16 ) ? buf : obuf);
136
137 if(ip != ibuf)
138 memcpy (buf , ibuf , m * AES_BLOCK_SIZE );
139

140 via_ecb_op5 (ksp , cwd , ip , op , m);


141
142 if(op != obuf)
143 memcpy (obuf , buf , m * AES_BLOCK_SIZE );
144
145 ibuf += m * AES_BLOCK_SIZE ;
146 obuf += m * AES_BLOCK_SIZE ;
147 nb -= m;
148 }
149 }
150
151 return EXIT_SUCCESS ;
152 }
153
154 # endif
155
156 #if ! defined ( ASSUME_VIA_ACE_PRESENT )
157 while (nb --)
158 {
159 if( aes_encrypt (ibuf , obuf , ctx) != EXIT_SUCCESS )
160 return EXIT_FAILURE ;
161 ibuf += AES_BLOCK_SIZE ;
162 obuf += AES_BLOCK_SIZE ;
163 }
164 # endif
165 return EXIT_SUCCESS ;
166 }
167
168 AES_RETURN aes_ctr_pad ( unsigned char *obuf , unsigned char *cbuf ,
,→aes_encrypt_ctx ctx [1])
169 {
170 #if defined ( USE_VIA_ACE_IF_PRESENT )
171 if(ctx ->inf.b[1] == 0xff && ALIGN_OFFSET ( ctx , 16 ))
172 return EXIT_FAILURE ;
173 # endif
174 int i;
175 unsigned char acc;

122
C.4 APPENDIX C. SOURCE CODE

176
177 memcpy (obuf , cbuf , AES_BLOCK_SIZE );
178 for (acc =1,i = AES_BLOCK_SIZE -1; i >= 0; i--) {
179 cbuf[i] += acc;
180 acc &= cbuf[i] == 0;
181 }
182 return aes_ecb_encrypt (obuf , obuf , AES_BLOCK_SIZE , ctx);
183 }
184
185 #if defined ( __cplusplus )
186 }
187 # endif

src/Obf/aes_via_ace.h
1 /*
2 Copyright (c) 1998 -2010 , Brian Gladman , Worcester , UK. All rights
,→reserved .
3
4 The redistribution and use of this software (with or without changes )
5 is allowed without the payment of fees or royalties provided that:
6
7 source code distributions include the above copyright notice , this
8 list of conditions and the following disclaimer ;
9
10 binary distributions include the above copyright notice , this list
11 of conditions and the following disclaimer in their documentation .
12
13 This software is provided 'as is' with no explicit or implied
,→warranties
14 in respect of its operation , including , but not limited to ,
,→correctness
15 and fitness for purpose .
16 ---------------------------------------------------------------------------
,→
17 Issue Date: 20/12/2007
18 */
19
20 # ifndef AES_VIA_ACE_H
21 # define AES_VIA_ACE_H
22

23 #if defined ( _MSC_VER )


24 # define INLINE __inline
25 #elif defined ( __GNUC__ )
26 # define INLINE static inline
27 #else
28 # error VIA ACE requires Microsoft or GNU C
29 # endif
30
31 # define NEH_GENERATE 1

123
C.4 APPENDIX C. SOURCE CODE

32 # define NEH_LOAD 2
33 # define NEH_HYBRID 3
34
35 # define MAX_READ_ATTEMPTS 1000
36
37 /* VIA Nehemiah RNG and ACE Feature Mask Values */
38
39 # define NEH_CPU_IS_VIA 0 x00000001
40 # define NEH_CPU_READ 0 x00000010
41 # define NEH_CPU_MASK 0 x00000011
42
43 # define NEH_RNG_PRESENT 0 x00000004
44 # define NEH_RNG_ENABLED 0 x00000008
45 # define NEH_ACE_PRESENT 0 x00000040
46 # define NEH_ACE_ENABLED 0 x00000080
47 # define NEH_RNG_FLAGS ( NEH_RNG_PRESENT | NEH_RNG_ENABLED )
48 # define NEH_ACE_FLAGS ( NEH_ACE_PRESENT | NEH_ACE_ENABLED )
49 # define NEH_FLAGS_MASK ( NEH_RNG_FLAGS | NEH_ACE_FLAGS )
50
51 /* VIA Nehemiah Advanced Cryptography Engine (ACE) Control Word
,→Values */
52
53 # define NEH_GEN_KEY 0 x00000000 /* generate key schedule
,→ */
54 # define NEH_LOAD_KEY 0 x00000080 /* load schedule from memory
,→ */
55 # define NEH_ENCRYPT 0 x00000000 /* encryption
,→ */
56 # define NEH_DECRYPT 0 x00000200 /* decryption
,→ */
57 # define NEH_KEY128 0 x00000000 +0 x0a /* 128 bit key
,→ */
58 # define NEH_KEY192 0 x00000400 +0 x0c /* 192 bit key
,→ */
59 # define NEH_KEY256 0 x00000800 +0 x0e /* 256 bit key
,→ */
60
61 # define NEH_ENC_GEN ( NEH_ENCRYPT | NEH_GEN_KEY )
62 # define NEH_DEC_GEN ( NEH_DECRYPT | NEH_GEN_KEY )
63 # define NEH_ENC_LOAD ( NEH_ENCRYPT | NEH_LOAD_KEY )
64 # define NEH_DEC_LOAD ( NEH_DECRYPT | NEH_LOAD_KEY )
65

66 # define NEH_ENC_GEN_DATA {\
67 NEH_ENC_GEN | NEH_KEY128 , 0, 0, 0,\
68 NEH_ENC_GEN | NEH_KEY192 , 0, 0, 0,\
69 NEH_ENC_GEN | NEH_KEY256 , 0, 0, 0 }
70
71 # define NEH_ENC_LOAD_DATA {\
72 NEH_ENC_LOAD | NEH_KEY128 , 0, 0, 0,\

124
C.4 APPENDIX C. SOURCE CODE

73 NEH_ENC_LOAD | NEH_KEY192 , 0, 0, 0,\


74 NEH_ENC_LOAD | NEH_KEY256 , 0, 0, 0 }
75
76 # define NEH_ENC_HYBRID_DATA {\
77 NEH_ENC_GEN | NEH_KEY128 , 0, 0, 0,\
78 NEH_ENC_LOAD | NEH_KEY192 , 0, 0, 0,\
79 NEH_ENC_LOAD | NEH_KEY256 , 0, 0, 0 }
80
81 # define NEH_DEC_GEN_DATA {\
82 NEH_DEC_GEN | NEH_KEY128 , 0, 0, 0,\
83 NEH_DEC_GEN | NEH_KEY192 , 0, 0, 0,\
84 NEH_DEC_GEN | NEH_KEY256 , 0, 0, 0 }
85
86 # define NEH_DEC_LOAD_DATA {\
87 NEH_DEC_LOAD | NEH_KEY128 , 0, 0, 0,\
88 NEH_DEC_LOAD | NEH_KEY192 , 0, 0, 0,\
89 NEH_DEC_LOAD | NEH_KEY256 , 0, 0, 0 }
90

91 # define NEH_DEC_HYBRID_DATA {\
92 NEH_DEC_GEN | NEH_KEY128 , 0, 0, 0,\
93 NEH_DEC_LOAD | NEH_KEY192 , 0, 0, 0,\
94 NEH_DEC_LOAD | NEH_KEY256 , 0, 0, 0 }
95
96 # define neh_enc_gen_key (x) ((x) == 128 ? ( NEH_ENC_GEN | NEH_KEY128 )
,→: \
97 (x) == 192 ? ( NEH_ENC_GEN | NEH_KEY192 ) : ( NEH_ENC_GEN |
,→NEH_KEY256 ))
98
99 # define neh_enc_load_key (x) ((x) == 128 ? ( NEH_ENC_LOAD | NEH_KEY128 )
,→ : \
100 (x) == 192 ? ( NEH_ENC_LOAD | NEH_KEY192 ) : ( NEH_ENC_LOAD |
,→NEH_KEY256 ))
101
102 # define neh_enc_hybrid_key (x) ((x) == 128 ? ( NEH_ENC_GEN |
,→NEH_KEY128 ) : \
103 (x) == 192 ? ( NEH_ENC_LOAD | NEH_KEY192 ) : ( NEH_ENC_LOAD |
,→NEH_KEY256 ))
104
105 # define neh_dec_gen_key (x) ((x) == 128 ? ( NEH_DEC_GEN | NEH_KEY128 )
,→: \
106 (x) == 192 ? ( NEH_DEC_GEN | NEH_KEY192 ) : ( NEH_DEC_GEN |
,→NEH_KEY256 ))
107
108 # define neh_dec_load_key (x) ((x) == 128 ? ( NEH_DEC_LOAD | NEH_KEY128 )
,→ : \
109 (x) == 192 ? ( NEH_DEC_LOAD | NEH_KEY192 ) : ( NEH_DEC_LOAD |
,→NEH_KEY256 ))
110

111 # define neh_dec_hybrid_key (x) ((x) == 128 ? ( NEH_DEC_GEN |

125
C.4 APPENDIX C. SOURCE CODE

,→NEH_KEY128 ) : \
112 (x) == 192 ? ( NEH_DEC_LOAD | NEH_KEY192 ) : ( NEH_DEC_LOAD |
,→NEH_KEY256 ))
113
114 #if defined ( _MSC_VER ) && ( _MSC_VER > 1200 )
115 # define aligned_auto (type , name , no , stride ) __declspec (align( stride
,→)) type name[no]
116 #else
117 # define aligned_auto (type , name , no , stride ) \
118 unsigned char _## name[no * sizeof (type) + stride ]; \
119 type *name = (type *) (16 * (((( unsigned long)(_## name)) + stride -
,→ 1) / stride ))
120 # endif
121
122 #if defined ( _MSC_VER ) && ( _MSC_VER > 1200 )
123 # define aligned_array (type , name , no , stride ) __declspec (align( stride
,→)) type name[no]
124 #elif defined ( __GNUC__ )
125 # define aligned_array (type , name , no , stride ) type name[no]
,→__attribute__ (( aligned ( stride )))
126 #else
127 # define aligned_array (type , name , no , stride ) type name[no]
128 # endif
129

130 /* VIA ACE codeword */


131
132 static unsigned char via_flags = 0;
133
134 #if defined ( _MSC_VER ) && ( _MSC_VER > 800 )
135

136 # define NEH_REKEY __asm pushfd __asm popfd


137 # define NEH_AES __asm _emit 0xf3 __asm _emit 0x0f __asm _emit 0
,→xa7
138 # define NEH_ECB NEH_AES __asm _emit 0xc8
139 # define NEH_CBC NEH_AES __asm _emit 0xd0
140 # define NEH_CFB NEH_AES __asm _emit 0xe0
141 # define NEH_OFB NEH_AES __asm _emit 0xe8
142 # define NEH_RNG __asm _emit 0x0f __asm _emit 0xa7 __asm _emit 0
,→xc0
143
144 INLINE int has_cpuid (void)
145 { char ret_value ;
146 __asm
147 { pushfd /* save EFLAGS register */
148 mov eax ,[ esp] /* copy it to eax */
149 mov edx ,0 x00200000 /* CPUID bit position */
150 xor eax ,edx /* toggle the CPUID bit */
151 push eax /* attempt to set EFLAGS to */
152 popfd /* the new value */

126
C.4 APPENDIX C. SOURCE CODE

153 pushfd /* get the new EFLAGS value */


154 pop eax /* into eax */
155 xor eax ,[ esp] /* xor with original value */
156 and eax ,edx /* has CPUID bit changed ? */
157 setne al /* set to 1 if we have been */
158 mov ret_value ,al /* able to change it */
159 popfd /* restore original EFLAGS */
160 }
161 return (int) ret_value ;
162 }
163
164 INLINE int is_via_cpu (void)
165 { char ret_value ;
166 __asm
167 { push ebx
168 xor eax ,eax /* use CPUID to get vendor */
169 cpuid /* identity string */
170 xor eax ,eax /* is it " CentaurHauls " ? */
171 sub ebx ,0 x746e6543 /* 'Cent ' */
172 or eax ,ebx
173 sub edx ,0 x48727561 /* 'aurH ' */
174 or eax ,edx
175 sub ecx ,0 x736c7561 /* 'auls ' */
176 or eax ,ecx
177 sete al /* set to 1 if it is VIA ID */
178 mov dl , NEH_CPU_READ /* mark CPU type as read */
179 or dl ,al /* & store result in flags */
180 mov [ via_flags ],dl /* set VIA detected flag */
181 mov ret_value ,al /* able to change it */
182 pop ebx
183 }
184 return (int) ret_value ;
185 }
186
187 INLINE int read_via_flags (void)
188 { char ret_value = 0;
189 __asm
190 { mov eax ,0 xC0000000 /* Centaur extended CPUID */
191 cpuid
192 mov edx ,0 xc0000001 /* >= 0 xc0000001 if support */
193 cmp eax ,edx /* for VIA extended feature */
194 jnae no_rng /* flags is available */
195 mov eax ,edx /* read Centaur extended */
196 cpuid /* feature flags */
197 mov eax , NEH_FLAGS_MASK /* mask out and save */
198 and eax ,edx /* the RNG and ACE flags */
199 or [ via_flags ],al /* present & enabled flags */
200 mov ret_value ,al /* able to change it */
201 no_rng :

127
C.4 APPENDIX C. SOURCE CODE

202 }
203 return (int) ret_value ;
204 }
205
206 INLINE unsigned int via_rng_in (void *buf)
207 { char ret_value = 0x1f;
208 __asm
209 { push edi
210 mov edi ,buf /* input buffer address */
211 xor edx ,edx /* try to fetch 8 bytes */
212 NEH_RNG /* do RNG read operation */
213 and ret_value ,al /* count of bytes returned */
214 pop edi
215 }
216 return (int) ret_value ;
217 }
218
219 INLINE void via_ecb_op5 (
220 const void *k, const void *c, const void *s, void *d, int
,→ l)
221 { __asm
222 { push ebx
223 NEH_REKEY
224 mov ebx , (k)
225 mov edx , (c)
226 mov esi , (s)
227 mov edi , (d)
228 mov ecx , (l)
229 NEH_ECB
230 pop ebx
231 }
232 }
233
234 INLINE void via_cbc_op6 (
235 const void *k, const void *c, const void *s, void *d, int
,→ l, void *v)
236 { __asm
237 { push ebx
238 NEH_REKEY
239 mov ebx , (k)
240 mov edx , (c)
241 mov esi , (s)
242 mov edi , (d)
243 mov ecx , (l)
244 mov eax , (v)
245 NEH_CBC
246 pop ebx
247 }
248 }

128
C.4 APPENDIX C. SOURCE CODE

249
250 INLINE void via_cbc_op7 (
251 const void *k, const void *c, const void *s, void *d, int l,
,→void *v, void *w)
252 { __asm
253 { push ebx
254 NEH_REKEY
255 mov ebx , (k)
256 mov edx , (c)
257 mov esi , (s)
258 mov edi , (d)
259 mov ecx , (l)
260 mov eax , (v)
261 NEH_CBC
262 mov esi , eax
263 mov edi , (w)
264 movsd
265 movsd
266 movsd
267 movsd
268 pop ebx
269 }
270 }
271

272 INLINE void via_cfb_op6 (


273 const void *k, const void *c, const void *s, void *d, int
,→ l, void *v)
274 { __asm
275 { push ebx
276 NEH_REKEY
277 mov ebx , (k)
278 mov edx , (c)
279 mov esi , (s)
280 mov edi , (d)
281 mov ecx , (l)
282 mov eax , (v)
283 NEH_CFB
284 pop ebx
285 }
286 }
287
288 INLINE void via_cfb_op7 (
289 const void *k, const void *c, const void *s, void *d, int l,
,→void *v, void *w)
290 { __asm
291 { push ebx
292 NEH_REKEY
293 mov ebx , (k)
294 mov edx , (c)

129
C.4 APPENDIX C. SOURCE CODE

295 mov esi , (s)


296 mov edi , (d)
297 mov ecx , (l)
298 mov eax , (v)
299 NEH_CFB
300 mov esi , eax
301 mov edi , (w)
302 movsd
303 movsd
304 movsd
305 movsd
306 pop ebx
307 }
308 }
309
310 INLINE void via_ofb_op6 (
311 const void *k, const void *c, const void *s, void *d, int
,→ l, void *v)
312 { __asm
313 { push ebx
314 NEH_REKEY
315 mov ebx , (k)
316 mov edx , (c)
317 mov esi , (s)
318 mov edi , (d)
319 mov ecx , (l)
320 mov eax , (v)
321 NEH_OFB
322 pop ebx
323 }
324 }
325
326 #elif defined ( __GNUC__ )
327
328 # define NEH_REKEY asm(" pushfl \n␣ popfl \n\t")
329 # define NEH_ECB asm(".byte␣0xf3 ,␣0x0f ,␣0xa7 ,␣0xc8\n\t")
330 # define NEH_CBC asm(".byte␣0xf3 ,␣0x0f ,␣0xa7 ,␣0xd0\n\t")
331 # define NEH_CFB asm(".byte␣0xf3 ,␣0x0f ,␣0xa7 ,␣0xe0\n\t")
332 # define NEH_OFB asm(".byte␣0xf3 ,␣0x0f ,␣0xa7 ,␣0xe8\n\t")
333 # define NEH_RNG asm(".byte␣0x0f ,␣0xa7 ,␣0xc0\n\t");
334
335 INLINE int has_cpuid (void)
336 { int val;
337 asm(" pushfl \n\t");
338 asm("movl␣␣0(% esp) ,%eax\n\t");
339 asm("xor␣␣␣ $0x00200000 ,% eax\n\t");
340 asm(" pushl ␣%eax\n\t");
341 asm(" popfl \n\t");
342 asm(" pushfl \n\t");

130
C.4 APPENDIX C. SOURCE CODE

343 asm("popl␣␣%eax\n\t");
344 asm("xorl␣␣0(% esp) ,%edx\n\t");
345 asm("andl␣␣ $0x00200000 ,% eax\n\t");
346 asm("movl␣␣%%eax ,%0\n\t" : "=m" (val));
347 asm(" popfl \n\t");
348 return val ? 1 : 0;
349 }
350
351 INLINE int is_via_cpu (void)
352 { int val;
353 asm(" pushl ␣%ebx\n\t");
354 asm("xorl␣%eax ,% eax\n\t");
355 asm(" cpuid \n\t");
356 asm("xorl␣%eax ,% eax\n\t");
357 asm("subl␣ $0x746e6543 ,% ebx\n\t");
358 asm("orl␣␣%ebx ,% eax\n\t");
359 asm("subl␣ $0x48727561 ,% edx\n\t");
360 asm("orl␣␣%edx ,% eax\n\t");
361 asm("subl␣ $0x736c7561 ,% ecx\n\t");
362 asm("orl␣␣%ecx ,% eax\n\t");
363 asm("movl␣%%eax ,%0\n\t" : "=m" (val));
364 asm("popl␣%ebx\n\t");
365 val = (val ? 0 : 1);
366 via_flags = (val | NEH_CPU_READ );
367 return val;
368 }
369
370 INLINE int read_via_flags (void)
371 { unsigned char val;
372 asm("movl␣ $0xc0000000 ,% eax\n\t");
373 asm(" cpuid \n\t");
374 asm("movl␣ $0xc0000001 ,% edx\n\t");
375 asm("cmpl␣%edx ,% eax\n\t");
376 asm(" setae ␣%al\n\t");
377 asm("movb␣%%al ,%0\n\t" : "=m" (val));
378 if(! val) return 0;
379 asm("movl␣ $0xc0000001 ,% eax\n\t");
380 asm(" cpuid \n\t");
381 asm("movb␣%%dl ,%0\n\t" : "=m" (val));
382 val &= NEH_FLAGS_MASK ;
383 via_flags |= val;
384 return (int) val;
385 }
386
387 INLINE int via_rng_in (void *buf)
388 { int val;
389 asm(" pushl ␣%edi\n\t");
390 asm("movl␣%0 ,%% edi\n\t" : : "m" (buf));
391 asm("xorl␣%edx ,% edx\n\t");

131
C.4 APPENDIX C. SOURCE CODE

392 NEH_RNG
393 asm("andl␣ $0x0000001f ,% eax\n\t");
394 asm("movl␣%%eax ,%0\n\t" : "=m" (val));
395 asm("popl␣%edi\n\t");
396 return val;
397 }
398
399 INLINE volatile void via_ecb_op5 (
400 const void *k, const void *c, const void *s, void *d, int
,→ l)
401 {
402 asm(" pushl ␣%ebx\n\t");
403 NEH_REKEY ;
404 asm("movl␣%0,␣%% ebx\n\t" : : "m" (k));
405 asm("movl␣%0,␣%% edx\n\t" : : "m" (c));
406 asm("movl␣%0,␣%% esi\n\t" : : "m" (s));
407 asm("movl␣%0,␣%% edi\n\t" : : "m" (d));
408 asm("movl␣%0,␣%% ecx\n\t" : : "m" (l));
409 NEH_ECB ;
410 asm("popl␣%ebx\n\t");
411 }
412
413 INLINE volatile void via_cbc_op6 (
414 const void *k, const void *c, const void *s, void *d, int
,→ l, void *v)
415 {
416 asm(" pushl ␣%ebx\n\t");
417 NEH_REKEY ;
418 asm("movl␣%0,␣%% ebx\n\t" : : "m" (k));
419 asm("movl␣%0,␣%% edx\n\t" : : "m" (c));
420 asm("movl␣%0,␣%% esi\n\t" : : "m" (s));
421 asm("movl␣%0,␣%% edi\n\t" : : "m" (d));
422 asm("movl␣%0,␣%% ecx\n\t" : : "m" (l));
423 asm("movl␣%0,␣%% eax\n\t" : : "m" (v));
424 NEH_CBC ;
425 asm("popl␣%ebx\n\t");
426 }
427
428 INLINE volatile void via_cbc_op7 (
429 const void *k, const void *c, const void *s, void *d, int l,
,→void *v, void *w)
430 {
431 asm(" pushl ␣%ebx\n\t");
432 NEH_REKEY ;
433 asm("movl␣%0,␣%% ebx\n\t" : : "m" (k));
434 asm("movl␣%0,␣%% edx\n\t" : : "m" (c));
435 asm("movl␣%0,␣%% esi\n\t" : : "m" (s));
436 asm("movl␣%0,␣%% edi\n\t" : : "m" (d));
437 asm("movl␣%0,␣%% ecx\n\t" : : "m" (l));

132
C.4 APPENDIX C. SOURCE CODE

438 asm("movl␣%0,␣%% eax\n\t" : : "m" (v));


439 NEH_CBC ;
440 asm("movl␣%eax ,% esi\n\t");
441 asm("movl␣%0,␣%% edi\n\t" : : "m" (w));
442 asm(" movsl ;␣movsl ;␣ movsl ;␣movsl \n\t");
443 asm("popl␣%ebx\n\t");
444 }
445
446 INLINE volatile void via_cfb_op6 (
447 const void *k, const void *c, const void *s, void *d, int
,→ l, void *v)
448 {
449 asm(" pushl ␣%ebx\n\t");
450 NEH_REKEY ;
451 asm("movl␣%0,␣%% ebx\n\t" : : "m" (k));
452 asm("movl␣%0,␣%% edx\n\t" : : "m" (c));
453 asm("movl␣%0,␣%% esi\n\t" : : "m" (s));
454 asm("movl␣%0,␣%% edi\n\t" : : "m" (d));
455 asm("movl␣%0,␣%% ecx\n\t" : : "m" (l));
456 asm("movl␣%0,␣%% eax\n\t" : : "m" (v));
457 NEH_CFB ;
458 asm("popl␣%ebx\n\t");
459 }
460

461 INLINE volatile void via_cfb_op7 (


462 const void *k, const void *c, const void *s, void *d, int l,
,→void *v, void *w)
463 {
464 asm(" pushl ␣%ebx\n\t");
465 NEH_REKEY ;
466 asm("movl␣%0,␣%% ebx\n\t" : : "m" (k));
467 asm("movl␣%0,␣%% edx\n\t" : : "m" (c));
468 asm("movl␣%0,␣%% esi\n\t" : : "m" (s));
469 asm("movl␣%0,␣%% edi\n\t" : : "m" (d));
470 asm("movl␣%0,␣%% ecx\n\t" : : "m" (l));
471 asm("movl␣%0,␣%% eax\n\t" : : "m" (v));
472 NEH_CFB ;
473 asm("movl␣%eax ,% esi\n\t");
474 asm("movl␣%0,␣%% edi\n\t" : : "m" (w));
475 asm(" movsl ;␣movsl ;␣ movsl ;␣movsl \n\t");
476 asm("popl␣%ebx\n\t");
477 }
478
479 INLINE volatile void via_ofb_op6 (
480 const void *k, const void *c, const void *s, void *d, int
,→ l, void *v)
481 {
482 asm(" pushl ␣%ebx\n\t");
483 NEH_REKEY ;

133
C.4 APPENDIX C. SOURCE CODE

484 asm("movl␣%0,␣%% ebx\n\t" : : "m" (k));


485 asm("movl␣%0,␣%% edx\n\t" : : "m" (c));
486 asm("movl␣%0,␣%% esi\n\t" : : "m" (s));
487 asm("movl␣%0,␣%% edi\n\t" : : "m" (d));
488 asm("movl␣%0,␣%% ecx\n\t" : : "m" (l));
489 asm("movl␣%0,␣%% eax\n\t" : : "m" (v));
490 NEH_OFB ;
491 asm("popl␣%ebx\n\t");
492 }
493
494 #else
495 # error VIA ACE is not available with this compiler
496 # endif
497
498 INLINE int via_ace_test (void)
499 {
500 return has_cpuid () && is_via_cpu () && (( read_via_flags () &
,→NEH_ACE_FLAGS ) == NEH_ACE_FLAGS );
501 }
502
503 # define VIA_ACE_AVAILABLE ((( via_flags & NEH_ACE_FLAGS ) ==
,→NEH_ACE_FLAGS ) \
504 || ( via_flags & NEH_CPU_READ ) && ( via_flags & NEH_CPU_IS_VIA ) ||
,→via_ace_test ())
505
506 INLINE int via_rng_test (void)
507 {
508 return has_cpuid () && is_via_cpu () && (( read_via_flags () &
,→NEH_RNG_FLAGS ) == NEH_RNG_FLAGS );
509 }
510
511 # define VIA_RNG_AVAILABLE ((( via_flags & NEH_RNG_FLAGS ) ==
,→NEH_RNG_FLAGS ) \
512 || ( via_flags & NEH_CPU_READ ) && ( via_flags & NEH_CPU_IS_VIA ) ||
,→via_rng_test ())
513

514 INLINE int read_via_rng (void *buf , int count)


515 { int nbr , max_reads , lcnt = count;
516 unsigned char *p, *q;
517 aligned_auto ( unsigned char , bp , 64, 16);
518
519 if(! VIA_RNG_AVAILABLE )
520 return 0;
521
522 do
523 {
524 max_reads = MAX_READ_ATTEMPTS ;
525 do
526 nbr = via_rng_in (bp);

134
C.4 APPENDIX C. SOURCE CODE

527 while
528 (nbr == 0 && --max_reads );
529
530 lcnt -= nbr;
531 p = ( unsigned char *) buf; q = bp;
532 while (nbr --)
533 *p++ = *q++;
534 }
535 while
536 (lcnt && max_reads );
537
538 return count - lcnt;
539 }
540
541 # endif

src/Obf/aescrypt.c
1 /*
2 ---------------------------------------------------------------------------
,→
3 Copyright (c) 1998 -2010 , Brian Gladman , Worcester , UK. All rights
,→reserved .
4
5 The redistribution and use of this software (with or without changes )
6 is allowed without the payment of fees or royalties provided that:
7
8 source code distributions include the above copyright notice , this
9 list of conditions and the following disclaimer ;
10
11 binary distributions include the above copyright notice , this list
12 of conditions and the following disclaimer in their documentation .
13
14 This software is provided 'as is' with no explicit or implied
,→warranties
15 in respect of its operation , including , but not limited to ,
,→correctness
16 and fitness for purpose .
17 ---------------------------------------------------------------------------
,→
18 Issue Date: 20/12/2007
19 */
20
21 # include " aesopt .h"
22 # include " aestab .h"
23
24 #if defined ( __cplusplus )
25 extern "C"
26 {
27 # endif

135
C.4 APPENDIX C. SOURCE CODE

28
29 # define si(y,x,k,c) (s(y,c) = word_in (x, c) ^ (k)[c])
30 # define so(y,x,c) word_out (y, c, s(x,c))
31
32 #if defined ( ARRAYS )
33 # define locals (y,x) x[4],y[4]
34 #else
35 # define locals (y,x) x##0,x##1,x##2,x##3,y##0,y##1,y##2,y##3
36 # endif
37
38 # define l_copy (y, x) s(y ,0) = s(x ,0); s(y ,1) = s(x ,1); \
39 s(y ,2) = s(x ,2); s(y ,3) = s(x ,3);
40 # define state_in (y,x,k) si(y,x,k ,0); si(y,x,k ,1); si(y,x,k ,2); si(y,x
,→,k ,3)
41 # define state_out (y,x) so(y,x ,0); so(y,x ,1); so(y,x ,2); so(y,x ,3)
42 # define round(rm ,y,x,k) rm(y,x,k ,0); rm(y,x,k ,1); rm(y,x,k ,2); rm(y,x
,→,k ,3)
43

44 #if ( FUNCS_IN_C & ENCRYPTION_IN_C )


45
46 /* Visual C++ .Net v7.1 provides the fastest encryption code when
,→using
47 Pentium optimiation with small code but this is poor for
,→decryption
48 so we need to control this with the following VC++ pragmas
49 */
50
51 #if defined ( _MSC_VER ) && ! defined ( _WIN64 )
52 # pragma optimize ( "s", on )
53 # endif
54
55 /* Given the column (c) of the output state variable , the following
56 macros give the input state variables which are needed in its
57 computation for each row (r) of the state . All the alternative
58 macros give the same end values but expand into different ways
59 of calculating these values . In particular the complex macro
60 used for dynamically variable block sizes is designed to expand
61 to a compile time constant whenever possible but will expand to
62 conditional clauses on some branches (I am grateful to Frank
63 Yellin for this construction )
64 */
65

66 # define fwd_var (x,r,c)\


67 ( r == 0 ? ( c == 0 ? s(x ,0) : c == 1 ? s(x ,1) : c == 2 ? s(x ,2) : s
,→(x ,3))\
68 : r == 1 ? ( c == 0 ? s(x ,1) : c == 1 ? s(x ,2) : c == 2 ? s(x ,3) : s
,→(x ,0))\
69 : r == 2 ? ( c == 0 ? s(x ,2) : c == 1 ? s(x ,3) : c == 2 ? s(x ,0) : s
,→(x ,1))\

136
C.4 APPENDIX C. SOURCE CODE

70 : ( c == 0 ? s(x ,3) : c == 1 ? s(x ,0) : c == 2 ? s(x ,1) : s


,→(x ,2)))
71
72 #if defined ( FT4_SET )
73 # undef dec_fmvars
74 # define fwd_rnd (y,x,k,c) (s(y,c) = (k)[c] ^ four_tables (x,t_use(f,
,→n),fwd_var ,rf1 ,c))
75 #elif defined ( FT1_SET )
76 # undef dec_fmvars
77 # define fwd_rnd (y,x,k,c) (s(y,c) = (k)[c] ^ one_table (x,upr ,t_use(
,→f,n),fwd_var ,rf1 ,c))
78 #else
79 # define fwd_rnd (y,x,k,c) (s(y,c) = (k)[c] ^ fwd_mcol ( no_table (x,
,→t_use(s,box),fwd_var ,rf1 ,c)))
80 # endif
81
82 #if defined ( FL4_SET )
83 # define fwd_lrnd (y,x,k,c) (s(y,c) = (k)[c] ^ four_tables (x,t_use(f,
,→l),fwd_var ,rf1 ,c))
84 #elif defined ( FL1_SET )
85 # define fwd_lrnd (y,x,k,c) (s(y,c) = (k)[c] ^ one_table (x,ups ,t_use(
,→f,l),fwd_var ,rf1 ,c))
86 #else
87 # define fwd_lrnd (y,x,k,c) (s(y,c) = (k)[c] ^ no_table (x,t_use(s,box
,→),fwd_var ,rf1 ,c))
88 # endif
89
90 AES_RETURN aes_encrypt ( const unsigned char *in , unsigned char *out ,
,→const aes_encrypt_ctx cx [1])
91 { uint_32t locals (b0 , b1);
92 const uint_32t *kp;
93 #if defined ( dec_fmvars )
94 dec_fmvars ; /* declare variables for fwd_mcol () if needed */
95 # endif
96
97 if( cx ->inf.b[0] != 10 * 16 && cx ->inf.b[0] != 12 * 16 && cx ->inf
,→.b[0] != 14 * 16 )
98 return EXIT_FAILURE ;
99
100 kp = cx ->ks;
101 state_in (b0 , in , kp);
102

103 #if ( ENC_UNROLL == FULL)


104
105 switch (cx ->inf.b[0])
106 {
107 case 14 * 16:
108 round(fwd_rnd , b1 , b0 , kp + 1 * N_COLS );
109 round(fwd_rnd , b0 , b1 , kp + 2 * N_COLS );

137
C.4 APPENDIX C. SOURCE CODE

110 kp += 2 * N_COLS ;
111 case 12 * 16:
112 round(fwd_rnd , b1 , b0 , kp + 1 * N_COLS );
113 round(fwd_rnd , b0 , b1 , kp + 2 * N_COLS );
114 kp += 2 * N_COLS ;
115 case 10 * 16:
116 round(fwd_rnd , b1 , b0 , kp + 1 * N_COLS );
117 round(fwd_rnd , b0 , b1 , kp + 2 * N_COLS );
118 round(fwd_rnd , b1 , b0 , kp + 3 * N_COLS );
119 round(fwd_rnd , b0 , b1 , kp + 4 * N_COLS );
120 round(fwd_rnd , b1 , b0 , kp + 5 * N_COLS );
121 round(fwd_rnd , b0 , b1 , kp + 6 * N_COLS );
122 round(fwd_rnd , b1 , b0 , kp + 7 * N_COLS );
123 round(fwd_rnd , b0 , b1 , kp + 8 * N_COLS );
124 round(fwd_rnd , b1 , b0 , kp + 9 * N_COLS );
125 round(fwd_lrnd , b0 , b1 , kp +10 * N_COLS );
126 }
127

128 #else
129
130 #if ( ENC_UNROLL == PARTIAL )
131 { uint_32t rnd;
132 for(rnd = 0; rnd < (cx ->inf.b[0] >> 5) - 1; ++ rnd)
133 {
134 kp += N_COLS ;
135 round(fwd_rnd , b1 , b0 , kp);
136 kp += N_COLS ;
137 round(fwd_rnd , b0 , b1 , kp);
138 }
139 kp += N_COLS ;
140 round(fwd_rnd , b1 , b0 , kp);
141 #else
142 { uint_32t rnd;
143 for(rnd = 0; rnd < (cx ->inf.b[0] >> 4) - 1; ++ rnd)
144 {
145 kp += N_COLS ;
146 round(fwd_rnd , b1 , b0 , kp);
147 l_copy (b0 , b1);
148 }
149 # endif
150 kp += N_COLS ;
151 round(fwd_lrnd , b0 , b1 , kp);
152 }
153 # endif
154
155 state_out (out , b0);
156 return EXIT_SUCCESS ;
157 }
158

138
C.4 APPENDIX C. SOURCE CODE

159 # endif
160
161 #if ( FUNCS_IN_C & DECRYPTION_IN_C )
162
163 /* Visual C++ .Net v7.1 provides the fastest encryption code when
,→using
164 Pentium optimiation with small code but this is poor for
,→decryption
165 so we need to control this with the following VC++ pragmas
166 */
167
168 #if defined ( _MSC_VER ) && ! defined ( _WIN64 )
169 # pragma optimize ( "t", on )
170 # endif
171
172 /* Given the column (c) of the output state variable , the following
173 macros give the input state variables which are needed in its
174 computation for each row (r) of the state . All the alternative
175 macros give the same end values but expand into different ways
176 of calculating these values . In particular the complex macro
177 used for dynamically variable block sizes is designed to expand
178 to a compile time constant whenever possible but will expand to
179 conditional clauses on some branches (I am grateful to Frank
180 Yellin for this construction )
181 */
182
183 # define inv_var (x,r,c)\
184 ( r == 0 ? ( c == 0 ? s(x ,0) : c == 1 ? s(x ,1) : c == 2 ? s(x ,2) : s
,→(x ,3))\
185 : r == 1 ? ( c == 0 ? s(x ,3) : c == 1 ? s(x ,0) : c == 2 ? s(x ,1) : s
,→(x ,2))\
186 : r == 2 ? ( c == 0 ? s(x ,2) : c == 1 ? s(x ,3) : c == 2 ? s(x ,0) : s
,→(x ,1))\
187 : ( c == 0 ? s(x ,1) : c == 1 ? s(x ,2) : c == 2 ? s(x ,3) : s
,→(x ,0)))
188

189 #if defined ( IT4_SET )


190 # undef dec_imvars
191 # define inv_rnd (y,x,k,c) (s(y,c) = (k)[c] ^ four_tables (x,t_use(i,
,→n),inv_var ,rf1 ,c))
192 #elif defined ( IT1_SET )
193 # undef dec_imvars
194 # define inv_rnd (y,x,k,c) (s(y,c) = (k)[c] ^ one_table (x,upr ,t_use(
,→i,n),inv_var ,rf1 ,c))
195 #else
196 # define inv_rnd (y,x,k,c) (s(y,c) = inv_mcol ((k)[c] ^ no_table (x,
,→t_use(i,box),inv_var ,rf1 ,c)))
197 # endif
198

139
C.4 APPENDIX C. SOURCE CODE

199 #if defined ( IL4_SET )


200 # define inv_lrnd (y,x,k,c) (s(y,c) = (k)[c] ^ four_tables (x,t_use(i,
,→l),inv_var ,rf1 ,c))
201 #elif defined ( IL1_SET )
202 # define inv_lrnd (y,x,k,c) (s(y,c) = (k)[c] ^ one_table (x,ups ,t_use(
,→i,l),inv_var ,rf1 ,c))
203 #else
204 # define inv_lrnd (y,x,k,c) (s(y,c) = (k)[c] ^ no_table (x,t_use(i,box
,→),inv_var ,rf1 ,c))
205 # endif
206
207 /* This code can work with the decryption key schedule in the */
208 /* order that is used for encrytpion (where the 1st decryption */
209 /* round key is at the high end ot the schedule ) or with a key */
210 /* schedule that has been reversed to put the 1st decryption */
211 /* round key at the low end of the schedule in memory (when */
212 /* AES_REV_DKS is defined ) */
213

214 # ifdef AES_REV_DKS


215 # define key_ofs 0
216 # define rnd_key (n) (kp + n * N_COLS )
217 #else
218 # define key_ofs 1
219 # define rnd_key (n) (kp - n * N_COLS )
220 # endif
221
222 AES_RETURN aes_decrypt ( const unsigned char *in , unsigned char *out ,
,→const aes_decrypt_ctx cx [1])
223 { uint_32t locals (b0 , b1);
224 #if defined ( dec_imvars )
225 dec_imvars ; /* declare variables for inv_mcol () if needed */
226 # endif
227 const uint_32t *kp;
228
229 if( cx ->inf.b[0] != 10 * 16 && cx ->inf.b[0] != 12 * 16 && cx ->inf
,→.b[0] != 14 * 16 )
230 return EXIT_FAILURE ;
231
232 kp = cx ->ks + ( key_ofs ? (cx ->inf.b[0] >> 2) : 0);
233 state_in (b0 , in , kp);
234
235 #if ( DEC_UNROLL == FULL)
236
237 kp = cx ->ks + ( key_ofs ? 0 : (cx ->inf.b[0] >> 2));
238 switch (cx ->inf.b[0])
239 {
240 case 14 * 16:
241 round(inv_rnd , b1 , b0 , rnd_key ( -13));
242 round(inv_rnd , b0 , b1 , rnd_key ( -12));

140
C.4 APPENDIX C. SOURCE CODE

243 case 12 * 16:


244 round(inv_rnd , b1 , b0 , rnd_key ( -11));
245 round(inv_rnd , b0 , b1 , rnd_key ( -10));
246 case 10 * 16:
247 round(inv_rnd , b1 , b0 , rnd_key (-9));
248 round(inv_rnd , b0 , b1 , rnd_key (-8));
249 round(inv_rnd , b1 , b0 , rnd_key (-7));
250 round(inv_rnd , b0 , b1 , rnd_key (-6));
251 round(inv_rnd , b1 , b0 , rnd_key (-5));
252 round(inv_rnd , b0 , b1 , rnd_key (-4));
253 round(inv_rnd , b1 , b0 , rnd_key (-3));
254 round(inv_rnd , b0 , b1 , rnd_key (-2));
255 round(inv_rnd , b1 , b0 , rnd_key (-1));
256 round(inv_lrnd , b0 , b1 , rnd_key ( 0));
257 }
258
259 #else
260

261 #if ( DEC_UNROLL == PARTIAL )


262 { uint_32t rnd;
263 for(rnd = 0; rnd < (cx ->inf.b[0] >> 5) - 1; ++ rnd)
264 {
265 kp = rnd_key (1);
266 round(inv_rnd , b1 , b0 , kp);
267 kp = rnd_key (1);
268 round(inv_rnd , b0 , b1 , kp);
269 }
270 kp = rnd_key (1);
271 round(inv_rnd , b1 , b0 , kp);
272 #else
273 { uint_32t rnd;
274 for(rnd = 0; rnd < (cx ->inf.b[0] >> 4) - 1; ++ rnd)
275 {
276 kp = rnd_key (1);
277 round(inv_rnd , b1 , b0 , kp);
278 l_copy (b0 , b1);
279 }
280 # endif
281 kp = rnd_key (1);
282 round(inv_lrnd , b0 , b1 , kp);
283 }
284 # endif
285
286 state_out (out , b0);
287 return EXIT_SUCCESS ;
288 }
289
290 # endif
291

141
C.4 APPENDIX C. SOURCE CODE

292 #if defined ( __cplusplus )


293 }
294 # endif

src/Obf/aeskey.c
1 /*
2 ---------------------------------------------------------------------------
,→
3 Copyright (c) 1998 -2010 , Brian Gladman , Worcester , UK. All rights
,→reserved .
4
5 The redistribution and use of this software (with or without changes )
6 is allowed without the payment of fees or royalties provided that:
7
8 source code distributions include the above copyright notice , this
9 list of conditions and the following disclaimer ;
10
11 binary distributions include the above copyright notice , this list
12 of conditions and the following disclaimer in their documentation .
13
14 This software is provided 'as is' with no explicit or implied
,→warranties
15 in respect of its operation , including , but not limited to ,
,→correctness
16 and fitness for purpose .
17 ---------------------------------------------------------------------------
,→
18 Issue Date: 20/12/2007
19 */
20
21 # include " aesopt .h"
22 # include " aestab .h"
23
24 # ifdef USE_VIA_ACE_IF_PRESENT
25 # include " aes_via_ace .h"
26 # endif
27
28 #if defined ( __cplusplus )
29 extern "C"
30 {
31 # endif
32
33 /* Initialise the key schedule from the user supplied key. The key
34 length can be specified in bytes , with legal values of 16, 24
35 and 32, or in bits , with legal values of 128, 192 and 256. These
36 values correspond with Nk values of 4, 6 and 8 respectively .
37
38 The following macros implement a single cycle in the key
39 schedule generation process . The number of cycles needed

142
C.4 APPENDIX C. SOURCE CODE

40 for each cx ->n_col and nk value is:


41
42 nk = 4 5 6 7 8
43 ------------------------------
44 cx ->n_col = 4 10 9 8 7 7
45 cx ->n_col = 5 14 11 10 9 9
46 cx ->n_col = 6 19 15 12 11 11
47 cx ->n_col = 7 21 19 16 13 14
48 cx ->n_col = 8 29 23 19 17 14
49 */
50
51 #if defined ( REDUCE_CODE_SIZE )
52 # define ls_box ls_sub
53 uint_32t ls_sub (const uint_32t t, const uint_32t n);
54 # define inv_mcol im_sub
55 uint_32t im_sub (const uint_32t x);
56 # ifdef ENC_KS_UNROLL
57 # undef ENC_KS_UNROLL
58 # endif
59 # ifdef DEC_KS_UNROLL
60 # undef DEC_KS_UNROLL
61 # endif
62 # endif
63

64 #if ( FUNCS_IN_C & ENC_KEYING_IN_C )


65
66 #if defined ( AES_128 ) || defined ( AES_VAR )
67
68 # define ke4(k,i) \
69 { k[4*(i)+4] = ss [0] ^= ls_box (ss [3] ,3) ^ t_use(r,c)[i]; \
70 k[4*(i)+5] = ss [1] ^= ss [0]; \
71 k[4*(i)+6] = ss [2] ^= ss [1]; \
72 k[4*(i)+7] = ss [3] ^= ss [2]; \
73 }
74
75 AES_RETURN aes_encrypt_key128 ( const unsigned char *key ,
,→aes_encrypt_ctx cx [1])
76 { uint_32t ss [4];
77
78 cx ->ks [0] = ss [0] = word_in (key , 0);
79 cx ->ks [1] = ss [1] = word_in (key , 1);
80 cx ->ks [2] = ss [2] = word_in (key , 2);
81 cx ->ks [3] = ss [3] = word_in (key , 3);
82
83 # ifdef ENC_KS_UNROLL
84 ke4(cx ->ks , 0); ke4(cx ->ks , 1);
85 ke4(cx ->ks , 2); ke4(cx ->ks , 3);
86 ke4(cx ->ks , 4); ke4(cx ->ks , 5);
87 ke4(cx ->ks , 6); ke4(cx ->ks , 7);

143
C.4 APPENDIX C. SOURCE CODE

88 ke4(cx ->ks , 8);


89 #else
90 { uint_32t i;
91 for(i = 0; i < 9; ++i)
92 ke4(cx ->ks , i);
93 }
94 # endif
95 ke4(cx ->ks , 9);
96 cx ->inf.l = 0;
97 cx ->inf.b[0] = 10 * 16;
98
99 # ifdef USE_VIA_ACE_IF_PRESENT
100 if( VIA_ACE_AVAILABLE )
101 cx ->inf.b[1] = 0xff;
102 # endif
103 return EXIT_SUCCESS ;
104 }
105

106 # endif
107
108 #if defined ( AES_192 ) || defined ( AES_VAR )
109
110 # define kef6(k,i) \
111 { k[6*(i)+ 6] = ss [0] ^= ls_box (ss [5] ,3) ^ t_use(r,c)[i]; \
112 k[6*(i)+ 7] = ss [1] ^= ss [0]; \
113 k[6*(i)+ 8] = ss [2] ^= ss [1]; \
114 k[6*(i)+ 9] = ss [3] ^= ss [2]; \
115 }
116
117 # define ke6(k,i) \
118 { kef6(k,i); \
119 k[6*(i)+10] = ss [4] ^= ss [3]; \
120 k[6*(i)+11] = ss [5] ^= ss [4]; \
121 }
122
123 AES_RETURN aes_encrypt_key192 ( const unsigned char *key ,
,→aes_encrypt_ctx cx [1])
124 { uint_32t ss [6];
125
126 cx ->ks [0] = ss [0] = word_in (key , 0);
127 cx ->ks [1] = ss [1] = word_in (key , 1);
128 cx ->ks [2] = ss [2] = word_in (key , 2);
129 cx ->ks [3] = ss [3] = word_in (key , 3);
130 cx ->ks [4] = ss [4] = word_in (key , 4);
131 cx ->ks [5] = ss [5] = word_in (key , 5);
132
133 # ifdef ENC_KS_UNROLL
134 ke6(cx ->ks , 0); ke6(cx ->ks , 1);
135 ke6(cx ->ks , 2); ke6(cx ->ks , 3);

144
C.4 APPENDIX C. SOURCE CODE

136 ke6(cx ->ks , 4); ke6(cx ->ks , 5);


137 ke6(cx ->ks , 6);
138 #else
139 { uint_32t i;
140 for(i = 0; i < 7; ++i)
141 ke6(cx ->ks , i);
142 }
143 # endif
144 kef6(cx ->ks , 7);
145 cx ->inf.l = 0;
146 cx ->inf.b[0] = 12 * 16;
147

148 # ifdef USE_VIA_ACE_IF_PRESENT


149 if( VIA_ACE_AVAILABLE )
150 cx ->inf.b[1] = 0xff;
151 # endif
152 return EXIT_SUCCESS ;
153 }
154
155 # endif
156
157 #if defined ( AES_256 ) || defined ( AES_VAR )
158
159 # define kef8(k,i) \
160 { k[8*(i)+ 8] = ss [0] ^= ls_box (ss [7] ,3) ^ t_use(r,c)[i]; \
161 k[8*(i)+ 9] = ss [1] ^= ss [0]; \
162 k[8*(i)+10] = ss [2] ^= ss [1]; \
163 k[8*(i)+11] = ss [3] ^= ss [2]; \
164 }
165

166 # define ke8(k,i) \


167 { kef8(k,i); \
168 k[8*(i)+12] = ss [4] ^= ls_box (ss [3] ,0); \
169 k[8*(i)+13] = ss [5] ^= ss [4]; \
170 k[8*(i)+14] = ss [6] ^= ss [5]; \
171 k[8*(i)+15] = ss [7] ^= ss [6]; \
172 }
173
174 AES_RETURN aes_encrypt_key256 ( const unsigned char *key ,
,→aes_encrypt_ctx cx [1])
175 { uint_32t ss [8];
176

177 cx ->ks [0] = ss [0] = word_in (key , 0);


178 cx ->ks [1] = ss [1] = word_in (key , 1);
179 cx ->ks [2] = ss [2] = word_in (key , 2);
180 cx ->ks [3] = ss [3] = word_in (key , 3);
181 cx ->ks [4] = ss [4] = word_in (key , 4);
182 cx ->ks [5] = ss [5] = word_in (key , 5);
183 cx ->ks [6] = ss [6] = word_in (key , 6);

145
C.4 APPENDIX C. SOURCE CODE

184 cx ->ks [7] = ss [7] = word_in (key , 7);


185
186 # ifdef ENC_KS_UNROLL
187 ke8(cx ->ks , 0); ke8(cx ->ks , 1);
188 ke8(cx ->ks , 2); ke8(cx ->ks , 3);
189 ke8(cx ->ks , 4); ke8(cx ->ks , 5);
190 #else
191 { uint_32t i;
192 for(i = 0; i < 6; ++i)
193 ke8(cx ->ks , i);
194 }
195 # endif
196 kef8(cx ->ks , 6);
197 cx ->inf.l = 0;
198 cx ->inf.b[0] = 14 * 16;
199
200 # ifdef USE_VIA_ACE_IF_PRESENT
201 if( VIA_ACE_AVAILABLE )
202 cx ->inf.b[1] = 0xff;
203 # endif
204 return EXIT_SUCCESS ;
205 }
206
207 # endif
208
209 #if defined ( AES_VAR )
210
211 AES_RETURN aes_encrypt_key ( const unsigned char *key , int key_len ,
,→aes_encrypt_ctx cx [1])
212 {
213 switch ( key_len )
214 {
215 case 16: case 128: return aes_encrypt_key128 (key , cx);
216 case 24: case 192: return aes_encrypt_key192 (key , cx);
217 case 32: case 256: return aes_encrypt_key256 (key , cx);
218 default : return EXIT_FAILURE ;
219 }
220 }
221
222 # endif
223
224 # endif
225
226 #if ( FUNCS_IN_C & DEC_KEYING_IN_C )
227
228 /* this is used to store the decryption round keys */
229 /* in forward or reverse order */
230

231 # ifdef AES_REV_DKS

146
C.4 APPENDIX C. SOURCE CODE

232 # define v(n,i) ((n) - (i) + 2 * ((i) & 3))


233 #else
234 # define v(n,i) (i)
235 # endif
236
237 #if DEC_ROUND == NO_TABLES
238 # define ff(x) (x)
239 #else
240 # define ff(x) inv_mcol (x)
241 #if defined ( dec_imvars )
242 # define d_vars dec_imvars
243 # endif
244 # endif
245
246 #if defined ( AES_128 ) || defined ( AES_VAR )
247
248 # define k4e(k,i) \
249 { k[v(40 ,(4*(i))+4)] = ss [0] ^= ls_box (ss [3] ,3) ^ t_use(r,c)[i]; \
250 k[v(40 ,(4*(i))+5)] = ss [1] ^= ss [0]; \
251 k[v(40 ,(4*(i))+6)] = ss [2] ^= ss [1]; \
252 k[v(40 ,(4*(i))+7)] = ss [3] ^= ss [2]; \
253 }
254
255 #if 1
256
257 # define kdf4(k,i) \
258 { ss [0] = ss [0] ^ ss [2] ^ ss [1] ^ ss [3]; \
259 ss [1] = ss [1] ^ ss [3]; \
260 ss [2] = ss [2] ^ ss [3]; \
261 ss [4] = ls_box (ss[(i+3) % 4], 3) ^ t_use(r,c)[i]; \
262 ss[i % 4] ^= ss [4]; \
263 ss [4] ^= k[v(40 ,(4*(i)))]; k[v(40 ,(4*(i))+4)] = ff(ss [4]); \
264 ss [4] ^= k[v(40 ,(4*(i))+1) ]; k[v(40 ,(4*(i))+5)] = ff(ss [4]); \
265 ss [4] ^= k[v(40 ,(4*(i))+2) ]; k[v(40 ,(4*(i))+6)] = ff(ss [4]); \
266 ss [4] ^= k[v(40 ,(4*(i))+3) ]; k[v(40 ,(4*(i))+7)] = ff(ss [4]); \
267 }
268
269 # define kd4(k,i) \
270 { ss [4] = ls_box (ss[(i+3) % 4], 3) ^ t_use(r,c)[i]; \
271 ss[i % 4] ^= ss [4]; ss [4] = ff(ss [4]); \
272 k[v(40 ,(4*(i))+4)] = ss [4] ^= k[v(40 ,(4*(i)))]; \
273 k[v(40 ,(4*(i))+5)] = ss [4] ^= k[v(40 ,(4*(i))+1) ]; \
274 k[v(40 ,(4*(i))+6)] = ss [4] ^= k[v(40 ,(4*(i))+2) ]; \
275 k[v(40 ,(4*(i))+7)] = ss [4] ^= k[v(40 ,(4*(i))+3) ]; \
276 }
277
278 # define kdl4(k,i) \
279 { ss [4] = ls_box (ss[(i+3) % 4], 3) ^ t_use(r,c)[i]; ss[i % 4] ^= ss
,→[4]; \

147
C.4 APPENDIX C. SOURCE CODE

280 k[v(40 ,(4*(i))+4)] = (ss [0] ^= ss [1]) ^ ss [2] ^ ss [3]; \


281 k[v(40 ,(4*(i))+5)] = ss [1] ^ ss [3]; \
282 k[v(40 ,(4*(i))+6)] = ss [0]; \
283 k[v(40 ,(4*(i))+7)] = ss [1]; \
284 }
285

286 #else
287
288 # define kdf4(k,i) \
289 { ss [0] ^= ls_box (ss [3] ,3) ^ t_use(r,c)[i]; k[v(40 ,(4*(i))+ 4)] =
,→ff(ss [0]); \
290 ss [1] ^= ss [0]; k[v(40 ,(4*(i))+ 5)] = ff(ss [1]); \
291 ss [2] ^= ss [1]; k[v(40 ,(4*(i))+ 6)] = ff(ss [2]); \
292 ss [3] ^= ss [2]; k[v(40 ,(4*(i))+ 7)] = ff(ss [3]); \
293 }
294
295 # define kd4(k,i) \
296 { ss [4] = ls_box (ss [3] ,3) ^ t_use(r,c)[i]; \
297 ss [0] ^= ss [4]; ss [4] = ff(ss [4]); k[v(40 ,(4*(i))+ 4)] = ss [4] ^=
,→ k[v(40 ,(4*(i)))]; \
298 ss [1] ^= ss [0]; k[v(40 ,(4*(i))+ 5)] = ss [4] ^= k[v(40 ,(4*(i))+ 1)
,→]; \
299 ss [2] ^= ss [1]; k[v(40 ,(4*(i))+ 6)] = ss [4] ^= k[v(40 ,(4*(i))+ 2)
,→]; \
300 ss [3] ^= ss [2]; k[v(40 ,(4*(i))+ 7)] = ss [4] ^= k[v(40 ,(4*(i))+ 3)
,→]; \
301 }
302
303 # define kdl4(k,i) \
304 { ss [0] ^= ls_box (ss [3] ,3) ^ t_use(r,c)[i]; k[v(40 ,(4*(i))+ 4)] =
,→ss [0]; \
305 ss [1] ^= ss [0]; k[v(40 ,(4*(i))+ 5)] = ss [1]; \
306 ss [2] ^= ss [1]; k[v(40 ,(4*(i))+ 6)] = ss [2]; \
307 ss [3] ^= ss [2]; k[v(40 ,(4*(i))+ 7)] = ss [3]; \
308 }
309

310 # endif
311
312 AES_RETURN aes_decrypt_key128 ( const unsigned char *key ,
,→aes_decrypt_ctx cx [1])
313 { uint_32t ss [5];
314 #if defined ( d_vars )
315 d_vars ;
316 # endif
317 cx ->ks[v(40 ,(0))] = ss [0] = word_in (key , 0);
318 cx ->ks[v(40 ,(1))] = ss [1] = word_in (key , 1);
319 cx ->ks[v(40 ,(2))] = ss [2] = word_in (key , 2);
320 cx ->ks[v(40 ,(3))] = ss [3] = word_in (key , 3);
321

148
C.4 APPENDIX C. SOURCE CODE

322 # ifdef DEC_KS_UNROLL


323 kdf4(cx ->ks , 0); kd4(cx ->ks , 1);
324 kd4(cx ->ks , 2); kd4(cx ->ks , 3);
325 kd4(cx ->ks , 4); kd4(cx ->ks , 5);
326 kd4(cx ->ks , 6); kd4(cx ->ks , 7);
327 kd4(cx ->ks , 8); kdl4(cx ->ks , 9);
328 #else
329 { uint_32t i;
330 for(i = 0; i < 10; ++i)
331 k4e(cx ->ks , i);
332 #if !( DEC_ROUND == NO_TABLES )
333 for(i = N_COLS ; i < 10 * N_COLS ; ++i)
334 cx ->ks[i] = inv_mcol (cx ->ks[i]);
335 # endif
336 }
337 # endif
338 cx ->inf.l = 0;
339 cx ->inf.b[0] = 10 * 16;
340
341 # ifdef USE_VIA_ACE_IF_PRESENT
342 if( VIA_ACE_AVAILABLE )
343 cx ->inf.b[1] = 0xff;
344 # endif
345 return EXIT_SUCCESS ;
346 }
347
348 # endif
349
350 #if defined ( AES_192 ) || defined ( AES_VAR )
351

352 # define k6ef(k,i) \


353 { k[v(48 ,(6*(i))+ 6)] = ss [0] ^= ls_box (ss [5] ,3) ^ t_use(r,c)[i]; \
354 k[v(48 ,(6*(i))+ 7)] = ss [1] ^= ss [0]; \
355 k[v(48 ,(6*(i))+ 8)] = ss [2] ^= ss [1]; \
356 k[v(48 ,(6*(i))+ 9)] = ss [3] ^= ss [2]; \
357 }
358
359 # define k6e(k,i) \
360 { k6ef(k,i); \
361 k[v(48 ,(6*(i))+10)] = ss [4] ^= ss [3]; \
362 k[v(48 ,(6*(i))+11)] = ss [5] ^= ss [4]; \
363 }
364
365 # define kdf6(k,i) \
366 { ss [0] ^= ls_box (ss [5] ,3) ^ t_use(r,c)[i]; k[v(48 ,(6*(i))+ 6)] =
,→ff(ss [0]); \
367 ss [1] ^= ss [0]; k[v(48 ,(6*(i))+ 7)] = ff(ss [1]); \
368 ss [2] ^= ss [1]; k[v(48 ,(6*(i))+ 8)] = ff(ss [2]); \
369 ss [3] ^= ss [2]; k[v(48 ,(6*(i))+ 9)] = ff(ss [3]); \

149
C.4 APPENDIX C. SOURCE CODE

370 ss [4] ^= ss [3]; k[v(48 ,(6*(i))+10)] = ff(ss [4]); \


371 ss [5] ^= ss [4]; k[v(48 ,(6*(i))+11)] = ff(ss [5]); \
372 }
373
374 # define kd6(k,i) \
375 { ss [6] = ls_box (ss [5] ,3) ^ t_use(r,c)[i]; \
376 ss [0] ^= ss [6]; ss [6] = ff(ss [6]); k[v(48 ,(6*(i))+ 6)] = ss [6] ^=
,→ k[v(48 ,(6*(i)))]; \
377 ss [1] ^= ss [0]; k[v(48 ,(6*(i))+ 7)] = ss [6] ^= k[v(48 ,(6*(i))+ 1)
,→]; \
378 ss [2] ^= ss [1]; k[v(48 ,(6*(i))+ 8)] = ss [6] ^= k[v(48 ,(6*(i))+ 2)
,→]; \
379 ss [3] ^= ss [2]; k[v(48 ,(6*(i))+ 9)] = ss [6] ^= k[v(48 ,(6*(i))+ 3)
,→]; \
380 ss [4] ^= ss [3]; k[v(48 ,(6*(i))+10)] = ss [6] ^= k[v(48 ,(6*(i))+ 4)
,→]; \
381 ss [5] ^= ss [4]; k[v(48 ,(6*(i))+11)] = ss [6] ^= k[v(48 ,(6*(i))+ 5)
,→]; \
382 }
383
384 # define kdl6(k,i) \
385 { ss [0] ^= ls_box (ss [5] ,3) ^ t_use(r,c)[i]; k[v(48 ,(6*(i))+ 6)] =
,→ss [0]; \
386 ss [1] ^= ss [0]; k[v(48 ,(6*(i))+ 7)] = ss [1]; \
387 ss [2] ^= ss [1]; k[v(48 ,(6*(i))+ 8)] = ss [2]; \
388 ss [3] ^= ss [2]; k[v(48 ,(6*(i))+ 9)] = ss [3]; \
389 }
390
391 AES_RETURN aes_decrypt_key192 ( const unsigned char *key ,
,→aes_decrypt_ctx cx [1])
392 { uint_32t ss [7];
393 #if defined ( d_vars )
394 d_vars ;
395 # endif
396 cx ->ks[v(48 ,(0))] = ss [0] = word_in (key , 0);
397 cx ->ks[v(48 ,(1))] = ss [1] = word_in (key , 1);
398 cx ->ks[v(48 ,(2))] = ss [2] = word_in (key , 2);
399 cx ->ks[v(48 ,(3))] = ss [3] = word_in (key , 3);
400
401 # ifdef DEC_KS_UNROLL
402 cx ->ks[v(48 ,(4))] = ff(ss [4] = word_in (key , 4));
403 cx ->ks[v(48 ,(5))] = ff(ss [5] = word_in (key , 5));
404 kdf6(cx ->ks , 0); kd6(cx ->ks , 1);
405 kd6(cx ->ks , 2); kd6(cx ->ks , 3);
406 kd6(cx ->ks , 4); kd6(cx ->ks , 5);
407 kd6(cx ->ks , 6); kdl6(cx ->ks , 7);
408 #else
409 cx ->ks[v(48 ,(4))] = ss [4] = word_in (key , 4);
410 cx ->ks[v(48 ,(5))] = ss [5] = word_in (key , 5);

150
C.4 APPENDIX C. SOURCE CODE

411 { uint_32t i;
412
413 for(i = 0; i < 7; ++i)
414 k6e(cx ->ks , i);
415 k6ef(cx ->ks , 7);
416 #if !( DEC_ROUND == NO_TABLES )
417 for(i = N_COLS ; i < 12 * N_COLS ; ++i)
418 cx ->ks[i] = inv_mcol (cx ->ks[i]);
419 # endif
420 }
421 # endif
422 cx ->inf.l = 0;
423 cx ->inf.b[0] = 12 * 16;
424
425 # ifdef USE_VIA_ACE_IF_PRESENT
426 if( VIA_ACE_AVAILABLE )
427 cx ->inf.b[1] = 0xff;
428 # endif
429 return EXIT_SUCCESS ;
430 }
431
432 # endif
433
434 #if defined ( AES_256 ) || defined ( AES_VAR )
435
436 # define k8ef(k,i) \
437 { k[v(56 ,(8*(i))+ 8)] = ss [0] ^= ls_box (ss [7] ,3) ^ t_use(r,c)[i]; \
438 k[v(56 ,(8*(i))+ 9)] = ss [1] ^= ss [0]; \
439 k[v(56 ,(8*(i))+10)] = ss [2] ^= ss [1]; \
440 k[v(56 ,(8*(i))+11)] = ss [3] ^= ss [2]; \
441 }
442
443 # define k8e(k,i) \
444 { k8ef(k,i); \
445 k[v(56 ,(8*(i))+12)] = ss [4] ^= ls_box (ss [3] ,0); \
446 k[v(56 ,(8*(i))+13)] = ss [5] ^= ss [4]; \
447 k[v(56 ,(8*(i))+14)] = ss [6] ^= ss [5]; \
448 k[v(56 ,(8*(i))+15)] = ss [7] ^= ss [6]; \
449 }
450
451 # define kdf8(k,i) \
452 { ss [0] ^= ls_box (ss [7] ,3) ^ t_use(r,c)[i]; k[v(56 ,(8*(i))+ 8)] =
,→ff(ss [0]); \
453 ss [1] ^= ss [0]; k[v(56 ,(8*(i))+ 9)] = ff(ss [1]); \
454 ss [2] ^= ss [1]; k[v(56 ,(8*(i))+10)] = ff(ss [2]); \
455 ss [3] ^= ss [2]; k[v(56 ,(8*(i))+11)] = ff(ss [3]); \
456 ss [4] ^= ls_box (ss [3] ,0); k[v(56 ,(8*(i))+12)] = ff(ss [4]); \
457 ss [5] ^= ss [4]; k[v(56 ,(8*(i))+13)] = ff(ss [5]); \
458 ss [6] ^= ss [5]; k[v(56 ,(8*(i))+14)] = ff(ss [6]); \

151
C.4 APPENDIX C. SOURCE CODE

459 ss [7] ^= ss [6]; k[v(56 ,(8*(i))+15)] = ff(ss [7]); \


460 }
461
462 # define kd8(k,i) \
463 { ss [8] = ls_box (ss [7] ,3) ^ t_use(r,c)[i]; \
464 ss [0] ^= ss [8]; ss [8] = ff(ss [8]); k[v(56 ,(8*(i))+ 8)] = ss [8] ^=
,→ k[v(56 ,(8*(i)))]; \
465 ss [1] ^= ss [0]; k[v(56 ,(8*(i))+ 9)] = ss [8] ^= k[v(56 ,(8*(i))+ 1)
,→]; \
466 ss [2] ^= ss [1]; k[v(56 ,(8*(i))+10)] = ss [8] ^= k[v(56 ,(8*(i))+ 2)
,→]; \
467 ss [3] ^= ss [2]; k[v(56 ,(8*(i))+11)] = ss [8] ^= k[v(56 ,(8*(i))+ 3)
,→]; \
468 ss [8] = ls_box (ss [3] ,0); \
469 ss [4] ^= ss [8]; ss [8] = ff(ss [8]); k[v(56 ,(8*(i))+12)] = ss [8] ^=
,→ k[v(56 ,(8*(i))+ 4)]; \
470 ss [5] ^= ss [4]; k[v(56 ,(8*(i))+13)] = ss [8] ^= k[v(56 ,(8*(i))+ 5)
,→]; \
471 ss [6] ^= ss [5]; k[v(56 ,(8*(i))+14)] = ss [8] ^= k[v(56 ,(8*(i))+ 6)
,→]; \
472 ss [7] ^= ss [6]; k[v(56 ,(8*(i))+15)] = ss [8] ^= k[v(56 ,(8*(i))+ 7)
,→]; \
473 }
474

475 # define kdl8(k,i) \


476 { ss [0] ^= ls_box (ss [7] ,3) ^ t_use(r,c)[i]; k[v(56 ,(8*(i))+ 8)] =
,→ss [0]; \
477 ss [1] ^= ss [0]; k[v(56 ,(8*(i))+ 9)] = ss [1]; \
478 ss [2] ^= ss [1]; k[v(56 ,(8*(i))+10)] = ss [2]; \
479 ss [3] ^= ss [2]; k[v(56 ,(8*(i))+11)] = ss [3]; \
480 }
481
482 AES_RETURN aes_decrypt_key256 ( const unsigned char *key ,
,→aes_decrypt_ctx cx [1])
483 { uint_32t ss [9];
484 #if defined ( d_vars )
485 d_vars ;
486 # endif
487 cx ->ks[v(56 ,(0))] = ss [0] = word_in (key , 0);
488 cx ->ks[v(56 ,(1))] = ss [1] = word_in (key , 1);
489 cx ->ks[v(56 ,(2))] = ss [2] = word_in (key , 2);
490 cx ->ks[v(56 ,(3))] = ss [3] = word_in (key , 3);
491
492 # ifdef DEC_KS_UNROLL
493 cx ->ks[v(56 ,(4))] = ff(ss [4] = word_in (key , 4));
494 cx ->ks[v(56 ,(5))] = ff(ss [5] = word_in (key , 5));
495 cx ->ks[v(56 ,(6))] = ff(ss [6] = word_in (key , 6));
496 cx ->ks[v(56 ,(7))] = ff(ss [7] = word_in (key , 7));
497 kdf8(cx ->ks , 0); kd8(cx ->ks , 1);

152
C.4 APPENDIX C. SOURCE CODE

498 kd8(cx ->ks , 2); kd8(cx ->ks , 3);


499 kd8(cx ->ks , 4); kd8(cx ->ks , 5);
500 kdl8(cx ->ks , 6);
501 #else
502 cx ->ks[v(56 ,(4))] = ss [4] = word_in (key , 4);
503 cx ->ks[v(56 ,(5))] = ss [5] = word_in (key , 5);
504 cx ->ks[v(56 ,(6))] = ss [6] = word_in (key , 6);
505 cx ->ks[v(56 ,(7))] = ss [7] = word_in (key , 7);
506 { uint_32t i;
507
508 for(i = 0; i < 6; ++i)
509 k8e(cx ->ks , i);
510 k8ef(cx ->ks , 6);
511 #if !( DEC_ROUND == NO_TABLES )
512 for(i = N_COLS ; i < 14 * N_COLS ; ++i)
513 cx ->ks[i] = inv_mcol (cx ->ks[i]);
514 # endif
515 }
516 # endif
517 cx ->inf.l = 0;
518 cx ->inf.b[0] = 14 * 16;
519
520 # ifdef USE_VIA_ACE_IF_PRESENT
521 if( VIA_ACE_AVAILABLE )
522 cx ->inf.b[1] = 0xff;
523 # endif
524 return EXIT_SUCCESS ;
525 }
526
527 # endif
528
529 #if defined ( AES_VAR )
530
531 AES_RETURN aes_decrypt_key ( const unsigned char *key , int key_len ,
,→aes_decrypt_ctx cx [1])
532 {
533 switch ( key_len )
534 {
535 case 16: case 128: return aes_decrypt_key128 (key , cx);
536 case 24: case 192: return aes_decrypt_key192 (key , cx);
537 case 32: case 256: return aes_decrypt_key256 (key , cx);
538 default : return EXIT_FAILURE ;
539 }
540 }
541
542 # endif
543
544 # endif
545

153
C.4 APPENDIX C. SOURCE CODE

546 #if defined ( __cplusplus )


547 }
548 # endif

src/Obf/aesopt.h
1 /*
2 ---------------------------------------------------------------------------
,→
3 Copyright (c) 1998 -2010 , Brian Gladman , Worcester , UK. All rights
,→reserved .
4
5 The redistribution and use of this software (with or without changes )
6 is allowed without the payment of fees or royalties provided that:
7
8 source code distributions include the above copyright notice , this
9 list of conditions and the following disclaimer ;
10
11 binary distributions include the above copyright notice , this list
12 of conditions and the following disclaimer in their documentation .
13
14 This software is provided 'as is' with no explicit or implied
,→warranties
15 in respect of its operation , including , but not limited to ,
,→correctness
16 and fitness for purpose .
17 ---------------------------------------------------------------------------
,→
18 Issue Date: 20/12/2007
19
20 This file contains the compilation options for AES ( Rijndael ) and
,→code
21 that is common across encryption , key scheduling and table
,→generation .
22

23 OPERATION
24
25 These source code files implement the AES algorithm Rijndael
,→designed by
26 Joan Daemen and Vincent Rijmen . This version is designed for the
,→standard
27 block size of 16 bytes and for key sizes of 128, 192 and 256 bits
,→(16, 24
28 and 32 bytes ).
29
30 This version is designed for flexibility and speed using operations
,→on
31 32- bit words rather than operations on bytes . It can be compiled
,→with
32 either big or little endian internal byte order but is faster when

154
C.4 APPENDIX C. SOURCE CODE

,→the
33 native byte order for the processor is used.
34
35 THE CIPHER INTERFACE
36
37 The cipher interface is implemented as an array of bytes in which
,→lower
38 AES bit sequence indexes map to higher numeric significance within
,→bytes .
39
40 uint_8t (an unsigned 8-bit type)
41 uint_32t (an unsigned 32- bit type)
42 struct aes_encrypt_ctx ( structure for the cipher encryption
,→context )
43 struct aes_decrypt_ctx ( structure for the cipher decryption
,→context )
44 AES_RETURN the function return type
45

46 C subroutine calls :
47
48 AES_RETURN aes_encrypt_key128 ( const unsigned char *key ,
,→aes_encrypt_ctx cx [1]);
49 AES_RETURN aes_encrypt_key192 ( const unsigned char *key ,
,→aes_encrypt_ctx cx [1]);
50 AES_RETURN aes_encrypt_key256 ( const unsigned char *key ,
,→aes_encrypt_ctx cx [1]);
51 AES_RETURN aes_encrypt ( const unsigned char *in , unsigned char *out ,
52 const
,→aes_encrypt_ctx cx [1]);
53

54 AES_RETURN aes_decrypt_key128 ( const unsigned char *key ,


,→aes_decrypt_ctx cx [1]);
55 AES_RETURN aes_decrypt_key192 ( const unsigned char *key ,
,→aes_decrypt_ctx cx [1]);
56 AES_RETURN aes_decrypt_key256 ( const unsigned char *key ,
,→aes_decrypt_ctx cx [1]);
57 AES_RETURN aes_decrypt ( const unsigned char *in , unsigned char *out ,
58 const
,→aes_decrypt_ctx cx [1]);
59
60 IMPORTANT NOTE: If you are using this C interface with dynamic
,→tables make sure that
61 you call aes_init () before AES is used so that the tables are
,→initialised .
62
63 C++ aes class subroutines :
64
65 Class AESencrypt for encryption
66

155
C.4 APPENDIX C. SOURCE CODE

67 Construtors :
68 AESencrypt (void)
69 AESencrypt (const unsigned char *key) - 128 bit key
70 Members :
71 AES_RETURN key128 ( const unsigned char *key)
72 AES_RETURN key192 ( const unsigned char *key)
73 AES_RETURN key256 ( const unsigned char *key)
74 AES_RETURN encrypt ( const unsigned char *in , unsigned char *
,→out) const
75
76 Class AESdecrypt for encryption
77 Construtors :
78 AESdecrypt (void)
79 AESdecrypt (const unsigned char *key) - 128 bit key
80 Members :
81 AES_RETURN key128 ( const unsigned char *key)
82 AES_RETURN key192 ( const unsigned char *key)
83 AES_RETURN key256 ( const unsigned char *key)
84 AES_RETURN decrypt ( const unsigned char *in , unsigned char *
,→out) const
85 */
86
87 #if ! defined ( _AESOPT_H )
88 # define _AESOPT_H
89
90 #if defined ( __cplusplus )
91 # include " aescpp .h"
92 #else
93 # include "aes.h"
94 # endif
95
96 /* PLATFORM SPECIFIC INCLUDES */
97
98 # include " brg_endian .h"
99
100 /* CONFIGURATION - THE USE OF DEFINES
101
102 Later in this section there are a number of defines that control
,→the
103 operation of the code. In each section , the purpose of each
,→define is
104 explained so that the relevant form can be included or excluded
,→by
105 setting either 1's or 0's respectively on the branches of the
,→related
106 #if clauses . The following local defines should not be changed .
107 */
108

109 # define ENCRYPTION_IN_C 1

156
C.4 APPENDIX C. SOURCE CODE

110 # define DECRYPTION_IN_C 2


111 # define ENC_KEYING_IN_C 4
112 # define DEC_KEYING_IN_C 8
113
114 # define NO_TABLES 0
115 # define ONE_TABLE 1
116 # define FOUR_TABLES 4
117 # define NONE 0
118 # define PARTIAL 1
119 # define FULL 2
120
121 /* --- START OF USER CONFIGURED OPTIONS --- */
122
123 /* 1. BYTE ORDER WITHIN 32 BIT WORDS
124
125 The fundamental data processing units in Rijndael are 8-bit bytes
,→. The
126 input , output and key input are all enumerated arrays of bytes in
,→ which
127 bytes are numbered starting at zero and increasing to one less
,→than the
128 number of bytes in the array in question . This enumeration is
,→only used
129 for naming bytes and does not imply any adjacency or order
,→relationship
130 from one byte to another . When these inputs and outputs are
,→considered
131 as bit sequences , bits 8*n to 8*n+7 of the bit sequence are
,→mapped to
132 byte[n] with bit 8n+i in the sequence mapped to bit 7-i within
,→the byte.
133 In this implementation bits are numbered from 0 to 7 starting at
,→the
134 numerically least significant end of each byte (bit n represents
,→2^n).
135

136 However , Rijndael can be implemented more efficiently using 32-


,→bit
137 words by packing bytes into words so that bytes 4*n to 4*n+3 are
,→placed
138 into word[n]. While in principle these bytes can be assembled
,→into words
139 in any positions , this implementation only supports the two
,→formats in
140 which bytes in adjacent positions within words also have adjacent
,→ byte
141 numbers . This order is called big - endian if the lowest numbered
,→bytes
142 in words have the highest numeric significance and little - endian

157
C.4 APPENDIX C. SOURCE CODE

,→if the
143 opposite applies .
144
145 This code can work in either order irrespective of the order used
,→ by the
146 machine on which it runs. Normally the internal byte order will
,→be set
147 to the order of the processor on which the code is to be run but
,→this
148 define can be used to reverse this in special situations
149
150 WARNING : Assembler code versions rely on PLATFORM_BYTE_ORDER
,→being set.
151 This define will hence be redefined later (in section 4) if
,→necessary
152 */
153
154 #if 1
155 # define ALGORITHM_BYTE_ORDER PLATFORM_BYTE_ORDER
156 #elif 0
157 # define ALGORITHM_BYTE_ORDER IS_LITTLE_ENDIAN
158 #elif 0
159 # define ALGORITHM_BYTE_ORDER IS_BIG_ENDIAN
160 #else
161 # error The algorithm byte order is not defined
162 # endif
163
164 /* 2. VIA ACE SUPPORT */
165
166 #if defined ( __GNUC__ ) && defined ( __i386__ ) \
167 || defined ( _WIN32 ) && defined ( _M_IX86 ) \
168 && !( defined ( _WIN64 ) || defined ( _WIN32_WCE ) || defined ( _MSC_VER
,→ ) && ( _MSC_VER <= 800 ))
169 # define VIA_ACE_POSSIBLE
170 # endif
171

172 /* Define this option if support for the VIA ACE is required . This
,→uses
173 inline assembler instructions and is only implemented for the
,→Microsoft ,
174 Intel and GCC compilers . If VIA ACE is known to be present , then
,→ defining
175 ASSUME_VIA_ACE_PRESENT will remove the ordinary encryption /
,→decryption
176 code. If USE_VIA_ACE_IF_PRESENT is defined then VIA ACE will be
,→used if
177 it is detected (both present and enabled ) but the normal AES code
,→ will
178 also be present .

158
C.4 APPENDIX C. SOURCE CODE

179
180 When VIA ACE is to be used , all AES encryption contexts MUST be
,→16 byte
181 aligned ; other input / output buffers do not need to be 16 byte
,→aligned
182 but there are very large performance gains if this can be
,→arranged .
183 VIA ACE also requires the decryption key schedule to be in
,→reverse
184 order ( which later checks below ensure ).
185 */
186

187 #if 1 && defined ( VIA_ACE_POSSIBLE ) && ! defined (


,→USE_VIA_ACE_IF_PRESENT )
188 # define USE_VIA_ACE_IF_PRESENT
189 # endif
190
191 #if 0 && defined ( VIA_ACE_POSSIBLE ) && ! defined (
,→ASSUME_VIA_ACE_PRESENT )
192 # define ASSUME_VIA_ACE_PRESENT
193 # endif
194
195 /* 3. ASSEMBLER SUPPORT
196

197 This define ( which can be on the command line) enables the use of
,→ the
198 assembler code routines for encryption , decryption and key
,→scheduling
199 as follows :
200

201 ASM_X86_V1C uses the assembler ( aes_x86_v1 .asm) with large tables
,→ for
202 encryption and decryption and but with key scheduling
,→ in C
203 ASM_X86_V2 uses assembler ( aes_x86_v2 .asm) with compressed
,→tables for
204 encryption , decryption and key scheduling
205 ASM_X86_V2C uses assembler ( aes_x86_v2 .asm) with compressed
,→tables for
206 encryption and decryption and but with key scheduling
,→ in C
207 ASM_AMD64_C uses assembler ( aes_amd64 .asm) with compressed tables
,→ for
208 encryption and decryption and but with key scheduling
,→ in C
209
210 Change one 'if 0' below to 'if 1' to select the version or define
211 as a compilation option .
212 */

159
C.4 APPENDIX C. SOURCE CODE

213
214 #if 0 && ! defined ( ASM_X86_V1C )
215 # define ASM_X86_V1C
216 #elif 0 && ! defined ( ASM_X86_V2 )
217 # define ASM_X86_V2
218 #elif 0 && ! defined ( ASM_X86_V2C )
219 # define ASM_X86_V2C
220 #elif 0 && ! defined ( ASM_AMD64_C )
221 # define ASM_AMD64_C
222 # endif
223
224 #if ( defined ( ASM_X86_V1C ) || defined ( ASM_X86_V2 ) || defined (
,→ASM_X86_V2C )) \
225 && ! defined ( _M_IX86 ) || defined ( ASM_AMD64_C ) && ! defined (
,→_M_X64 )
226 # error Assembler code is only available for x86 and AMD64 systems
227 # endif
228

229 /* 4. FAST INPUT / OUTPUT OPERATIONS .


230
231 On some machines it is possible to improve speed by transferring
,→the
232 bytes in the input and output arrays to and from the internal 32-
,→bit
233 variables by addressing these arrays as if they are arrays of 32-
,→bit
234 words . On some machines this will always be possible but there
,→may
235 be a large performance penalty if the byte arrays are not aligned
,→ on
236 the normal word boundaries . On other machines this technique will
237 lead to memory access errors when such 32- bit word accesses are
,→not
238 properly aligned . The option SAFE_IO avoids such problems but
,→will
239 often be slower on those machines that support misaligned access
240 ( especially so if care is taken to align the input and output
,→byte
241 arrays on 32- bit word boundaries ). If SAFE_IO is not defined it
,→is
242 assumed that access to byte arrays as if they are arrays of 32-
,→bit
243 words will not cause problems when such accesses are misaligned .
244 */
245 #if 1 && ! defined ( _MSC_VER )
246 # define SAFE_IO
247 # endif
248

249 /* 5. LOOP UNROLLING

160
C.4 APPENDIX C. SOURCE CODE

250
251 The code for encryption and decrytpion cycles through a number of
,→ rounds
252 that can be implemented either in a loop or by expanding the code
,→ into a
253 long sequence of instructions , the latter producing a larger
,→program but
254 one that will often be much faster . The latter is called loop
,→unrolling .
255 There are also potential speed advantages in expanding two
,→iterations in
256 a loop with half the number of iterations , which is called
,→partial loop
257 unrolling . The following options allow partial or full loop
,→unrolling
258 to be set independently for encryption and decryption
259 */
260 #if 1
261 # define ENC_UNROLL FULL
262 #elif 0
263 # define ENC_UNROLL PARTIAL
264 #else
265 # define ENC_UNROLL NONE
266 # endif
267
268 #if 1
269 # define DEC_UNROLL FULL
270 #elif 0
271 # define DEC_UNROLL PARTIAL
272 #else
273 # define DEC_UNROLL NONE
274 # endif
275
276 #if 1
277 # define ENC_KS_UNROLL
278 # endif
279
280 #if 1
281 # define DEC_KS_UNROLL
282 # endif
283
284 /* 6. FAST FINITE FIELD OPERATIONS
285
286 If this section is included , tables are used to provide faster
,→finite
287 field arithmetic (this has no effect if FIXED_TABLES is defined ).
288 */
289 #if 1
290 # define FF_TABLES

161
C.4 APPENDIX C. SOURCE CODE

291 # endif
292
293 /* 7. INTERNAL STATE VARIABLE FORMAT
294
295 The internal state of Rijndael is stored in a number of local 32-
,→bit
296 word varaibles which can be defined either as an array or as
,→individual
297 names variables . Include this section if you want to store these
,→local
298 varaibles in arrays . Otherwise individual local variables will be
,→ used.
299 */
300 #if 1
301 # define ARRAYS
302 # endif
303
304 /* 8. FIXED OR DYNAMIC TABLES
305
306 When this section is included the tables used by the code are
,→compiled
307 statically into the binary file. Otherwise the subroutine
,→aes_init ()
308 must be called to compute them before the code is first used.
309 */
310 #if 1 && !( defined ( _MSC_VER ) && ( _MSC_VER <= 800 ))
311 # define FIXED_TABLES
312 # endif
313
314 /* 9. MASKING OR CASTING FROM LONGER VALUES TO BYTES
315
316 In some systems it is better to mask longer values to extract
,→bytes
317 rather than using a cast. This option allows this choice .
318 */
319 #if 0
320 # define to_byte (x) (( uint_8t )(x))
321 #else
322 # define to_byte (x) ((x) & 0xff)
323 # endif
324
325 /* 10. TABLE ALIGNMENT
326
327 On some sytsems speed will be improved by aligning the AES large
,→lookup
328 tables on particular boundaries . This define should be set to a
,→power of
329 two giving the desired alignment . It can be left undefined if
,→alignment

162
C.4 APPENDIX C. SOURCE CODE

330 is not needed . This option is specific to the Microsft VC++


,→compiler -
331 it seems to sometimes cause trouble for the VC ++ version 6
,→compiler .
332 */
333

334 #if 1 && defined ( _MSC_VER ) && ( _MSC_VER >= 1300 )


335 # define TABLE_ALIGN 32
336 # endif
337
338 /* 11. REDUCE CODE AND TABLE SIZE
339

340 This replaces some expanded macros with function calls if


,→AES_ASM_V2 or
341 AES_ASM_V2C are defined
342 */
343
344 #if 1 && ( defined ( ASM_X86_V2 ) || defined ( ASM_X86_V2C ))
345 # define REDUCE_CODE_SIZE
346 # endif
347
348 /* 12. TABLE OPTIONS
349
350 This cipher proceeds by repeating in a number of cycles known as
,→'rounds '
351 which are implemented by a round function which can optionally be
,→ speeded
352 up using tables . The basic tables are each 256 32- bit words ,
,→with either
353 one or four tables being required for each round function
,→depending on
354 how much speed is required . The encryption and decryption round
,→functions
355 are different and the last encryption and decrytpion round
,→functions are
356 different again making four different round functions in all.
357
358 This means that:
359 1. Normal encryption and decryption rounds can each use either
,→0, 1
360 or 4 tables and table spaces of 0, 1024 or 4096 bytes each.
361 2. The last encryption and decryption rounds can also use
,→either 0, 1
362 or 4 tables and table spaces of 0, 1024 or 4096 bytes each.
363
364 Include or exclude the appropriate definitions below to set the
,→number
365 of tables used by this implementation .
366 */

163
C.4 APPENDIX C. SOURCE CODE

367
368 #if 1 /* set tables for the normal encryption round */
369 # define ENC_ROUND FOUR_TABLES
370 #elif 0
371 # define ENC_ROUND ONE_TABLE
372 #else
373 # define ENC_ROUND NO_TABLES
374 # endif
375
376 #if 1 /* set tables for the last encryption round */
377 # define LAST_ENC_ROUND FOUR_TABLES
378 #elif 0
379 # define LAST_ENC_ROUND ONE_TABLE
380 #else
381 # define LAST_ENC_ROUND NO_TABLES
382 # endif
383
384 #if 1 /* set tables for the normal decryption round */
385 # define DEC_ROUND FOUR_TABLES
386 #elif 0
387 # define DEC_ROUND ONE_TABLE
388 #else
389 # define DEC_ROUND NO_TABLES
390 # endif
391
392 #if 1 /* set tables for the last decryption round */
393 # define LAST_DEC_ROUND FOUR_TABLES
394 #elif 0
395 # define LAST_DEC_ROUND ONE_TABLE
396 #else
397 # define LAST_DEC_ROUND NO_TABLES
398 # endif
399
400 /* The decryption key schedule can be speeded up with tables in the
,→same
401 way that the round functions can. Include or exclude the
,→following
402 defines to set this requirement .
403 */
404 #if 1
405 # define KEY_SCHED FOUR_TABLES
406 #elif 0
407 # define KEY_SCHED ONE_TABLE
408 #else
409 # define KEY_SCHED NO_TABLES
410 # endif
411
412 /* ---- END OF USER CONFIGURED OPTIONS ---- */
413

164
C.4 APPENDIX C. SOURCE CODE

414 /* VIA ACE support is only available for VC++ and GCC */
415
416 #if ! defined ( _MSC_VER ) && ! defined ( __GNUC__ )
417 # if defined ( ASSUME_VIA_ACE_PRESENT )
418 # undef ASSUME_VIA_ACE_PRESENT
419 # endif
420 # if defined ( USE_VIA_ACE_IF_PRESENT )
421 # undef USE_VIA_ACE_IF_PRESENT
422 # endif
423 # endif
424
425 #if defined ( ASSUME_VIA_ACE_PRESENT ) && ! defined (
,→USE_VIA_ACE_IF_PRESENT )
426 # define USE_VIA_ACE_IF_PRESENT
427 # endif
428
429 #if defined ( USE_VIA_ACE_IF_PRESENT ) && ! defined ( AES_REV_DKS )
430 # define AES_REV_DKS
431 # endif
432
433 /* Assembler support requires the use of platform byte order */
434
435 #if ( defined ( ASM_X86_V1C ) || defined ( ASM_X86_V2C ) || defined (
,→ASM_AMD64_C ) ) \
436 && ( ALGORITHM_BYTE_ORDER != PLATFORM_BYTE_ORDER )
437 # undef ALGORITHM_BYTE_ORDER
438 # define ALGORITHM_BYTE_ORDER PLATFORM_BYTE_ORDER
439 # endif
440
441 /* In this implementation the columns of the state array are each
,→held in
442 32- bit words . The state array can be held in various ways: in an
,→array
443 of words , in a number of individual word variables or in a number
,→of
444 processor registers . The following define maps a variable name x
,→and
445 a column number c to the way the state array variable is to be
,→held.
446 The first define below maps the state into an array x[c] whereas
,→the
447 second form maps the state into a number of individual variables
,→x0 ,
448 x1 , etc. Another form could map individual state colums to
,→machine
449 register names .
450 */
451

452 #if defined ( ARRAYS )

165
C.4 APPENDIX C. SOURCE CODE

453 # define s(x,c) x[c]


454 #else
455 # define s(x,c) x##c
456 # endif
457
458 /* This implementation provides subroutines for encryption ,
,→decryption
459 and for setting the three key lengths ( separately ) for encryption
460 and decryption . Since not all functions are needed , masks are set
461 up here to determine which will be implemented in C
462 */
463

464 #if ! defined ( AES_ENCRYPT )


465 # define EFUNCS_IN_C 0
466 #elif defined ( ASSUME_VIA_ACE_PRESENT ) || defined ( ASM_X86_V1C ) \
467 || defined ( ASM_X86_V2C ) || defined ( ASM_AMD64_C )
468 # define EFUNCS_IN_C ENC_KEYING_IN_C
469 #elif ! defined ( ASM_X86_V2 )
470 # define EFUNCS_IN_C ( ENCRYPTION_IN_C | ENC_KEYING_IN_C )
471 #else
472 # define EFUNCS_IN_C 0
473 # endif
474
475 #if ! defined ( AES_DECRYPT )
476 # define DFUNCS_IN_C 0
477 #elif defined ( ASSUME_VIA_ACE_PRESENT ) || defined ( ASM_X86_V1C ) \
478 || defined ( ASM_X86_V2C ) || defined ( ASM_AMD64_C )
479 # define DFUNCS_IN_C DEC_KEYING_IN_C
480 #elif ! defined ( ASM_X86_V2 )
481 # define DFUNCS_IN_C ( DECRYPTION_IN_C | DEC_KEYING_IN_C )
482 #else
483 # define DFUNCS_IN_C 0
484 # endif
485
486 # define FUNCS_IN_C ( EFUNCS_IN_C | DFUNCS_IN_C )
487

488 /* END OF CONFIGURATION OPTIONS */


489
490 # define RC_LENGTH (5 * ( AES_BLOCK_SIZE / 4 - 2))
491
492 /* Disable or report errors on some combinations of options */
493

494 #if ENC_ROUND == NO_TABLES && LAST_ENC_ROUND != NO_TABLES


495 # undef LAST_ENC_ROUND
496 # define LAST_ENC_ROUND NO_TABLES
497 #elif ENC_ROUND == ONE_TABLE && LAST_ENC_ROUND == FOUR_TABLES
498 # undef LAST_ENC_ROUND
499 # define LAST_ENC_ROUND ONE_TABLE
500 # endif

166
C.4 APPENDIX C. SOURCE CODE

501
502 #if ENC_ROUND == NO_TABLES && ENC_UNROLL != NONE
503 # undef ENC_UNROLL
504 # define ENC_UNROLL NONE
505 # endif
506

507 #if DEC_ROUND == NO_TABLES && LAST_DEC_ROUND != NO_TABLES


508 # undef LAST_DEC_ROUND
509 # define LAST_DEC_ROUND NO_TABLES
510 #elif DEC_ROUND == ONE_TABLE && LAST_DEC_ROUND == FOUR_TABLES
511 # undef LAST_DEC_ROUND
512 # define LAST_DEC_ROUND ONE_TABLE
513 # endif
514
515 #if DEC_ROUND == NO_TABLES && DEC_UNROLL != NONE
516 # undef DEC_UNROLL
517 # define DEC_UNROLL NONE
518 # endif
519
520 #if defined ( bswap32 )
521 # define aes_sw32 bswap32
522 #elif defined ( bswap_32 )
523 # define aes_sw32 bswap_32
524 #else
525 # define brot(x,n) ((( uint_32t )(x) << n) | (( uint_32t )(x) >> (32
,→- n)))
526 # define aes_sw32 (x) (( brot ((x) ,8) & 0 x00ff00ff ) | (brot ((x) ,24) & 0
,→xff00ff00 ))
527 # endif
528

529 /* upr(x,n): rotates bytes within words by n positions , moving


,→bytes to
530 higher index positions with wrap around into low
,→positions
531 ups(x,n): moves bytes by n positions to higher index positions
,→in
532 words but without wrap around
533 bval(x,n): extracts a byte from a word
534
535 WARNING : The definitions given here are intended only for use
,→with
536 unsigned variables and with shift counts that are
,→compile
537 time constants
538 */
539
540 #if ( ALGORITHM_BYTE_ORDER == IS_LITTLE_ENDIAN )
541 # define upr(x,n) ((( uint_32t )(x) << (8 * (n))) | (( uint_32t )(x
,→) >> (32 - 8 * (n))))

167
C.4 APPENDIX C. SOURCE CODE

542 # define ups(x,n) (( uint_32t ) (x) << (8 * (n)))


543 # define bval(x,n) to_byte ((x) >> (8 * (n)))
544 # define bytes2word (b0 , b1 , b2 , b3) \
545 ((( uint_32t )(b3) << 24) | (( uint_32t )(b2) << 16) | (( uint_32t
,→)(b1) << 8) | (b0))
546 # endif
547
548 #if ( ALGORITHM_BYTE_ORDER == IS_BIG_ENDIAN )
549 # define upr(x,n) ((( uint_32t )(x) >> (8 * (n))) | (( uint_32t )(x
,→) << (32 - 8 * (n))))
550 # define ups(x,n) (( uint_32t ) (x) >> (8 * (n)))
551 # define bval(x,n) to_byte ((x) >> (24 - 8 * (n)))
552 # define bytes2word (b0 , b1 , b2 , b3) \
553 ((( uint_32t )(b0) << 24) | (( uint_32t )(b1) << 16) | (( uint_32t
,→)(b2) << 8) | (b3))
554 # endif
555
556 #if defined ( SAFE_IO )
557 # define word_in (x,c) bytes2word ((( const uint_8t *)(x)+4*c)[0], ((
,→const uint_8t *)(x)+4*c)[1] , \
558 (( const uint_8t *)(x)+4*c)[2], ((
,→const uint_8t *)(x)+4*c)[3])
559 # define word_out (x,c,v) { (( uint_8t *)(x)+4*c)[0] = bval(v ,0); ((
,→uint_8t *)(x)+4*c)[1] = bval(v ,1); \
560 (( uint_8t *)(x)+4*c)[2] = bval(v ,2); ((
,→uint_8t *)(x)+4*c)[3] = bval(v ,3); }
561 #elif ( ALGORITHM_BYTE_ORDER == PLATFORM_BYTE_ORDER )
562 # define word_in (x,c) (*(( uint_32t *)(x)+(c)))
563 # define word_out (x,c,v) (*(( uint_32t *)(x)+(c)) = (v))
564 #else
565 # define word_in (x,c) aes_sw32 (*(( uint_32t *)(x)+(c)))
566 # define word_out (x,c,v) (*(( uint_32t *)(x)+(c)) = aes_sw32 (v))
567 # endif
568
569 /* the finite field modular polynomial and elements */
570

571 # define WPOLY 0x011b


572 # define BPOLY 0x1b
573
574 /* multiply four bytes in GF (2^8) by 'x' {02} in parallel */
575
576 # define gf_c1 0 x80808080
577 # define gf_c2 0 x7f7f7f7f
578 # define gf_mulx (x) ((((x) & gf_c2) << 1) ^ ((((x) & gf_c1) >> 7) *
,→BPOLY))
579
580 /* The following defines provide alternative definitions of gf_mulx
,→that might
581 give improved performance if a fast 32- bit multiply is not

168
C.4 APPENDIX C. SOURCE CODE

,→available . Note
582 that a temporary variable u needs to be defined where gf_mulx is
,→used.
583
584 # define gf_mulx (x) (u = (x) & gf_c1 , u |= (u >> 1) , ((x) & gf_c2 ) <<
,→1) ^ ((u >> 3) | (u >> 6))
585 # define gf_c4 (0 x01010101 * BPOLY )
586 # define gf_mulx (x) (u = (x) & gf_c1 , ((x) & gf_c2 ) << 1) ^ ((u - (u
,→>> 7)) & gf_c4 )
587 */
588
589 /* Work out which tables are needed for the different options */
590
591 #if defined ( ASM_X86_V1C )
592 # if defined ( ENC_ROUND )
593 # undef ENC_ROUND
594 # endif
595 # define ENC_ROUND FOUR_TABLES
596 # if defined ( LAST_ENC_ROUND )
597 # undef LAST_ENC_ROUND
598 # endif
599 # define LAST_ENC_ROUND FOUR_TABLES
600 # if defined ( DEC_ROUND )
601 # undef DEC_ROUND
602 # endif
603 # define DEC_ROUND FOUR_TABLES
604 # if defined ( LAST_DEC_ROUND )
605 # undef LAST_DEC_ROUND
606 # endif
607 # define LAST_DEC_ROUND FOUR_TABLES
608 # if defined ( KEY_SCHED )
609 # undef KEY_SCHED
610 # define KEY_SCHED FOUR_TABLES
611 # endif
612 # endif
613

614 #if ( FUNCS_IN_C & ENCRYPTION_IN_C ) || defined ( ASM_X86_V1C )


615 # if ENC_ROUND == ONE_TABLE
616 # define FT1_SET
617 # elif ENC_ROUND == FOUR_TABLES
618 # define FT4_SET
619 # else
620 # define SBX_SET
621 # endif
622 # if LAST_ENC_ROUND == ONE_TABLE
623 # define FL1_SET
624 # elif LAST_ENC_ROUND == FOUR_TABLES
625 # define FL4_SET
626 # elif ! defined ( SBX_SET )

169
C.4 APPENDIX C. SOURCE CODE

627 # define SBX_SET


628 # endif
629 # endif
630
631 #if ( FUNCS_IN_C & DECRYPTION_IN_C ) || defined ( ASM_X86_V1C )
632 # if DEC_ROUND == ONE_TABLE
633 # define IT1_SET
634 # elif DEC_ROUND == FOUR_TABLES
635 # define IT4_SET
636 # else
637 # define ISB_SET
638 # endif
639 # if LAST_DEC_ROUND == ONE_TABLE
640 # define IL1_SET
641 # elif LAST_DEC_ROUND == FOUR_TABLES
642 # define IL4_SET
643 # elif ! defined ( ISB_SET )
644 # define ISB_SET
645 # endif
646 # endif
647
648 #if !( defined ( REDUCE_CODE_SIZE ) && ( defined ( ASM_X86_V2 ) ||
,→defined ( ASM_X86_V2C )))
649 # if (( FUNCS_IN_C & ENC_KEYING_IN_C ) || ( FUNCS_IN_C &
,→DEC_KEYING_IN_C ))
650 # if KEY_SCHED == ONE_TABLE
651 # if ! defined ( FL1_SET ) && ! defined ( FL4_SET )
652 # define LS1_SET
653 # endif
654 # elif KEY_SCHED == FOUR_TABLES
655 # if ! defined ( FL4_SET )
656 # define LS4_SET
657 # endif
658 # elif ! defined ( SBX_SET )
659 # define SBX_SET
660 # endif
661 # endif
662 # if ( FUNCS_IN_C & DEC_KEYING_IN_C )
663 # if KEY_SCHED == ONE_TABLE
664 # define IM1_SET
665 # elif KEY_SCHED == FOUR_TABLES
666 # define IM4_SET
667 # elif ! defined ( SBX_SET )
668 # define SBX_SET
669 # endif
670 # endif
671 # endif
672

673 /* generic definitions of Rijndael macros that use tables */

170
C.4 APPENDIX C. SOURCE CODE

674
675 # define no_table (x,box ,vf ,rf ,c) bytes2word ( \
676 box[bval(vf(x,0,c),rf(0,c))], \
677 box[bval(vf(x,1,c),rf(1,c))], \
678 box[bval(vf(x,2,c),rf(2,c))], \
679 box[bval(vf(x,3,c),rf(3,c))])
680
681 # define one_table (x,op ,tab ,vf ,rf ,c) \
682 ( tab[bval(vf(x,0,c),rf(0,c))] \
683 ^ op(tab[bval(vf(x,1,c),rf(1,c))],1) \
684 ^ op(tab[bval(vf(x,2,c),rf(2,c))],2) \
685 ^ op(tab[bval(vf(x,3,c),rf(3,c))],3))
686
687 # define four_tables (x,tab ,vf ,rf ,c) \
688 ( tab [0][ bval(vf(x,0,c),rf(0,c))] \
689 ^ tab [1][ bval(vf(x,1,c),rf(1,c))] \
690 ^ tab [2][ bval(vf(x,2,c),rf(2,c))] \
691 ^ tab [3][ bval(vf(x,3,c),rf(3,c))])
692
693 # define vf1(x,r,c) (x)
694 # define rf1(r,c) (r)
695 # define rf2(r,c) ((8+r-c)&3)
696
697 /* perform forward and inverse column mix operation on four bytes in
,→long word x in */
698 /* parallel . NOTE: x must be a simple variable , NOT an expression in
,→these macros . */
699
700 #if !( defined ( REDUCE_CODE_SIZE ) && ( defined ( ASM_X86_V2 ) ||
,→defined ( ASM_X86_V2C )))
701
702 #if defined ( FM4_SET ) /* not currently used */
703 # define fwd_mcol (x) four_tables (x,t_use(f,m),vf1 ,rf1 ,0)
704 #elif defined ( FM1_SET ) /* not currently used */
705 # define fwd_mcol (x) one_table (x,upr ,t_use(f,m),vf1 ,rf1 ,0)
706 #else
707 # define dec_fmvars uint_32t g2
708 # define fwd_mcol (x) (g2 = gf_mulx (x), g2 ^ upr ((x) ^ g2 , 3) ^
,→ upr ((x), 2) ^ upr ((x), 1))
709 # endif
710
711 #if defined ( IM4_SET )
712 # define inv_mcol (x) four_tables (x,t_use(i,m),vf1 ,rf1 ,0)
713 #elif defined ( IM1_SET )
714 # define inv_mcol (x) one_table (x,upr ,t_use(i,m),vf1 ,rf1 ,0)
715 #else
716 # define dec_imvars uint_32t g2 , g4 , g9
717 # define inv_mcol (x) (g2 = gf_mulx (x), g4 = gf_mulx (g2), g9 =
,→(x) ^ gf_mulx (g4), g4 ^= g9 , \

171
C.4 APPENDIX C. SOURCE CODE

718 (x) ^ g2 ^ g4 ^ upr(g2 ^ g9 , 3) ^ upr(g4 ,


,→ 2) ^ upr(g9 , 1))
719 # endif
720
721 #if defined ( FL4_SET )
722 # define ls_box (x,c) four_tables (x,t_use(f,l),vf1 ,rf2 ,c)
723 #elif defined ( LS4_SET )
724 # define ls_box (x,c) four_tables (x,t_use(l,s),vf1 ,rf2 ,c)
725 #elif defined ( FL1_SET )
726 # define ls_box (x,c) one_table (x,upr ,t_use(f,l),vf1 ,rf2 ,c)
727 #elif defined ( LS1_SET )
728 # define ls_box (x,c) one_table (x,upr ,t_use(l,s),vf1 ,rf2 ,c)
729 #else
730 # define ls_box (x,c) no_table (x,t_use(s,box),vf1 ,rf2 ,c)
731 # endif
732
733 # endif
734

735 #if defined ( ASM_X86_V1C ) && defined ( AES_DECRYPT ) && ! defined (


,→ISB_SET )
736 # define ISB_SET
737 # endif
738
739 # endif

src/Obf/aestab.c
1 /*
2 ---------------------------------------------------------------------------
,→
3 Copyright (c) 1998 -2010 , Brian Gladman , Worcester , UK. All rights
,→reserved .
4
5 The redistribution and use of this software (with or without changes )
6 is allowed without the payment of fees or royalties provided that:
7
8 source code distributions include the above copyright notice , this
9 list of conditions and the following disclaimer ;
10
11 binary distributions include the above copyright notice , this list
12 of conditions and the following disclaimer in their documentation .
13
14 This software is provided 'as is' with no explicit or implied
,→warranties
15 in respect of its operation , including , but not limited to ,
,→correctness
16 and fitness for purpose .
17 ---------------------------------------------------------------------------
,→
18 Issue Date: 20/12/2007

172
C.4 APPENDIX C. SOURCE CODE

19 */
20
21 # define DO_TABLES
22
23 # include "aes.h"
24 # include " aesopt .h"
25
26 #if defined ( FIXED_TABLES )
27
28 # define sb_data (w) {\
29 w(0 x63), w(0 x7c), w(0 x77), w(0 x7b), w(0 xf2), w(0 x6b), w(0 x6f), w
,→(0 xc5) ,\
30 w(0 x30), w(0 x01), w(0 x67), w(0 x2b), w(0 xfe), w(0 xd7), w(0 xab), w
,→(0 x76) ,\
31 w(0 xca), w(0 x82), w(0 xc9), w(0 x7d), w(0 xfa), w(0 x59), w(0 x47), w
,→(0 xf0) ,\
32 w(0 xad), w(0 xd4), w(0 xa2), w(0 xaf), w(0 x9c), w(0 xa4), w(0 x72), w
,→(0 xc0) ,\
33 w(0 xb7), w(0 xfd), w(0 x93), w(0 x26), w(0 x36), w(0 x3f), w(0 xf7), w
,→(0 xcc) ,\
34 w(0 x34), w(0 xa5), w(0 xe5), w(0 xf1), w(0 x71), w(0 xd8), w(0 x31), w
,→(0 x15) ,\
35 w(0 x04), w(0 xc7), w(0 x23), w(0 xc3), w(0 x18), w(0 x96), w(0 x05), w
,→(0 x9a) ,\
36 w(0 x07), w(0 x12), w(0 x80), w(0 xe2), w(0 xeb), w(0 x27), w(0 xb2), w
,→(0 x75) ,\
37 w(0 x09), w(0 x83), w(0 x2c), w(0 x1a), w(0 x1b), w(0 x6e), w(0 x5a), w
,→(0 xa0) ,\
38 w(0 x52), w(0 x3b), w(0 xd6), w(0 xb3), w(0 x29), w(0 xe3), w(0 x2f), w
,→(0 x84) ,\
39 w(0 x53), w(0 xd1), w(0 x00), w(0 xed), w(0 x20), w(0 xfc), w(0 xb1), w
,→(0 x5b) ,\
40 w(0 x6a), w(0 xcb), w(0 xbe), w(0 x39), w(0 x4a), w(0 x4c), w(0 x58), w
,→(0 xcf) ,\
41 w(0 xd0), w(0 xef), w(0 xaa), w(0 xfb), w(0 x43), w(0 x4d), w(0 x33), w
,→(0 x85) ,\
42 w(0 x45), w(0 xf9), w(0 x02), w(0 x7f), w(0 x50), w(0 x3c), w(0 x9f), w
,→(0 xa8) ,\
43 w(0 x51), w(0 xa3), w(0 x40), w(0 x8f), w(0 x92), w(0 x9d), w(0 x38), w
,→(0 xf5) ,\
44 w(0 xbc), w(0 xb6), w(0 xda), w(0 x21), w(0 x10), w(0 xff), w(0 xf3), w
,→(0 xd2) ,\
45 w(0 xcd), w(0 x0c), w(0 x13), w(0 xec), w(0 x5f), w(0 x97), w(0 x44), w
,→(0 x17) ,\
46 w(0 xc4), w(0 xa7), w(0 x7e), w(0 x3d), w(0 x64), w(0 x5d), w(0 x19), w
,→(0 x73) ,\
47 w(0 x60), w(0 x81), w(0 x4f), w(0 xdc), w(0 x22), w(0 x2a), w(0 x90), w
,→(0 x88) ,\
48 w(0 x46), w(0 xee), w(0 xb8), w(0 x14), w(0 xde), w(0 x5e), w(0 x0b), w

173
C.4 APPENDIX C. SOURCE CODE

,→(0 xdb) ,\
49 w(0 xe0), w(0 x32), w(0 x3a), w(0 x0a), w(0 x49), w(0 x06), w(0 x24), w
,→(0 x5c) ,\
50 w(0 xc2), w(0 xd3), w(0 xac), w(0 x62), w(0 x91), w(0 x95), w(0 xe4), w
,→(0 x79) ,\
51 w(0 xe7), w(0 xc8), w(0 x37), w(0 x6d), w(0 x8d), w(0 xd5), w(0 x4e), w
,→(0 xa9) ,\
52 w(0 x6c), w(0 x56), w(0 xf4), w(0 xea), w(0 x65), w(0 x7a), w(0 xae), w
,→(0 x08) ,\
53 w(0 xba), w(0 x78), w(0 x25), w(0 x2e), w(0 x1c), w(0 xa6), w(0 xb4), w
,→(0 xc6) ,\
54 w(0 xe8), w(0 xdd), w(0 x74), w(0 x1f), w(0 x4b), w(0 xbd), w(0 x8b), w
,→(0 x8a) ,\
55 w(0 x70), w(0 x3e), w(0 xb5), w(0 x66), w(0 x48), w(0 x03), w(0 xf6), w
,→(0 x0e) ,\
56 w(0 x61), w(0 x35), w(0 x57), w(0 xb9), w(0 x86), w(0 xc1), w(0 x1d), w
,→(0 x9e) ,\
57 w(0 xe1), w(0 xf8), w(0 x98), w(0 x11), w(0 x69), w(0 xd9), w(0 x8e), w
,→(0 x94) ,\
58 w(0 x9b), w(0 x1e), w(0 x87), w(0 xe9), w(0 xce), w(0 x55), w(0 x28), w
,→(0 xdf) ,\
59 w(0 x8c), w(0 xa1), w(0 x89), w(0 x0d), w(0 xbf), w(0 xe6), w(0 x42), w
,→(0 x68) ,\
60 w(0 x41), w(0 x99), w(0 x2d), w(0 x0f), w(0 xb0), w(0 x54), w(0 xbb), w
,→(0 x16) }
61
62 # define isb_data (w) {\
63 w(0 x52), w(0 x09), w(0 x6a), w(0 xd5), w(0 x30), w(0 x36), w(0 xa5), w
,→(0 x38) ,\
64 w(0 xbf), w(0 x40), w(0 xa3), w(0 x9e), w(0 x81), w(0 xf3), w(0 xd7), w
,→(0 xfb) ,\
65 w(0 x7c), w(0 xe3), w(0 x39), w(0 x82), w(0 x9b), w(0 x2f), w(0 xff), w
,→(0 x87) ,\
66 w(0 x34), w(0 x8e), w(0 x43), w(0 x44), w(0 xc4), w(0 xde), w(0 xe9), w
,→(0 xcb) ,\
67 w(0 x54), w(0 x7b), w(0 x94), w(0 x32), w(0 xa6), w(0 xc2), w(0 x23), w
,→(0 x3d) ,\
68 w(0 xee), w(0 x4c), w(0 x95), w(0 x0b), w(0 x42), w(0 xfa), w(0 xc3), w
,→(0 x4e) ,\
69 w(0 x08), w(0 x2e), w(0 xa1), w(0 x66), w(0 x28), w(0 xd9), w(0 x24), w
,→(0 xb2) ,\
70 w(0 x76), w(0 x5b), w(0 xa2), w(0 x49), w(0 x6d), w(0 x8b), w(0 xd1), w
,→(0 x25) ,\
71 w(0 x72), w(0 xf8), w(0 xf6), w(0 x64), w(0 x86), w(0 x68), w(0 x98), w
,→(0 x16) ,\
72 w(0 xd4), w(0 xa4), w(0 x5c), w(0 xcc), w(0 x5d), w(0 x65), w(0 xb6), w
,→(0 x92) ,\
73 w(0 x6c), w(0 x70), w(0 x48), w(0 x50), w(0 xfd), w(0 xed), w(0 xb9), w
,→(0 xda) ,\

174
C.4 APPENDIX C. SOURCE CODE

74 w(0 x5e), w(0 x15), w(0 x46), w(0 x57), w(0 xa7), w(0 x8d), w(0 x9d), w
,→(0 x84) ,\
75 w(0 x90), w(0 xd8), w(0 xab), w(0 x00), w(0 x8c), w(0 xbc), w(0 xd3), w
,→(0 x0a) ,\
76 w(0 xf7), w(0 xe4), w(0 x58), w(0 x05), w(0 xb8), w(0 xb3), w(0 x45), w
,→(0 x06) ,\
77 w(0 xd0), w(0 x2c), w(0 x1e), w(0 x8f), w(0 xca), w(0 x3f), w(0 x0f), w
,→(0 x02) ,\
78 w(0 xc1), w(0 xaf), w(0 xbd), w(0 x03), w(0 x01), w(0 x13), w(0 x8a), w
,→(0 x6b) ,\
79 w(0 x3a), w(0 x91), w(0 x11), w(0 x41), w(0 x4f), w(0 x67), w(0 xdc), w
,→(0 xea) ,\
80 w(0 x97), w(0 xf2), w(0 xcf), w(0 xce), w(0 xf0), w(0 xb4), w(0 xe6), w
,→(0 x73) ,\
81 w(0 x96), w(0 xac), w(0 x74), w(0 x22), w(0 xe7), w(0 xad), w(0 x35), w
,→(0 x85) ,\
82 w(0 xe2), w(0 xf9), w(0 x37), w(0 xe8), w(0 x1c), w(0 x75), w(0 xdf), w
,→(0 x6e) ,\
83 w(0 x47), w(0 xf1), w(0 x1a), w(0 x71), w(0 x1d), w(0 x29), w(0 xc5), w
,→(0 x89) ,\
84 w(0 x6f), w(0 xb7), w(0 x62), w(0 x0e), w(0 xaa), w(0 x18), w(0 xbe), w
,→(0 x1b) ,\
85 w(0 xfc), w(0 x56), w(0 x3e), w(0 x4b), w(0 xc6), w(0 xd2), w(0 x79), w
,→(0 x20) ,\
86 w(0 x9a), w(0 xdb), w(0 xc0), w(0 xfe), w(0 x78), w(0 xcd), w(0 x5a), w
,→(0 xf4) ,\
87 w(0 x1f), w(0 xdd), w(0 xa8), w(0 x33), w(0 x88), w(0 x07), w(0 xc7), w
,→(0 x31) ,\
88 w(0 xb1), w(0 x12), w(0 x10), w(0 x59), w(0 x27), w(0 x80), w(0 xec), w
,→(0 x5f) ,\
89 w(0 x60), w(0 x51), w(0 x7f), w(0 xa9), w(0 x19), w(0 xb5), w(0 x4a), w
,→(0 x0d) ,\
90 w(0 x2d), w(0 xe5), w(0 x7a), w(0 x9f), w(0 x93), w(0 xc9), w(0 x9c), w
,→(0 xef) ,\
91 w(0 xa0), w(0 xe0), w(0 x3b), w(0 x4d), w(0 xae), w(0 x2a), w(0 xf5), w
,→(0 xb0) ,\
92 w(0 xc8), w(0 xeb), w(0 xbb), w(0 x3c), w(0 x83), w(0 x53), w(0 x99), w
,→(0 x61) ,\
93 w(0 x17), w(0 x2b), w(0 x04), w(0 x7e), w(0 xba), w(0 x77), w(0 xd6), w
,→(0 x26) ,\
94 w(0 xe1), w(0 x69), w(0 x14), w(0 x63), w(0 x55), w(0 x21), w(0 x0c), w
,→(0 x7d) }
95
96 # define mm_data (w) {\
97 w(0 x00), w(0 x01), w(0 x02), w(0 x03), w(0 x04), w(0 x05), w(0 x06), w
,→(0 x07) ,\
98 w(0 x08), w(0 x09), w(0 x0a), w(0 x0b), w(0 x0c), w(0 x0d), w(0 x0e), w
,→(0 x0f) ,\
99 w(0 x10), w(0 x11), w(0 x12), w(0 x13), w(0 x14), w(0 x15), w(0 x16), w

175
C.4 APPENDIX C. SOURCE CODE

,→(0 x17) ,\
100 w(0 x18), w(0 x19), w(0 x1a), w(0 x1b), w(0 x1c), w(0 x1d), w(0 x1e), w
,→(0 x1f) ,\
101 w(0 x20), w(0 x21), w(0 x22), w(0 x23), w(0 x24), w(0 x25), w(0 x26), w
,→(0 x27) ,\
102 w(0 x28), w(0 x29), w(0 x2a), w(0 x2b), w(0 x2c), w(0 x2d), w(0 x2e), w
,→(0 x2f) ,\
103 w(0 x30), w(0 x31), w(0 x32), w(0 x33), w(0 x34), w(0 x35), w(0 x36), w
,→(0 x37) ,\
104 w(0 x38), w(0 x39), w(0 x3a), w(0 x3b), w(0 x3c), w(0 x3d), w(0 x3e), w
,→(0 x3f) ,\
105 w(0 x40), w(0 x41), w(0 x42), w(0 x43), w(0 x44), w(0 x45), w(0 x46), w
,→(0 x47) ,\
106 w(0 x48), w(0 x49), w(0 x4a), w(0 x4b), w(0 x4c), w(0 x4d), w(0 x4e), w
,→(0 x4f) ,\
107 w(0 x50), w(0 x51), w(0 x52), w(0 x53), w(0 x54), w(0 x55), w(0 x56), w
,→(0 x57) ,\
108 w(0 x58), w(0 x59), w(0 x5a), w(0 x5b), w(0 x5c), w(0 x5d), w(0 x5e), w
,→(0 x5f) ,\
109 w(0 x60), w(0 x61), w(0 x62), w(0 x63), w(0 x64), w(0 x65), w(0 x66), w
,→(0 x67) ,\
110 w(0 x68), w(0 x69), w(0 x6a), w(0 x6b), w(0 x6c), w(0 x6d), w(0 x6e), w
,→(0 x6f) ,\
111 w(0 x70), w(0 x71), w(0 x72), w(0 x73), w(0 x74), w(0 x75), w(0 x76), w
,→(0 x77) ,\
112 w(0 x78), w(0 x79), w(0 x7a), w(0 x7b), w(0 x7c), w(0 x7d), w(0 x7e), w
,→(0 x7f) ,\
113 w(0 x80), w(0 x81), w(0 x82), w(0 x83), w(0 x84), w(0 x85), w(0 x86), w
,→(0 x87) ,\
114 w(0 x88), w(0 x89), w(0 x8a), w(0 x8b), w(0 x8c), w(0 x8d), w(0 x8e), w
,→(0 x8f) ,\
115 w(0 x90), w(0 x91), w(0 x92), w(0 x93), w(0 x94), w(0 x95), w(0 x96), w
,→(0 x97) ,\
116 w(0 x98), w(0 x99), w(0 x9a), w(0 x9b), w(0 x9c), w(0 x9d), w(0 x9e), w
,→(0 x9f) ,\
117 w(0 xa0), w(0 xa1), w(0 xa2), w(0 xa3), w(0 xa4), w(0 xa5), w(0 xa6), w
,→(0 xa7) ,\
118 w(0 xa8), w(0 xa9), w(0 xaa), w(0 xab), w(0 xac), w(0 xad), w(0 xae), w
,→(0 xaf) ,\
119 w(0 xb0), w(0 xb1), w(0 xb2), w(0 xb3), w(0 xb4), w(0 xb5), w(0 xb6), w
,→(0 xb7) ,\
120 w(0 xb8), w(0 xb9), w(0 xba), w(0 xbb), w(0 xbc), w(0 xbd), w(0 xbe), w
,→(0 xbf) ,\
121 w(0 xc0), w(0 xc1), w(0 xc2), w(0 xc3), w(0 xc4), w(0 xc5), w(0 xc6), w
,→(0 xc7) ,\
122 w(0 xc8), w(0 xc9), w(0 xca), w(0 xcb), w(0 xcc), w(0 xcd), w(0 xce), w
,→(0 xcf) ,\
123 w(0 xd0), w(0 xd1), w(0 xd2), w(0 xd3), w(0 xd4), w(0 xd5), w(0 xd6), w
,→(0 xd7) ,\

176
C.4 APPENDIX C. SOURCE CODE

124 w(0 xd8), w(0 xd9), w(0 xda), w(0 xdb), w(0 xdc), w(0 xdd), w(0 xde), w
,→(0 xdf) ,\
125 w(0 xe0), w(0 xe1), w(0 xe2), w(0 xe3), w(0 xe4), w(0 xe5), w(0 xe6), w
,→(0 xe7) ,\
126 w(0 xe8), w(0 xe9), w(0 xea), w(0 xeb), w(0 xec), w(0 xed), w(0 xee), w
,→(0 xef) ,\
127 w(0 xf0), w(0 xf1), w(0 xf2), w(0 xf3), w(0 xf4), w(0 xf5), w(0 xf6), w
,→(0 xf7) ,\
128 w(0 xf8), w(0 xf9), w(0 xfa), w(0 xfb), w(0 xfc), w(0 xfd), w(0 xfe), w
,→(0 xff) }
129
130 # define rc_data (w) {\
131 w(0 x01), w(0 x02), w(0 x04), w(0 x08), w(0 x10),w(0 x20), w(0 x40), w(0
,→x80) ,\
132 w(0 x1b), w(0 x36) }
133
134 # define h0(x) (x)
135

136 # define w0(p) bytes2word (p, 0, 0, 0)


137 # define w1(p) bytes2word (0, p, 0, 0)
138 # define w2(p) bytes2word (0, 0, p, 0)
139 # define w3(p) bytes2word (0, 0, 0, p)
140
141 # define u0(p) bytes2word (f2(p), p, p, f3(p))
142 # define u1(p) bytes2word (f3(p), f2(p), p, p)
143 # define u2(p) bytes2word (p, f3(p), f2(p), p)
144 # define u3(p) bytes2word (p, p, f3(p), f2(p))
145
146 # define v0(p) bytes2word (fe(p), f9(p), fd(p), fb(p))
147 # define v1(p) bytes2word (fb(p), fe(p), f9(p), fd(p))
148 # define v2(p) bytes2word (fd(p), fb(p), fe(p), f9(p))
149 # define v3(p) bytes2word (f9(p), fd(p), fb(p), fe(p))
150
151 # endif
152
153 #if defined ( FIXED_TABLES ) || ! defined ( FF_TABLES )
154
155 # define f2(x) ((x<<1) ^ (((x>>7) & 1) * WPOLY))
156 # define f4(x) ((x<<2) ^ (((x>>6) & 1) * WPOLY) ^ (((x>>6) & 2) *
,→WPOLY))
157 # define f8(x) ((x<<3) ^ (((x>>5) & 1) * WPOLY) ^ (((x>>5) & 2) *
,→WPOLY) \
158 ^ (((x>>5) & 4) * WPOLY))
159 # define f3(x) (f2(x) ^ x)
160 # define f9(x) (f8(x) ^ x)
161 # define fb(x) (f8(x) ^ f2(x) ^ x)
162 # define fd(x) (f8(x) ^ f4(x) ^ x)
163 # define fe(x) (f8(x) ^ f4(x) ^ f2(x))
164

177
C.4 APPENDIX C. SOURCE CODE

165 #else
166
167 # define f2(x) ((x) ? pow[log[x] + 0x19] : 0)
168 # define f3(x) ((x) ? pow[log[x] + 0x01] : 0)
169 # define f9(x) ((x) ? pow[log[x] + 0xc7] : 0)
170 # define fb(x) ((x) ? pow[log[x] + 0x68] : 0)
171 # define fd(x) ((x) ? pow[log[x] + 0xee] : 0)
172 # define fe(x) ((x) ? pow[log[x] + 0xdf] : 0)
173
174 # endif
175
176 # include " aestab .h"
177
178 #if defined ( __cplusplus )
179 extern "C"
180 {
181 # endif
182

183 #if defined ( FIXED_TABLES )


184
185 /* implemented in case of wrong call for fixed tables */
186
187 AES_RETURN aes_init (void)
188 {
189 return EXIT_SUCCESS ;
190 }
191
192 #else /* Generate the tables for the dynamic table option */
193
194 #if defined ( FF_TABLES )
195
196 # define gf_inv (x) ((x) ? pow[ 255 - log[x]] : 0)
197
198 #else
199
200 /* It will generally be sensible to use tables to compute finite
201 field multiplies and inverses but where memory is scarse this
202 code might sometimes be better . But it only has effect during
203 initialisation so its pretty unimportant in overall terms .
204 */
205
206 /* return 2 ^ (n - 1) where n is the bit number of the highest bit
207 set in x with x in the range 1 < x < 0 x00000200 . This form is
208 used so that locals within fi can be bytes rather than words
209 */
210
211 static uint_8t hibit( const uint_32t x)
212 { uint_8t r = ( uint_8t )((x >> 1) | (x >> 2));
213

178
C.4 APPENDIX C. SOURCE CODE

214 r |= (r >> 2);


215 r |= (r >> 4);
216 return (r + 1) >> 1;
217 }
218
219 /* return the inverse of the finite field element x */
220
221 static uint_8t gf_inv ( const uint_8t x)
222 { uint_8t p1 = x, p2 = BPOLY , n1 = hibit(x), n2 = 0x80 , v1 = 1, v2
,→= 0;
223
224 if(x < 2)
225 return x;
226
227 for( ; ; )
228 {
229 if(n1)
230 while (n2 >= n1) /* divide polynomial p2 by p1
,→ */
231 {
232 n2 /= n1; /* shift smaller polynomial
,→left */
233 p2 ^= (p1 * n2) & 0xff; /* and remove from larger one
,→ */
234 v2 ^= v1 * n2; /* shift accumulated value
,→and */
235 n2 = hibit(p2); /* add into result
,→ */
236 }
237 else
238 return v1;
239
240 if(n2) /* repeat with values swapped
,→ */
241 while (n1 >= n2)
242 {
243 n1 /= n2;
244 p1 ^= p2 * n1;
245 v1 ^= v2 * n1;
246 n1 = hibit(p1);
247 }
248 else
249 return v2;
250 }
251 }
252
253 # endif
254

255 /* The forward and inverse affine transformations used in the S-box

179
C.4 APPENDIX C. SOURCE CODE

,→*/
256 uint_8t fwd_affine (const uint_8t x)
257 { uint_32t w = x;
258 w ^= (w << 1) ^ (w << 2) ^ (w << 3) ^ (w << 4);
259 return 0x63 ^ ((w ^ (w >> 8)) & 0xff);
260 }
261
262 uint_8t inv_affine (const uint_8t x)
263 { uint_32t w = x;
264 w = (w << 1) ^ (w << 3) ^ (w << 6);
265 return 0x05 ^ ((w ^ (w >> 8)) & 0xff);
266 }
267
268 static int init = 0;
269
270 AES_RETURN aes_init (void)
271 { uint_32t i, w;
272

273 #if defined ( FF_TABLES )


274
275 uint_8t pow [512] , log [256];
276
277 if(init)
278 return EXIT_SUCCESS ;
279 /* log and power tables for GF (2^8) finite field with
280 WPOLY as modular polynomial - the simplest primitive
281 root is 0x03 , used here to generate the tables
282 */
283
284 i = 0; w = 1;
285 do
286 {
287 pow[i] = ( uint_8t )w;
288 pow[i + 255] = ( uint_8t )w;
289 log[w] = ( uint_8t )i++;
290 w ^= (w << 1) ^ (w & 0x80 ? WPOLY : 0);
291 }
292 while (w != 1);
293
294 #else
295 if(init)
296 return EXIT_SUCCESS ;
297 # endif
298
299 for(i = 0, w = 1; i < RC_LENGTH ; ++i)
300 {
301 t_set(r,c)[i] = bytes2word (w, 0, 0, 0);
302 w = f2(w);
303 }

180
C.4 APPENDIX C. SOURCE CODE

304
305 for(i = 0; i < 256; ++i)
306 { uint_8t b;
307
308 b = fwd_affine ( gf_inv (( uint_8t )i));
309 w = bytes2word (f2(b), b, b, f3(b));
310
311 #if defined ( SBX_SET )
312 t_set(s,box)[i] = b;
313 # endif
314
315 #if defined ( FT1_SET ) /* tables for a normal
,→encryption round */
316 t_set(f,n)[i] = w;
317 # endif
318 #if defined ( FT4_SET )
319 t_set(f,n)[0][i] = w;
320 t_set(f,n)[1][i] = upr(w ,1);
321 t_set(f,n)[2][i] = upr(w ,2);
322 t_set(f,n)[3][i] = upr(w ,3);
323 # endif
324 w = bytes2word (b, 0, 0, 0);
325
326 #if defined ( FL1_SET ) /* tables for last encryption round
,→ (may also */
327 t_set(f,l)[i] = w; /* be used in the key schedule )
,→ */
328 # endif
329 #if defined ( FL4_SET )
330 t_set(f,l)[0][i] = w;
331 t_set(f,l)[1][i] = upr(w ,1);
332 t_set(f,l)[2][i] = upr(w ,2);
333 t_set(f,l)[3][i] = upr(w ,3);
334 # endif
335
336 #if defined ( LS1_SET ) /* table for key schedule if
,→t_set (f,l) above is*/
337 t_set(l,s)[i] = w; /* not of the required form
,→ */
338 # endif
339 #if defined ( LS4_SET )
340 t_set(l,s)[0][i] = w;
341 t_set(l,s)[1][i] = upr(w ,1);
342 t_set(l,s)[2][i] = upr(w ,2);
343 t_set(l,s)[3][i] = upr(w ,3);
344 # endif
345
346 b = gf_inv ( inv_affine (( uint_8t )i));
347 w = bytes2word (fe(b), f9(b), fd(b), fb(b));

181
C.4 APPENDIX C. SOURCE CODE

348
349 #if defined ( IM1_SET ) /* tables for the inverse mix
,→ column operation */
350 t_set(i,m)[b] = w;
351 # endif
352 #if defined ( IM4_SET )
353 t_set(i,m)[0][b] = w;
354 t_set(i,m)[1][b] = upr(w ,1);
355 t_set(i,m)[2][b] = upr(w ,2);
356 t_set(i,m)[3][b] = upr(w ,3);
357 # endif
358

359 #if defined ( ISB_SET )


360 t_set(i,box)[i] = b;
361 # endif
362 #if defined ( IT1_SET ) /* tables for a normal
,→decryption round */
363 t_set(i,n)[i] = w;
364 # endif
365 #if defined ( IT4_SET )
366 t_set(i,n)[0][i] = w;
367 t_set(i,n)[1][i] = upr(w ,1);
368 t_set(i,n)[2][i] = upr(w ,2);
369 t_set(i,n)[3][i] = upr(w ,3);
370 # endif
371 w = bytes2word (b, 0, 0, 0);
372 #if defined ( IL1_SET ) /* tables for last decryption
,→ round */
373 t_set(i,l)[i] = w;
374 # endif
375 #if defined ( IL4_SET )
376 t_set(i,l)[0][i] = w;
377 t_set(i,l)[1][i] = upr(w ,1);
378 t_set(i,l)[2][i] = upr(w ,2);
379 t_set(i,l)[3][i] = upr(w ,3);
380 # endif
381 }
382 init = 1;
383 return EXIT_SUCCESS ;
384 }
385
386 # endif
387
388 #if defined ( __cplusplus )
389 }
390 # endif

src/Obf/aestab.h
1 /*

182
C.4 APPENDIX C. SOURCE CODE

2 ---------------------------------------------------------------------------
,→
3 Copyright (c) 1998 -2010 , Brian Gladman , Worcester , UK. All rights
,→reserved .
4
5 The redistribution and use of this software (with or without changes )
6 is allowed without the payment of fees or royalties provided that:
7
8 source code distributions include the above copyright notice , this
9 list of conditions and the following disclaimer ;
10
11 binary distributions include the above copyright notice , this list
12 of conditions and the following disclaimer in their documentation .
13
14 This software is provided 'as is' with no explicit or implied
,→warranties
15 in respect of its operation , including , but not limited to ,
,→correctness
16 and fitness for purpose .
17 ---------------------------------------------------------------------------
,→
18 Issue Date: 20/12/2007
19
20 This file contains the code for declaring the tables needed to
,→implement
21 AES. The file aesopt .h is assumed to be included before this header
,→file.
22 If there are no global variables , the definitions here can be used
,→to put
23 the AES tables in a structure so that a pointer can then be added to
,→ the
24 AES context to pass them to the AES routines that need them. If
,→this
25 facility is used , the calling program has to ensure that this
,→pointer is
26 managed appropriately . In particular , the value of the t_dec (in ,it)
,→ item
27 in the table structure must be set to zero in order to ensure that
,→the
28 tables are initialised . In practice the three code sequences in
,→aeskey .c
29 that control the calls to aes_init () and the aes_init () routine
,→itself will
30 have to be changed for a specific implementation . If global
,→variables are
31 available it will generally be preferable to use them with the
,→precomputed
32 FIXED_TABLES option that uses static global tables .
33

183
C.4 APPENDIX C. SOURCE CODE

34 The following defines can be used to control the way the tables
35 are defined , initialised and used in embedded environments that
36 require special features for these purposes
37
38 the 't_dec ' construction is used to declare fixed table arrays
39 the 't_set ' construction is used to set fixed table values
40 the 't_use ' construction is used to access fixed table values
41
42 256 byte tables :
43
44 t_xxx (s,box) => forward S box
45 t_xxx (i,box) => inverse S box
46
47 256 32- bit word OR 4 x 256 32- bit word tables :
48
49 t_xxx (f,n) => forward normal round
50 t_xxx (f,l) => forward last round
51 t_xxx (i,n) => inverse normal round
52 t_xxx (i,l) => inverse last round
53 t_xxx (l,s) => key schedule table
54 t_xxx (i,m) => key schedule table
55
56 Other variables and tables :
57

58 t_xxx (r,c) => the rcon table


59 */
60
61 #if ! defined ( _AESTAB_H )
62 # define _AESTAB_H
63

64 #if defined ( __cplusplus )


65 extern "C" {
66 # endif
67
68 # define t_dec(m,n) t_##m##n
69 # define t_set(m,n) t_##m##n
70 # define t_use(m,n) t_##m##n
71
72 #if defined ( FIXED_TABLES )
73 # if ! defined ( __GNUC__ ) && ( defined ( __MSDOS__ ) || defined (
,→__WIN16__ ))
74 /* make tables far data to avoid using too much DGROUP space (PG)
,→*/
75 # define CONST const far
76 # else
77 # define CONST const
78 # endif
79 #else
80 # define CONST

184
C.4 APPENDIX C. SOURCE CODE

81 # endif
82
83 #if defined ( DO_TABLES )
84 # define EXTERN
85 #else
86 # define EXTERN extern
87 # endif
88
89 #if defined ( _MSC_VER ) && defined ( TABLE_ALIGN )
90 # define ALIGN __declspec (align( TABLE_ALIGN ))
91 #else
92 # define ALIGN
93 # endif
94
95 #if defined ( __WATCOMC__ ) && ( __WATCOMC__ >= 1100 )
96 # define XP_DIR __cdecl
97 #else
98 # define XP_DIR
99 # endif
100
101 #if defined ( DO_TABLES ) && defined ( FIXED_TABLES )
102 # define d_1(t,n,b,e) EXTERN ALIGN CONST XP_DIR t n[256] =
,→b(e)
103 # define d_4(t,n,b,e,f,g,h) EXTERN ALIGN CONST XP_DIR t n [4][256] = {
,→b(e), b(f), b(g), b(h) }
104 EXTERN ALIGN CONST uint_32t t_dec(r,c)[ RC_LENGTH ] = rc_data (w0);
105 #else
106 # define d_1(t,n,b,e) EXTERN ALIGN CONST XP_DIR t n[256]
107 # define d_4(t,n,b,e,f,g,h) EXTERN ALIGN CONST XP_DIR t n [4][256]
108 EXTERN ALIGN CONST uint_32t t_dec(r,c)[ RC_LENGTH ];
109 # endif
110
111 #if defined ( SBX_SET )
112 d_1(uint_8t , t_dec(s,box), sb_data , h0);
113 # endif
114 #if defined ( ISB_SET )
115 d_1(uint_8t , t_dec(i,box), isb_data , h0);
116 # endif
117
118 #if defined ( FT1_SET )
119 d_1(uint_32t , t_dec(f,n), sb_data , u0);
120 # endif
121 #if defined ( FT4_SET )
122 d_4(uint_32t , t_dec(f,n), sb_data , u0 , u1 , u2 , u3);
123 # endif
124
125 #if defined ( FL1_SET )
126 d_1(uint_32t , t_dec(f,l), sb_data , w0);
127 # endif

185
C.4 APPENDIX C. SOURCE CODE

128 #if defined ( FL4_SET )


129 d_4(uint_32t , t_dec(f,l), sb_data , w0 , w1 , w2 , w3);
130 # endif
131
132 #if defined ( IT1_SET )
133 d_1(uint_32t , t_dec(i,n), isb_data , v0);
134 # endif
135 #if defined ( IT4_SET )
136 d_4(uint_32t , t_dec(i,n), isb_data , v0 , v1 , v2 , v3);
137 # endif
138
139 #if defined ( IL1_SET )
140 d_1(uint_32t , t_dec(i,l), isb_data , w0);
141 # endif
142 #if defined ( IL4_SET )
143 d_4(uint_32t , t_dec(i,l), isb_data , w0 , w1 , w2 , w3);
144 # endif
145

146 #if defined ( LS1_SET )


147 #if defined ( FL1_SET )
148 # undef LS1_SET
149 #else
150 d_1(uint_32t , t_dec(l,s), sb_data , w0);
151 # endif
152 # endif
153
154 #if defined ( LS4_SET )
155 #if defined ( FL4_SET )
156 # undef LS4_SET
157 #else
158 d_4(uint_32t , t_dec(l,s), sb_data , w0 , w1 , w2 , w3);
159 # endif
160 # endif
161
162 #if defined ( IM1_SET )
163 d_1(uint_32t , t_dec(i,m), mm_data , v0);
164 # endif
165 #if defined ( IM4_SET )
166 d_4(uint_32t , t_dec(i,m), mm_data , v0 , v1 , v2 , v3);
167 # endif
168
169 #if defined ( __cplusplus )
170 }
171 # endif
172
173 # endif

src/Obf/brg_endian.h
1 /*

186
C.4 APPENDIX C. SOURCE CODE

2 ---------------------------------------------------------------------------
,→
3 Copyright (c) 1998 -2010 , Brian Gladman , Worcester , UK. All rights
,→reserved .
4
5 The redistribution and use of this software (with or without changes )
6 is allowed without the payment of fees or royalties provided that:
7
8 source code distributions include the above copyright notice , this
9 list of conditions and the following disclaimer ;
10
11 binary distributions include the above copyright notice , this list
12 of conditions and the following disclaimer in their documentation .
13
14 This software is provided 'as is' with no explicit or implied
,→warranties
15 in respect of its operation , including , but not limited to ,
,→correctness
16 and fitness for purpose .
17 ---------------------------------------------------------------------------
,→
18 Issue Date: 20/12/2007
19 */
20

21 # ifndef _BRG_ENDIAN_H
22 # define _BRG_ENDIAN_H
23
24 # define IS_BIG_ENDIAN 4321 /* byte 0 is most significant ( mc68k )
,→ */
25 # define IS_LITTLE_ENDIAN 1234 /* byte 0 is least significant (i386)
,→ */
26
27 /* Include files where endian defines and byteswap functions may
,→reside */
28 #if defined ( __sun )
29 # include <sys/ isa_defs .h>
30 #elif defined ( __FreeBSD__ ) || defined ( __OpenBSD__ ) || defined (
,→__NetBSD__ )
31 # include <sys/ endian .h>
32 #elif defined ( BSD ) && ( BSD >= 199103 ) || defined ( __APPLE__ ) ||
,→\
33 defined ( __CYGWIN32__ ) || defined ( __DJGPP__ ) || defined (
,→__osf__ )
34 # include <machine / endian .h>
35 #elif defined ( __linux__ ) || defined ( __GNUC__ ) || defined (
,→__GNU_LIBRARY__ )
36 # if ! defined ( __MINGW32__ ) && ! defined ( _AIX )
37 # include <endian .h>
38 # if ! defined ( __BEOS__ )

187
C.4 APPENDIX C. SOURCE CODE

39 # include <byteswap .h>


40 # endif
41 # endif
42 # endif
43
44 /* Now attempt to set the define for platform byte order using any
,→*/
45 /* of the four forms SYMBOL , _SYMBOL , __SYMBOL & __SYMBOL__ , which
,→*/
46 /* seem to encompass most endian symbol definitions
,→*/
47

48 #if defined ( BIG_ENDIAN ) && defined ( LITTLE_ENDIAN )


49 # if defined ( BYTE_ORDER ) && BYTE_ORDER == BIG_ENDIAN
50 # define PLATFORM_BYTE_ORDER IS_BIG_ENDIAN
51 # elif defined ( BYTE_ORDER ) && BYTE_ORDER == LITTLE_ENDIAN
52 # define PLATFORM_BYTE_ORDER IS_LITTLE_ENDIAN
53 # endif
54 #elif defined ( BIG_ENDIAN )
55 # define PLATFORM_BYTE_ORDER IS_BIG_ENDIAN
56 #elif defined ( LITTLE_ENDIAN )
57 # define PLATFORM_BYTE_ORDER IS_LITTLE_ENDIAN
58 # endif
59

60 #if defined ( _BIG_ENDIAN ) && defined ( _LITTLE_ENDIAN )


61 # if defined ( _BYTE_ORDER ) && _BYTE_ORDER == _BIG_ENDIAN
62 # define PLATFORM_BYTE_ORDER IS_BIG_ENDIAN
63 # elif defined ( _BYTE_ORDER ) && _BYTE_ORDER == _LITTLE_ENDIAN
64 # define PLATFORM_BYTE_ORDER IS_LITTLE_ENDIAN
65 # endif
66 #elif defined ( _BIG_ENDIAN )
67 # define PLATFORM_BYTE_ORDER IS_BIG_ENDIAN
68 #elif defined ( _LITTLE_ENDIAN )
69 # define PLATFORM_BYTE_ORDER IS_LITTLE_ENDIAN
70 # endif
71

72 #if defined ( __BIG_ENDIAN ) && defined ( __LITTLE_ENDIAN )


73 # if defined ( __BYTE_ORDER ) && __BYTE_ORDER == __BIG_ENDIAN
74 # define PLATFORM_BYTE_ORDER IS_BIG_ENDIAN
75 # elif defined ( __BYTE_ORDER ) && __BYTE_ORDER == __LITTLE_ENDIAN
76 # define PLATFORM_BYTE_ORDER IS_LITTLE_ENDIAN
77 # endif
78 #elif defined ( __BIG_ENDIAN )
79 # define PLATFORM_BYTE_ORDER IS_BIG_ENDIAN
80 #elif defined ( __LITTLE_ENDIAN )
81 # define PLATFORM_BYTE_ORDER IS_LITTLE_ENDIAN
82 # endif
83

84 #if defined ( __BIG_ENDIAN__ ) && defined ( __LITTLE_ENDIAN__ )

188
C.4 APPENDIX C. SOURCE CODE

85 # if defined ( __BYTE_ORDER__ ) && __BYTE_ORDER__ == __BIG_ENDIAN__


86 # define PLATFORM_BYTE_ORDER IS_BIG_ENDIAN
87 # elif defined ( __BYTE_ORDER__ ) && __BYTE_ORDER__ ==
,→__LITTLE_ENDIAN__
88 # define PLATFORM_BYTE_ORDER IS_LITTLE_ENDIAN
89 # endif
90 #elif defined ( __BIG_ENDIAN__ )
91 # define PLATFORM_BYTE_ORDER IS_BIG_ENDIAN
92 #elif defined ( __LITTLE_ENDIAN__ )
93 # define PLATFORM_BYTE_ORDER IS_LITTLE_ENDIAN
94 # endif
95

96 /* if the platform byte order could not be determined , then try to


,→*/
97 /* set this define using common machine defines
,→*/
98 #if ! defined ( PLATFORM_BYTE_ORDER )
99

100 #if defined ( __alpha__ ) || defined ( __alpha ) || defined ( i386 )


,→ || \
101 defined ( __i386__ ) || defined ( _M_I86 ) || defined ( _M_IX86
,→) || \
102 defined ( __OS2__ ) || defined ( sun386 ) || defined (
,→__TURBOC__ ) || \
103 defined ( vax ) || defined ( vms ) || defined ( VMS )
,→ || \
104 defined ( __VMS ) || defined ( _M_X64 )
105 # define PLATFORM_BYTE_ORDER IS_LITTLE_ENDIAN
106
107 #elif defined ( AMIGA ) || defined ( applec ) || defined (
,→__AS400__ ) || \
108 defined ( _CRAY ) || defined ( __hppa ) || defined ( __hp9000
,→ ) || \
109 defined ( ibm370 ) || defined ( mc68000 ) || defined ( m68k )
,→ || \
110 defined ( __MRC__ ) || defined ( __MVS__ ) || defined (
,→__MWERKS__ ) || \
111 defined ( sparc ) || defined ( __sparc ) || defined (
,→SYMANTEC_C ) || \
112 defined ( __VOS__ ) || defined ( __TIGCC__ ) || defined ( __TANDEM
,→ ) || \
113 defined ( THINK_C ) || defined ( __VMCMS__ ) || defined ( _AIX )
114 # define PLATFORM_BYTE_ORDER IS_BIG_ENDIAN
115
116 #elif 0 /* **** EDIT HERE IF NECESSARY **** */
117 # define PLATFORM_BYTE_ORDER IS_LITTLE_ENDIAN
118 #elif 0 /* **** EDIT HERE IF NECESSARY **** */
119 # define PLATFORM_BYTE_ORDER IS_BIG_ENDIAN
120 #else

189
C.4 APPENDIX C. SOURCE CODE

121 # error Please edit lines 126 or 128 in brg_endian .h to set the
,→platform byte order
122 # endif
123
124 # endif
125

126 # endif

src/Obf/brg_types.h
1 /*
2 ---------------------------------------------------------------------------
,→
3 Copyright (c) 1998 -2010 , Brian Gladman , Worcester , UK. All rights
,→reserved .
4
5 The redistribution and use of this software (with or without changes )
6 is allowed without the payment of fees or royalties provided that:
7
8 source code distributions include the above copyright notice , this
9 list of conditions and the following disclaimer ;
10
11 binary distributions include the above copyright notice , this list
12 of conditions and the following disclaimer in their documentation .
13
14 This software is provided 'as is' with no explicit or implied
,→warranties
15 in respect of its operation , including , but not limited to ,
,→correctness
16 and fitness for purpose .
17 ---------------------------------------------------------------------------
,→
18 Issue Date: 20/12/2007
19
20 The unsigned integer types defined here are of the form uint_ <nn >t
,→where
21 <nn > is the length of the type; for example , the unsigned 32- bit
,→type is
22 'uint_32t '. These are NOT the same as the 'C99 integer types ' that
,→are
23 defined in the inttypes .h and stdint .h headers since attempts to use
,→ these
24 types have shown that support for them is still highly variable .
,→However ,
25 since the latter are of the form uint <nn >_t , a regular expression
,→search
26 and replace (in VC ++ search on 'uint_ {:z}t' and replace with 'uint \1
,→_t ')
27 can be used to convert the types used here to the C99 standard types
,→.

190
C.4 APPENDIX C. SOURCE CODE

28 */
29
30 # ifndef _BRG_TYPES_H
31 # define _BRG_TYPES_H
32
33 #if defined ( __cplusplus )
34 extern "C" {
35 # endif
36
37 # include <limits .h>
38
39 #if defined ( _MSC_VER ) && ( _MSC_VER >= 1300 )
40 # include <stddef .h>
41 # define ptrint_t intptr_t
42 #elif defined ( __ECOS__ )
43 # define intptr_t unsigned int
44 # define ptrint_t intptr_t
45 #elif defined ( __GNUC__ ) && ( __GNUC__ >= 3 )
46 # include <stdint .h>
47 # define ptrint_t intptr_t
48 #else
49 # define ptrint_t int
50 # endif
51

52 # ifndef BRG_UI8
53 # define BRG_UI8
54 # if UCHAR_MAX == 255u
55 typedef unsigned char uint_8t ;
56 # else
57 # error Please define uint_8t as an 8-bit unsigned integer type in
,→ brg_types .h
58 # endif
59 # endif
60
61 # ifndef BRG_UI16
62 # define BRG_UI16
63 # if USHRT_MAX == 65535u
64 typedef unsigned short uint_16t ;
65 # else
66 # error Please define uint_16t as a 16- bit unsigned short type in
,→brg_types .h
67 # endif
68 # endif
69
70 # ifndef BRG_UI32
71 # define BRG_UI32
72 # if UINT_MAX == 4294967295 u
73 # define li_32(h) 0x##h##u
74 typedef unsigned int uint_32t ;

191
C.4 APPENDIX C. SOURCE CODE

75 # elif ULONG_MAX == 4294967295 u


76 # define li_32(h) 0x##h##ul
77 typedef unsigned long uint_32t ;
78 # elif defined ( _CRAY )
79 # error This code needs 32- bit data types , which Cray machines do
,→not provide
80 # else
81 # error Please define uint_32t as a 32- bit unsigned integer type
,→in brg_types .h
82 # endif
83 # endif
84

85 # ifndef BRG_UI64
86 # if defined ( __BORLANDC__ ) && ! defined ( __MSDOS__ )
87 # define BRG_UI64
88 # define li_64(h) 0x##h## ui64
89 typedef unsigned __int64 uint_64t ;
90 # elif defined ( _MSC_VER ) && ( _MSC_VER < 1300 ) /* 1300 == VC ++
,→ 7.0 */
91 # define BRG_UI64
92 # define li_64(h) 0x##h## ui64
93 typedef unsigned __int64 uint_64t ;
94 # elif defined ( __sun ) && defined ( ULONG_MAX ) && ULONG_MAX == 0
,→xfffffffful
95 # define BRG_UI64
96 # define li_64(h) 0x##h## ull
97 typedef unsigned long long uint_64t ;
98 # elif defined ( __MVS__ )
99 # define BRG_UI64
100 # define li_64(h) 0x##h## ull
101 typedef unsigned int long long uint_64t ;
102 # elif defined ( UINT_MAX ) && UINT_MAX > 4294967295 u
103 # if UINT_MAX == 18446744073709551615 u
104 # define BRG_UI64
105 # define li_64(h) 0x##h##u
106 typedef unsigned int uint_64t ;
107 # endif
108 # elif defined ( ULONG_MAX ) && ULONG_MAX > 4294967295 u
109 # if ULONG_MAX == 18446744073709551615 ul
110 # define BRG_UI64
111 # define li_64(h) 0x##h##ul
112 typedef unsigned long uint_64t ;
113 # endif
114 # elif defined ( ULLONG_MAX ) && ULLONG_MAX > 4294967295 u
115 # if ULLONG_MAX == 18446744073709551615 ull
116 # define BRG_UI64
117 # define li_64(h) 0x##h## ull
118 typedef unsigned long long uint_64t ;
119 # endif

192
C.4 APPENDIX C. SOURCE CODE

120 # elif defined ( ULONG_LONG_MAX ) && ULONG_LONG_MAX > 4294967295 u


121 # if ULONG_LONG_MAX == 18446744073709551615 ull
122 # define BRG_UI64
123 # define li_64(h) 0x##h## ull
124 typedef unsigned long long uint_64t ;
125 # endif
126 # endif
127 # endif
128
129 #if ! defined ( BRG_UI64 )
130 # if defined ( NEED_UINT_64T )
131 # error Please define uint_64t as an unsigned 64 bit type in
,→brg_types .h
132 # endif
133 # endif
134
135 # ifndef RETURN_VALUES
136 # define RETURN_VALUES
137 # if defined ( DLL_EXPORT )
138 # if defined ( _MSC_VER ) || defined ( __INTEL_COMPILER )
139 # define VOID_RETURN __declspec ( dllexport ) void __stdcall
140 # define INT_RETURN __declspec ( dllexport ) int __stdcall
141 # elif defined ( __GNUC__ )
142 # define VOID_RETURN __declspec ( __dllexport__ ) void
143 # define INT_RETURN __declspec ( __dllexport__ ) int
144 # else
145 # error Use of the DLL is only available on the Microsoft , Intel
,→ and GCC compilers
146 # endif
147 # elif defined ( DLL_IMPORT )
148 # if defined ( _MSC_VER ) || defined ( __INTEL_COMPILER )
149 # define VOID_RETURN __declspec ( dllimport ) void __stdcall
150 # define INT_RETURN __declspec ( dllimport ) int __stdcall
151 # elif defined ( __GNUC__ )
152 # define VOID_RETURN __declspec ( __dllimport__ ) void
153 # define INT_RETURN __declspec ( __dllimport__ ) int
154 # else
155 # error Use of the DLL is only available on the Microsoft , Intel
,→ and GCC compilers
156 # endif
157 # elif defined ( __WATCOMC__ )
158 # define VOID_RETURN void __cdecl
159 # define INT_RETURN int __cdecl
160 # else
161 # define VOID_RETURN void
162 # define INT_RETURN int
163 # endif
164 # endif
165

193
C.4 APPENDIX C. SOURCE CODE

166 /* These defines are used to detect and set the memory alignment
,→ of pointers .
167 Note that offsets are in bytes .
168
169 ALIGN_OFFSET (x,n) return the positive or zero
,→offset of
170 the memory addressed by the pointer '
,→x'
171 from an address that is aligned on an
172 'n' byte boundary ('n' is a power of
,→2)
173

174 ALIGN_FLOOR (x,n) return a pointer that points


,→to memory
175 that is aligned on an 'n' byte
,→boundary
176 and is not higher than the memory
,→address
177 pointed to by 'x' ('n' is a power of
,→2)
178
179 ALIGN_CEIL (x,n) return a pointer that
,→ points to memory
180 that is aligned on an 'n' byte
,→boundary
181 and is not lower than the memory
,→address
182 pointed to by 'x' ('n' is a power of
,→2)
183 */
184
185 # define ALIGN_OFFSET (x,n) ((( ptrint_t )(x)) & ((n) - 1))
186 # define ALIGN_FLOOR (x,n) (( uint_8t *)(x) - ( (( ptrint_t )(x)) &
,→((n) - 1)))
187 # define ALIGN_CEIL (x,n) (( uint_8t *)(x) + (-(( ptrint_t )(x)) &
,→((n) - 1)))
188
189 /* These defines are used to declare buffers in a way that allows
190 faster operations on longer variables to be used. In all these
191 defines 'size ' must be a power of 2 and >= 8. NOTE that the
192 buffer size is in bytes but the type length is in bits
193

194 UNIT_TYPEDEF (x,size) declares a variable 'x' of length


195 'size ' bits
196
197 BUFR_TYPEDEF (x,size ,bsize ) declares a buffer 'x' of length '
,→bsize '
198 bytes defined as an array of
,→variables

194
C.4 APPENDIX C. SOURCE CODE

199 each of 'size ' bits ( bsize must be a


200 multiple of size / 8)
201
202 UNIT_CAST (x,size) casts a variable to a type of
203 length 'size ' bits
204

205 UPTR_CAST (x,size) casts a pointer to a pointer to a


206 varaiable of length 'size ' bits
207 */
208
209 # define UI_TYPE (size) uint_ ## size ##t
210 # define UNIT_TYPEDEF (x,size) typedef UI_TYPE (size) x
211 # define BUFR_TYPEDEF (x,size ,bsize) typedef UI_TYPE (size) x[bsize / (
,→size >> 3)]
212 # define UNIT_CAST (x,size) (( UI_TYPE (size) )(x))
213 # define UPTR_CAST (x,size) (( UI_TYPE (size)*)(x))
214
215 #if defined ( __cplusplus )
216 }
217 # endif
218
219 # endif

src/Obf/cmac.c
1 /*
2 ---------------------------------------------------------------------------
,→
3 Copyright (c) 1998 -2010 , Brian Gladman , Worcester , UK. All rights
,→reserved .
4
5 The redistribution and use of this software (with or without changes )
6 is allowed without the payment of fees or royalties provided that:
7
8 source code distributions include the above copyright notice , this
9 list of conditions and the following disclaimer ;
10
11 binary distributions include the above copyright notice , this list
12 of conditions and the following disclaimer in their documentation .
13
14 This software is provided 'as is' with no explicit or implied
,→warranties
15 in respect of its operation , including , but not limited to ,
,→correctness
16 and fitness for purpose .
17 ---------------------------------------------------------------------------
,→
18 Issue Date: 6/10/2008
19 */
20

195
C.4 APPENDIX C. SOURCE CODE

21 # include "cmac.h"
22
23 # define BLK_ADR_MASK ( BLOCK_SIZE - 1)
24
25 void cmac_init ( const unsigned char key [], cmac_ctx ctx [1])
26 {
27 memset (ctx , 0, sizeof ( cmac_ctx ));
28 aes_encrypt_key128 (key , ctx ->aes);
29 }
30
31 void cmac_data ( const unsigned char buf [], unsigned long len , cmac_ctx
,→ ctx [1])
32 { uint_32t cnt = 0, b_pos = ctx -> txt_cnt & BLK_ADR_MASK ;
33
34 if(! len)
35 return ;
36
37 if (!(( buf - ( UI8_PTR (ctx -> txt_cbc ) + b_pos)) & BUF_ADRMASK ))
38 {
39 while (cnt < len && (b_pos & BUF_ADRMASK ))
40 UI8_PTR (ctx -> txt_cbc )[b_pos ++] ^= buf[cnt ++];
41
42 while (cnt + BLOCK_SIZE <= len)
43 {
44 while (cnt + BUF_INC <= len && b_pos <= BLOCK_SIZE -
,→BUF_INC )
45 {
46 * UNIT_PTR ( UI8_PTR (ctx -> txt_cbc ) + b_pos) ^= * UNIT_PTR
,→(buf + cnt);
47 cnt += BUF_INC ; b_pos += BUF_INC ;
48 }
49
50 while (cnt + BLOCK_SIZE <= len)
51 {
52 aes_ecb_encrypt ( UI8_PTR (ctx -> txt_cbc ), UI8_PTR (ctx ->
,→txt_cbc ), AES_BLOCK_SIZE , ctx ->aes);
53 xor_block_aligned (ctx ->txt_cbc , ctx ->txt_cbc , buf +
,→cnt);
54 cnt += BLOCK_SIZE ;
55 }
56 }
57 }
58 else
59 {
60 while (cnt < len && b_pos < BLOCK_SIZE )
61 UI8_PTR (ctx -> txt_cbc )[b_pos ++] ^= buf[cnt ++];
62
63 while (cnt + BLOCK_SIZE <= len)
64 {

196
C.4 APPENDIX C. SOURCE CODE

65 aes_ecb_encrypt ( UI8_PTR (ctx -> txt_cbc ), UI8_PTR (ctx ->


,→txt_cbc ), AES_BLOCK_SIZE , ctx ->aes);
66 xor_block (ctx ->txt_cbc , ctx ->txt_cbc , buf + cnt);
67 cnt += BLOCK_SIZE ;
68 }
69 }
70
71 while (cnt < len)
72 {
73 if(b_pos == BLOCK_SIZE )
74 {
75 aes_ecb_encrypt ( UI8_PTR (ctx -> txt_cbc ), UI8_PTR (ctx ->
,→txt_cbc ), AES_BLOCK_SIZE , ctx ->aes);
76 b_pos = 0;
77 }
78 UI8_PTR (ctx -> txt_cbc )[b_pos ++] ^= buf[cnt ++];
79 }
80

81 ctx -> txt_cnt += cnt;


82 }
83
84 static const unsigned char c_xor [4] = { 0x00 , 0x87 , 0x0e , 0x89 };
85
86 static void gf_mulx ( uint_8t pad[ BLOCK_SIZE ])
87 { int i, t = pad [0] >> 7;
88
89 for(i = 0; i < BLOCK_SIZE - 1; ++i)
90 pad[i] = (pad[i] << 1) | (pad[i + 1] >> 7);
91 pad[ BLOCK_SIZE - 1] = (pad[ BLOCK_SIZE - 1] << 1) ^ c_xor[t];
92 }
93
94 void gf_mulx2 ( uint_8t pad[ BLOCK_SIZE ])
95 { int i, t = pad [0] >> 6;
96
97 for(i = 0; i < BLOCK_SIZE - 1; ++i)
98 pad[i] = (pad[i] << 2) | (pad[i + 1] >> 6);
99 pad[ BLOCK_SIZE - 2] ^= (t >> 1);
100 pad[ BLOCK_SIZE - 1] = (pad[ BLOCK_SIZE - 1] << 2) ^ c_xor[t];
101 }
102
103 void cmac_end ( unsigned char auth_tag [], cmac_ctx ctx [1])
104 { buf_type pad;
105 int i;
106
107 memset (pad , 0, sizeof (pad));
108 aes_ecb_encrypt ( UI8_PTR (pad), UI8_PTR (pad), AES_BLOCK_SIZE , ctx ->
,→aes);
109 i = ctx -> txt_cnt & BLK_ADR_MASK ;
110 if(ctx -> txt_cnt == 0 || i)

197
C.4 APPENDIX C. SOURCE CODE

111 {
112 UI8_PTR (ctx -> txt_cbc )[i] ^= 0x80;
113 gf_mulx2 ( UI8_PTR (pad));
114 }
115 else
116 gf_mulx ( UI8_PTR (pad));
117
118 xor_block_aligned (pad , pad , ctx -> txt_cbc );
119 aes_ecb_encrypt ( UI8_PTR (pad), UI8_PTR (pad), AES_BLOCK_SIZE , ctx ->
,→aes);
120
121 for(i = 0; i < BLOCK_SIZE ; ++i)
122 auth_tag [i] = UI8_PTR (pad)[i];
123 }

src/Obf/cmac.h
1 /*
2 ---------------------------------------------------------------------------
,→
3 Copyright (c) 1998 -2010 , Brian Gladman , Worcester , UK. All rights
,→reserved .
4

5 The redistribution and use of this software (with or without changes )


6 is allowed without the payment of fees or royalties provided that:
7
8 source code distributions include the above copyright notice , this
9 list of conditions and the following disclaimer ;
10

11 binary distributions include the above copyright notice , this list


12 of conditions and the following disclaimer in their documentation .
13
14 This software is provided 'as is' with no explicit or implied
,→warranties
15 in respect of its operation , including , but not limited to ,
,→correctness
16 and fitness for purpose .
17 ---------------------------------------------------------------------------
,→
18 Issue Date: 6/10/2008
19 */
20
21 # ifndef CMAC_AES_H
22 # define CMAC_AES_H
23
24 #if ! defined ( UNIT_BITS )
25 # if 1
26 # define UNIT_BITS 64
27 # elif 0
28 # define UNIT_BITS 32

198
C.4 APPENDIX C. SOURCE CODE

29 # else
30 # define UNIT_BITS 8
31 # endif
32 # endif
33
34 # include <string .h>
35 # include "aes.h"
36 # include " mode_hdr .h"
37
38 UNIT_TYPEDEF (buf_unit , UNIT_BITS );
39 BUFR_TYPEDEF (buf_type , UNIT_BITS , AES_BLOCK_SIZE );
40

41 #if defined ( __cplusplus )


42 extern "C"
43 {
44 # endif
45
46 # define BLOCK_SIZE AES_BLOCK_SIZE
47
48 typedef struct
49 {
50 buf_type txt_cbc ;
51 aes_encrypt_ctx aes [1]; /* AES encryption context
,→ */
52 uint_32t txt_cnt ;
53 } cmac_ctx ;
54
55 void cmac_init ( const unsigned char key [], /* the encryption key
,→ */
56 cmac_ctx ctx [1] ); /* the OMAC context
,→ */
57
58 void cmac_data ( const unsigned char buf [], /* the data buffer
,→ */
59 unsigned long len , /* the length of this
,→block ( bytes ) */
60 cmac_ctx ctx [1] ); /* the OMAC context
,→ */
61
62 void cmac_end ( unsigned char auth_tag [], /* the encryption key
,→ */
63 cmac_ctx ctx [1] ); /* the OMAC context
,→ */
64
65 #if defined ( __cplusplus )
66 }
67 # endif
68

69 # endif

199
C.4 APPENDIX C. SOURCE CODE

src/Obf/mode_hdr.h
1 /*
2 ---------------------------------------------------------------------------
,→
3 Copyright (c) 1998 -2010 , Brian Gladman , Worcester , UK. All rights
,→reserved .
4

5 The redistribution and use of this software (with or without changes )


6 is allowed without the payment of fees or royalties provided that:
7
8 source code distributions include the above copyright notice , this
9 list of conditions and the following disclaimer ;
10

11 binary distributions include the above copyright notice , this list


12 of conditions and the following disclaimer in their documentation .
13
14 This software is provided 'as is' with no explicit or implied
,→warranties
15 in respect of its operation , including , but not limited to ,
,→correctness
16 and fitness for purpose .
17 ---------------------------------------------------------------------------
,→
18 Issue Date: 07/10/2010
19

20 This header file is an INTERNAL file which supports mode


,→implementation
21 */
22
23 # ifndef _MODE_HDR_H
24 # define _MODE_HDR_H
25
26 # include <string .h>
27 # include <limits .h>
28
29 # include " brg_endian .h"
30

31 /* This define sets the units in which buffers are processed . This
,→code
32 can provide significant speed gains if buffers can be processed
,→in
33 32 or 64 bit chunks rather than in bytes . This define sets the
,→units
34 in which buffers will be accessed if possible
35 */
36 #if ! defined ( UNIT_BITS )
37 # if PLATFORM_BYTE_ORDER == IS_BIG_ENDIAN
38 # if 0
39 # define UNIT_BITS 32

200
C.4 APPENDIX C. SOURCE CODE

40 # elif 1
41 # define UNIT_BITS 64
42 # endif
43 # elif defined ( _WIN64 )
44 # define UNIT_BITS 64
45 # else
46 # define UNIT_BITS 32
47 # endif
48 # endif
49
50 #if UNIT_BITS == 64 && ! defined ( NEED_UINT_64T )
51 # define NEED_UINT_64T
52 # endif
53
54 # include " brg_types .h"
55
56 /* Use of inlines is preferred but code blocks can also be expanded
,→inline
57 using 'defines '. But the latter approach will typically generate
,→ a LOT
58 of code and is not recommended .
59 */
60 #if 1 && ! defined ( USE_INLINING )
61 # define USE_INLINING
62 # endif
63
64 #if defined ( _MSC_VER )
65 # if _MSC_VER >= 1400
66 # include <stdlib .h>
67 # include <intrin .h>
68 # pragma intrinsic ( memset )
69 # pragma intrinsic ( memcpy )
70 # define rotl32 _rotl
71 # define rotr32 _rotr
72 # define rotl64 _rotl64
73 # define rotr64 _rotl64
74 # define bswap_16 (x) _byteswap_ushort (x)
75 # define bswap_32 (x) _byteswap_ulong (x)
76 # define bswap_64 (x) _byteswap_uint64 (x)
77 # else
78 # define rotl32 _lrotl
79 # define rotr32 _lrotr
80 # endif
81 # endif
82
83 #if defined ( USE_INLINING )
84 # if defined ( _MSC_VER )
85 # define mh_decl __inline
86 # elif defined ( __GNUC__ ) || defined ( __GNU_LIBRARY__ )

201
C.4 APPENDIX C. SOURCE CODE

87 # define mh_decl static inline


88 # else
89 # define mh_decl static
90 # endif
91 # endif
92

93 #if defined ( __cplusplus )


94 extern "C" {
95 # endif
96
97 # define UI8_PTR (x) UPTR_CAST (x, 8)
98 # define UI16_PTR (x) UPTR_CAST (x, 16)
99 # define UI32_PTR (x) UPTR_CAST (x, 32)
100 # define UI64_PTR (x) UPTR_CAST (x, 64)
101 # define UNIT_PTR (x) UPTR_CAST (x, UNIT_BITS )
102
103 # define UI8_VAL (x) UNIT_CAST (x, 8)
104 # define UI16_VAL (x) UNIT_CAST (x, 16)
105 # define UI32_VAL (x) UNIT_CAST (x, 32)
106 # define UI64_VAL (x) UNIT_CAST (x, 64)
107 # define UNIT_VAL (x) UNIT_CAST (x, UNIT_BITS )
108
109 # define BUF_INC ( UNIT_BITS >> 3)
110 # define BUF_ADRMASK (( UNIT_BITS >> 3) - 1)
111
112 # define rep2_u2 (f,r,x) f( 0,r,x); f( 1,r,x)
113 # define rep2_u4 (f,r,x) f( 0,r,x); f( 1,r,x); f( 2,r,x); f( 3,r,x)
114 # define rep2_u16 (f,r,x) f( 0,r,x); f( 1,r,x); f( 2,r,x); f( 3,r,x);
,→ \
115 f( 4,r,x); f( 5,r,x); f( 6,r,x); f( 7,r,x);
,→ \
116 f( 8,r,x); f( 9,r,x); f(10,r,x); f(11,r,x);
,→ \
117 f(12,r,x); f(13,r,x); f(14,r,x); f(15,r,x)
118
119 # define rep2_d2 (f,r,x) f( 1,r,x); f( 0,r,x)
120 # define rep2_d4 (f,r,x) f( 3,r,x); f( 2,r,x); f( 1,r,x); f( 0,r,x)
121 # define rep2_d16 (f,r,x) f(15,r,x); f(14,r,x); f(13,r,x); f(12,r,x);
,→ \
122 f(11,r,x); f(10,r,x); f( 9,r,x); f( 8,r,x);
,→ \
123 f( 7,r,x); f( 6,r,x); f( 5,r,x); f( 4,r,x);
,→ \
124 f( 3,r,x); f( 2,r,x); f( 1,r,x); f( 0,r,x)
125
126 # define rep3_u2 (f,r,x,y,c) f( 0,r,x,y,c); f( 1,r,x,y,c)
127 # define rep3_u4 (f,r,x,y,c) f( 0,r,x,y,c); f( 1,r,x,y,c); f( 2,r,x,y,
,→c); f( 3,r,x,y,c)
128 # define rep3_u16 (f,r,x,y,c) f( 0,r,x,y,c); f( 1,r,x,y,c); f( 2,r,x,y,

202
C.4 APPENDIX C. SOURCE CODE

,→c); f( 3,r,x,y,c); \
129 f( 4,r,x,y,c); f( 5,r,x,y,c); f( 6,r,x,y,
,→c); f( 7,r,x,y,c); \
130 f( 8,r,x,y,c); f( 9,r,x,y,c); f(10,r,x,y,
,→c); f(11,r,x,y,c); \
131 f(12,r,x,y,c); f(13,r,x,y,c); f(14,r,x,y,
,→c); f(15,r,x,y,c)
132
133 # define rep3_d2 (f,r,x,y,c) f( 1,r,x,y,c); f( 0,r,x,y,c)
134 # define rep3_d4 (f,r,x,y,c) f( 3,r,x,y,c); f( 2,r,x,y,c); f( 1,r,x,y,
,→c); f( 0,r,x,y,c)
135 # define rep3_d16 (f,r,x,y,c) f(15,r,x,y,c); f(14,r,x,y,c); f(13,r,x,y,
,→c); f(12,r,x,y,c); \
136 f(11,r,x,y,c); f(10,r,x,y,c); f( 9,r,x,y,
,→c); f( 8,r,x,y,c); \
137 f( 7,r,x,y,c); f( 6,r,x,y,c); f( 5,r,x,y,
,→c); f( 4,r,x,y,c); \
138 f( 3,r,x,y,c); f( 2,r,x,y,c); f( 1,r,x,y,
,→c); f( 0,r,x,y,c)
139
140 /* function pointers might be used for fast XOR operations */
141
142 typedef void (* xor_function )(void* r, const void* p, const void* q);
143

144 /* left and right rotates on 32 and 64 bit variables */


145
146 #if ! defined ( rotl32 ) /* NOTE: 0 <= n <= 32 ASSUMED */
147 mh_decl uint_32t rotl32 ( uint_32t x, int n)
148 {
149 return (((x) << n) | ((x) >> (32 - n)));
150 }
151 # endif
152
153 #if ! defined ( rotr32 ) /* NOTE: 0 <= n <= 32 ASSUMED */
154 mh_decl uint_32t rotr32 ( uint_32t x, int n)
155 {
156 return (((x) >> n) | ((x) << (32 - n)));
157 }
158 # endif
159
160 #if UNIT_BITS == 64 && ! defined ( rotl64 ) /* NOTE: 0 <= n <= 64
,→ASSUMED */
161 mh_decl uint_64t rotl64 ( uint_64t x, int n)
162 {
163 return (((x) << n) | ((x) >> (64 - n)));
164 }
165 # endif
166

167 #if UNIT_BITS == 64 && ! defined ( rotr64 ) /* NOTE: 0 <= n <= 64

203
C.4 APPENDIX C. SOURCE CODE

,→ASSUMED */
168 mh_decl uint_64t rotr64 ( uint_64t x, int n)
169 {
170 return (((x) >> n) | ((x) << (64 - n)));
171 }
172 # endif
173
174 /* byte order inversions for 16, 32 and 64 bit variables */
175
176 #if ! defined ( bswap_16 )
177 mh_decl uint_16t bswap_16 ( uint_16t x)
178 {
179 return ( uint_16t )((x >> 8) | (x << 8));
180 }
181 # endif
182
183 #if ! defined ( bswap_32 )
184 mh_decl uint_32t bswap_32 ( uint_32t x)
185 {
186 return (( rotr32 ((x), 24) & 0 x00ff00ff ) | ( rotr32 ((x), 8) & 0
,→xff00ff00 ));
187 }
188 # endif
189

190 #if UNIT_BITS == 64 && ! defined ( bswap_64 )


191 mh_decl uint_64t bswap_64 ( uint_64t x)
192 {
193 return bswap_32 (( uint_32t )(x >> 32)) | (( uint_64t ) bswap_32 ((
,→uint_32t )x) << 32);
194 }
195 # endif
196
197 /* support for fast aligned buffer move , xor and byte swap operations
,→ -
198 source and destination buffers for move and xor operations must
,→not
199 overlap , those for byte order revesal must either not overlap or
200 must be identical
201 */
202 # define f_copy (n,p,q) p[n] = q[n]
203 # define f_xor(n,r,p,q,c) r[n] = c(p[n] ^ q[n])
204

205 mh_decl void copy_block (void* p, const void* q)


206 {
207 memcpy (p, q, 16);
208 }
209
210 mh_decl void copy_block_aligned (void *p, const void *q)
211 {

204
C.4 APPENDIX C. SOURCE CODE

212 #if UNIT_BITS == 8


213 memcpy (p, q, 16);
214 #elif UNIT_BITS == 32
215 rep2_u4 (f_copy , UNIT_PTR (p),UNIT_PTR (q));
216 #else
217 rep2_u2 (f_copy , UNIT_PTR (p),UNIT_PTR (q));
218 # endif
219 }
220
221 mh_decl void xor_block (void *r, const void* p, const void* q)
222 {
223 rep3_u16 (f_xor , UI8_PTR (r), UI8_PTR (p), UI8_PTR (q), UI8_VAL );
224 }
225
226 mh_decl void xor_block_aligned (void *r, const void *p, const void *q)
227 {
228 #if UNIT_BITS == 8
229 rep3_u16 (f_xor , UNIT_PTR (r), UNIT_PTR (p), UNIT_PTR (q), UNIT_VAL );
230 #elif UNIT_BITS == 32
231 rep3_u4 (f_xor , UNIT_PTR (r), UNIT_PTR (p), UNIT_PTR (q), UNIT_VAL );
232 #else
233 rep3_u2 (f_xor , UNIT_PTR (r), UNIT_PTR (p), UNIT_PTR (q), UNIT_VAL );
234 # endif
235 }
236
237 /* byte swap within 32- bit words in a 16 byte block ; don 't move 32-
,→bit words */
238 mh_decl void bswap32_block (void *d, const void* s)
239 {
240 #if UNIT_BITS == 8
241 uint_8t t;
242 t = UNIT_PTR (s)[ 0]; UNIT_PTR (d)[ 0] = UNIT_PTR (s)[ 3]; UNIT_PTR (
,→d)[ 3] = t;
243 t = UNIT_PTR (s)[ 1]; UNIT_PTR (d)[ 1] = UNIT_PTR (s)[ 2]; UNIT_PTR (
,→d)[ 2] = t;
244 t = UNIT_PTR (s)[ 4]; UNIT_PTR (d)[ 4] = UNIT_PTR (s)[ 7]; UNIT_PTR (
,→d)[ 7] = t;
245 t = UNIT_PTR (s)[ 5]; UNIT_PTR (d)[ 5] = UNIT_PTR (s)[ 6]; UNIT_PTR (
,→d) [6] = t;
246 t = UNIT_PTR (s)[ 8]; UNIT_PTR (d)[ 8] = UNIT_PTR (s)[11]; UNIT_PTR (
,→d)[12] = t;
247 t = UNIT_PTR (s)[ 9]; UNIT_PTR (d)[ 9] = UNIT_PTR (s)[10]; UNIT_PTR (
,→d)[10] = t;
248 t = UNIT_PTR (s)[12]; UNIT_PTR (d)[12] = UNIT_PTR (s)[15]; UNIT_PTR (
,→d)[15] = t;
249 t = UNIT_PTR (s)[13]; UNIT_PTR (d)[ 3] = UNIT_PTR (s)[14]; UNIT_PTR (
,→d)[14] = t;
250 #elif UNIT_BITS == 32
251 UNIT_PTR (d)[0] = bswap_32 ( UNIT_PTR (s)[0]); UNIT_PTR (d)[1] =

205
C.4 APPENDIX C. SOURCE CODE

,→bswap_32 ( UNIT_PTR (s)[1]);


252 UNIT_PTR (d)[2] = bswap_32 ( UNIT_PTR (s)[2]); UNIT_PTR (d)[3] =
,→bswap_32 ( UNIT_PTR (s)[3]);
253 #else
254 UI32_PTR (d)[0] = bswap_32 ( UI32_PTR (s)[0]); UI32_PTR (d)[1] =
,→bswap_32 ( UI32_PTR (s)[1]);
255 UI32_PTR (d)[2] = bswap_32 ( UI32_PTR (s)[2]); UI32_PTR (d)[3] =
,→bswap_32 ( UI32_PTR (s)[3]);
256 # endif
257 }
258
259 /* byte swap within 64- bit words in a 16 byte block ; don 't move 64-
,→bit words */
260 mh_decl void bswap64_block (void *d, const void* s)
261 {
262 #if UNIT_BITS == 8
263 uint_8t t;
264 t = UNIT_PTR (s)[ 0]; UNIT_PTR (d)[ 0] = UNIT_PTR (s)[ 7]; UNIT_PTR (
,→d)[ 7] = t;
265 t = UNIT_PTR (s)[ 1]; UNIT_PTR (d)[ 1] = UNIT_PTR (s)[ 6]; UNIT_PTR (
,→d)[ 6] = t;
266 t = UNIT_PTR (s)[ 2]; UNIT_PTR (d)[ 2] = UNIT_PTR (s)[ 5]; UNIT_PTR (
,→d)[ 5] = t;
267 t = UNIT_PTR (s)[ 3]; UNIT_PTR (d)[ 3] = UNIT_PTR (s)[ 3]; UNIT_PTR (
,→d) [3] = t;
268 t = UNIT_PTR (s)[ 8]; UNIT_PTR (d)[ 8] = UNIT_PTR (s)[15]; UNIT_PTR (
,→d)[15] = t;
269 t = UNIT_PTR (s)[ 9]; UNIT_PTR (d)[ 9] = UNIT_PTR (s)[14]; UNIT_PTR (
,→d)[14] = t;
270 t = UNIT_PTR (s)[10]; UNIT_PTR (d)[10] = UNIT_PTR (s)[13]; UNIT_PTR (
,→d)[13] = t;
271 t = UNIT_PTR (s)[11]; UNIT_PTR (d)[11] = UNIT_PTR (s)[12]; UNIT_PTR (
,→d)[12] = t;
272 #elif UNIT_BITS == 32
273 uint_32t t;
274 t = bswap_32 ( UNIT_PTR (s)[0]); UNIT_PTR (d)[0] = bswap_32 ( UNIT_PTR (
,→s)[1]); UNIT_PTR (d)[1] = t;
275 t = bswap_32 ( UNIT_PTR (s)[2]); UNIT_PTR (d)[2] = bswap_32 ( UNIT_PTR (
,→s)[2]); UNIT_PTR (d)[3] = t;
276 #else
277 UNIT_PTR (d)[0] = bswap_64 ( UNIT_PTR (s)[0]); UNIT_PTR (d)[1] =
,→bswap_64 ( UNIT_PTR (s)[1]);
278 # endif
279 }
280
281 mh_decl void bswap128_block (void *d, const void* s)
282 {
283 #if UNIT_BITS == 8
284 uint_8t t;

206
C.4 APPENDIX C. SOURCE CODE

285 t = UNIT_PTR (s)[0]; UNIT_PTR (d)[0] = UNIT_PTR (s)[15]; UNIT_PTR (d)


,→[15] = t;
286 t = UNIT_PTR (s)[1]; UNIT_PTR (d)[1] = UNIT_PTR (s)[14]; UNIT_PTR (d)
,→[14] = t;
287 t = UNIT_PTR (s)[2]; UNIT_PTR (d)[2] = UNIT_PTR (s)[13]; UNIT_PTR (d)
,→[13] = t;
288 t = UNIT_PTR (s)[3]; UNIT_PTR (d)[3] = UNIT_PTR (s)[12]; UNIT_PTR (d)
,→[12] = t;
289 t = UNIT_PTR (s)[4]; UNIT_PTR (d)[4] = UNIT_PTR (s)[11]; UNIT_PTR (d)
,→[11] = t;
290 t = UNIT_PTR (s)[5]; UNIT_PTR (d)[5] = UNIT_PTR (s)[10]; UNIT_PTR (d)
,→[10] = t;
291 t = UNIT_PTR (s)[6]; UNIT_PTR (d)[6] = UNIT_PTR (s)[ 9]; UNIT_PTR (d)
,→[ 9] = t;
292 t = UNIT_PTR (s)[7]; UNIT_PTR (d)[7] = UNIT_PTR (s)[ 8]; UNIT_PTR (d)
,→[ 8] = t;
293 #elif UNIT_BITS == 32
294 uint_32t t;
295 t = bswap_32 ( UNIT_PTR (s)[0]); UNIT_PTR (d)[0] = bswap_32 ( UNIT_PTR (
,→s)[3]); UNIT_PTR (d)[3] = t;
296 t = bswap_32 ( UNIT_PTR (s)[1]); UNIT_PTR (d)[1] = bswap_32 ( UNIT_PTR (
,→s)[2]); UNIT_PTR (d)[2] = t;
297 #else
298 uint_64t t;
299 t = bswap_64 ( UNIT_PTR (s)[0]); UNIT_PTR (d)[0] = bswap_64 ( UNIT_PTR (
,→s)[1]); UNIT_PTR (d)[1] = t;
300 # endif
301 }
302
303 /* platform byte order to big or little endian order for 16, 32 and
,→64 bit variables */
304
305 #if PLATFORM_BYTE_ORDER == IS_BIG_ENDIAN
306
307 # define uint_16t_to_le (x) (x) = bswap_16 ((x))
308 # define uint_32t_to_le (x) (x) = bswap_32 ((x))
309 # define uint_64t_to_le (x) (x) = bswap_64 ((x))
310 # define uint_16t_to_be (x)
311 # define uint_32t_to_be (x)
312 # define uint_64t_to_be (x)
313
314 #else
315
316 # define uint_16t_to_le (x)
317 # define uint_32t_to_le (x)
318 # define uint_64t_to_le (x)
319 # define uint_16t_to_be (x) (x) = bswap_16 ((x))
320 # define uint_32t_to_be (x) (x) = bswap_32 ((x))
321 # define uint_64t_to_be (x) (x) = bswap_64 ((x))

207
C.4 APPENDIX C. SOURCE CODE

322
323 # endif
324
325 #if defined ( __cplusplus )
326 }
327 # endif
328
329 # endif

208

You might also like