CryptDB: Confidentiality for Database Applications with Encrypted Query Processing
Raluca Ada Popa, Catherine Redfield, Nickolai Zeldovich, and Hari Balakrishnan
MIT CSAIL
Berkeley Cloud Computing Seminar, 2011
Problem: Confidential Data Leaks
curious DB administrators
User 1
User 2
User 3
Application
SQL
DB Server
hackers curious cloud/employees physical attacks
Both on private clouds and public clouds Regulatory laws
CryptDB
Goal: protect confidentiality of data
Threat 2: active/passive attacks on all servers Threat 1: passive attacks on DB server
user password
User 1 User 2 User 3
Application
Proxy
SQL
DB Server
1. 2.
Process SQL queries on encrypted data Capture and enforce cryptographically access control in SQL: chain keys from user passwords to data item
Threat Model
Consider attacks on any part of the servers We do not consider integrity attacks Can affect data integrity, but not confidentiality
Threat 1: Passive attacks to DB Server
Trusted
application queries unencrypted
Under attack
SQL
Proxy
DB Server
Stores schema, master key Decrypts results No query execution
Perform SQL query processing on encrypted data
1. 2. 3.
Support standard SQL queries on encrypted data Process queries completely at the DB server No change to existing DBMS
Application
Example
table1 (emp)
SELECT * FROM emp WHERE salary = 100 SELECT * FROM table1 WHERE col3 = x5a8c34 x638e54
60 100 800 100
Proxy
col1/rank col2/name col3/salary x934bc1 x1eab8 x4be219 1 x5a8c34 x638e5 x95c623 4 x922eb4 x84a21c x2ea887 x638e5 x5a8c34 x17cea7 4
x638e5 x5a8c34 ? 4 x5a8c34 x922eb4 x638e5 x5a8c34 4
Two techniques
1. SQL-aware encryption strategy
Obs.: set of SQL operators is limited Different encryption schemes provide different functionality
2. Adjustable query-based encryption
Adapt encryption of data based on user queries
1. SQL-aware encryption
Highest
Scheme RND
Operation None
Details
AES in UFE
HOM
Security
+, * equality
join ILIKE order
e.g., Paillier
AES in CTR
e.g., =, !=, GROUP BY, IN, COUNT, DISTINCT
DET
JOIN SEARCH OPE
new
Amanatidis et al.07 Boldyreva et al. 09
e.g., >, <, ORDER BY, SORT, MAX, MIN
first practical implementation
Onions of encryptions
Significant confidentiality and space savings
RND DET SEARCH JOIN
Any value
RND OPE OPE-JOIN
Any value
HOM
int value
Onion 1
Onion 2
Onion 3
Each column has the same key in a given layer of an
onion
2. Adjustable query-based encryption
Start out the database with the most secure encryption scheme Adjust encryption dynamically
Strip off levels of the onions: proxy gives key to server using a UDF
Example
emp: rank name salary
RND DET SEARCH JOIN
Any value
SELECT * FROM emp WHERE salary = 100
UPDATE table1 SET col3onion1 = DecryptRND(key, col3onion1) SELECT * FROM table1 WHERE col3onion1 = x5a8c34
JOIN needs new crypto
Challenge: do not know which columns will be joined
Join key Col1-Col2 Proxy Col1 Col2
Data items not revealed, cannot join without join key
Other queries
Various others supported:
Inserts, updates, deletes, nested queries Indexes Transactions, auto-increments
Not supported: A.a + A.b > B.c
Security converges
Onion levels stripped only when new operations needed
Steady State: no decryptions at server Practical: typical SQL processing on enlarged tuples
Confidentiality Guarantees
Formal security definition and proof Implications:
emp: rank name salary
If query has equality predicate on name repeats order predicate on name order aggregation on salary nothing no filter on a column nothing
Never reveal plaintext Server cannot compute queries requiring unrequested relationships
Picture so far
User 1 User 2 User 3
Under attack Proxy Under attack DB Server
Application
SQL
Threat 2: arbitrary confidentiality attacks on any servers
Each user password gives access to data allowed by access control policy of application
Problem: data sharing
1.
How to capture read access policy of application at SQL granularity?
Annotations: app. policy SQL policy
2.
How to enforce access control cryptographically?
Key chaining from password to data item in DB
3.
How to execute queries?
Process on encrypted data as before!
Key chaining to user passwords
Enforce access control graph cryptographically Principals
userid 1
SKu1
All key chaining operations done at proxy, keys stored encrypted at DB server
Username: Alice Password: amplab
SKa = psswd ESKa[SKu1]
ESKu1[SKm5]
msgid 5
SKm5
secret message
SKm5
userid 2
Username: Bob Password: cloud
SKb = psswd ESKb[SKu2]
SKu2
ESKu2[SKm5]
Also use public key pair
Annotations
Observation: Each row in certain tables naturally specifies
1. 2.
permission flow between principals how data should be encrypted
privmsgs_to: msgid 5 6 senderid 1 9 recipientid 2 6 privmsgs: msgid 5 6 msgtext
secret message hello world
Annotations
1. Principals 2. ENCRYPT_FOR 3. HAS_ACCESS_TO
Securing phpBB private messages:
PRINC TYPES physical_user EXTERNAL; PRINC TYPES user, msg; CREATE TABLE privmsgs ( msgid int, subject varchar(255) ENCRYPT_FOR PRINC msgid TYPE msg, msgtext text ENCRYPT_FOR PRINC msgid TYPE msg ); CREATE TABLE privmsgs_to ( msgid int, rcpt id int, sender id int, PRINC sender_id TYPE user HAS_ACCESS_TO PRINC msgid TYPE msg, PRINC rcpt_id TYPE user HAS_ACCESS_TO PRINC msgid TYPE msg ); CREATE TABLE users ( userid int,username varchar(255), PRINC username TYPE physical_user HAS_ACCESS_TO PRINC userid TYPE user );
Security
Protects data readable only by users not logged in at the moment/for the duration of an attack Leaking logged-in users data seems unavoidable because applications may perform arbitrary computations on it Example: protection even when adversary changes annotations recorded at proxy
Implementation
SQL Interface
Query
Server
Encrypted Query
Encrypted Results
Application
CryptDB Results Proxy
Unmodified DBMS
CryptDB PK tables
CryptDB UDFs (user-defined
functions)
No change to the DBMS Portable: from Postgres to MySQL with 86 lines One-key: no change to applications Multi-user keys: annotations and login/logout
Evaluation
Multi-key CryptDB:
phpBB hotCRP MIT grad admissions Encrypted sensitive fields
Supports all queries on sensitive fields Annotations can express read access control
One-key CryptDB:
TPC-C Encrypted all fields
Supports all queries on all data
Application changes
400,000 lines of code
Confidentiality in the DB
All the most sensitive fields remained at RND
Fields at OPE were either semisensitive or not sensitive
Importance of adjustable query-based encryption to confidentiality
Low overhead
TPC-C
Throughput loss 27%
phpBB: throughput loss of 13%
Encrypted DBMS is practical
Related work
Theoretical approaches ([Gentry10+, *Gennaro et al., 10+)
Inefficient
Search on encrypted data (e.g., *Song et al., 00+)
Restricted set of queries, inefficient
Systems proposals (e.g., [Hacigumus et al., 02+)
Lower degree of security, rewrite the DBMS, client-side processing
Software checks (e.g., PQL, UrFlow, Resin)
No protection against adversaries with complete access to servers
Conclusions
CryptDB:
1.
The first practical DBMS for running most standard queries on encrypted data
Secures the DB server against attacks to any part One-key solution is standalone
2.
3.
Protects data of logged out users even when all servers are compromised Modest overhead and minimal app. changes
Thanks!