Let’s explore a new AI security risk: model inversion attacks.
What is it?
It’s a type of AI attack during which an attacker tries to infer personal information about a data subject by exploiting the outputs of a machine learning model. When we talk about the output of a model, it’s kind of like the answer that the computer gives you after it has learned something. It’s like when you ask a question, and someone gives you an answer; the output is like the computer’s answer.
For example, if we teach the computer to recognise pictures of cats, the output might be whether the picture has a cat in it or not. So basically, the output is the computer’s response after it has learned something.
How does model inversion attack work?
In this attack, the attacker first trains a separate machine learning model, known as an inversion model, on the output of the target model (your model). The inversion model’s task is to predict the input data, that is, the original dataset for the target model.
By analysing the inversion model’s predictions, the attacker can learn information about the data subject you didn’t intend for the target model to reveal.
An example
Suppose you train a machine learning model to predict whether a person has heart disease based on their medical history. Then, an attacker who doesn’t have access to the person’s medical history could use a model inversion attack to infer the person’s medical history. How? By using the machine learning model output (heart disease markers) as the input to an inversion model.
The attacker then trains the inversion model to predict the person’s medical history. Ultimately, the attacker can infer personal information about the person’s health.
How to prevent these attacks
These attacks can be prevented through various security techniques such as differential privacy, federated learning, and secure multi-party computation:
- Differential privacy. It’s a technique that involves adding noise to data to protect individual privacy. This technique ensures that the output of a computation does not reveal any personal information about the individual’s data points. By adding noise to the data, differential privacy provides a statistical guarantee that the privacy of data subjects is protected.
- Federated learning. This security technique enables multiple devices to collaboratively train a machine learning model without sharing their data with a central server. In federated learning, the devices send encrypted updates of their local models to a central server, which aggregates the updates and sends back an updated global model. This approach ensures that the data remains on the devices and is not accessible by the central server, thereby protecting the privacy of the data.
- Secure multi-party computation. It’s a technique that enables multiple parties to jointly compute a function on their private data without revealing their data to each other. In secure multi-party computation, each party encrypts their data and shares it with the other parties. The parties then perform computations on the encrypted data to obtain the desired result without revealing personal information.
Actions you can take next
- Manage the data protection risks of your AI projects by asking us to join our data protection programme.
- Understand how AI systems impact your data protection compliance efforts by asking us to draft a legal opinion.
- Explore how your organisation could comply by consulting with us regarding artificial intelligence and data protection laws.
- Take steps to disclose your use of AI systems by engaging us to help you draft a new or updated privacy policy.