0% found this document useful (0 votes)
104 views84 pages

Guide to Securing AI Systems 2024

Uploaded by

LuongTrungThanh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
104 views84 pages

Guide to Securing AI Systems 2024

Uploaded by

LuongTrungThanh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

COMPANION

GUIDE ON
SECURING
AI SYSTEMS
OCTOBER 2024
This document is meant as a community-driven resource with
contribution from the AI and cybersecurity practitioner communities. It
puts together available and practical mitigation measures and
practices. This document is intended for informational purposes only
and is not mandatory, prescriptive nor exhaustive.

System owners should refer to this Companion Guide as a resource,


alongside other available resources in observing the Cyber Security
Agency of Singapore’s (CSA) Guidelines on Securing AI systems. This
Companion Guide is a living document that will be continually updated
to address material developments in this space.

DEVELOPED IN CONSULTATION WITH


This document is published by the CSA, developed with partners
across the AI and Cyber communities, including:

Accenture
Artificial Intelligence Technical Committee, Information Technology Standards Committee (AITC, ITSC)
Association of Information Security Professionals (AiSP)’s Artificial Intelligence Special Interest Group (AI SIG)
Alibaba Cloud (Singapore) Pte Ltd
Amazon Web Services Singapore
Amaris.AI
BSA | The Software Alliance
Ensign InfoSecurity Pte Ltd
F5
Google Asia Pacific Pte Ltd
Huawei International Pte Ltd
Information Technology Industry Council (ITI)
Kaspersky Lab Singapore Pte Ltd
KPMG in Singapore
Microsoft Singapore
COMPANION GUIDE ON SECURING AI SYSTEMS

Pricewaterhouse Coopers Risk Services Pte Ltd


Rajah & Tann Cybersecurity Pte. Ltd.
Rajah & Tann Technologies Pte. Ltd.
Resaro.AI
US-ASEAN Business Council
AI & Cyber practitioners across the Singapore Government

2
DISCLAIMER
These organisations provided views and suggestions on the security controls, descriptions of the
security control(s), and technical implementations included in this Companion Guide. CSA and its
partners shall not be liable for any inaccuracies, errors and/or omissions contained herein nor for
any losses or damages of any kind (including any loss of profits, business, goodwill, or reputation,
and/or any special, incidental, or consequential damages) in connection with any use of this
Companion Guide. Organisations are advised to consider how to apply the controls within to their
specific circumstances, in addition to other additional measures relevant to their needs.
COMPANION GUIDE ON SECURING AI SYSTEMS

3
VERSION HISTORY

VERSION DATE RELEASED REMARKS

0.1 29 July 2024 Draft release of Companion Guide

1.0 15 Oct 2024 First release


COMPANION GUIDE ON SECURING AI SYSTEMS

4
TABLE OF
CONTENTS
1. INTRODUCTION ...................................................................................................................................... 8
1.1. PURPOSE AND SCOPE .................................................................................................................. 9
2. USING THE COMPANION GUIDE ........................................................................................................... 10
2.1. START WITH A RISK ASSESSMENT ................................................................................................ 11
2.2. IDENTIFY THE RELEVANT MEASURES/CONTROLS ........................................................................ 12
2.2.1. PLANNING AND DESIGN ...................................................................................................... 13
2.2.2. DEVELOPMENT .................................................................................................................... 16
2.2.3. DEPLOYMENT ...................................................................................................................... 31
2.2.4. OPERATIONS AND MAINTENANCE ....................................................................................... 39
2.2.5. END OF LIFE ......................................................................................................................... 42
3. USE CASE EXAMPLES ............................................................................................................................ 44
3.1. DETAILED WALKTHROUGH EXAMPLE .......................................................................................... 44
3.1.1. RISK ASSESSMENT EXAMPLE ................................................................................................ 45
3.1.2. WALKTHROUGH OF TABULATED MEASURES/CONTROLS ..................................................... 46
3.2. STREAMLINED IMPLEMENTATION EXAMPLE ................................................................................ 56
3.2.1. RISK ASSESSMENT EXAMPLE – EXTRACT ON PATCH ATTACK ................................................ 57
3.2.2. RELEVANT TREATMENT CONTROLS FROM COMPANION GUIDE ........................................... 58
GLOSSARY ..................................................................................................................................................... 59
ANNEX A ........................................................................................................................................................ 63
LIST OF AI TESTING TOOLS ........................................................................................................................ 66
OFFENSIVE AI TESTING TOOLS ......................................................................................................... 67
COMPANION GUIDE ON SECURING AI SYSTEMS

DEFENSIVE AI TESTING TOOLS ......................................................................................................... 70


AI GOVERNANCE TESTING TOOLS .................................................................................................... 71
ANNEX B ........................................................................................................................................................ 74
REFERENCES ................................................................................................................................................. 80

5
QUICK
REFERENCE TABLE
Stakeholders in specific roles may use the following table to quickly reference relevant
controls in section “2.2 IDENTIFY THE RELEVANT MEASURES/CONTROLS”

The roles defined below are included to guide understanding of this document and are not
intended to be authoritative.

Decision Makers:
Responsible for overseeing the strategic and operational aspects of AI implementation for
the AI system. They are responsible for setting the vision and goals for AI initiatives,
defining product requirements, allocating resources, ensuring compliance, and
evaluating risks and benefits.
Roles Included: Product Manager, Project Manager

AI Practitioners:
Responsible for the practical application (i.e. designing, developing, and implementing AI
models and solutions) across the life cycle. This includes collecting or procuring and
analysing data that goes into systems, building the AI system architecture and
infrastructure, building and optimising the AI system to deliver the required functions, as
well as conducting rigorous testing and validation of AI models to ensure their accuracy,
reliability, and performance. In cases where the AI system utilizes a third-party AI system,
AI Practitioners include the third-party provider responsible for these activities, e.g. as
contracted through a Service Level Agreement (SLA). AI practitioners would be in charge
of implementing the required controls across the entire system.
Roles Included: AI/ML Developer, AI/ML Engineer, Data Scientist
COMPANION GUIDE ON SECURING AI SYSTEMS

Cybersecurity Practitioners:
Responsible for ensuring the security and integrity of AI systems. This includes
implementing security measures to protect AI systems in collaboration with AI
Practitioners, monitoring for potential threats, ensuring compliance with cybersecurity
regulations.
Roles Included: IT Security Practitioner, Cybersecurity Expert

6
The following sections The following sections The following sections
may be relevant to may be relevant to may be relevant to
Decision Makers: AI Practitioners: Cybersecurity
Practitioners:

1.1 Team competency on 1.1 Team competency on 1.1 Team competency on


threats and risks threats and risks threats and risks
1.2 Conduct security risk 1.2 Conduct security risk 1.2 Conduct security risk
assessment assessment assessment

2.1 Secure the supply chain 2.1 Secure the supply chain 2.1 Secure the supply chain
2.2 Model development 2.3 Identify, track and
2.3 Identify, track and protect assets
protect assets 2.4 Secure the AI
2.4 Secure the AI development environment
development environment

3.1 Secure the deployment


3.1 Secure the deployment 3.1 Secure the deployment infrastructure and
infrastructure and infrastructure and environment
environment environment 3.2 Have well developed
3.2 Have well developed 3.2 Have well developed incident management
incident management incident management procedures
procedures procedures 3.3 Release AI responsibly
3.3 Release AI responsibly

4.4 Vulnerability disclosure 4.1 Monitor system outputs 4.1 Monitor system outputs
process and behaviour and behaviour
4.2 Monitor system inputs 4.2 Monitor system inputs
4.3 Have a secure-by- 4.4 Vulnerability disclosure
design approach to process
updates and continuous
learning
4.4 Vulnerability disclosure
COMPANION GUIDE ON SECURING AI SYSTEMS

process

5.1 Proper data and model 5.1 Proper data and model 5.1 Proper data and model
disposal disposal disposal

Table 1: User Quick Reference Table

7
1. INTRODUCTION
Artificial Intelligence (AI) poses benefits for economy, society, and
national security. It has the potential to drive efficiency and innovation
in almost every sector – from commerce and healthcare to
transportation and cybersecurity.

To reap the benefits, users must have confidence that the AI will behave as designed, and
outcomes are safe, secure, and responsible manner. However, in addition to safety risks,
AI systems can be vulnerable to adversarial attacks, where malicious actors intentionally
manipulate or deceive the AI system. The adoption of AI can introduce or exacerbate
existing cybersecurity risks to enterprise systems. These can lead to risks such as
data leakage or data breaches, or result in harmful, unfair, or otherwise undesired
model outcomes. As such, the Cyber Security Agency of Singapore (CSA) has released
the Guidelines on Securing AI Systems to advise system owners on securing their
adoption of AI.

Nonetheless, AI security is a developing field of study, and understanding of the


security risks associated with AI continues to evolve internationally. As such,
government agencies, our industry partners, AI and cybersecurity practitioners have
put together this Companion Guide on Securing AI Systems. The Companion Guide is
a community-driven resource. It puts together available and practical mitigation
measures and practices, drawing from industry and academia, as well as key resources
such as the MITRE ATLAS database and OWASP Top 10 for Machine Learning and for
Generative AI. System owners can refer to this Companion Guide as a resource, alongside
other available resources in observing the Guidelines. This document is intended for
informational purposes only and is not mandatory, prescriptive nor exhaustive. They
should not be construed as comprehensive guidance or definitive recommendations.
COMPANION GUIDE ON SECURING AI SYSTEMS

This Companion Guide is a living document that will be continually updated to address
material developments in this space.

8
1.1. PURPOSE AND SCOPE
Purpose
This Companion Guide curates practical treatment measures and controls that system
owners of AI systems may consider to secure their adoption of AI systems. These
measures/controls are voluntary, and not all the treatment measures/controls
listed in this Companion Guide will be directly applicable to all organisations or
environments. Organisations may also be at different stages of development and
release (e.g. POC, pilot, beta release). Organisations should consider relevance to their
use cases/applications.

The Companion Guide is also meant as a resource to support system owners in


addressing CSA’s Guidelines on Securing AI Systems.

Scope
The controls within the Companion Guide primarily address the cybersecurity risks to
AI systems. It does not address AI safety, or other common attendant considerations
for AI such as fairness, transparency or inclusion, or cybersecurity risks introduced by
AI systems, although some of the recommended controls may overlap. It also does not
cover the misuse of AI in cyberattacks (AI-enabled malware), mis/disinformation, and
scams (deepfakes).
COMPANION GUIDE ON SECURING AI SYSTEMS

9
2. USING THE
COMPANION GUIDE
The Companion Guide puts together potential treatment measures/
controls that can support secure adoption of AI. However, not all of
these controls might apply to your organisation.

Our goal is to put together a comprehensive set of treatment measures that system
owners can consider for their respective use cases across the AI system lifecycle. These
span the categories of People, Process and Technology.

There are two categories of measures/controls: (1) based on classical cybersecurity


practices, which continue to be relevant to AI systems; and (2) others unique to AI
systems. Measures/controls marked with an asterisk (*) next to their number indicates
that they are unique to AI systems.

Each measure/control is designed to be used independently, to offer flexibility in


customising which measures to evaluate and what mitigations to adopt, based on the
specific needs of your organisation.
COMPANION GUIDE ON SECURING AI SYSTEMS

10
2.1. START WITH A RISK
ASSESSMENT
As in CSA’s Guidelines for Securing AI Systems, system owners should consider starting
with a risk assessment. This will enable organisations to identify potential risks, priorities,
and subsequently, the appropriate risk management strategies (including what measures
and controls are appropriate).

You can consider the following four steps to tailor a systematic defence plan that best
addresses your organisation’s highest priority risks – protecting the things you care about
the most.

STEP 1
Conduct risk assessment, focusing on security risks to AI systems

Conduct a risk assessment, focusing on security risks related to AI systems, either based
on best practices or your organisation’s existing Enterprise Risk Assessment/Management
Framework.
Risk assessment can be done with reference to CSA published guides, if applicable:
▪ Guide To Cyber Threat Modelling
▪ Guide To Conducting Cybersecurity Risk Assessment for Critical Information
Infrastructure

STEP 2
Prioritise areas to address based on risk/impact/resources

Prioritise which risks to address, based on risk level, impact, and available resources.

STEP 3
Identify and implement the relevant actions to secure the AI system

Identify relevant actions and control measures to secure the AI system, such as by
referencing those outlined in the Companion Guide on Securing AI Systems and
COMPANION GUIDE ON SECURING AI SYSTEMS

implement these across the AI life cycle.

STEP 4
Evaluate residual risks for mitigation or acceptance

Evaluate the residual risk after implementing security measures for the AI system to inform
decisions about accepting or addressing residual risks.

11
2.2. IDENTIFY THE RELEVANT
MEASURES/CONTROLS
Based on the risk assessment, system owners can identify the relevant
measures/controls from the following tables. Each treatment measure/ control
plays a different role, and should be assessed for relevance and priority in
addressing the security risks specific to your AI system and context (Refer to
section “2.1 START WITH A RISK ASSESSMENT”).

Checkboxes are included to help users of this document to keep track of which measures/controls
are applicable, and have (or have not) been implemented.

Related risks and Associated MITRE ATLAS Techniques1 indicated serve as examples and are not
exhaustive. They might differ based on your organisation’s use case.

Example implementations are included for each measure/control as a more tangible elaboration
on how they can be applied. These are also not exhaustive.

Additional references and resources are provided for users of this document to obtain further
details on applying the treatment measure/control if required.

Asterisks (*) indicate measures/controls that are unique to AI systems (those without an
asterisk indicate more classical cyber practices).
COMPANION GUIDE ON SECURING AI SYSTEMS

1
MITRE ATLAS Framework offer a structured way to understand cyber threats in relation to AI systems (see Annex A)

12
2.2.1. PLANNING AND DESIGN

1.1 Raise awareness and competency on security risks


Security is everyone’s responsibility. Staff are provided with proper training and guidance.
Suggested Treatment Yes No NA Related Risks Example Implementation Reference or Resource
Measures/Controls for
consideration
1.1.1* Ensure system owners and • Incidents occurring due to poor Attending seminars on AI threats, • Principles for the Security of
senior leaders understand cyber hygiene and/or knowledge policies, and compliance and get Machine Learning
threats to secure AI and their exposed to case studies to (UK NCSC)
mitigations. appreciate the many AI potential
and associated risks. • Secure by Design - Shifting
Responsible parties: the Balance of Cybersecurity
Decision Makers Internal workshops and eLearning Risk: Principles and
courses can inform employees on Approaches for Secure by
1.1.2* Provide guidance to staff on Design Software
AI basics, responsible use, and
Security by Design and
relevant regulations. Integrate • Failure modes in Machine
Security by Default
regular security training as part of Learning (Microsoft)
principles as well as unique
the company’s AI innovation
AI security risks and failure • OWASP AI Exchange
training for a balanced approach.
modes as part of InfoSec
training. e.g. LLM security • Advisory Guidelines on use
Online resources, e.g. electronic
matters, common AI of Personal Data in AI
newsletters and YouTube videos
weaknesses and attacks. Recommendation and
could provide a means to track AI
Decision Systems
security developments that are
Responsible parties:
emerging almost daily.
Decision Makers, AI
Practitioners, Cybersecurity
Documentary evidence that team
Practitioners
members have relevant security
1.1.3 Train developers in secure • Code vulnerabilities that could be knowledge and training. These
coding practices and good exploited can include, where applicable:
practices for the AI lifecycle. • Training records
• Attendance records
Responsible parties: • Assessments
Decision Makers, AI • Certifications
Practitioners
Establish the right cross-
functional team to ensure that
security, risk, and compliance
considerations are included from
the start.

13
1.2 Conduct security risk assessments
Apply a holistic process to model threats to the system.
Treatment Yes No NA Related Risks Example Implementation Reference or Resource
Measures/Controls
1.2.1* Understand AI governance • No triage, leading to confusion Perform a security risk • Reference the case studies
and legal requirements, and locked or overloaded assessment to determine the in this document.
the impact to the system, resources in the event of an AI consequences and impact to the
users, organisation, if an security incident various stakeholders, and if the AI • Singapore Model
AI component is component does not behave as Governance Framework for
compromised or has • Slow incident response, leading to intended. Generative AI
unexpected behaviour or large damage done
Understand the AI inventory of • NIST AI Risk Management
there is an attack that • Slow remediation, leading to Framework
systems used and their
affected AI privacy. prolonged operational outage implications and interactions.
• ISO 31000: Risk
Plan for an attack and its • Slow response means that Management
mitigation, using the attackers could do more damage,
principles of CIA. cover their tracks e.g. using anti- • MITRE ATLAS
forensics
• NCSC Risk Management
Responsible parties:
Guidance
Decision Makers, AI
Practitioners, • OWASP Threat Modelling
Cybersecurity
Practitioners • OWASP Machine Learning
Security Top Ten
1.2.2* Assess AI-related attacks Having/Developing a play book
and implement mitigating and AI incident handling • Threats to AI using
steps. procedures that will shorten the Microsoft STRIDE
time to remediate and reduce
Responsible parties: resources wasted on • Advisory Guidelines on use
AI Practitioners, unnecessary steps. of Personal Data in AI
Cybersecurity Recommendation and
Document the decision-making Decision Systems
Practitioners
process of assessing potential AI
threats and possible attack • Model Artificial Intelligence
surfaces, as well as steps to Governance Framework
mitigate these threats. This can
be done through a threat risk
assessment. Project risks may
extend beyond security, e.g.
newer AI models could obsolete

14
1.2 Conduct security risk assessments
Apply a holistic process to model threats to the system.
Treatment Yes No NA Related Risks Example Implementation Reference or Resource
Measures/Controls
the entire use case and business
assumptions.
1.2.3 Conduct a risk • Failure to comply with industry Refer to the industry standards
assessment in accordance standards/best practices may and best practices when
with the relevant industry lead to insufficient, inefficient or performing risk assessment.
standards/best practices. ineffective mitigations

Responsible parties:

Decision Makers, AI
Practitioners,
Cybersecurity
Practitioners

15
2.2.2. DEVELOPMENT

2.1 Secure the Supply Chain


Assess and monitor the security of the supply chain across the system’s life cycle.
Treatment Yes No NA Related Risks Example Implementation Reference or Resource
Measures/Controls
2.1.1 Implement Secure Coding • Introduction of bugs, Adopt Security by Design. • CSA Critical Information
and Development Lifecycle. vulnerabilities or unwanted and Infrastructure Supply Chain
Apply software development
malicious active content, such Programme
lifecycle (SDLC) process.
Responsible parties: as AI poisoning and model
• NCSC Supply Chain
Decision Makers, AI backdoors Use software development tools
Guidance
Practitioners to check for insecure coding
Associated MITRE ATLAS Techniques: practices. • Supply-chain Levels for
• AML.T0018.000 Backdoor ML Software Artifacts (SLSA)
Consider implementing zero trust
Model
principles in system design. • MITRE Supply Chain Security
• AML.T0020.000 Poison Training
Framework
Data
2.1.2 Supply Chain Security: If procuring any AI System or
• AML.T0010 ML Supply Chain • OWASP Top 10 LLM
Ensure data, models, component from a vendor,
Compromise Applications
compilers, software check/ensure suppliers adhere to
policy and the equivalent security • MITRE Supply Chain Security
libraries, developer tools and
standards as your organisation. Framework
applications from trusted This could be done by establishing
sources. a Service Level Agreement (SLA) • NIST Secure Software
with the vendor. Development Framework for
Responsible parties: Generative AI and for Dual
Decision Makers, AI If the above is not plausible, Use Foundation Models
Practitioners, Cybersecurity consider using software Virtual Workshop
Practitioners components only from trusted
sources.
Verify object integrity e.g. hashes
before using, opening, or running
any files.
Associated MITRE Mitigations:
• AML.M0016 Vulnerability
Scanning
• AML.M0013 Code Signing
• AML.M0007 Sanitize Training
Data

16
2.1 Secure the Supply Chain
Assess and monitor the security of the supply chain across the system’s life cycle.
Treatment Yes No NA Related Risks Example Implementation Reference or Resource
Measures/Controls
• AML.M0014 Verify ML
Artifacts
• AML.M0008 Validate ML
Model
2.1.3* Protect the integrity of data • Data poisoning attacks Use automated data discovery • ETSI AI Data Supply Chain
that will be used for training tools to identify sensitive data Security
• Exposure of sensitive and across various environments,
the model. classified data in the AI training • DSTL Machine Learning with
including databases, data lakes,
Data and cloud storage. Limited Data
Responsible parties:
AI Practitioners Associated MITRE ATLAS Techniques: Implement secure workflow and
• AML.T0020.000 Poison Training data flow to ensure the integrity of
Data the data used.
• AML.T0019 Publish Poison
Dataset When viable, have humans look at
each data input and generate
notifications where labels differ.
Use statistical and automated
methods to check for
abnormalities.
Associated MITRE Mitigations:
• AML.M0007 Sanitize Training
Data
• AML.M0014 Verify ML
Artifacts

2.1.4* Consider the trade-offs • Model backdoors Untrusted 3rd party models are
when deciding to use an models obtained from
• Remote code execution
untrusted 3rd party model public/private repositories, whose
(with or without fine tuning). Associated MITRE ATLAS Techniques: publisher's origins cannot be
• AML.T0018 Backdoor ML Model verified.
Responsible parties: • AML.T0043 Craft Adversarial
Decision Makers, AI Data While there are benefits to relying
Practitioners, Cybersecurity • AML.T0050 Command and on 3rd party models, possible risks
Practitioners Scripting Interpreter

17
2.1 Secure the Supply Chain
Assess and monitor the security of the supply chain across the system’s life cycle.
Treatment Yes No NA Related Risks Example Implementation Reference or Resource
Measures/Controls
include less control and visibility
of model development.

This reduced visibility may


introduce backdoors injected by
malicious actors. Consider the
trade-offs based on your
application’s requirements

Associated MITRE Mitigations:


• AML.M0018 User Training
• AML.M0013 Code Signing
2.1.5* Consider sandboxing Running the model within a virtual
untrusted models or machine or isolated environment
serialised weight files where away from production
relevant. environment.

Responsible parties: Associated MITRE Mitigations:


AI Practitioners, • AML.M0008 Validate ML
Cybersecurity Practitioners Model
• AML.M0018 User Training
• AML.M0013 Code Signing

2.1.6* Scan models or serialised Use scanning tools such as • Pickle Scanning (Hugging
weight files. Picklescan, Modelscan, on model Face)
files from an external source on a • Stable Diffusion Pickle
Responsible parties:
separate platform/system where Scanner GUI
AI Practitioners,
the production system is on.
Cybersecurity Practitioners • Also see Annex A – Technical
Associated MITRE Mitigations: Testing and System Validation
• AML.M0016 Vulnerability
Scanning
• AML.M0008 Validate ML
Model

18
2.1 Secure the Supply Chain
Assess and monitor the security of the supply chain across the system’s life cycle.
Treatment Yes No NA Related Risks Example Implementation Reference or Resource
Measures/Controls
2.1.7 Consider the trade-offs • Data leaks Check that data uploaded is non- • Advisory Guidelines on use of
associated with using sensitive or protected before Personal Data in AI
• Compromised privacy Recommendation and
sensitive data for model submitting to the external model
Associated MITRE ATLAS Techniques: Decision Systems
training or inference. according to enterprise data
• AML.T0024 Exfiltration via ML protection policy/requirements.
Responsible parties: Inference API
Decision Makers, AI • AML.T0057 LLM Data Leakage Organisations may explore various
Practitioners • AML.T0056 LLM Meta Prompt risk mitigation measures to secure
Extraction their non-public sensitive data,
• AML.T0040 ML Model Inference such as anonymisation and
API Access privacy-enhancing technologies,
• AML.T0047 ML Model Product or
before making decision on the use
Service
• AML.T0049 Exploit Public Facing of sensitive data for model
Application training.

Pay specific attention to supplier


policies on the confidentiality of
user data, most notably ensure
that suppliers commit that user
inputs and model outputs are not
subsequently used for model
training.

If necessary, consider techniques


such as anonymisation, before
deciding to use sensitive data for
training.

Associated MITRE Mitigations:


• AML.M0012 Encrypt Sensitive
Information
• AML.M0016 Vulnerability
Scanning

19
2.1 Secure the Supply Chain
Assess and monitor the security of the supply chain across the system’s life cycle.
Treatment Yes No NA Related Risks Example Implementation Reference or Resource
Measures/Controls
2.1.8 Apply appropriate controls • Data leaks Implement an automated Data
for data being sent out of the Loss Prevention, exfiltration
• Compromised privacy
organisation. countermeasures, alert triggers
Associated MITRE ATLAS Techniques: and possibly human intervention
Responsible parties: • AML.T0024 Exfiltration via ML e.g. added confirmation via login
AI Practitioners, Inference API and input confirmation.
Cybersecurity Practitioners • AML.T0057 LLM Data Leakage
• AML.T0056 LLM Meta Prompt Associated MITRE Mitigations:
Extraction • AML.M0012 Encrypt Sensitive
• AML.T0040 ML Model Inference Information
API Access • AML.M0004 Restrict Number
• AML.T0047 ML Model Product or of ML Model Queries
Service • AML.M0019 Control Access
• AML.T0049 Exploit Public Facing to ML Models and Data in
Application Production.
• Insecure or vulnerable libraries,
2.1.9 Consider evaluation of For example, ensure the library • CVE List
which can introduce unexpected
dependent software attack surfaces does not have arbitrary code
• Open-source Insights
libraries, open-source execution when being imported or
models and when possible, • Model Subversion used. This can be done by using AI • OSS Insight
run code checking. Associated MITRE ATLAS Techniques: code checking, a vulnerability
• AML.T0016 Obtain Capabilities scanning tool, or checking against
Responsible parties:
a database with vulnerability
AI Practitioners,
information.
Cybersecurity Practitioners
Associated MITRE Mitigations:
• AML.M0008 Validate ML
Model
• AML.M0011 Restrict Library
Loading
• AML.M0004 Restrict Number
of ML Model Queries
• AML.M0008 Validate ML
Model
• AML.M0014 Verify ML
Artifacts

20
2.1 Secure the Supply Chain
Assess and monitor the security of the supply chain across the system’s life cycle.
Treatment Yes No NA Related Risks Example Implementation Reference or Resource
Measures/Controls
• AML.M0011 Restrict Library
Loading

2.1.10 Use software and libraries • Insecure or vulnerable libraries, Update to the latest secure patch • CVE List
that does not have known which can introduce unexpected in a timely manner.
• Open-source Insights
vulnerabilities. attack surfaces
Associated MITRE Mitigations:
• OSS Insight
Responsible parties: • Model Subversion • AML.M0008 Validate ML
AI Practitioners, Associated MITRE ATLAS Techniques: Model
Cybersecurity Practitioners • AML.M0014 Verify ML
• AML.T0016 Obtain Capabilities
Artifacts

21
2.2 Consider security benefits and trade-offs when selecting the appropriate model to use

Treatment Yes No NA Related Risks Example Implementation Reference or Resource


Measures/Controls
2.2.1* Assess the need to use • Privacy compromise Classify your organisation data • Advisory Guidelines on use of
sensitive data for training the based on sensitivity and/or Personal Data in AI
• Attackers may be able to extract enterprise data policy. Recommendation and
model, or directly referenced data used for training or from Decision Systems (PDPC)
by the model. vector stores via malicious Consider the need to use PII or
queries and prompt injections sensitive data to generate vector • Generative AI Scoping Matrix
Responsible parties: databases that will be referenced
AI Practitioners by the model e.g. when using • OWASP Machine Learning
Associated MITRE ATLAS Techniques:
Retrieval Augmented Generation Security Top 10 (2023 edition)
• AML.T0057 LLM Data Leakage
(RAG). - Draft release v0.3

Consider the trade-offs • OWASP Top 10 for Large


associated with using sensitive Language Model Applications
data for model training.
Organisations may wish to
explore various risk mitigation
measures to secure their non-
public sensitive data, such as
anonymisation and privacy-
enhancing technologies, before
they decide whether to use such
sensitive data for model training.

Associated MITRE Mitigations:


• AML.M0018 User Training
2.2.2* Consider Model hardening if • Input-based attacks Apply data augmentation and
appropriate. adversarial training to reduce the
• Prompt Injection
effect of adversarial robustness
• Adversarial Attacks attacks.
Responsible parties:
AI Practitioners • Model overfitting Adversarial training: Inject
adversarial text or image
• Privacy compromise
transformations (e.g. random
Associated MITRE ATLAS Techniques: flips, crops, rotation). This might

22
2.2 Consider security benefits and trade-offs when selecting the appropriate model to use

Treatment Yes No NA Related Risks Example Implementation Reference or Resource


Measures/Controls
• AML.T0043 Craft Adversarial impact the effectiveness of the
Data model.
• AML.T0015 Evade ML Model
• AML.T0024 Exfiltration via ML For LLMs, prompt engineering
Inference API best practices such as usage of
• AML.T0051 LLM Prompt Injection guardrails and wrapping
• AML.T0057 LLM Data Leakage instructions in a single pair of
• AML.T0054 LLM Jailbreak salted sequence tags can be
methods to further ground the
model.

Overfitting can increase the


chance of adversarial attacks
through model inversion.

Associated MITRE Mitigations:


• AML.M0003 Model Hardening
• AML.M0006 Use Ensemble
Methods
• AML.M0010 Input
Restoration
• AML.M0015 Adversarial Input
Detection
• AML.M0004 Restrict Number
of ML Model Queries
2.2.3* Consider implementing • Adversarial attacks on the model Supporting Countermeasures:
techniques to
Associated MITRE ATLAS Techniques: • Cyber threat Intelligence to
strengthen/harden the analyse and predict attacks.
• AML.T0015 Evade ML Model
system apart from
• Infrastructure Attacks • Involve beta users (better red
strengthening the model
• AML.T0029 Denial of ML Service teaming) to test, exploit the
itself. • Attacker Recon activities wisdom of the crowds.
Responsible parties: • AML.TA002 ATLAS Tactic Recon
• Anti-recon measures via
AI Practitioners hiding, disinformation,
deception (honeypots).

23
2.2 Consider security benefits and trade-offs when selecting the appropriate model to use

Treatment Yes No NA Related Risks Example Implementation Reference or Resource


Measures/Controls
• High quality datasets to
improve model performance.

• Data security controls for


data collection, data storage,
data processing, and data
use as well as code and
model security.

• For LLMs, implement


guardrails or input validation.

• Implement endpoint security.

• Consider implementing Zero


Trust Principles for the
system.
Associated MITRE Mitigations:
• AML.M0003 Model Hardening
• AML.M0006 Use Ensemble
Methods
• AML.M0010 Input
Restoration
• AML.M0015 Adversarial Input
Detection
• AML.M0004 Restrict Number
of ML Model Queries
• AML.M0019 Control Access
to ML Models and Data in
Production

24
2.3 Identify, track and protect AI-related assets
Understand the value of AI-related assets, including models, data, prompts, logs, and assessments.
Have processes to track, authenticate, version control, and secure assets.
Treatment Yes No NA Related Risks Example Implementation Reference or Resource
Measures/Controls
2.3.1 Establishing a data lineage • Loss of data integrity Model cards, Data cards, and • Cybersecurity Code of Practice
and software license Software Bill of Materials for Critical Information
• Unauthorised changes to data, (SBOMs) may be used. Infrastructure (CSA)
management process. This model or system
includes documenting the Associated MITRE Mitigations: • ISO 27001: Information
data, codes, test cases and • Insider threats • AML.M0016 Vulnerability security, cybersecurity and
model, including any • Ransomware attacks Scanning privacy protection
changes made and by whom. • AML.M0013 Code Signing
• Loss of intellectual property • AML.M0007 Sanitize Training
Responsible parties: Associated MITRE ATLAS Techniques: Data
AI Practitioners, • AML.T0018.000 Backdoor ML • AML.M0014 Verify ML
Cybersecurity Practitioners Model Artifacts
• AML.T0020.000 Poison Training • AML.M0008 Validate ML
Data Model
• AML.T0011 User Execution • AML.M0005 Control Access
to ML Models and Data at
Rest
• AML.M0018 User Training
2.3.2 Secure data at rest, and data • Data loss and leaks. Sensitive data (model weight and
in transit. python code) is stored encrypted
• Loss of data integrity. and transferred with proper
Responsible parties: • Ransomware encryption. encryption protocols, and secure
AI Practitioners, key management.
Cybersecurity Practitioners Associated MITRE ATLAS Techniques:
• AML.T0024 Exfiltration via ML Consider saving model weights in
Inference API secure formats such as
• AML.T0025 Exfiltration via Cyber safetensor, etc.
Means Associated MITRE Mitigations:
• AML.T0054 LLM Jailbreak • AML.M0012 Encrypt
Sensitive Information
• AML.M0005 Control Access
to ML Models and Data at
Rest
• AML.M0019 Control Access
to ML Models and Data in
Production

25
2.3 Identify, track and protect AI-related assets
Understand the value of AI-related assets, including models, data, prompts, logs, and assessments.
Have processes to track, authenticate, version control, and secure assets.
Treatment Yes No NA Related Risks Example Implementation Reference or Resource
Measures/Controls
2.3.3 Have regular backups in Identify essential data to backup
event of compromise. more frequently.
Implement a regular backup
Responsible parties:
schedule.
AI Practitioners,
Cybersecurity Practitioners Have redundancy to ensure
availability.
Associated MITRE Mitigations:
• AML.M0014 Verify ML
Artifacts
• AML.M0005 Control Access
to ML Models and Data at
Rest
• AML.M0019 Control Access
to ML Models and Data in
Production
2.3.4* Implement controls to limit • Data leaks For sensitive data such as PII, • Advisory Guidelines on use of
what AI can access and explore various risk mitigation Personal Data in AI
• Privacy attacks
generate, based on measures to secure non-public Recommendation and
Associated MITRE ATLAS Techniques: sensitive data, such as data Decision Systems
sensitivity of the data.
• AML.T0036 Data from anonymisation and privacy-
Responsible parties: Information Repositories enhancing techniques, before
AI Practitioners, • AML.T0037 Data from Local input into the AI.
Cybersecurity Practitioners System
Have filters at the output to
• AML.T0057 LLM Data Leakage
prevent sensitive information
from being leaked.
Associated MITRE Mitigations:
• AML.M0012 Encrypt
Sensitive Information
• AML.M0019 Control Access
to ML Models and Data in
Production
• AML.M0014 Verify ML
Artifacts

26
2.3 Identify, track and protect AI-related assets
Understand the value of AI-related assets, including models, data, prompts, logs, and assessments.
Have processes to track, authenticate, version control, and secure assets.
Treatment Yes No NA Related Risks Example Implementation Reference or Resource
Measures/Controls
• AML.M0005 Control Access
to ML Models and Data at
Rest
2.3.5 For very private data, • Data leaks Examples include having a
consider if privacy enhancing Trusted Execution Environment,
technologies may be used. Associated MITRE ATLAS Techniques: differential privacy or
• AML.T0024 Exfiltration via ML homomorphic encryption.
Responsible parties: Inference API Associated MITRE Mitigations:
AI Practitioners, • AML.M0012 Encrypt
Cybersecurity Practitioners Sensitive Information

27
2.4 Secure the AI development environment
Apply good infrastructure security principles.
Treatment Yes No NA Related Risks Example Implementation Reference or Resource
Measures/Controls
2.4.1 Implement appropriate • Unauthorised access to system, Have secure authentication • Cybersecurity Code of
access controls to APIs, data and models processes. Practice for Critical
models and data, logs, and Rule and role-based access Information Infrastructure
• Data breaches
the environments that they controls to the development (CSA)
are in. • Model/system compromise environment, based on the • ISO 27001: Information
• Loss of intellectual property principles of least privilege. security, cybersecurity and
Responsible parties:
Have periodic reviews for role privacy protection
AI Practitioners,
Associated MITRE ATLAS Techniques: conflicts or violations of
Cybersecurity Practitioners • Advisory Guidelines on use of
• AML.T0024 Exfiltration via ML segregation of duties, and
Personal Data in AI
Inference API documentation should be
Recommendation and
retained including remediation
• AML.T0025 Exfiltration via Cyber Decision Systems
Means actions.
• AML.T0036 Data from Access should be promptly
Information Repositories revoked for terminated users or
• AML.T0037 Data from Local when the employee no longer
System requires access.
• AML.T0012 Valid Accounts
Associated MITRE Mitigations:
• AML.T0057 LLM Data Leakage
• AML.M0005 Control Access
• AML.T0053 LLM Plugin
to ML Models and Data at
Compromise
Rest
• AML.T0054 LLM Jailbreak
• AML.M0019 Control Access
• AML.T0044 Full ML Model to ML Models and Data in
Access Production
• AML.T0055 Unsecured • AML.M0012 Encrypt Sensitive
Credentials Information
• AML.T0013 Discover ML • AML.M0014 Verify ML
Ontology Artifacts
• AML.T0014 Discover ML Family
• AML.T0007 Discover ML Artifacts

• AML.T0035 ML Artifact
Collection

28
2.4 Secure the AI development environment
Apply good infrastructure security principles.
Treatment Yes No NA Related Risks Example Implementation Reference or Resource
Measures/Controls
2.4.2 Implement access logging • Anomalies and suspicious Log access with timestamps.
and monitoring. activities not detectable Track changes to the data and
model or configuration changes.
Responsible parties: • Failed compliance and audit. Protect logs from being attacked
AI Practitioners, (deleted, or tampered)
• Poor transparency and
Cybersecurity Practitioners
accountability
Associated MITRE Mitigations:
• Insider threats • AML.M0005 Control Access
to ML Models and Data at
Associated MITRE ATLAS Techniques: Rest
• AML.T0024 Exfiltration via ML • AML.M0019 Control Access
Inference API to ML Models and Data in
• AML.T0025 Exfiltration via Cyber Production
Means
• AML.T0040 ML Model Inference
API Access
• AML.T0020.000 Poison Training
Data
2.4.3 Segregate production/ • Data integrity and confidentiality Consider keeping different project
development environments. being compromised environments separate from each
other. E.g. development
Responsible parties: • Limit the impact of potential separated from production.
AI Practitioners, attacks If you are using cloud services,
Cybersecurity Practitioners consider compartmentalizing
• Risk of disruptions or conflicts
your projects using VPCs, VMs,
between different functions/
VPNs, enclaves, and containers
models
Associated MITRE Mitigations:
• Insider attacks
• AML.M0005 Control Access
Associated MITRE ATLAS Techniques: to ML Models and Data at
• AML.T0024 Exfiltration via ML Rest
Inference API • AML.M0019 Control Access
• AML.T0025 Exfiltration via Cyber to ML Models and Data in
Means Production

29
2.4 Secure the AI development environment
Apply good infrastructure security principles.
Treatment Yes No NA Related Risks Example Implementation Reference or Resource
Measures/Controls

2.4.4 Ensure configurations are • Unauthorized access and data Default option should be secure
secure by default. breaches against common threats. E.g.
Implicitly deny access to sensitive
Responsible parties: • Insider threats data.
AI Practitioners, Associated MITRE ATLAS Techniques: Associated MITRE Mitigations:
Cybersecurity Practitioners • AML.T0024 Exfiltration via ML • AML.M0005 Control Access
Inference API to ML Models and Data at
• AML.T0025 Exfiltration via Cyber Rest
Means • AML.M0019 Control Access
to ML Models and Data in
Production

30
2.2.3. DEPLOYMENT

3.1 Secure the deployment infrastructure and environment of AI systems


Apply good infrastructure security principles.
Treatment Yes No NA Related Risks Example Implementation Reference or Resource
Measures/Controls
3.1.1 Ensure contingency plans are • Extended downtime to Having a manual or secondary • Cybersecurity Code of
in place to mitigate disruption availability system as a fail-over/fail-safe if Practice for Critical
or failure of AI services. the AI service becomes Information Infrastructure
Associated MITRE ATLAS unavailable. (CSA)
Responsible parties: Techniques: • ISO 27001: Information
Decision Makers, AI • AML.T0029 Denial of ML security, cybersecurity and
Practitioners, Cybersecurity Service
privacy protection
Practitioners
• Advisory Guidelines on use
3.1.2 Implement appropriate • Unauthorized access to Have secure authentication
of Personal Data in AI
access controls to APIs, sensitive AI models and data processes.
Recommendation and
models and data, logs, • Data breaches Rule and role-based access Decision Systems
configuration files and the controls to the deployment
• Loss of model integrity • NSA Guidance for
environments that they are in. environment, based on the
Strengthening AI System
• Loss of intellectual property principles of least privilege.
Responsible parties: Security
Associated MITRE ATLAS Have periodic reviews for role
AI Practitioners,
Techniques: conflicts or violations of
Cybersecurity Practitioners
• AML.T0024 Exfiltration via ML segregation of duties, and
Inference API documentation should be
retained including remediation
• AML.T0025 Exfiltration via
actions.
Cyber Means
• AML.T0040 ML Model Access should be removed timely
Inference API Access for terminated users or when the
• AML.T0020.000 Poison employee no longer requires
Training Data access.
Associated MITRE Mitigations:
• AML.M0005 Control Access
to ML Models and Data at
Rest
• AML.M0019 Control Access
to ML Models and Data in
Production

31
3.1 Secure the deployment infrastructure and environment of AI systems
Apply good infrastructure security principles.
Treatment Yes No NA Related Risks Example Implementation Reference or Resource
Measures/Controls
3.1.3 Implement access logging, • Unauthorized access to Keep a record of access to the
monitoring and policy deployment infrastructure model, inputs to the model, and
management and environment output behaviour of the model. If
necessary, track all AI
• Undetected Anomalies and applications, models and data.
Responsible parties:
suspicious activities
AI Practitioners, Have the ability to discover all AI
Cybersecurity Practitioners • Nonadherence to compliance apps, models, and data across
and audit requirements the system, and who they are
• Data integrity and used by.
accountability Define and enforce data security
• Insider threats policies across their
environments.
Associated MITRE ATLAS
Techniques: Associated MITRE Mitigations:
• AML.T0024 Exfiltration via ML • AML.M0005 Control Access
Inference API to ML Models and Data at
Rest
• AML.T0025 Exfiltration via
Cyber Means • AML.M0019 Control Access
to ML Models and Data in
• AML.T0040 ML Model
Production
Inference API Access

3.1.4 Implement segregation of • Data integrity and Keep different project


environments. confidentiality being environments separate from each
compromised other. E.g. when working on the
Responsible parties: cloud, have a separate VPC.
AI Practitioners, • Limit the impact of potential
attacks, Risk of disruptions or Keep the development and
Cybersecurity Practitioners
conflicts between different operational environment apart.
functions/models
Associated MITRE Mitigations:
Associated MITRE ATLAS • AML.M0019 Control Access
Techniques: to ML Models and Data in
• AML.T0029 Denial of ML Production
Service
• AML.T0025 Exfiltration via
Cyber Means

32
3.1 Secure the deployment infrastructure and environment of AI systems
Apply good infrastructure security principles.
Treatment Yes No NA Related Risks Example Implementation Reference or Resource
Measures/Controls
• AML.T0031 Erode ML Model
Integrity
3.1.5 Ensure configurations are • Vulnerability exploitation, Default option should be secure
secure by default. Unauthorized access, Data against common threats. E.g.
breaches Implicitly deny access to
Responsible parties: sensitive data.
AI Practitioners, • Insider threats
Associated MITRE Mitigations:
Cybersecurity Practitioners Associated MITRE ATLAS • AML.M0019 Control Access
Techniques: to ML Models and Data in
• AML.T0024 Exfiltration via ML Production
Inference API
• AML.T0025 Exfiltration via
Cyber Means
• AML.T0031 Erode ML Model
Integrity
3.1.6 Consider implementing • Unauthorized access to AI Consider implementing Firewalls
firewalls. systems, models, and data if the model is accessible to users
online.
Responsible parties: • Network-based attacks, such
as denial-of-service (DoS) Associated MITRE Mitigations:
Cybersecurity Practitioners
attacks. • AML.M0005 Control Access
to ML Models and Data at
• Malware and intrusion Rest
attempts • AML.M0019 Control Access
• Unauthorized access to to ML Models and Data in
specific components of the AI Production
systems
Associated MITRE ATLAS
Techniques:
• AML.T0029 Denial of ML
Service
• AML.T0046 Spamming ML
System with Chaff Data

33
3.1 Secure the deployment infrastructure and environment of AI systems
Apply good infrastructure security principles.
Treatment Yes No NA Related Risks Example Implementation Reference or Resource
Measures/Controls

3.1.7 Implement any other relevant Implement any other relevant


security controls based on security control based on best
cybersecurity best practice, practice, such as ISO 27001.
which has not been stated
above.

Responsible parties:
Decision Makers, AI
Practitioners, Cybersecurity
Practitioners

34
3.2 Establish incident management procedures
Ensure proper incident response, escalation, and remediation plans.
Treatment Yes No NA Related Risks Example Implementation Reference or Resource
Measures/Controls
3.2.1 Have plans to address • Failed Incident Response Have different incident response • CSA Incident Response
different attack and outage plans that address different Checklist
• Disruption to business types of outages and the
scenarios. Implement continuity potential attack scenarios, which
measures to assist
Associated MITRE ATLAS may be blended with DOS.
investigation. Implement forensics support and
Techniques:
• AML.T0029 Denial of ML protect against erasure of
Responsible parties:
Service evidence.
Decision Makers, AI
Practitioners, Cybersecurity Use cyber threat intelligence to
Practitioners support investigation.
Associated MITRE Mitigations:
• AML.M0018 User Training
3.2.2 Regularly reassess incident • Failed Incident Response Assess how changes to the
response plans as the system system and AI will affect the
• Disruption to business attack surfaces.
changes. continuity
Associated MITRE Mitigations:
Responsible parties: Associated MITRE ATLAS • AML.M0018 User Training
Decision Makers, AI Techniques:
Practitioners, Cybersecurity • AML.T0029 Denial of ML
Practitioners Service

3.2.3 Have regular backups in event • Data Loss Store critical data assets in
of compromise. offline backups.
• Ransomware attacks
Associated MITRE Mitigations:
Responsible parties: • Operational Disruptions • AML.M0014 Verify ML
AI Practitioners,
Cybersecurity Practitioners • Data Integrity Artifacts

Associated MITRE ATLAS


Techniques:
• AML.T0029 Denial of ML
Service
• AML.T0031 Erode ML Model
Integrity

35
3.2 Establish incident management procedures
Ensure proper incident response, escalation, and remediation plans.
Treatment Yes No NA Related Risks Example Implementation Reference or Resource
Measures/Controls
3.2.4 When an alert has been • Regulatory non-Compliance Use threat hunting to determine
raised or investigation has full extent of attack and
• Increased cost and damages investigate attribution.
confirmed an incident, report to the enterprise
to the relevant stakeholders

Responsible parties:
Decision Makers, AI
Practitioners, Cybersecurity
Practitioners

36
3.3 Release AI systems responsibly
Release models, applications, or systems only after subjecting them to appropriate and effective security checks and evaluation
Treatment Yes No NA Related Risks Example Implementation Reference or Resource
Measures/Controls
3.3.1* Verify models with • Model Tampering/Poisoning Compute and share model and
hashes/signatures of model dataset hashes/signatures when
• Data Poisoning creating new models or data and
files and datasets before
• Backdoor/ Trojan model update the relevant documentation
deployment or periodically,
e.g. model cards.
according to enterprise Associated MITRE ATLAS
policy. Techniques: Associated MITRE Mitigations:
• AML.T0018.000 Backdoor ML • AML.M0014 Verify ML Artifacts
Responsible parties: Model
AI Practitioners • AML.T0020.000 Poison
Training Data
3.3.2* Benchmark and test the AI • Failure to achieve trust and Models have been validated and • Adversarial Robustness
models before release. reliability achieved performance targets Toolbox (IBM)
• Adversarial Attacks before deployment. • CleverHans (University of
Responsible parties:
Toronto)
AI Practitioners • Lack of accountability Consider using an adversarial test
set to validate model robustness, • TextAttack (University of
• Model Robustness Virginia)
where possible.
Associated MITRE ATLAS • Prompt Bench (Microsoft)
Techniques: Conduct AI Red-Teaming.
• Counterfit (Microsoft)
• AML.T0048 External Harm Associated MITRE Mitigations:
• AML.T0043 Craft Adversarial • AML.M0008 Validate ML Model • AI Verify (Infocomm Media
Data • AML.M0014 Verify ML Artifacts Development Authority,
• AML.T0031 Erode ML Model Singapore)
Integrity • Moonshot (Infocomm
Media Development
Authority, Singapore)

37
3.3 Release AI systems responsibly
Release models, applications, or systems only after subjecting them to appropriate and effective security checks and evaluation
Treatment Yes No NA Related Risks Example Implementation Reference or Resource
Measures/Controls
3.3.3 Consider the need to conduct • Security Vulnerabilities Perform VAPT/security testing on AI • OWASP Top 10 for Large
security testing on the AI systems. Language Model
systems. Associated MITRE ATLAS Applications
Prioritise and focus on the most
Techniques: realistic and practical attacks, • Web LLM attacks
Responsible parties: • AML.T0048 External Harm based on the risk assessment (Portswigger)
AI Practitioners, • AML.T0031 Erode ML Model during the planning phase.
Cybersecurity Practitioners Integrity
System owner and project teams to
follow up on findings from security
testing/red team, by assessing the
criticality of vulnerabilities
uncovered, apply additional
measures and if necessary, seek
approval from relevant entity e.g.
CISO, for acceptance of residual
risks, according to their enterprise
risk management/cybersecurity
policies.
Create a feedback loop to
maximise the impact of the
findings from security tests.
Associated MITRE Mitigations:
• AML.M0003 Model Hardening
• AML.M0006 Use Ensemble
Methods
• AML.M0016 Vulnerability
Scanning

38
2.2.4. OPERATIONS AND MAINTENANCE

4.1 Monitor AI system inputs


Monitor and log inputs to the system, such as queries, prompts and requests.
Proper logging allows for compliance, audit, investigation and remediation.
Treatment Measures/Controls Yes No NA Related Risks Example Implementation Reference or Resource

4.1.1* Validate/Monitor inputs to the model • Adversarial Attacks AI System owners may consider to monitor • Introduction to Logging for
and system for possible attacks and and validate input prompts, queries or API Security Purpose (NCSC)
• Data exfiltration requests for attempts to access, modify or
suspicious activity. • OpenAI usage policies
exfiltrate information deemed confidential by
Responsible parties: Associated MITRE the organisation. • Advisory Guidelines On use of
AI Practitioners, Cybersecurity ATLAS Techniques: Personal Data In AI
Consider use of classifiers to detect
Practitioners • AML.T0043 Craft Recommendation and
malicious inputs and log them for future
Adversarial Data Decision Systems (PDPC)
review to identify potential vulnerabilities.
• AML.T0025
Exfiltration via Note: Implementor should consider the
Cyber Means current privacy regulations/guidelines when
logging inputs.
Associated MITRE Mitigations:
AML.M0015 Adversarial Input Detection
4.1.2 Monitor/Limit the rate of queries. • Denial of Service If possible, prevent users from continuously
(DoS) Attacks querying the model with a high frequency e.g.
Responsible parties: API throttling.
AI Practitioners, Cybersecurity Associated MITRE This mitigates the potential for membership-
Practitioners ATLAS Techniques: inference and extraction attacks.
• AML.T0029 Denial Associated MITRE Mitigations:
of ML Service • AML.M0004 Restrict Number of ML
• AML.T0034 Cost Model Queries
Harvesting

39
4.2 Monitor AI system outputs and behaviour
Monitor for anomalous behaviour that might indicate intrusions or compromise.
Treatment Measures/Controls Yes No NA Related Risks Example Implementation Reference or Resource

4.2.1* Monitor model outputs and model • Adversarial Attacks Implement an alert system that monitors for anomalous
performance. or unwanted output.
• Operational Impact
E.g. a customer facing chatbot that is safe for work
Responsible parties: Associated MITRE ATLAS begins to output profanity instead.
AI Practitioners, Cybersecurity Techniques:
Practitioners • AML.T0031 Erode ML Associated MITRE Mitigations:
Model Integrity • AML.M0008 Validate ML Model
• AML.T0020.000
Poison Training Data
• AML.T0029 Denial of
ML Service
• AML.T0048 External
Harms

4.2.2* Ensure adequate human oversight • False Positives from Manual investigation of unusual or anomalous alert
to verify model output, when the model notifications.
viable or appropriate. • Misinterpretation of For critical systems, ensure human oversight to verify
Context decisions recommended by the model.
Responsible parties:
AI Practitioners, Cybersecurity • Adverse Impact on Associated MITRE Mitigations:
Practitioners Operations • AML.M0018 User Training
• AML.M0015 Adversarial Input Detection
Associated MITRE ATLAS
Techniques:
• AML.T0029 Denial of
ML Service
• AML.T0048 External

40
4.3 Adopt a secure-by-design approach to updates and continuous learning.
Ensure risks associated to model updates have been considered. Changes to the data and model can lead to changes in behaviour.
Treatment Yes No NA Related Risks Example Implementation Reference or Resource
Measures/Controls
4.3.1* Treat major updates as new • Model Tampering/Poisoning New models to be validated, • Principles for the Security of
versions and integrate benchmarked, and be tested Machine Learning
• Backdoor/ Trojan model before release.
software updates with model (UK NCSC)
updates and renewal. Associated MITRE ATLAS Associated MITRE Mitigations:
Techniques: • AML.M0008 Validate ML
Responsible parties: • AML.T0020.000 Poison Model
AI Practitioners Training Data • AML.M0014 Verify ML
• AML.T0018.000 Backdoor ML Artifacts
Model
• AML.T0031 Erode ML Model
Integrity
• AML.T0010 ML Supply Chain
Compromise

4.3.2* Treat new input data used for • Data Poisoning Subject new input to the same
training as new data. verification and validation as new
• Poison/Backdoor/Trojan data.
Responsible parties: model
Associated MITRE Mitigations:
AI Practitioners Associated MITRE ATLAS • AML.M0007 Sanitize Training
Techniques: Data
• AML.T0020.000 Poison
Training Data
• AML.T0018.000 Backdoor ML
Model
• AML.T0010 ML Supply Chain
Compromise

41
4.4 Establish a vulnerability disclosure process
Have a feedback process for users to share any findings of concern, which might uncover potential vulnerabilities to the system.
Treatment Yes No NA Possible Risk Mitigated Example Implementation Reference or Resource
Measures/Controls
4.4.1 Maintain open lines of • Regulatory non-Compliance Set up channels to allow users to • SingCERT Vulnerability
communication. provide feedback on security Disclosure Policy (CSA)
and usage.
• UK NCSC Vulnerability
Responsible parties:
Decision Makers, Disclosure Toolkit
AI Practitioners, • CVE List
Cybersecurity Practitioners
• AI CWE List
4.4.2 Share findings with • Regulatory non-Compliance Share discoveries of
vulnerabilities to relevant • ATLAS Case Studies
appropriate stakeholders.
stakeholders such as the
Responsible parties: company CISO.
Decision Makers,
AI Practitioners,
Cybersecurity Practitioners

2.2.5. END OF LIFE

5.1 Ensure proper data and model disposal

Treatment Yes No NA Related Risks Example Implementation Reference or Resource


Measures/Controls
5.1.1 Ensure proper and secure • Regulatory non-Compliance Examples include crypto • Personal Data Protection Act
disposal/destruction of data shredding or degaussing (PDPA)
• Sensitive data loss
and models in accordance • Advisory Guidelines on use of
with data privacy standards Personal Data in AI
and/or relevant rules and Recommendation and
regulations. Decision Systems

Responsible parties:
Decision Makers, AI
Practitioners, Cybersecurity
Practitioners

42
43
3. USE CASE
EXAMPLES

3.1. DETAILED
WALKTHROUGH EXAMPLE
Case Study: Implementing Companion Guide on LLM-based Chatbot

• Company A is currently testing out an LLM to implement as their customer service


chatbot, known as SuperResponder.

• The model is an LLM that is downloaded from an open-source model hosting


website (Hugging Face) and further developed in-house on a cloud environment.

• The data is sourced from manually curated FAQs from customer service
conversations, which will be converted to a vector database to implement Retrieval
Augmented Generation (RAG) with the downloaded LLM model.

Supply Chain Attacks The integrity and security of AI supply chains are
essential for ensuring the reliability and
In this example, Company A trustworthiness of AI systems. AI vulnerabilities in the
relies heavily on third party AI supply chain refer to weaknesses or exploitable points
software components to within the processes of acquiring, integrating, and
develop SuperResponder. deploying AI technologies. These vulnerabilities can
stem from malicious or compromised components,
including datasets, models, algorithms, and software
COMPANION GUIDE ON SECURING AI

libraries, which may introduce security risks and


threats to AI systems2.

2
https://vulcan.io/blog/understanding-the-hugging-face-backdoor-threat/

44
Figure 2. LLM Context Based-chatbot System Architecture

3.1.1. RISK ASSESSMENT EXAMPLE

Company A performed a risk assessment to identify and address potential risks to


confidentiality, integrity, availability of their AI system. If the risks are not mitigated, there is
a potential for an attacker to exploit the list of vulnerabilities, causing SuperResponder to
be compromised. This could result in widespread customer dissatisfaction and damage to
the company's reputation.

The hypothetical risk assessment* is as follows:


Risk Scenarios Impact Likelihood Proposed Risk Level
Mitigations
Prompt injection Confidentiality: Likelihood: Medium Use automated tools to Initial Risk
attack High Chatbot interface is remove PIIs from Level: Medium
public facing. Attack can datasets used.
Crafted input can be Confidential be performed easily Residual Risk
executed to instruct information such as without privileged access In addition, use data Level: Low
LLM to retrieve PII data of customers and be repeated protection measures and
private customer may be leaked. continuously. output sanitisation
information. mechanisms.
Supply Chain Integrity: High Likelihood: Medium Scanning the model. Initial Risk
Vulnerabilities. The chatbot may be It is possible to upload Sandboxing the model. Level: Medium
Use of compromised prompted to compromised models
pre-trained LLM can regularly output the onto public model Download models from Residual Risk
introduce other wrong answer or hosting platforms. These trusted model Level: Low
vulnerabilities such advice to customers. models are downloaded developers or sources.
as model backdoor. and used to develop the
COMPANION GUIDE ON SECURING AI

chatbot.

Model Denial of Availability: Likelihood: Medium API throttling. Initial Risk


Service. Medium Volumetric and Level: Medium
Chatbot at risk of The chatbot service continuous querying of
volumetric and can be overwhelmed the chatbot can be Residual Risk
continuous querying, by a large volume of performed with some Level: Low
consuming a large requests and scripting knowledge or
amount of resource. become unavailable automated tools.
to other users.
* The above table is not exhaustive and is meant as an example of a risk assessment done.

45
3.1.2. WALKTHROUGH OF TABULATED MEASURES/CONTROLS

Following the risk assessment, Company A promptly referenced the CSA


Guidelines for Securing AI Systems and the Companion Guide to mitigate the risks.
The list of implemented actions are as follows:

3.1.2.1. PLANNING AND DESIGN STAGE

1.1 Raise awareness and competency on security risks

Treatment Yes No NA Implementation done / reason


Measures/Controls not done

1.1.1 Ensure system owners and ✓ System owners have attended


senior leaders understand seminars on AI security and
threats to secure AI and their understood potential risks
mitigations.
associated with AI systems.

1.1.2 Provide guidance to staff on ✓ Trained staff on AI security and


Security by Design and Security risks, e.g. attack vectors, and
by Default principles as well as countermeasures (practical
unique AI security risks and
defence strategies).
failure modes as part of InfoSec
training. e.g. LLM security
matters, common AI
weaknesses and attacks. Developers were sent to attend a
3-day course on AI & Cybersecurity
covering adversarial machine
learning at a local tertiary
institution. They also referred to
online courses from Udemy on AI
security essentials and AI risk
management.

1.1.3 Train developers are trained in ✓ Developers have attended certified


secure coding practices and workshops on how to maintain
good practices for the AI secure coding practices when
lifecycle.
developing the model.
COMPANION GUIDE ON SECURING AI

46
1.2 Conduct security risk assessments

Treatment Yes No NA Implementation done / reason


Measures/Controls not done

1.2.1 Understand AI governance and ✓ Understood AI Verify framework,


legal requirements, the impact PDPA Guidance for AI and User
to the system, users, Data, Model Governance
organisation, if an AI
Framework for Generative AI
component is compromised or
has unexpected behaviour or (IMDA)
there is an attack that affected
AI privacy.

Plan for an attack and its


mitigation, using the principles
of CIA.
1.2.2 Assess AI-related attacks and ✓ Threat Modelling: Identify and
implement mitigating steps. assess potential attack vectors
from adversarial attacks such as
prompt injection, membership
inference, data poisoning and
backdoor attacks using mitre atlas
framework

1.2.3 Conduct risk assessment is ✓ Risk assessment was conducted in


done in accordance with the accordance with company risk
relevant industry management policy
standards/best practices.
COMPANION GUIDE ON SECURING AI

47
3.1.2.2. DEVELOPMENT

2.1 Secure the Supply Chain


Treatment Measures/Controls Yes No NA Example Implementation of Action
2.1.1 Implement Secure Coding and ✓ Attended secure coding courses and
Development Lifecycle. adopt secure coding practices for
development of the LLM.

2.1.2 Supply Chain Security: ✓ For the pre-trained 3rd party LLM
Ensure data, models, compilers, model – Applied Source verification to
software libraries, developer ensure data and models obtained are
tools and applications from from trusted and reputable sources.
trusted sources. Verified the authenticity and integrity
of the sources before incorporating
them into the system (digital
signatures).

2.1.3 Protect the integrity of data that ✓ Data to support Retrieval Augmented
will be used for training the Generation (RAG) is sourced from
model. company’s own customer service
conversations and internal FAQ
documents.
2.1.4 Consider the trade-offs when ✓ Examples of risks considered:
deciding to use an untrusted 3rd Data Breaches, Data Privacy Leakage,
party model (with or without fine Service Disruptions, Model backdoor.
Compensatory measures such as
tuning).
prompt filters and prompt engineering
to mitigate adversarial attacks.

2.1.5 Consider sandboxing untrusted ✓ Implemented virtual machines (VMs),


models or serialised weight files to isolate and restrict execution
where relevant. environment of these components.

2.1.6 Scan models or serialised weight ✓ Scanned model files with Picklescan
files.

2.1.7 Consider the trade-offs ✓ Not using external APIs during


associated with using sensitive development
data for model training or
inference
2.1.8 Apply appropriate controls for ✓ Not required as model is hosted
COMPANION GUIDE ON SECURING AI

data being sent out of the locally and not a SaaS.


organisation.

48
2.1.9 Consider evaluation of ✓ Used a vulnerability scanner to ensure
dependent software libraries, safety of third-party libraries from
open-source models and when known CVEs
possible, run code checking.

2.1.10 Use software and libraries that ✓ Use of updated software and libraries
does not have known with no known vulnerability in
vulnerabilities. accordance with company IT policy

2.2 Consider security benefits and trade-offs when selecting the appropriate model to
use

Treatment Yes No NA Implementation done / reason


Measures/Controls not done

2.2.1 Assess the need to use ✓ Sensitive data not used for vector
sensitive data for training the database. Training data has been
model, or directly referenced by carefully sanitised by sensitive
the model.
data redaction methods to counter
inference attacks.

2.2.2 Consider Model hardening if ✓ Prompt engineering to prevent the


appropriate. model from producing output
beyond what is intended.

Implemented guardrails to ensure


sensitive data is not disclosed.

2.2.3 Consider implementing ✓ Added input prompt filters and


techniques to output filters for unwanted topics,
strengthen/harden the system to mitigate against prompt
apart from strengthening the
injections.
model itself.
COMPANION GUIDE ON SECURING AI

49
2.3 Identify, track and protect AI-related assets

Treatment Yes No NA Implementation done / reason


Measures/Controls not done

2.3.1 Establishing a data lineage and ✓ Maintained documentation of the


software license management changes made to newer model
process. This includes versions on Model cards and
documenting the data, codes,
verified it.
test cases and model, including
any changes made and by
whom.
2.3.2 Secure data at rest, and data in ✓ Encryption algorithms approved by
transit. enterprise security policy is used
for data at rest and transit.

2.3.3 Have regular backups in event ✓ Used git to maintain version control
of compromise. of the codebase and model
artifacts.

2.3.4 Implement controls to limit ✓ Prompt engineering to ensure that


what AI can access and the model is less likely to generate
generate, based on sensitivity any unwanted topics.
of the data.
2.3.5 For very private data, privacy ✓ No private data used
enhancing technologies may be
used.

2.4 Secure the AI development environment

Treatment Yes No NA Implementation done / reason


Measures/Controls not done

2.4.1 Appropriate access controls to ✓ Rule and role-based access


APIs, models and data, logs, controls implemented for
and the environments that they developers
are in.
2.4.2 Implement access logging and ✓ Turned on cloud native logging.
monitoring.
2.4.3 Segregation production/ ✓ Developer environment is in a
development environments. different VPC from the deployment
environment.

2.4.4 Ensure configurations are ✓ Implicit deny access to


COMPANION GUIDE ON SECURING AI

secure by default. unauthorised users via cloud


native identity and access
management.

50
3.1.2.3. DEPLOYMENT

3.1 Secure the deployment infrastructure and environment of AI systems

Treatment Yes No NA Implementation done


Measures/Controls

3.1.1 Ensure contingency plans are in ✓ Deployed a backup availability


place to mitigate disruption or zone to ensure availability of
failure of AI services. service.

3.1.2 Implement appropriate access ✓ General users only have access to


controls to APIs, models and the LLM interface via the frontend
data, logs, configuration files chatbot, no access to the backend
and the environments that they
environment.
are in.
3.1.3 Implement access logging, ✓ Turned on cloud native logging.
monitoring and policy
management

3.1.4 Implementation segregation of ✓ Deployment environment is in a


environments. different VPC from the
development environment.

3.1.5 Ensure configurations are ✓ Implicit deny access to


secure by default. unauthorised users via cloud
native identity and access
management.

3.1.6 Consider implementing ✓ Configured firewalls in between


firewalls. access to environment and model

3.1.7 Implement any other relevant ✓ Current controls are in line with
security controls based on company cybersecurity policy.
cybersecurity best practice,
which has not been stated
above.
COMPANION GUIDE ON SECURING AI

51
3.2 Establish incident management procedures

Treatment Yes No NA Implementation done / reason


Measures/Controls not done

3.2.1 Have plans to depict different ✓ Conducted exercise to simulate


attack and outage scenarios. outage of AI chatbot and fail over
Implement measures to assist to another availability zone.
investigation.
3.2.2 Regularly reassess incident ✓ Will reassess system every 12
response plans as the system months or whenever there is an
changes. update to the system, according to
company cybersecurity policy.

3.2.3 Have regular backups in event ✓ Weekly backups in place,


of compromise. according to company IT policy.

3.2.4 When an alert has been raised ✓ Procedure in place to report to


or investigation has confirmed CISO, in accordance with incident
incident, to report to the response standard operating
relevant stakeholders.
procedure.
COMPANION GUIDE ON SECURING AI

52
3.3 Release AI systems responsibly

Treatment Yes No NA Implementation done / reason


Measures/Controls not done

3.3.1 Verify models with ✓ Models are validated with hashes


hashes/signatures of model before deployment.
files and datasets before
deployment or periodically,
according to enterprise policy.
3.3.2 Benchmark and test the AI ✓ Prepared a golden dataset to
models before release. validate and benchmark model.

Conducted red teaming on the LLM


model before release;
incorporating test cases on prompt
injection and supply chain attacks,
which were identified during the
security risk assessment.

3.3.3 Consider need to conduct ✓ Performed VAPT/security testing


security testing on the AI on LLM systems.
systems.
The system owner followed up on
findings from the red team,
assessed the criticality of
vulnerabilities uncovered, applied
additional measures, and sought
approval from CISO for
acceptance of vulnerabilities that
cannot be rectified.
COMPANION GUIDE ON SECURING AI

53
3.1.2.4. OPERATIONS AND MAINTENANCE

4.1 Monitor AI system inputs

Treatment Yes No NA Implementation done / reason


Measures/Controls not done

4.1.1 Validate/monitor inputs to the ✓ All inputs to the LLM that has
model and system for possible guardrails triggered are logged for
attacks and suspicious activity. future review and to identify
potential vulnerabilities in prompt
design.

4.1.2 Monitor/Limit the rate of ✓ API throttling is in place to limit


queries. rate on queries to model.

4.2 Monitor AI system outputs and behaviour

Treatment Yes No NA Implementation done / reason


Measures/Controls not done

4.2.1 Monitoring of model outputs ✓ Implement a monitoring system to


and model performance. detect anomalous behaviour or
outputs from the LLM system that
could indicate an attack or
vulnerability.

4.2.2 Ensure adequate human ✓ Manually investigate unusual,


oversight to verify model automated processes that are
output, when viable or flagged as anomalous.
appropriate.

4.3 Adopt a secure-by-design approach to updates and continuous learning.

Treatment Yes No NA Implementation done / reason


Measures/Controls not done

4.3.1 Treat major updates as new ✓ To validate and benchmark new


versions and integrate software models and updates against a
updates with model updates ‘Golden dataset’
and renewal.
COMPANION GUIDE ON SECURING AI

4.3.2 Treat new input data used for ✓ New data used for finetuning will
training as new data. be validated as they were new
data.

54
4.4 Establish a vulnerability disclosure process

Treatment Yes No NA Implementation done / reason


Measures/Controls not done

4.4.1 Maintain open lines of ✓ Establish a vulnerability disclosure


communication. program (bounty program, etc.) to
encourage responsible reporting
and handling of security
vulnerabilities by the users.

4.4.2 Share findings with appropriate ✓ New findings will be shared with
stakeholders. company CISO

3.1.2.5. END OF LIFE

5.1 Ensure proper data and model disposal

Treatment Measures/Controls Yes No NA Implementation done / reason


not done

5.1.1 Ensure proper and secure ✓ All data related to the chatbot,
disposal/destruction of data and and vector database will be
models in accordance with data deleted through the CSP data
privacy standards and/or relevant
disposal process, in line with
rules and regulations.
company data policy.
COMPANION GUIDE ON SECURING AI

55
3.2. STREAMLINED
IMPLEMENTATION EXAMPLE
Case Study: Patch attacks on image recognition surveillance system

• Company B has recently implemented an advanced AI-driven facial recognition


gantry system at all access points at their office.

• The system is part of enhanced security measures to identify individuals and to


streamline employee flow by reducing dependence on manual checks.

• Facial recognition systems utilise deep learning algorithms to identify individuals,


by analysing visual data captured through cameras.

Patch Attacks A patch attack is a type of attack that disrupts


object classification in a camera's visual field
In this example, the system owner has by introducing a specific pattern or object. This
identified patch attack as a possible disruption can lead to misinterpretation or
attack vector for this system evasion attacks.
COMPANION GUIDE ON SECURING AI

Figure 3. AI Facial Recognition Gantry System Architecture

56
3.2.1. RISK ASSESSMENT EXAMPLE – EXTRACT ON PATCH ATTACK

The following is an extract from a security risk assessment, specific to an image


patch attack.

Risk Impact Likelihood Proposed Mitigations Risk Level


Scenarios
Image Patch Integrity: Likelihood: Low • Adversarial training Initial Risk
Evasion High Threat actors need • Ensemble model Level: High
attack. Integrity of to know how the • Multiple sensors
Attacker can the AI facial facial recognition • Input Filtering Residual Risk
use adversarial recognition AI model works in Level: Low
patches to system will order to generate
compromise be a malicious patch
physical impacted that is effective
security allowing
measures, unauthorise
leading to d personnel
unauthorised to access
access and the gantry
potential
security
breaches.
COMPANION GUIDE ON SECURING AI

57
3.2.2. RELEVANT TREATMENT CONTROLS FROM COMPANION
GUIDE

To avoid repetition from section 5.1, we outline only the essential controls related
to the Patch Attack scenario.

2.2 Consider security benefits and trade-offs when selecting the appropriate model to use

Treatment Measures/Controls Yes No NA Implementation done / reason not


done

2.2.2 Consider Model hardening if ✓ Adversarial training is implemented.


appropriate.
Ensemble Model: Utilised ensemble
approaches that combine multiple
facial recognition algorithms.
These measures can enhance
robustness and resilience against
image patch attacks, mitigating the
impact of individual vulnerabilities
2.2.3 Consider implementing ✓ Multi-Sensor Fusion: Multiple
techniques to cameras and lasers used to detect the
strengthen/harden the system face.
apart from strengthening the
model itself.

4.1 Monitor AI system inputs

Treatment Measures/Controls Yes No NA Implementation done / reason not


done
4.1.1 Validate/monitor inputs to the ✓ Additional input filtering layer to
model and system for possible detect if abnormal patches are
attacks and suspicious activity. present.
Having a staff to verify when one is
detected.
COMPANION GUIDE ON SECURING AI

58
GLOSSARY
Term Brief description

AI system Artificial Intelligence.


A machine-based system that for explicit or implicit objectives, infers, from the
input it receives, how to generate outputs such as predictions, content,
recommendations, or decisions that can influence physical or virtual
environments. Different AI systems vary in their levels of autonomy and
adaptiveness after deployment.

Adversarial The process of extracting information about the behaviour and characteristics
Machine of an ML system and/or learning how to manipulate the inputs into an ML
Learning system in order to obtain a preferred outcome.

Anomaly The identification of observations, events or data points that deviate from what
Detection is usual, standard, or expected, making them inconsistent with the rest of data.

API Application Programming Interface.


A set of protocols that determine how two software applications will interact
with each other.

Backdoor A backdoor attack is when an attacker subtly alters AI models during training,
attack causing unintended behaviour under certain triggers.

Chatbot A software application that is designed to imitate human conversation through


text or voice commands

Computer An interdisciplinary field of science and technology that focuses on how


Vision computers can gain understanding from images and videos.

Data Breach Data Breach occurs when a threat actor gains unauthorised access to
COMPANION GUIDE ON SECURING AI

sensitive/confidential data.

Data Integrity The property that data has not been altered in an unauthorised manner. Data
integrity covers data in storage, during processing, and while in transit.

59
Data Leakage Unintentional exposure of sensitive, protected, or confidential information
outside its intended environment.

Data Loss A system’s ability to identify, monitor, and protect data in use (e.g., endpoint
Prevention actions), data in motion (e.g., network actions), and data at rest (e.g., data
storage) through deep packet content inspection, and contextual security
analysis of transaction (e.g., attributes of originator, data object, medium,
timing, recipient/destination, etc.) within a centralised management
framework.

Data Control a model with training data modifications.


Poisoning

Data Science An interdisciplinary field of technology that uses algorithms and processes to
gather and analyse large amounts of data to uncover patterns and insights that
inform business decisions.

Deep Learning A function of AI that imitates the human brain by learning from how it structures
and processes information to make decisions. Instead of relying on an
algorithm that can only perform one specific task, this subset of machine
learning can learn from unstructured data without supervision.

Defence-in- Defence in depth is a strategy that leverages multiple security measures to


Depth protect an organization's assets. The thinking is that if one line of defence is
compromised, additional layers exist as a backup to ensure that threats are
stopped along the way.

Evasion attack Crafting input to AI in order to mislead it into performing its task incorrectly.

Extraction Copy or steal an AI model by appropriately sampling the input space and
attack observing outputs to build a surrogate model that behaves similarly.

Generative AI A type of machine learning that focuses on creating new data, including text,
video, code and images. A generative AI system is trained using large amounts
COMPANION GUIDE ON SECURING AI

of data, so that it can find patterns for generating new content.

Guardrails Restrictions and rules placed on AI systems to make sure that they handle data
appropriately and don't generate unethical content.

60
Hallucination An incorrect response from an AI system, or false information in an output that
is presented as factual information.

Image Image recognition is the process of identifying an object, person, place, or text
Recognition in an image or video.

LLM Large Language Model.


A type of AI model that processes and generates human-like text. LLMs are
specifically trained on large data sets of natural language to generate human-
like output.

ML Machine Learning.
A subset of AI that incorporates aspects of computer science, mathematics,
and coding. Machine learning focuses on developing algorithms and models
that can learn from data, and make predictions and decisions about new data.

Membership Data privacy attacks to determine if a data sample was part of the
Inference training set of a machine learning model.
attack

NLP Natural Language Processing.


A subset of AI that enables computers to understand spoken and written
human language. NLP enables features like text and speech recognition on
devices.

Neural A deep learning technique designed to resemble the human brain’s structure.
Network Neural networks require large data sets to perform calculations and create
outputs, which enables features like speech and vision recognition.

Overfitting Occurs in machine learning training when the algorithm can only work on
specific examples within the training data. A typical functioning AI model
should be able to generalise patterns in the data to tackle new tasks.

Prompt A prompt is a natural language input that a user feeds to an AI system in order to
COMPANION GUIDE ON SECURING AI

get a result or output.

Reinforcement A type of machine learning in which an algorithm learns by interacting with its
Learning environment and then is either rewarded or penalised based on its actions.

61
SDLC Software Development Life Cycle

The process of integrating security considerations and practices into the


various stages of software development. This integration is essential to ensure
that software is secure from the design phase through deployment and
maintenance.

Training data Training data is the information or examples given to an AI system to enable it to
learn, find patterns, and create new content.
COMPANION GUIDE ON SECURING AI

62
ANNEX A
Technical Testing and System Validation

Efficient testing is an essential component for Security by Design and


Privacy by Design, ensuring that AI systems meet the needs and
expectations of end-users, deliver value, solve real-world problems,
and are safe, reliable, accurate, and beneficial for intended users and
purposes.

AI systems can be vulnerable to adversarial attacks where malicious actors


manipulate inputs to cause the system to malfunction. Testing helps expose these
vulnerabilities and implement safeguards to mitigate them. Repeated iterations can
improve the design lifecycle and lead to a deeper understanding of how individual AI
components are interacting with each other in an eco-system, which should be secured
in its totality.

TYPES OF TESTING

There are three main categories of AI testing, each with varying levels of access to the
internal workings of the AI system:

White-Box Testing: In white-box testing, you have complete access to the source code,
model weights, and internal logic of the AI system. This allows for very detailed testing,
focusing on specific algorithms and code sections. However, it requires significant
expertise in the underlying technology and can be time-consuming

Grey-Box Testing: Grey-box testing provides partial access to the AI system. You might
have knowledge of the algorithms used but not the specific implementation details. This
allows for testing specific functionalities without getting bogged down in the intricate
code.
COMPANION GUIDE ON SECURING AI

Black-Box Testing: Black-box testing treats the AI system as a complete unit, with no
knowledge of its internal workings. This is similar to how a user would interact with the
system. Testers focus on inputs, outputs, and expected behaviours.

63
PROS AND CONS OF BLACK BOX TESTING FOR AI

Black-box testing offers several advantages, particularly for securing sensitive


information:

Protects Intellectual Property: By not requiring access to source code or model


weights, black box testing safeguards proprietary information and trade secrets.

Focus on User Experience: It prioritises real-world functionality from a user's


perspective, ensuring the AI delivers the intended results.

Reduced Expertise Needed: Testers do not need in-depth knowledge of the underlying
algorithms, making it more accessible for broader testing teams.

However, it is important to note that black box testing alone might not be sufficient for
the most comprehensive form of AI testing, because:

Limited Visibility into Issues: Without understanding the internal workings, it can be
difficult to pinpoint the root cause of errors or unexpected behaviours.

Challenges in Debugging: Debugging issues becomes more complex as you cannot


isolate problems within the specific algorithms or code sections.
COMPANION GUIDE ON SECURING AI

64
CHALLENGES OF AI TESTING

Despite considerable research to uncover the best methods for enhancing robustness,
many countermeasures would fail when subjected to stronger adversarial attacks. The
recommended approach would be to subject the AI system iteratively to robustness
testing with respect to different defences, using a comprehensive testing tool or system,
like running a penetration test.

Such a platform would then subject the test system via not just multiple attacks that will
scale upwards progressively but would manage the testing cycles with knowledge to
optimise the attack evaluation process, e.g., Black box attacks that do not need the help
of insiders. In addition, the project teams can also test the robustness of their AI systems
against the full set of known and importantly, unknown adversarial attacks.

Other challenges are:

Non-determinism: resulting from self-learning, i.e. AI-based systems may evolve over
time and therefore security properties may degrade.

Test oracle problem: where assigning a test verdict is different and more difficult for AI-
based systems, since not all expected results are known a priori.

Data-driven paradigm: AI algorithms, where in contrast to traditional systems, (training)


data will predominately determine the output behaviour of the AI.

Developing diverse test datasets: Creating datasets that represent various languages,
modalities (text, image, audio), and potential attack vectors.

Evaluating performance across modalities: Measuring the effectiveness of attacks and


model robustness across different data types.
COMPANION GUIDE ON SECURING AI

Limited testing tools: The need for specialised tools to handle the complexities of
blended AI models.

65
LIST OF AI TESTING TOOLS

AI testing is extremely complex, and the tools listed here will not be always able to
reduce its complexity and difficulty.

The list of tools for AI model testing will be split into three categories: Offensive AI Testing
Tools, Defensive AI Testing Tools, and Governance AI Testing Tools, based on the primary
purpose and functionality of the tools.

Offensive AI Testing Tools are designed to identify


vulnerabilities and weaknesses in AI systems by simulating
Offensive AI adversarial attacks or malicious inputs. These tools help
Testing Tools evaluate the robustness and security of AI models against
various types of attacks, such as adversarial examples, data
poisoning, and model extraction.

Defensive AI Testing Tools, on the other hand, focus on


enhancing the robustness and resilience of AI systems against
potential threats and vulnerabilities. These tools aim to detect
Defensive AI and mitigate the impact of adversarial attacks, natural noises,
Testing Tools or other forms of corrupted inputs, ensuring that AI models
maintain their intended behaviour and performance. Tools
that have both offensive and defensive elements are listed
under Offensive Testing.

Governance AI Testing Tools are broader in scope and are


primarily concerned with assessing the trustworthiness,
Governance AI fairness, and transparency of AI systems. These tools provide
Testing Tools frameworks, guidelines, and resources to evaluate and ensure
that AI systems align with principles of responsible AI
development, deployment, and governance.
COMPANION GUIDE ON SECURING AI

Note: The tools mentioned in these tables are often open-source projects or research prototypes that are still under active
development. As such, their functionality, performance, and capabilities may change over time, and they might not always
work as intended or as described. It is essential to regularly check for updates, documentation, and community support for
these tools, as their features and effectiveness may evolve rapidly. Additionally, some tools might have limited support or
documentation, requiring users to have a certain level of expertise and familiarity with the underlying concepts and
technologies. Therefore, it is crucial to thoroughly evaluate and validate these tools in a controlled environment before
deploying them in production or critical systems. Using highly automated settings may result in violations of cybersecurity
misuse legislation that forbids any form of scanning or vulnerability scanning unless permission has been granted. For open-
source tools, their long-term maintenance, ease of use, other tools integration, reporting and community adoption may be
a concern, especially compared to commercial or enterprise-backed AI security solutions.

66
OFFENSIVE AI TESTING TOOLS
Tool Name License Model Type Pros Cons
Description Type
Gymnasium 3 Open- Various Provides a toolkit for developing Limited to the
Malware source and comparing reinforcement malware domain.
Environment for learning algorithms. This makes
single-agent it possible to write agents that
reinforcement learn to manipulate PE files (e.g.,
learning malware) to achieve some
environments, with objective (e.g., bypass AV) based
popular reference on a reward provided by taking
environments and specific manipulation actions.
related utilities
(formerly Gym)

Deep-Pwning4 Open- Various Comprehensive framework for Requires expertise


Metasploit for source evaluating robustness of ML in adversarial
Machine Learning. models against adversarial machine learning.
attacks.

Offers flexibility and


customisation options, allowing
testers to fine-tune attack
parameters and strategies to suit
their specific testing
requirements.

Garak5 Open- LLM, Hugging Specifically designed for testing Limited to LLMs,
LLM Vulnerability source Face models LLMs for vulnerabilities, i.e. relatively new tool.
Scanner. and public probes for hallucination, data
ones. leakage, prompt injection,
misinformation, toxicity
generation, jailbreaks, and many
other weaknesses.

Adversarial Open- Various but Originated from IBM. Donated by IBM to


Robustness source not LLMs the Linux
Toolbox (ART)6 Was part of a DARPA project Foundation AI &
Library that helps called Guaranteeing AI Data Foundation
developers and Robustness Against Deception in 2020 and has
researchers improve (GARD). lost steam, as no
COMPANION GUIDE ON SECURING AI

the security of version updates


machine learning Good for research, with modules since 2020 and
models. for attacks, defences, metrics, has little new
estimators, and other activities.

3
https://github.com/Farama-Foundation/Gymnasium
4
https://github.com/cchio/deep-pwning
5
https://github.com/leondz/garak/
6
https://github.com/Trusted-AI/adversarial-robustness-toolbox

67
functionalities to help secure
machine learning pipelines Does not directly
against adversarial threats. address LLM
security issues like
prompt injection.

CleverHans7 Open- Various and Good educational and research Requires a steep
A Python library for source developing library, offering a wide range of learning curve for
creating and LLM attacks attack and defence methods via beginners to
evaluating as well. a modular design. understand all the
adversarial concepts and
examples, Offers a comprehensive set of effectively utilise.
benchmarking tools for generating and
machine learning analysing adversarial examples. Being a static
models against These are carefully crafted inputs framework, may
adversarial designed to deceive machine not inherently
examples. learning models, helping keep pace with the
researchers and developers rapidly evolving
identify weaknesses in their landscape of
systems. adversarial
attacks and
defence
Various evaluation metrics that
strategies.
go beyond standard accuracy
Documentation
measurements. It includes
and tutorials are
metrics like robustness,
focused on
resilience, and adversarial
computer vision
success rates, providing a more
models.
comprehensive understanding of
a model's performance.
While CleverHans
offers
implementations
for popular
machine learning
frameworks like
TensorFlow and
PyTorch, it may not
support all existing
frameworks or the
latest updates.

Foolbox8 Open- Various but Open-source Python library that Specialised focus
A Python toolbox for source not LLMs offers a wide variety of on image
creating adversarial adversarial attack methods, classification
examples that fools including gradient-based, score- models and does
machine learning based, and decision-based not cover other
COMPANION GUIDE ON SECURING AI

models. attacks, hence more feature- areas well.


rich, compared to the ART toolkit.

7
https://github.com/cleverhans-lab/cleverhans
8
https://github.com/bethgelab/foolbox

68
Also provides defences against
these attacks.

Provides in-depth tools and


techniques for analysing
adversarial attacks and security
in the context of computer vision
tasks.

Advertorch9 Open- Various but Offers broader set of attack and Steep Learning
A PyTorch library for source not LLMs defence techniques compared to Curve.
generating the ART toolkit, such as universal
adversarial adversarial perturbations and
Specifically
examples and ensemble-based defences.
designed for
enhancing the
PyTorch models,
robustness of deep
Allows users to seamlessly apply which may limit its
neural networks.
adversarial attacks and defences applicability to
to PyTorch models. frameworks or
models from
different libraries.

Adversarial Attacks Open- Various but Provides a comprehensive set of High complexity.
and Defences in source not LLMs tools for evaluating and
Machine Learning defending against adversarial
(AAD) Framework10 attacks on machine learning
Python framework models, which includes a wider
for defending range of attack and defence
machine learning techniques compared to the ART
models from toolkit, covering areas like
adversarial evasion, poisoning, and model
examples. extraction attacks.

Defence techniques include


adversarial training, defensive
distillation, input
transformations, and model
ensembles.
COMPANION GUIDE ON SECURING AI

9
https://github.com/BorealisAI/advertorch
10
https://github.com/changx03/adversarial_attack_defence

69
DEFENSIVE AI TESTING TOOLS
Tool Name License Model Type Pros Cons
Description Type

CNN Explainer11 Open- CNN Helps understand and validate Limited to CNNs
Visualisation tool for source CNN model decisions. A good only and does not
explaining CNN visualisation system to educate cover any other AI
decisions. new users via visualisation. vision model.

Nvidia NeMo12 Open- LLM Includes guardrails specifically Complex and GPU
A framework for source designed for LLM security, e.g. intensive, thus
generative AI. monitoring, and controlling LLM expensive and
behaviour during inference, affects latency.
ensuring that generated
responses adhere to predefined
constraints. It provides
mechanisms for detecting and
mitigating harmful or
inappropriate content, enforcing
ethical guidelines, and
maintaining user privacy.

Guardrails are customizable and


adaptable to different use cases
and regulatory requirements.

AllenAI's Open- LLM NLP library that includes Steep learning


AllenNLP13 source guardrails for LLM security: tools curve, complex
An Apache 2.0 NLP for bias detection, fairness setup, heavily
research library, assessment, and data focused on
built on PyTorch, for governance, helping users build research and
developing deep and deploy LLMs responsibly. experimentation -
learning models on some of its
a wide variety of features might be
Designed to be flexible and
linguistic tasks. more geared
adaptable to different use cases.
towards academic
research rather
than production-
level applications.

No new features
to be added, tool
is only
COMPANION GUIDE ON SECURING AI

maintained.

11
https://poloclub.github.io/cnn-explainer/
12
https://github.com/NVIDIA/NeMo
13
https://github.com/allenai/allennlp

70
AI GOVERNANCE TESTING TOOLS

Tool Name License Model Pros Cons


Description Type Type
Assessment List for Open- Various Fairly comprehensive Not an
Trustworthy AI14 source framework for evaluating automated tool,
Self-assessment tool for trustworthiness. requires
trustworthiness of AI systems. manual
assessment.

OECD AI System Open- Various Provides guidelines and Not a specific


Classification15 source resources for trustworthy AI testing tool,
Classification and tools for development. more of a
developing trustworthy AI framework.
systems.

Charcuterie16 Open- Various Provides a variety of tools for Not specifically


Collection of tools for data source data analysis and model focused on
science and machine learning. development. testing, more of
an assistance
tool.

LangKit17 Open- LLM Helps monitor and evaluate Limited to


Open-source text metrics source LLM performance, safety, LLMs, may not
toolkit for monitoring language and security. cover broader
models. AI system
governance.

AI Verify (IMDA)18 Open- Various A comprehensive tool Does not cover


AI governance testing source designed for AI governance LLM.
framework and software toolkit and responsible AI
that validates the performance practices.
of AI systems through It offers a range of features
standardised tests. to support organisations in
managing and evaluating
their AI systems throughout
their lifecycle.
Provides guidance on bias
detection and mitigation,
fairness assessments, and
stakeholder engagement.
COMPANION GUIDE ON SECURING AI

14
https://digital-strategy.ec.europa.eu/en/library/assessment-list-trustworthy-artificial-intelligence-altai-self-assessment
15
https://www.oecd.org/digital/ieconomy/artificial-intelligence-machine-learning-and-big-data/trusted-ai-systems/
16
https://github.com/moohax/Charcuterie
17
https://github.com/whylabs/langkit
18 https://aiverifyfoundation.sg/what-is-ai-verify/

71
Project Moonshot19 (IMDA) Open- LLM Moonshot provides intuitive Does not cover
An LLM Evaluation Toolkit source results, so testing unveils the LLM system
designed to integrate quality and safety of a model security.
benchmarking, red teaming, or application in an easily
and testing baselines. It helps understood manner, even for
developers, compliance teams, a non-technical user
and AI system owners manage
LLM deployment risks by
providing a seamless way to
evaluate their applications’
performance, both pre- and
post-deployment. This open-
source tool is hosted on GitHub
and is currently in beta.

threat-composer (AWS labs) Open- Various Identify security issues in the


A simple threat modelling tool source context of own AI system.
to help humans to reduce time- Provides insights on how to
to-value when threat modelling improve.

CSA does not endorse any commercial product or service. CSA does not attest to the
suitability or effectiveness of these services and resources for any particular use case.
Any reference to specific commercial products, processes, or services by service mark,
trademark, manufacturer, or otherwise, does not constitute or imply their endorsement,
recommendation, or favouring by CSA.
COMPANION GUIDE ON SECURING AI

19
https://aiverifyfoundation.sg/project-moonshot/

72
REFERENCES

Articles Standard / Regulatory Bodies


1. LinkedIn: How can you design test AI Systems Safely? 20 8. NIST: Executive Order on Safe, Secure,
2. Elinext: How to test your medical AI for safety21 and Trustworthy Artificial Intelligence27
3. Mathworks: The Road to AI Certification: The importance 9. ETSI: Securing Artificial Intelligence
of Verification and Validation in AI22 Introduction28
4. Techforgood Institute: AI Verify Foundation: Shaping the 10. CSA: GUIDELINES FOR AUDITING
AI landscape of tomorrow23 CRITICAL INFORMATION
5. FPF.Org: Explaining the Crosswalk Between Singapore’s INFRASTRUCTURE JANUARY 2020 29
AI Verify Testing Framework and The U.S. NIST AI Risk 11. IMDA: Singapore launches AI Verify
Management Framework24 Foundation to shape the future of
6. FPF.Org: AIVerify: Singapore’s AI Governance Testing international AI standards through
Initiative Explained25 collaboration30
7. Data Protection Report: Singapore proposes
Governance Framework for Generative AI26

20
https://www.linkedin.com/advice/1/how-can-you-design-test-ai-systems-safety
21
https://www.elinext.com/industries/healthcare/trends/step-by-step-guide-how-to-test-your-medical-ai-for-safety
22
https://blogs.mathworks.com/deep-learning/2023/07/11/the-road-to-ai-certification-the-importance-of-verification-and-
validation-in-ai/
COMPANION GUIDE ON SECURING AI

23
https://techforgoodinstitute.org/blog/articles/ai-verify-foundation-shaping-the-ai-landscape-of-tomorrow/
24
https://fpf.org/blog/explaining-the-crosswalk-between-singapores-ai-verify-testing-framework-and-the-u-s-nist-ai-risk-
management-framework/
25
https://fpf.org/blog/ai-verify-singapores-ai-governance-testing-initiative-explained/
26
https://www.dataprotectionreport.com/2024/02/singapore-proposes-governance-framework-for-generative-ai/
27
https://www.nist.gov/artificial-intelligence/executive-order-safe-secure-and-trustworthy-artificial-intelligence/test
28
https://portal.etsi.org/Portals/0/TBpages/SAI/Docs/2021-12-ETSI_SAI_Introduction.pdf
29
https://www.csa.gov.sg/docs/default-
source/csa/documents/legislation_supplementary_references/guidelines_for_auditing_critical_information_infrastructure.p
df?sfvrsn=8fe3dab7_0
30
https://www.imda.gov.sg/resources/press-releases-factsheets-and-speeches/press-releases/2023/singapore-launches-
ai-verify-foundation-to-shape-the-future-of-international-ai-standards-through-collaboration

73
ANNEX B
AI Security Defences and their trade-offs

As AI becomes a cornerstone of innovation and national security,


protecting its core components becomes paramount. Implementing a
multi-layered, dynamically adaptive approach that combines
technical safeguards (encryption, air gapping) with robust security
protocols (access control, monitoring) and a culture of cyber
awareness within organisations is crucial to safeguarding these new
"crown jewels" of the digital age.

DEFENDING AI MODELS
Importantly, the models themselves are “fragile” and can be easily attacked using image
or text adversarial robustness attacks, or the LLMs could be attacked using malicious
prompts.

The table below gives a short summary on the techniques to defend AI systems (non LLM)
from examples of adversarial attack.

Defence Description

Adversarial Training Train AI model using adversarial samples

Utilise blended models to perform a task, and compare their


Ensemble Models
results

Train AI model using class probabilities, instead of discrete class


Defensive Distillation
labels, to learn more information about data
COMPANION GUIDE ON SECURING AI

Adversarial Detection
Attempt to identify whether an input is an adversarial sample
• Compression
Counter Image Attacks
• Blurring
Identify which part of the input had the highest impact in
Explainability producing the resulting classification. To discover how and why
the attack is happening and what makes it work?

Table C1: Countermeasures with description

74
ADVERSARIAL TRAINING

The most viable method is to introduce adversarial training into the training dataset and
retrain the system, i.e., to simply generate and then incorporate adversarial examples
into the training process. There are toolsets to do this. In addition, some of the latest
image object recognition algorithms e.g. Yolo5, would incorporate adversarial training
within this workflow when running training. This will improve model robustness but may
not eliminate it.

Hence, the main goal of Adversarial Training is to make a model more robust against
adversarial attacks by adding adversarial samples into the model’s training dataset. Like
adding augmented samples, such as mirrored or cropped images, into the training
dataset to improve generalisation. An existing attack algorithm is used to generate these
adversarial samples, and there are several variants that utilise different algorithms to
generate the adversarial samples for training. Adversarial Training can also be thought of
as a brute-force approach, which aims to widen the input distribution of the model so
that the boundaries between classes become more accurate.

LIMITATIONS OF ADVERSARIAL TRAINING

Adversarial Training requires additional time to train the model using adversarial
samples. Iterative attack algorithms such as Projected Gradient Descent (PGD) requires
a much larger time cost, making it difficult to be used for training with massive datasets.

Adversarial Training is mainly effective against the adversarial samples the model was
trained against. To note that models, even with Adversarial Training, are susceptible to
black-box attacks that utilise a locally trained substitute model to generate adversarial
samples. Another technique proposed in “Ensemble Adversarial Training: Attacks and
Defences” by Tramèr et. Al.31 adds random perturbations to an input before running the
adversarial attacks on the perturbed input, successfully bypassing the Adversarial
Training defence.
COMPANION GUIDE ON SECURING AI

31
https://arxiv.org/abs/1705.07204

75
ENSEMBLE MODELS

Another intuitive approach to enhance model robustness would be to use multiple models
(best to be handling different aspects of the recognition problem) to either detect an attack
or to prevent a bypass attack. For example, as depicted in the diagram below, if there were
a second AI head detector, the person detector even though fooled by the physical logo on
the attacker’s shirt, the head detector would not be fooled. Additionally, if there are
multiple recognition models, the summation results of different AI systems could still be
functional, despite one model being successfully attacked.

LIMITATIONS OF ENSEMBLE MODEL

As multiple models are used on each input, the use of ensemble methods will require
additional resources, more memory and computational power for each classification.
Ensembles of models may also require more time for development and be more difficult
to be used in scenarios where fast, real-time predictions may be required.

DEFENSIVE DISTILLATION

Distillation, also known as Teacher-Student Models, is a procedure which utilises


knowledge obtained from a trained ‘teacher’ Deep Neural Network (DNN) to train a second
‘student’ DNN. The classes of the labelled training data are known as hard labels, and the
output classifications of the ‘teacher’ DNN are known as soft labels which captures
probability distributions indicating how confident the model is for each class. The
‘student’ DNN is trained using soft labels and the softer predictions make it harder to fool
the student which has learnt a more nuanced representation of the dataset. This makes
the DNN more robust to adversarial attacks.

LIMITATIONS OF DEFENSIVE DISTILLATION

However, defensively distilled models are still vulnerable to various black-box attacks,
due to the strong transferability of adversarial samples generated by these attacks.
Modified versions of existing attack algorithms, such as the modified Papernot’s attack,
COMPANION GUIDE ON SECURING AI

have also successfully bypassed defensive distillation.

76
UTILISING EXPLAINABILITY

A different approach to Adversarial Detection involves the incorporation of Explainable AI


(XAI) techniques, which ‘explain’ the reasons that led to the AI model’s prediction. XAI is
an emerging field in machine learning that aims to explain predictions made by AI models
to improve accuracy, fairness and at the same time aid in the detection of possible
anomalies or adversarial attacks. In order to understand the complex black boxes that
are AI models, XAI is expected to provide explanations interpretable by humans with
clear and simple visualisations.

The main strength of this method is its ability to gain insights into weaknesses present in
the model, such as when the reasons leading to the resultant prediction are incorrect. A
local interpreter is built to explain the factors that cause adversarial samples to be
wrongly classified by the target model.

Furthermore, adversarial samples that exploit these weaknesses can then be generated
for use in adversarial training, allowing the model to overcome them. In addition, as the
interpretation technique is general to all classifiers, this method can be applied to
improve any type of model that supports XAI techniques.

Finally, AI Explainability techniques can be applied to the suspected adversarial inputs,


providing visualisations to human operators explaining why these inputs are potentially
malicious. The operators can then find out if the detection was a false positive and work
on improving the detection model. Otherwise, if the detection was accurate, problems
with the defended model can potentially be identified, and the appropriate
countermeasures can be applied.

A COMBINATION OF TECHNIQUES

Multiple countermeasures can be used to complement one another, creating a defence-


in-depth approach as a higher level of using ensemble defences with differently
configured AI models. to would ensure even stronger robustness against adversarial
attacks.
COMPANION GUIDE ON SECURING AI

77
DEFENDING YOUR AI SYSTEMS
BEYOND THE MODELS
After defending the AI models, since it is still possible to subvert,
poison and tamper with the AI system, enhanced infrastructural
security measures would have to be added to counter the offensive
TTPs that were identified during the risk assessment. The key areas to
focus on include:

Continuous monitoring and threat intelligence: Staying informed about the latest
threats and vulnerabilities through threat intelligence feeds and security monitoring tools.

Implementing security best practices: This includes basic hygiene measures like
patching vulnerabilities, using strong passwords, and implementing multi-factor
authentication. Increase system segregation and isolation using containers, VMs, air gaps,
firewalls etc.

User awareness training: Educating employees about social engineering tactics and how
to identify and avoid phishing attacks.

Security testing and vulnerability assessments: Regularly testing systems for


vulnerabilities and implementing security controls to mitigate risks.

Investing in Security Automation: Utilise automation tools to streamline security


processes and improve efficiency.
COMPANION GUIDE ON SECURING AI

By staying proactive and adapting to the evolving threat landscape that heralds powerful
AI-armed APT intruders, organisations can build stronger AI crown jewel defences and
mitigate the impact of cyberattacks. Remember, cybersecurity is an ongoing process, not
a one-time fix.

78
AI SECURITY DEFENCES AND
THEIR TRADE-OFFS
It is prudent to start implementing countermeasures to protect AI models against attacks
early, even as there remain unknowns.

• No one method or countermeasure can reliably defend against all


attacks

• Limited awareness and know-how in understanding and operationalising


adversarial countermeasures, exacerbated by the complexity of AI
models that also makes it difficult to prove how and which defence
method will work against some subset of attacks

• As with other changes to the AI model and system, modifications to the


model to enhance defences can have impact on model/ system
performance

Regardless, traditional security practices continue to be relevant, and provide a good


foundation for securing cutting-edge technologies like AI, even as work in this space
continues to evolve.
COMPANION GUIDE ON SECURING AI

79
REFERENCES
Standards / Regulatory Bodies
1. NIST b. NIST: AI RMF Knowledge Base34
a. NIST: NIST Secure Software c. NIST: A USAISI Workshop: Collaboration to Enable
Development Framework for Safe and Trustworthy AI35
Generative AI and for Dual Use d. NIST: USAISI Workshop Slides36
Foundation Models Virtual Workshop e. NIST: Artificial Intelligence37
32
f. NIST: Biden-Harris Administration Announces first
NIST: Executive Order 14110 on Safe, Secure, and ever consortium dedicated to AI Safety38
Trustworthy Artificial Intelligence (October 2023)33 g. NIST: NIST Secure Software Development
Framework for Generative AI and for Dual Use
Foundation Models Virtual Workshop39

2. ENISA e. ENISA: Artificial Intelligence Cybersecurity


a. ENISA: Artificial Intelligence40 Challenges 44
b. ENISA: EU Elections at Risk with Rise f. ENISA: Is Secure and Trusted AI Possible? The EU
of AI-Enabled Information Leads the Way 45
Manipulation 41 g. ENISA: Cybersecurity and privacy in AI - Medical
c. ENISA: Multilayer Framework for Good imaging diagnosis 46
Cybersecurity Practices for AI 42 b. ENISA: Cybersecurity and privacy in AI -
d. ENISA: Cybersecurity and AI and Forecasting demand on electricity grids
Standardisation Report43

3. NCSC
a. NCSC: Guidelines for secure AI
system development 4748
4. NSA
a. Deploying AI Systems Securely: Best
Practices for Deploying Secure and
Resilient AI Systems49
5. Singapore
a. CSA: Codes of Practice50
b. PDPC: Primer for 2nd Edition of AI Gov
Framework 51
c. Gov.SG ISAGO52
d. PDPC: Advisory Guidelines On use of
Personal Data In AI Recommendation
and Decision Systems53

6. Standards and Guides


a. ISO/IEC 42001:2023 Information
Technology: Artificial Intelligence:
Management System54
b. ISO/IEC 23894:2023 Information
Technology: Artificial Intelligence:
Guidance on Risk Management55
c. OWASP AI Security and Privacy
COMPANION GUIDE ON SECURING AI

Guide56

7. Others
a. PartnershiponAI57
b. AJL58
c. International Telecommunication
Union (ITU)59
d. OECD Artificial Intelligence 60

80
8. GitHub Repositories e. TenSEAL: Encrypting Tensors with Microsoft SEAL 65
a. Privacy Library of Threats 4 Artificial f. SyMPC: Extends Pysft with SMPC Support66
Intelligence61 g. PyVertical: privacy-preserving, vertical federated
b. Guardrails.AI62 learning using syft 67
c. PyDP: Differential Privacy63
d. IBM Differential Privacy Library64

9. Articles f. Kim & Chang: South Korea: Legislation on Artificial


a. CSO: NIST releases expanded 2.0 Intelligence to Make Significant Progress 73
version of the Cybersecurity g. MetaNews: South Korean Government Says No
Framework68 Copyright for AI Content74
b. Technologylawdispatch.com: ENISA h. East Asia Forum: The future of AI policy in China 75
Releases Comprehensive Framework i. Reuters: China approves over 40 AI models for
public use in past six months76

32
https://www.nist.gov/news-events/events/nist-secure-software-development-framework-generative-ai-and-dual-use-
foundation
33
https://www.nist.gov/artificial-intelligence/executive-order-safe-secure-and-trustworthy-artificial-intelligence
34
https://airc.nist.gov/AI_RMF_Knowledge_Base/AI_RMF
35
https://www.nist.gov/news-events/events/usaisi-workshop-collaboration-enable-safe-and-trustworthy-ai
36
https://www.nist.gov/system/files/documents/noindex/2023/11/20/USAISI-workshop-
slides%20%28combined%20final%29.pdf
37
https://www.nist.gov/artificial-intelligence/artificial-intelligence-safety-institute
38
https://www.nist.gov/news-events/news/2024/02/biden-harris-administration-announces-first-ever-consortium-
dedicated-ai
39
https://www.nist.gov/news-events/events/nist-secure-software-development-framework-generative-ai-and-dual-use-
foundation
40
https://www.enisa.europa.eu/topics/iot-and-smart-infrastructures/artificial_intelligence
41
https://www.enisa.europa.eu/news/eu-elections-at-risk-with-rise-of-ai-enabled-information-manipulation
42
https://www.enisa.europa.eu/publications/multilayer-framework-for-good-cybersecurity-practices-for-ai
43
https://www.enisa.europa.eu/publications/cybersecurity-of-ai-and-standardisation/@@download/fullReport
44
https://www.enisa.europa.eu/publications/artificial-intelligence-cybersecurity-challenges
45
https://www.enisa.europa.eu/news/is-secure-and-trusted-ai-possible-the-eu-leads-the-way
46
https://www.enisa.europa.eu/publications/cybersecurity-and-privacy-in-ai-medical-imaging-diagnosis
47
https://www.ncsc.gov.uk/collection/guidelines-secure-ai-system-development
48
https://www.ncsc.gov.uk/files/Guidelines-for-secure-AI-system-development.pdf
49
https://www.nsa.gov/Press-Room/Press-Releases-Statements/Press-Release-View/Article/3741371/nsa-publishes-
guidance-for-strengthening-ai-system-security/
50
https://www.csa.gov.sg/legislation/Codes-of-Practice
51
https://www.pdpc.gov.sg/-/media/files/pdpc/pdf-files/resource-for-organisation/ai/primer-for-2nd-edition-of-ai-gov-
framework.pdf
52
http://go.gov.sg/isago
53
https://www.pdpc.gov.sg/guidelines-and-consultation/2024/02/advisory-guidelines-on-use-of-personal-data-in-ai-
recommendation-and-decision-systems
54
https://www.iso.org/standard/81230.html
55
https://www.iso.org/standard/77304.html
56
https://owasp.org/www-project-ai-security-and-privacy-guide/
57
https://partnershiponai.org/
58
https://www.ajl.org
59
https://www.itu.int/
60
https://www.oecd.org/digital/artificial-intelligence/
COMPANION GUIDE ON SECURING AI

61
https://plot4.ai/
62
https://github.com/guardrails-ai/guardrails
63
https://github.com/OpenMined/PyDP
64
https://github.com/IBM/differential-privacy-library
65
https://github.com/OpenMined/TenSEAL
66
https://github.com/OpenMined/SyMPC
67
https://github.com/OpenMined/PyVertical
68
https://www.csoonline.com/article/1310046/nist-releases-expanded-2-0-version-of-the-cybersecurity-framework.html
73
https://www.kimchang.com/en/insights/detail.kc?sch_section=4&idx=26935
74
https://metanews.com/south-korean-government-says-no-copyright-for-ai-content/
75
https://eastasiaforum.org/2023/09/27/the-future-of-ai-policy-in-china/
76
https://www.reuters.com/technology/china-approves-over-40-ai-models-public-use-past-six-months-2024-01-29/

81
for Ensuring Cybersecurity in the j. DataNami: Artificial Intelligence Leaders Partner
Lifecycle of AI Systems 69 with Cloud Security Alliance to Launch the AI Safety
c. DataGuidance: ENISA releases four Initiative 77
reports on AI and Cybersecurity 70 k. World Economic Forum: Why we need to care
d. KoreaTimes: Korea issues first AI about responsible AI in the age of the algorithm 78
ethics checklist 71 l. What is Confidential Computing? | Data Security in
e. Dig.watch: South Korea to boost trust Cloud Computing (Anjuna)79
in AI with watermarking initiative72 m. What is Confidential Computing? (NVIDIA Blog)80
COMPANION GUIDE ON SECURING AI

69
https://www.technologylawdispatch.com/2023/06/data-cyber-security/enisa-releases-comprehensive-framework-for-
ensuring-cybersecurity-in-the-lifecycle-of-ai-systems/
70
https://www.dataguidance.com/news/eu-enisa-releases-four-reports-ai-and-cybersecurity
71
https://m.koreatimes.co.kr/pages/article.asp?newsIdx=352971
72
https://dig.watch/updates/south-korea-to-boost-trust-in-ai-with-watermarking-initiative
77
https://www.datanami.com/this-just-in/artificial-intelligence-leaders-partner-with-cloud-security-alliance-to-launch-the-
ai-safety-initiative/
78
https://www.weforum.org/agenda/2023/03/why-businesses-should-commit-to-responsible-ai/
79
https://www.anjuna.io/blog/confidential-computing-a-new-paradigm-for-complete-cloud-security
80
https://docs.nvidia.com/nvtrust/index.html

82
Advisory and Cloud Providers

10. Google
a. Google: Introducing Google’s Secure
AI Framework81
b. Google: OCISO Securing AI Similar or
Different? 82

11. Microsoft e. Microsoft: AI Fairness Checklist87


a. Microsoft: Azure Platform83 f. Microsoft: AI Lab project: Responsible AI
b. Microsoft: Introduction to Azure dashboard88
Security84 g. Microsoft: Our commitments to advance safe,
c. Microsoft: Responsible AI85 secure, and trustworthy AI89
d. Microsoft: Responsible AI Principles
and Approach86

12. Amazon Web Services


a. Secure approach to generative AI90
COMPANION GUIDE ON SECURING AI

81
https://blog.google/technology/safety-security/introducing-googles-secure-ai-framework/
82
https://services.google.com/fh/files/misc/ociso_securing_ai_different_similar.pdf
83
https://www.microsoft.com/en-us/ai/ai-platform
84
https://learn.microsoft.com/en-us/azure/security/fundamentals/overview
85
https://www.microsoft.com/en-us/ai/responsible-ai
86
https://www.microsoft.com/en-us/ai/principles-and-approach
87
https://www.microsoft.com/en-us/research/project/ai-fairness-checklist/
88
https://www.microsoft.com/en-us/ai/ai-lab-responsible-ai-dashboard
89
https://blogs.microsoft.com/on-the-issues/2023/07/21/commitment-safe-secure-ai/
90
https://aws.amazon.com/ai/generative-ai/security/

83
Consultancies

13. Deloitte
a. Deloitte: Trustworthy AI91
b. Deloitte: Omnia AI92
c. Deloitte AI Institute: The State of
Generative AI in the Enterprise: Now
Decides Next93

14. EY
a. EY: How to navigate generative AI use
at work94
b. EY: EY’s commitment to developing
and using AI ethically and
responsibly95
c. EY: Making Artificial Intelligence and
Machine Learning trustworthy and
ethical96

15. KPMG
a. KPMG: AI security framework design97
b. KPMG: Trust in Artificial Intelligence98

16. PwC
a. PwC: Balancing Power and Protection:
AI in Cybersecurity and Cybersecurity
in AI 99
b. PWC: What is Responsible AI100

17. Research Papers


a. TASRA: a Taxonomy and Analysis of
Societal-Scale Risks from AI101
b. China Academy of Information and
Communications Technology:
Whitepaper on Trustworthy Artificial
Intelligence102
c. Trustworthy AI: From Principles to
Practices 103

91
https://www2.deloitte.com/us/en/pages/deloitte-analytics/solutions/ethics-of-ai-framework.html
92
https://www2.deloitte.com/ca/en/pages/deloitte-analytics/articles/omnia-artificial-intelligence.html
93
https://www2.deloitte.com/us/en/pages/deloitte-analytics/articles/advancing-human-ai-collaboration.html#
COMPANION GUIDE ON SECURING AI

94
https://www.ey.com/en_us/consulting/video-how-to-navigate-generative-ai-use-at-work
95
https://www.ey.com/en_gl/ai/principles-for-ethical-and-responsible-ai
96
https://assets.ey.com/content/dam/ey-sites/ey-com/en_gl/topics/consulting/ey-making-artificial-intelligence-and-
machine-learning-trustworthy-and-ethical.pdf
97
https://kpmg.com/us/en/capabilities-services/advisory-services/cyber-security-services/cyber-strategy-
governance/security-framework.html
98
https://kpmg.com/xx/en/home/insights/2023/09/trust-in-artificial-intelligence.html
99
https://www.pwc.com/m1/en/publications/balancing-power-protection-ai-cybersecurity.html
100
https://www.pwc.com/gx/en/issues/data-and-analytics/artificial-intelligence/what-is-responsible-ai.html
101
https://arxiv.org/abs/2306.06924
102
http://www.caict.ac.cn/english/research/whitepapers/202110/P020211014399666967457.pdf
103
https://arxiv.org/abs/2110.01167

84

You might also like