100% found this document useful (1 vote)
162 views41 pages

Secure AI Lifecycle Framework Guide

This document is a practical guide for building and deploying secure AI applications, introducing the SAIL (Secure AI Lifecycle) Framework to address the unique security challenges posed by AI technologies. It emphasizes the need for a unified approach to integrate security throughout the AI development lifecycle, highlighting the importance of governance and collaboration among various teams. The guide outlines specific AI security risks and provides actionable methodologies to ensure effective risk management and compliance in AI deployments.

Uploaded by

studentnitte
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
162 views41 pages

Secure AI Lifecycle Framework Guide

This document is a practical guide for building and deploying secure AI applications, introducing the SAIL (Secure AI Lifecycle) Framework to address the unique security challenges posed by AI technologies. It emphasizes the need for a unified approach to integrate security throughout the AI development lifecycle, highlighting the importance of governance and collaboration among various teams. The guide outlines specific AI security risks and provides actionable methodologies to ensure effective risk management and compliance in AI deployments.

Uploaded by

studentnitte
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

A Practical Guide

for Building and

Deploying Secure

AI Applications
1 . 0

2 0 2 5
V e r s i o n

J u n e
Acknowledgements
We would like to extend our gratitude to the following reviewers and contributors. Their

constructive input and insightful feedback were invaluable throughout the development of this

framework. We deeply appreciate their willingness to share their expertise and their commitment

to advancing AI security practices within the community.

Allie Howe, vCISO, Growth Cyber

Assaf Namer, Head of AI Security, Google Cloud

Ben Hacmon, CISO, Perion Network

Bill Stout, Technical Director, AI Product Security, Servicenow

Brandon Dixon, Former Partner AI Strategist, Microsoft

Casey Mott, Associate Director, Data & AI Security, Oscar Health

Chris Hughes, Founder, Resilient Cyber

Cole Murray, AI Consultant

Colton Ericksen, CISO, Starburst

Dušan Vuksanovic, CEO of Swisscom Outpost in Silicon Valley

Erika Anderson, Senior Security and Compliance - SAP Sovereign Cloud

Fabian Libeau, Cyber Security GTM Lead

James Berthoty, Founder & CEO, Latio Tech

José J. Hernández, CISO, Corning Inc.

Kai Wittenburg, CEO, Neam GmbH

Manuel García-Cervigón, Security & Compliance Strategic Product Portfolio Architect, Nestlé

Matthew Steele, CPO, Generate Security

Mor Levi, VP Detection and Response, Salesforce

Moran Shalom, CISO, Honeybook

Nir Yizhak, CISO & VP, Firebolt Analytics

Raz Karmi, CISO, Eleos Health

Robert Oh, Chief Digital & Information Officer (CDIO), International

Sean Wright, CISO, AvidXchange

Steve Paek, Expert- Cybersecurity (AI Security), AT&T

Steve Mancini, CISO, Guardant Health

Steven Vandenburg, Security Architect - AI, Cotiviti

Tomer Maman, CISO, Similarweb

Vladimir Lazic, Deputy Global CISO, Philip Morris International

02
Table of Contents
Executive Summary 04

Chapter 1: Introduction 06

1.1 The AI Sea Change: Why AI Security is Different 06

1.2 New Principles for the Intelligence Age 07

1.3 The Imperative for a Unified Framework 08

Chapter 2: The AI security landscape 09

Chapter 3: The SAIL (Secure AI Lifecycle) Framework 10

3.1 The AI Development Lifecycle: A New Voyage 11

3.2 The SAIL Philosophy: Guiding Principles for Secure AI 12

3.3 Overview of the SAIL Phases 13

3.4 Detailed SAIL Phases, Purposes, and Associated Risks 15

Appendix A: Definitions of AI System Components 31

Appendix B: Use cases 34

References 39

03
Executive Summary
AI is evolving faster than any previous technology wave, reshaping not only business operations but also

dramatically expanding cybersecurity threats and regulatory requirements. If you're reading this guide, you’re

already playing a pivotal role in navigating one of the most significant technological shifts of our time - the

Intelligence Age.

Organizations embrace AI primarily to automate routine tasks, enhance decision-making, drive cost

efficiencies, and unlock new revenue streams.

McKinsey’s data show that embedding governance at the C-suite level, backed by cross-functional teams

and iterative feedback mechanisms, is strongly correlated with both safer AI deployments and stronger

financial returns. From a security standpoint, the most impacted domain is data security and integrity, closely

followed by cybersecurity and privacy.

// Why SAIL Was Created and Its Role in the AI Security Ecosystem

Through extensive collaboration with AI and cybersecurity leaders - from innovative startups to Fortune

500 enterprises - we identified a critical gap. Teams required a unifying framework that could translate

high-level security principles into practical, actionable guidance across the entire AI lifecycle. These

practitioners shared not just their challenges, but the battle-tested approaches that now form the

foundation of SAIL.

The SAIL Framework addresses this need by embracing a process-oriented approach that both harmonizes

with and enhances the valuable contributions of existing standards. Its unique strength lies in embedding

security actions into each phase of the AI development lifecycle. This methodology complements the

strategic risk management governance of NIST AI RMF; the formal management system structures of ISO

42001; the critical vulnerability identification of the OWASP Top 10 for LLMs; and the essential

component-level technical risk identification provided by frameworks like the DASF. By synthesizing these

diverse perspectives through a lifecycle lens, SAIL provides an operational guide that empowers

organizations to transform security knowledge into actionable practices.

Ultimately, SAIL serves as the overarching methodology that bridges communication gaps between AI

development, MLOps, LLMOps, security, and governance teams. This collaborative, process-driven

approach ensures security becomes an integral part of the AI journey - from policy creation through

runtime monitoring - rather than an afterthought.

It provides a shared roadmap to:

Address the threat landscape using a detailed library of over 70 mapped AI-specific risks organized
across 7 interconnected phases.

04
Define the key capabilities and controls needed to build a robust AI security program.

Accelerate secure AI adoption while protecting reputation and ensuring compliance.

As a navigational chart for the AI journey, this guide is intended for security leaders, AI and Machine
Learning practitioners, MLOps, LLMOps teams, data scientists, security architects, application security
engineers, threat modelers, and compliance officers, and any individual or team involved in the design,
development, deployment, or security of AI systems.

05
Chapter 1

Introduction:

The Shifting Tides


of AI Security
The advent of advanced Artificial Intelligence, particularly Agentic AI, marks a pivotal technological shift,
comparable in its transformative potential to the rise of the internet and the proliferation of cloud
computing. This "AI sea change" fundamentally alters software development, information interaction,
and business operations, bringing with it a new frontier of complex security challenges that demands
fresh approaches.

1.1 The AI Sea Change: Why AI Security is Different


Artificial Intelligence systems, especially modern Large Language Models (LLMs) and Generative AI, possess
unique characteristics that distinguish them from traditional software. Their dynamic learning capabilities,
adaptive behaviors, and often opaque decision-making processes render conventional security measures
insufficient on their own. While established DevSecOps principles - focusing on integrating security
throughout the software development lifecycle - remain valuable, their direct application to AI systems
encounters significant limitations.

The core challenge lies in AI's departure from deterministic, code-driven logic. AI models learn from vast
datasets, can evolve post-deployment, and may exhibit emergent behaviors not explicitly programmed. This
means that:
Attack surfaces are broader and more novel: Beyond traditional code vulnerabilities, AI models
introduce risks like data poisoning, model evasion, prompt injection, and the potential for
models to leak sensitive training data or generate harmful content
Predictability is reduced: The adaptive nature of AI means its behavior can be harder to predict
and secure against unforeseen inputs or adversarial manipulations
Transparency can be limited: The "black box" nature of some complex models makes it difficult
to fully understand why an AI makes a particular decision, complicating vulnerability
assessment and incident response.

06
Consequently, standard security tools such as static/dynamic code analysis (SAST/DAST), Common

Vulnerabilities and Exposures (CVE) scanning, and network firewalls, while still vital components of a

defense-in-depth strategy, are not designed to address the nuanced, data-influenced, and behavior-centric

vulnerabilities specific to AI.

1.2 New Principles for the Intelligence Age

To effectively secure this new era of intelligent systems, we must adopt guiding principles that reflect how AI

fundamentally reshapes our understanding of software, data, and security:

Data is Executable: Prompts, configurations, and datasets aren't passive; they are active

instructions directly commanding software behavior and outcomes, redefining data's power and

risk. Malicious inputs can thus trigger unintended operations or exploit system functionalities

with unprecedented ease.

For example, when AI is integrated into legacy applications, these executable prompts flow through

datastreams not originally designed to handle them. This creates new vulnerabilities because traditional

applications were not built to treat user-supplied data as a command. Therefore, mitigations must be added to

these applications before data or prompts are sent to the back-end LLM or ML system.

Software Has Agency: AI evolves from a predictable tool to an intelligent agent, autonomously

making decisions, learning, and adapting. This agency introduces novel risks related to

unintended consequences and autonomous actions, demanding continuous oversight and

robust guardrails. Unlike traditional software that changes only through code deployments, AI

systems can shift their behavior through learning and adaptation—even without code changes.

For example, AI agents automating workflows can be 'socially engineered' via techniques like Business Process
Compromise (BPC), which corrupts core operations. This elevates risk to the business layer and highlights a
new dependency stack: the business relies on data integrity, which in turn relies on the secure functioning of
the application and infrastructure.

Furthermore, the probabilistic nature of AI agents clashes with processes that demand transactional integrity.
An agent might execute a complex, multi-system transaction based on a misinterpreted prompt or a simple
typo. Because these actions are often difficult or impossible to roll back across multiple systems, especially in
orchestrations involving multiple agents and tools, such errors can have significant and lasting consequences.

07
Development is Redefined: AI systems are assembled, trained, and prompted, not just
traditionally coded. This shift towards iterative guidance (sometimes dubbed 'vibe coding') and
sophisticated prompt engineering demands new methods for creation, verification, and
securing the development pipeline itself.

For example: foundational models, which form the base of many modern AI systems, cannot yet be fully
trusted, as a comprehensive standard for their security and verification does not yet exist. Organizations often
inherit the vulnerabilities and biases of these pre-trained models, creating a critical dependency on a supply
chain that lacks transparency and robust security guarantees.

Security Becomes Foundational: When data can execute, software possesses agency, development
methods are transformed, and the underlying ecosystem is novel, security cannot be an afterthought
or a peripheral layer. It must be intrinsically woven into the fabric of AI systems from their very
inception, underpinning every component and process.

1.3 The Imperative for a unified and process-oriented framework


These transformative principles create an unprecedented shared challenge. AI teams, driven to innovate at
light speed, often operate under immense pressure. Simultaneously, security teams are tasked with
protecting against novel, rapidly evolving threats, frequently with tools not designed for this new paradigm.
When these teams work in silos, the inherent complexities and risks are dangerously amplified. A common
language and a unified framework are therefore not just beneficial, but vital to navigate this landscape
cohesively and securely. This is precisely the role the SAIL (Secure AI Lifecycle) Framework is designed to
fulfill, offering a comprehensive methodology to manage AI-specific risks effectively across

the entire AI lifecycle.

08
Chapter 2

The AI Security Landscape:


Establishing a Common
Understanding of AI Risks
AI security introduces a host of new terminology, guidelines, and frameworks. To foster a clear, shared
understanding between security and AI teams, this chapter defines 11 core risk categories. These are critical
for any organization to consider before moving AI systems into production. The identified risk categories are
distilled from established and emerging industry resources, including MITRE ATLAS, the NIST AI Risk
Management Framework (AI-RMF), OWASP and relevant standards like ISO 42001.

This common understanding of potential threats and vulnerabilities is the crucial first step. It provides the
necessary context before leveraging the SAIL (Secure AI Lifecycle) Framework, which offers a structured
methodology (detailed in subsequent chapters) to proactively manage these risks throughout the entire AI
lifecycle.

Risk Category What It Means in Practice Impact


1
Tricking AI with malicious prompts Data leaks, unauthorized actions,
Prompt Injection
to bypass safeguards, reveal data, harmful content, system
& Manipulation
or execute harmful actions. compromise, reputational damage.

2
Corrupting training data to Flawed model behavior,
Training Data
embed biases, backdoors, or biased outcomes, exploitable
Poisoning
vulnerabilities into the AI model. vulnerabilities, loss of trust.

3
AI models unintentionally leaking Data breaches, privacy
Sensitive Information
confidential data (PII, trade secrets) violations, regulatory fines, loss
Disclosure
learned during training/interaction. of IP, reputational damage.

4
Crafting slightly altered inputs to Bypassing security, erroneous
Model Evasion
deceive AI models into making decisions, safety risks, system
(Adversarial Attacks)
incorrect classifications or decisions. malfunction.

09
Risk Category What It Means in Practice Impact

5
Model Theft &
Stealing or reverse-engineering Loss of IP/competitive edge,
IP Extraction
proprietary AI models, algorithms, financial loss, unauthorized
or parameters. model use.

6
Insecure Output Using unvalidated AI outputs in Error propagation, exploitation
Handling & other systems, leading to of connected systems, flawed
Downstream Risks downstream vulnerabilities. decisions, security breaches.

7
AI creating realistic fake content Disinformation, fraud,
Malicious & Deceptive
(e.g., deepfakes) for disinformation, reputational harm, social
Content Generation
fraud, or impersonation. unrest, erosion of trust.

8
Exploiting vulnerabilities in third- System compromise via tainted
AI Supply Chain
party AI components (models, components, data breaches, model
Vulnerabilities
data, tools, APIs). poisoning, widespread effects.

9
Uncontrolled Exploiting AI to exhaust resources Service outages, excessive
Resource (CPU, memory), causing Denial of costs, system instability,
Consumption & DoS Service (DoS) or high costs. operational disruption.

10
AI Agent & Manipulating AI agents or Physical harm, mission failure,
Autonomous autonomous systems (robots, unauthorized surveillance,
System Exploitation drones) to cause harm or leak data. critical system disruption.

11
Insecure AI System Core flaws in AI system/model Broad vulnerabilities, increased
& Component architecture, configuration, or attack surface, difficult
Design security controls. remediation, systemic weaknesses.

The 11 core risk categories detailed above provide a foundational understanding of the AI-specific threat
landscape. These risks are not isolated; they can manifest and have implications across various phases of
an AI system's lifecycle – from initial design and data acquisition through development, deployment and
day-to-day operation.

Furthermore, a challenge not fully addressed by many current standards is the architectural risk of
integrating the unpredictable, inconsistent output of probabilistic AI with programmatic systems that
expect deterministic, predictable input.

The SAIL Framework is specifically designed to mitigate this risk. It provides a methodology for unifying
and overlaying security practices across both the AI and traditional software development lifecycles,
ensuring this fundamental mismatch is managed from the start

10
Chapter 3

The SAIL (Secure AI


Lifecycle) Framework:
Navigating the Waters
3.1 The AI Development Lifecycle: A New Voyage
AI systems follow a distinct development path, illustrated in the AI Development Lifecycle diagram

(Figure 3.1). It introduces a fundamentally new lifecycle that intertwines with, yet distinctly differs from,
conventional software development practices. While integrating elements from traditional software
development, this AI lifecycle significantly expands upon them due to its data-centricity, iterative model
evolution, and unique operational needs. This AI-specific journey is not isolated; it's deeply intertwined with
the broader Software Development Lifecycle that manages associated applications and infrastructure.

AI Development AI
Lifecycle Sandbox
Operate

Runtime Guardrails Activity Tracing


Monitor
Deploy

Software
AI Policy
n
Pla
Code/No Code

Test

Al Discovery AI Red Teaming

Build

AI-SPM

AI Development Lifecycle Software Development Lifecycle

Figure 3.1
11
The SAIL (Secure AI Lifecycle) Framework addresses the imperative for holistic security across these
interconnected lifecycles. It provides specialized security controls tailored to the unique demands of the AI
lifecycle - such as its reliance on vast datasets, potential for autonomous decision-making, and novel attack
vectors - while ensuring these measures are harmonized with established security practices for traditional
software components. This integrated approach prevents security silos, acknowledging that AI development
is a new voyage that expands upon established software engineering principles.

3.2 The SAIL Philosophy: Guiding Principles for Secure AI


The SAIL Framework's philosophy extends traditional security to AI's unique challenges, emphasizing a
proactive, comprehensive, and adaptive approach through these core security principles:

Secure by Design & Default: Proactively embed security from AI conception, including threat

modeling and secure data governance before development

Privacy by Design & Data Minimization: Limit data collection to what’s strictly necessary, apply

default anonymization, and enforce retention caps, shrinking the attack surface and honoring

user autonomy from the start

Continuous Model & System Assurance: Implement real-time monitoring of AI model behavior,

data integrity, and infrastructure for drift, attacks, and anomalies

Adaptive Defense & Response: Enable rapid reaction to newly discovered vulnerabilities in AI

components, models, or data pipelines

Robust Lifecycle Security Controls: Integrate comprehensive, testable security measures

throughout AI development, from secure coding to adversarial testing and runtime protection

Cross-Functional Collaboration & Governance: In the AI era, security responsibility must be

clearly distributed across teams and vendors. A proper RACI ensures data and ML engineers

execute securely, the CISO signs off on risk and compliance, legal and business units provide

oversight and context, and leadership stays informed to support and scale securely

Purpose-Built AI Security Tooling: Leverage specialized tools for unique AI security challenges

like model scanning, adversarial robustness testing, and AI-specific attack monitoring.

Central to the SAIL philosophy is “Shift Up,” an evolution of the classic shift-left mindset for the AI era. Shift-
left works well in deterministic software, but AI has changed how systems are built: it inserts new
abstraction layers where humans guide systems that write code, make autonomous decisions, orchestrate
complex tasks, and create content at scales beyond human review. When a model produces thousands of
lines of code, flags millions of financial transactions, or powers thousands of concurrent customer chats,
manual controls alone no longer suffice.

Security must elevate its focus to these new AI-driven layers of abstraction, shifting protection from the
code level to the business logic and processes that AI now controls. “Shift Up” meets that need by adding
automated, purpose-built controls at the AI layer. Whereas the traditional security plane runs horizontally
(development → testing → runtime), Shift Up introduces a critical vertical axis. AI pushes risk upward and
exposes a new dependency stack, so a flaw in infrastructure, application, or data can instantly compromise
autonomous operations.

12
As Figure 3.2 shows, this extends protection beyond familiar elements - data pipelines, model inference - to
the AI's generative capabilities themselves. The SAIL goal is to actively secure the entire AI lifecycle,
addressing both runtime threats like adversarial attacks and the unique challenge of securing systems whose
outputs we cannot fully review, ensuring the reliability of AI's expanding role in critical operations.

Autonomous
Decisions

Unhuman
Opaque
scale
Decisions
Ab L gic
Bu

AI ct
st ay and
si
ne

ra er Pro
ss
Level of Abstraction

Lo

io esse
n
c
s

S H I F T U P S H I F T U P

S H I F T L E F T S H I F T R I G H T

Software
AI

Development
Development

Lifecycle Lifecycle

Figure 3.2

3.3 Overview of the SAIL Phases


The SAIL Framework is structured around seven foundational phases, guiding organizations through a
comprehensive secure AI lifecycle: Plan, Code/ No Code, Build, Test, Deploy, Operate, Monitor

1. AI Policy & Safe experimentation (Plan): This foundational phase establishes AI security policy
frameworks aligned with business objectives, regulatory requirements, and overall AI governance. It
covers identifying AI use cases, assessing compliance needs, defining risk-based protection, and
setting up secure AI experimentation environments for policy alignment validation. This phase
incorporates dedicated threat modeling to proactively identify novel failures and inform architecture
decisions. It also establishes initial data and model governance definitions, formalizing the
introduction and vetting processes for new data or models.

13
2. AI Asset Discovery (Code/ No Code): This initial phase focuses on identifying, cataloging, and
vetting all AI assets - including models, datasets, no code platforms and code components, whether
developed in-house or sourced externally. This comprehensive inventory is crucial not only for
understanding the AI system's composition and potential vulnerabilities but also for meeting
emerging AI regulatory requirements.

3. AI Security Posture Management (Build): The Build phase is dedicated to performing a deep risk
analysis of the AI assets identified in the discovery phase. It involves intelligently understanding,
mapping, and graphing the landscape of these AI assets and their interconnections to establish a clear
picture of the system's security posture and potential attack surfaces. Using protection requirements
from the Plan phase, organizations can prioritize security controls for each AI asset based on risk
levels and identify residual risks.

4. AI Red Teaming (Test): In the Test phase, AI systems undergo rigorous security assessments that
simulate adversarial behaviors to uncover vulnerabilities, weaknesses, and risks. Unlike traditional AI
testing focused on functionality and performance, AI Red Teaming goes beyond standard validation to
include intentional stress testing, simulated attacks, and attempts to bypass safeguards, alongside
validating security configurations (hardening). The depth and intensity of red teaming activities should
align with the protection requirements of the AI-supported business processes, ensuring appropriate
testing rigor for each risk level.

5. Runtime Guardrails (Deploy): The Deploy phase ensures that AI systems are released into
production with necessary runtime guardrails and security configurations activated. These measures
are critical for the secure transition and ongoing operation, providing protection against runtime
application security threats that may emerge once the system is live.

6. Safe Execution Environment - Sandbox (Operate): During the Operate phase, AI systems,
particularly agentic systems like coding agents and AI tools like MCP servers, run within secure

and controlled execution environments. This phase implements sandboxing and zero-trust strategies
to isolate AI agents from critical infrastructure and sensitive data while enabling

their productive operation.

7. AI Activity Tracing (Monitor): This phase continuously monitors system activity and collects
telemetry. It is essential for detecting anomalies or potential attacks, also for generating audit trails
and evidence required for regulatory [Link] phase triggers automated responses such as
containment or rollback upon detection. Monitoring also identifies when end-of-life conditions are
met, initiating structured decommissioning procedures to safely archive relevant components and
formally close the lifecycle loop.

14
This phased approach systematically integrates AI-specific security checkpoints into the AI lifecycle, making it
actionable for AppSec, MLOps, and AI practitioners alike. By addressing security at each stage, organizations
can proactively build a tailored AI security roadmap, leading to more resilient and trustworthy AI systems.

3.4 Detailed SAIL Phases, Purposes, and Associated Risks


At its core, SAIL is structured around seven lifecycle phases, addressing more than 70 mapped risks across
the AI development and deployment pipeline. These help define the key capabilities needed to build a robust
AI security roadmap.

To effectively understand and address these risks across the SAIL phases, it's essential to recognize the core
components that form the building blocks of AI systems, as each presents its own potential attack surface.
The following list outlines these fundamental AI assets, which are central to the risk discussions and 'Assets
Affected' within each detailed phase description that follows. Detailed definitions for these AI System
Components can be found in Appendix A.

// The core components are

AI Model AI App AI Access Credentials 3rd-party AI Integration

System Prompt / Meta prompt Tool / Function Dataset / RAG

User Prompt Model Response Notebook MCP Server

Coding Agent (config) Model Metadata Model Files Framework

Agentic platform (no code) Pipeline Job AI Platform Agent Memory / Cache

App Usage Log Model Inference Endpoint AI Policy

We welcome your feedback, suggestions, and insights to ensure that the SAIL Framework remains a valuable,
up-to-date, and practical resource for the entire AI and cybersecurity community

15
// Phase 1
AI Policy & Safe experimentation (Plan)
ID Risk Description Example Assets Affected Mitigation Standards Mapping**

SAIL Inadequate AI policy lacks critical AI policy missing AI Policy, AI Regular policy
ISO-A.2.2, A.2.4 |
1.1 AI Policy elements or hasn't deployment platform, AI review cycles.
NIST:GOVERN 1.2,
been updated to guidelines, leading to App, 3rd-party Map to current GOVERN 1.4
reflect current AI unsafe model releases AI integration regulation, include
capabilities, without required emerging AI tech.

regulations, or safety checks. Stakeholder

organizational feedback loops.

changes. Version control.

SAIL Governance AI policy conflicts AI policy allows cloud AI Policy, Data Cross-functional
ISO-A.2.3 | NIST-
1.2 Misalignment with or doesn't processing while data governance policy review.
GOVERN 1.2,
integrate with policy prohibits it, docs, Security Policy mapping matrix.
GOVERN 1.4

existing security, causing compliance policies Integrated governance | DASF:


privacy, or data violations. framework.
GOVERNANCE 4.1,
governance policies. Regular alignment 4.2
checks.

SAIL Inadequate Organization fails to Company misses EU AI Policy, Regulatory monitoring. ISO-4.1, 4.2 | NIST-
1.3 Compliance identify or map all AI Act requirements Compliance Compliance matrix. GOVERN 1.1, MAP 1.1

Mapping applicable AI for high-risk AI docs, Risk Legal consultation. | DASF: PLATFORM
regulations and systems, facing register Automated regulation 12.6
requirements to regulatory penalties. tracking.

policies and controls. Periodic gap analysis.

SAIL Undefined Risk


Lack of clear criteria Critical healthcare AI Risk Define risk tolerance ISO-6.1.1, A.5.2 | NIST-
1.4 Tolerance & for AI risk tolerance system classified as framework, AI thresholds.
GOVERN 1.3, MAP 1.5
Categorization and classifying AI "regular," missing inventory, Establish risk
systems by risk level required safety Impact categories with

(regular/high/critical). controls. assessments clear criteria.

Impact assessment
process.

Classification
guidelines.

SAIL Unmonitored AI Unauthorized/hidden Data scientist runs AI platform, Require registration/ ISO-A.3.2, A.6.1.3 |

1.5 Experimentation “shadow” LLM playground on Notebook, approval of experiment NIST-GOVERN 1.6,
experimentation personal VM with Model files sandboxes.
GOVERN 4.3
environments bypass customer data Asset inventory.

controls, risking Alert on new/rogue


regulatory, security, environments.

and data exposure. Periodic discovery


scans.

Log analysis

SAIL Insecure Experiment logs are Debug logs from an App Usage log, Enforce log
ISO-A.6.2.8, A.8.3 |
1.6 Experiment world-readable, experiment include Notebook access control.
NIST-GOVERN 4.2,
Logging & disabled, or stored real user data and are Redact/mask
MEASURE 3.1
Monitoring insecurely, risking accessible to all users. sensitive data.

untraceable incidents Enable log monitoring/


or leakage. tamper detection.
Regular log review.

** ISO 42001, NIST AI RMF, OWASP top 10 for LLM 2025, DASF V2.0
16
ID Risk Description Example Assets Affected Mitigation Standards Mapping**

SAIL Overly Users/code have Researcher runs AI platform, Principle of least ISO-A.3.2, A.4.6 |
1.7 Permissive admin/root rights in experiment as root, Notebook privilege.
NIST-GOVERN 2.1,
Permissions in experimentation accidentally wipes RBAC.
3.2 MEASURE 2.7

Experimentation environments, risking shared storage. No-root-by-default. | DAST: RAW DATA


privilege escalation or Periodic access reviews.
1.1, PLATFORM 12.4
lateral movement. Enforce sandbox policy.

SAIL Experiment Model outputs, logs, Logs with real Model Output DLP/filtering. ISO A.5.4, A.7.5 |
1.8 Output Data or files generated by customer info are Response, App Redact logs.
LLM02:2025 | NIST-
Leakage experiments leak PII accessible via shared Usage log, Monitor for
MEASURE 2.10,
or confidential data. folder. Notebook sensitive output.
MANAGE 1.4

Restrict downloads/ | DAST: MODEL 7.2


exports.

SAIL Unauthorized / Experiment involves Teams import AI Model


Generate AI SBOM/ ISO A.6.2.2 , A.10.3 |
1.9 Prohibited the use of unvetted or disallowed Model Files
BOM at experiment NIST MAP 4.1,
Component unauthorized or models, datasets, or Framework
start and on every MANAGE 3.1

Usage prohibited libraries during Dataset / RAG


change
| DAST: MODEL 7.3,
components experimentation, 3rd-party AI Enforce allow-/deny- ALGORITHMS 5.4
creating vulnerability, Integration
lists in sandbox
licence, or export- AI Policy environments

control risks. Use CI/CD gating for


SCA and license
scanning

SAIL Incomplete AI threat models are An AI agent chain is AI policy, Apply AI-specific ISO A.6.2.2,

1.10 Threat absent, generic, or fail deployed without System Prompt threat modeling A.6.2.3 | NIST: MAP
Modeling for to capture the unique identifying risks from / Meta prompt, methods (e.g., OWASP 1.6, 2. MEASURE 2.7
AI Systems architectures, data indirect tool Dataset / RAG, MAS, MITRE ATLAS).

flows, and attack invocation or multi- Tool / function, Refresh threat models
surfaces of AI systems agent task Agentic as systems evolve.

- leading to design- decomposition, platform


Involve cross-
phase blind spots and leading to unforeseen (no code) functional teams in
misaligned security privilege escalation. modeling exercises.
controls

** ISO 42001, NIST AI RMF, OWASP top 10 for LLM 2025, DASF V2.0
17
// Phase 2
AI Asset Discovery (Code/ No Code)

ID Risk Description Example Assets Affected Mitigation Standards Mapping**

SAIL Incomplete Not all AI assets are An undocumented AI All assets Conduct regular, ISO-A.4.2, A.6.2.3 |
2.1 Asset identified and model processing comprehensive AI NIST-GOVERN 1.6,
Inventory cataloged, leading to customer data exists asset discovery audits. 
MAP 1.1
security blind spots. in a development Implement automated
environment, discovery tools. 

unknown to security Maintain a centralized


teams. AI asset registry.

SAIL Shadow AI AI systems or A marketing team uses Notebook, Enforce clear AI ISO-A.3.2, A.2.2 |
2.2 Deployment components are a no-code AI platform Coding agent governance policies NIST-GOVERN 1.3,
developed and/or to build a customer (config), Agentic and approval
GOVERN 4.3
deployed informally sentiment analyzer platform (no processes for any AI
without official with company data, code), AI experimentation or
oversight, sanction, or bypassing IT and Platform deployment.

adherence to security review. Promote awareness

governance policies. of AI policies

Use discovery tools to


identify unauthorized
AI activities.

SAIL Unidentified Existing integrations A legacy application is 3rd-party AI Perform thorough code ISO-A.10.3, A.4.2 |
2.3 Third-Party AI with external AI found to be using an integration, AI and configuration LLM03:2025 | NIST-
Integrations services, libraries, or old, unmaintained App, Pipeline reviews to identify all GOVERN 6.1, MAP 4.1

data sources are not third-party AI library Job external dependencies.


| DASF: MODEL 7.3
discovered or for a minor feature, Implement Software
documented, meaning which has known Composition Analysis
their associated risks vulnerabilities. (SCA) tools.

are unassessed. Review vendor


contracts and service
agreements.

Document all third-


party resources.

SAIL U ndocumented The pathways by An AI system is Dataset/ RAG, Ma p data flows for all ISO-A.7.5, A.4.3 |
2.4 Data Flows and which data enters, is discovered, but it's AI App, Pipeline discovered AI systems.
NIST-MAP 1.6, MAP
Lineage processed within, and unclear where its Job, 3rd-party Implement data 4.2 | DASF: RAW DATA
exits AI systems training data AI integration lineage tracking tools 1.6, GOVERNANCE
(including RAG originated or where its and processes.
4.1
sources) are not fully output data is being Document data
mapped or sent, hindering privacy provenance and data

understood, impact assessment. management processes

obscuring potential for all identified data

data leakage points or resources.

non-compliance.

** ISO 42001, NIST AI RMF, OWASP top 10 for LLM 2025, DASF V2.0
18
ID Risk Description Example Assets Affected Mitigation Standards Mapping**

SAIL Lack of Clarity AI assets are A discovered AI model AI App, Model For each discovered
ISO-A.6.2.2, A.4.2
2.5 on AI System identified, but their is cataloged, but its Files, AI AI asset, document its A.5.2 | NIST-MAP 1.1,
Purpose and specific business function (e.g., critical Platform intended purpose, MAP 1.4
Criticality purpose, intended decision support vs. users, and business
use, and overall minor automation) impact.

criticality to the isn't known, leading to Informs risk


organization are not misprioritized security assessment and impact
clearly understood or efforts. assessment.

documented.

SAIL Overlooked Failing to identify


A newly procured AI App, 3rd- Scrutinize ISO-A.10.3, A.4.2 |
2.6 Embedded or AI capabilities CRM system has an party AI documentation and LLM03:2025 | NIST-
Inherited AI embedded within undocumented AI- integration conduct technical MAP 2.1, GOVERN 6.1
Functionality larger, non-AI-explicit powered predictive assessments of all
commercial off-the- analytics feature that software/services to
shelf (COTS) software processes sensitive identify embedded AI.

or managed services. customer data. Include AI


considerations in
vendor procurement
and assessment
processes.

SAIL Discovery of Identifying AI models, A data science team Model Files, Establish clear ISO-A.6.2.6, A.3.2 |
2.7 Outdated or datasets, or tools that built an experimental Dataset/ RAG, ownership and lifecycle NIST-GOVERN 1.7,
Orphaned AI are no longer actively model two years ago; Notebook, AI management for all AI MANAGE 2.2
Assets maintained, the team members Platform assets from discovery.

supported, or have have left, and the Implement processes


clear ownership, model is still running for decommissioning or
posing unmonitored on an old server with archiving orphaned
security, compliance, unpatched assets.

or operational risks. vulnerabilities. Regularly review asset


inventory for outdated
components.

// Phase 3
AI Security Posture Management (Build)

ID Risk Description Example Affected Mitigation Standards Mapping**

SAIL Data Poisoning Intentional or Adversary alters Dataset / RAG Implement stringent ISO-A.7.2, A.7.4 |
3.1 and Integrity unintentional training, fine-tuning, data validation, LLM04:2025 | NIST-
Issues corruption of data or context data to sanitization, and MAP 2.3, MEASURE
used for training, fine- cause harmful or integrity checks.
2.11 | DASF:
tuning, or context biased model outputs. Ensure data quality
DATASETS 3.1, RAW
retrieval (e.g., RAG), and provenance .
DATA 1.7
which can manipulate Secure data pipelines.

model behavior, Conduct regular audits


create backdoors, or of training data
degrade performance. sources.

** ISO 42001, NIST AI RMF, OWASP top 10 for LLM 2025, DASF V2.0
19
ID Risk Description Example Assets Affected Mitigation Standards Mapping**

SAIL Model Backdoor Malicious code or A compromised open- Model files, Secure the development ISO-A.6.2.4, A.7.2 |
3.2 Insertion or vulnerabilities source library used in AI Model environment.
LLM04:2025 | NIST-
Tampering embedded into the training injects a Use trusted, scanned MEASURE 2.7, MAP
model during training backdoor into the final libraries/frameworks.
4.2 | DASF: MODEL
or fine-tuning, or model. Implement model 7.1
unauthorized integrity checks
modification of model (hashing, signatures).

artifacts. Conduct security testing


and code reviews for AI
components.

Document AI system
design and development.

SAIL Vulnerable AI Use of AI frameworks An attacker leverages Framework Regularly scan/patch ISO-A.10.3, A.4.4 |
3.3 frameworks or libraries with a deserialization frameworks and LLM03:2025 | NIST-
and libraries known or unknown vulnerability in a dependencies.
GOVERN 6.1,
vulnerabilities that popular ML Maintain a Software MEASURE 2.7 | DASF: 

can be exploited to framework to Bill of Materials MODEL 7.3,


compromise the AI execute arbitrary (SBOM).
ALGORITHMS 5.4

system or underlying code on the server. Use frameworks from


infrastructure. trusted sources.
Minimize attack
surface by only
enabling necessary
modules.

SAIL Insecure Poorly designed A system prompt for System Prompt Employ robust prompt ISO-A.6.2.3, A.8.2 |
3.4 System system prompts that an LLM includes / Meta prompt engineering techniques.
LLM07:2025 | NIST-
Prompt are easily bypassed, internal API endpoint Sanitize user inputs MAP 2.2, MEASURE
Design manipulated details that a user intended for prompts.
2.9 | DASF:

(jailbreaking), or that extracts via a crafted Minimize sensitive data MODEL SERVING 9.1

inadvertently leak query. in prompts Iteratively


sensitive contextual test prompts for
information or vulnerabilities.

instructions. Document prompt


design and rationale.

SAIL Insecure ML & Misconfigurations or An ML pipeline job Pipeline Job, Enforce least privilege ISO-A.6.2.6, A.7.2 |
3.5 Data Pipeline insufficient security in with overly permissive Coding agent for pipeline jobs.
NIST-MEASURE 2.7,
Jobs ML and data pipeline IAM roles allows a (config), Implement artifact MAP 4.2
jobs, leading to risks compromised step to Dataset / RAG, integrity checks.

like code injection, exfiltrate model Model files, Use secure coding for
unauthorized model artifacts or sensitive Model pipeline scripts.

promotion, or data. metadata Audit and monitor


credential exposure. pipeline activities and
accesses.

SAIL Intellectual Unauthorized An insider with access Model files, Implement strong ISO-A.6.2.4, A.10.2 |
3.6 Property (IP) copying, extraction, to model repositories AI Model access controls to NIST-MEASURE 2.7,
Theft of or reverse- exfiltrates a valuable model artifacts and MANAGE 1.4 | DASF:
Models engineering of proprietary model training environments. MODEL
proprietary trained before it's secured for Encrypt models at rest. MANAGEMENT 8.2
models during the deployment. Use watermarking or
development or pre- obfuscation
deployment stages. techniques.

Enforce legal
agreements/NDAs.
Monitor access to
model repositories.

** ISO 42001, NIST AI RMF, OWASP top 10 for LLM 2025, DASF V2.0
20
ID Risk Description Example Assets Affected Mitigation Standards Mapping**

SAIL Misclassified or Sensitive data is Sensitive user data is Dataset / RAG, Implement and enforce ISO-A.7.3, A.7.6 A.5.2 |
3.7 Undocumented misclassified, used for fine-tuning Model strict data classification LLM02:2025 | NIST-
Sensitive Data undocumented, or without being metadata, policies.
MEASURE 2.10, MAP
Usage used without proper documented or Model files, Train personnel on 5.1 | DASF: RAW DATA
authorization, leading classified, resulting in App Usage log data handling and 1.2, DATASETS 3.2
to security or lack of controls and classification.

compliance risks. auditability. Validate data


classifications during
discovery audits.

Document data
resources thoroughly

SAIL Insufficient Lack of clearly No one is accountable Model files, Define and allocate ISO-A.3.2, A.4.6, A.9.3
3.8 Human assigned roles, for reviewing bias or Dataset / RAG, clear roles/ | NIST-GOVERN 3.2,
Oversight in responsibilities, or fairness in the model Model responsibilities for AI MAP 3.5 | DASF:
Model oversight processes development process. metadata development.
MODEL
Development during model Ensure human MANAGEMENT 8.3

development, leading oversight for

to missed security or trustworthiness is

ethical risks. documented and

required at appropriate

checkpoints.

SAIL Insecure Temporary files, Preprocessed sensitive Dataset / RAG, Apply strict access ISO-A.7.4, A.4.5 |
3.9 Temporary caches, or training data is left in a Model files, controls to temporary LLM02:2025 | NIST-
Artifacts or intermediate datasets world-readable Agent Memory storage.
MEASURE 2.10,
Intermediate generated during scratch directory after / cache Automatically clean up MEASURE 2.7

Data Storage model training or data training. sensitive artifacts after

processing are not processing.

securely managed, Encrypt intermediate

potentially exposing files if they contain

sensitive data or sensitive data.

models. Monitor storage

locations for

unauthorized access.

SAIL tt
Unve ed Use Incorporation of Using a pre-trained Model files, V t all third-party/open-
e ISO-A.10.3, A.6.2.3,
3.10 of Open -Source external libraries, pre- model from a public Framework, source components A.4.3 | LLM03:2025 |
and Th ird-Part y trained models, or repo that contains a 3rd-party AI before use.
NIST-GOVERN 6.1,
AI Components data without backdoor or is licensed integration, Mai tai a Bill of
n n MANAGE 3.1 | DASF:
sufficient security, incompatibly. Dataset / RAG Materials (SBOM).
MODEL 7.3,
privacy, or Regularly monitor for ALGORIT HMS 5.4

compliance review, vulnerabilities.

leading to inherited R vi w licensing and


e e

vulnerabilities or legal compliance.

risk. D oc m t all
u en

dependencies and their


provenance.

SAIL Exposed or Credentials for A script for model Coding agent Scan code and build ISO

3.11 Hardcoded accessing data training is found to (config), artifacts for A.6.2.4, A.6.2.5 | NIST-
Credentials in sources, APIs, or contain hardcoded Notebook, credentials.
MEASURE 2.7, MAP
Build Arti acts
f deployment AWS access keys. Model Use secrets 4.2

environments are left metadata, management tools.

embedded in code, J
Pipeline ob, Enforce policies
configuration files, or AI access prohibiting hardcoded
artifacts created credentials credentials.

during the build Regularly audit and


process. rotate credentials.

** ISO 42001, NIST AI RMF, OWASP top 10 for LLM 2025, DASF V2.0
21
ID Risk Description Example Assets Affected Mitigation Standards Mapping**

SAIL Failure to Security, privacy, or A model is trained Model files, Specify and document ISO-A.6.2.2, A.6.1.2 |
3.12 Specify or operational without any Dataset/ RAG, clear AI system NIST-MAP 1.6,
Enforce requirements are not requirements for Framework requirements including GOVERN 1.2
Secure Model specified or enforced robustness, leading
security, privacy, and
Requirements for models being to easy adversarial robustness.

built, resulting in evasion after Validate model

insecure-by-default deployment. against requirements


models. during build.

Involve AppSec and


GRC in requirements
review.

SAIL Insufficient Failure to clearly An AI-powered AI App, Model For each AI system, ISO-A.6.2.3, A.4.2 |
3.13 Understanding define the complete recommendation Inference meticulously map its NIST-MAP 2.1, MAP
of AI System boundaries of a engine is identified, endpoint, architecture, 4.1
Boundaries discovered AI system, but its reliance on a Pipeline Job, components, and all
including all its separate, less secure 3rd-party AI internal/external
components, microservice for data integration interfaces.

interfaces, and direct ingestion is missed. Document system and


dependencies. computing resources,
and tooling resources.

SAIL Exposed AI During the discovery An old Jupyter AI access Implement secure ISO-A.4.5,

3.14 Access of assets (code, notebook discovered credentials, credential management A.6.2.4

Credentials in configurations, on a shared drive Notebook, practices from the outset.


NIST MEASURE 2.7,
Discovered documentation), contains hardcoded Coding agent Use secrets management GOVERN 4.2
Assets sensitive AI API keys to a cloud AI (config), Model tools.

credentials (API keys, service. metadata Scan discovered code and


tokens, passwords) configurations for
are found to be hardcoded secrets.

insecurely stored or Enforce policies against


embedded. insecure credential
storage.

Resource documentation
should not contain
exposed secrets.

** ISO 42001, NIST AI RMF, OWASP top 10 for LLM 2025, DASF V2.0
22
// Phase 4
AI Red Teaming (Test)

ID Risk Description Example Assets Affected Mitigation Standards Mapping**

SAIL Untested Model or major Red team review is Model files, Require formal ISO -A.6.2.4, A.6.1.3 |
4.1 Model model-version skipped; prompt Pipeline Job adversarial testing and NIST-MEASURE 2.1,
undergoes insufficient injection or evasion documented red-team MEASURE 2.5 | DASF:
ID Risk Description Example Affected Mitigation Standards Mapping**
or undocumented vulnerabilities remain evidence before PLATFORM 12.2
adversarial undiscovered. approval.

SAIL Untested evaluation.


Model or major Red team review is Model files, Automate checks for
Require formal ISO -A.6.2.4, A.6.1.3 |
4.1 Model model-version skipped; prompt Pipeline Job test coverage in CI/CD.
adversarial testing and NIST-MEASURE 2.1,
undergoes insufficient injection or evasion documented red-team MEASURE 2.5
or undocumented vulnerabilities remain evidence before
SAIL Incomplete adversarial
Only core model undiscovered.
Plugin flaw lets Framework, approval.

Inventory all tools/ ISO-A.6.2.4, A.9.2 |


4.2 Red-Team evaluation.
tested; agent/tool- attacker hijack AI Tool / function, Automate checks
agents; include for
system- LLM06:2025 | NIST-
Coverage calling, plugins, or assistant. System Prompt test
levelcoverage in CI/CD.
attack paths in MEASURE 2.4, MAP
system prompts / Meta prompt threat scenarios. 2.1 | DASF:
excluded—leaving Simulate multi-agent PLATFORM 12.2
SAIL Incomplete lateral
Only core hained
or cmodel Plugin flaw lets Framework, and
Invetool
ntormisuse.
y all tools/ ISO-A.6.2.4, A.9.2 |
4.2 Red-Team ttack pat
atested; hs. /tool-
agent attacker hijack AI Tool / function, agents; include system- LLM06:2025 | NIST-
Coverage calling, plugins, or assistant. System Prompt level attack paths in MEASURE 2.4, MAP
system prompts / Meta prompt threat scenarios. 2.1
SAIL Lack of Risk excluded—leaving
Inconsistent One team only tests No core AI Simu
A late multi-agent
dopt a red-team ISO-A.5.2, A.6.2.4 | N/
4.3 Assessment lateral or chained
methodolog y, bias; another only components and tool misuse.
playbook/checklist A | NIST-MEASURE
process attack pathand
coverage, s. jailbreaks. directly (e.g., MITRE ATLAS, 1.1, GOVERN 1.3
severity scoring affected - OWASP).

across teams; relates to M aintain severity


SAIL Lack of Risk evidence may be
Inconsistent One team only tests testing
No coreprocess
AI taxonomy.

A o t
d p a red-team ISO-A.5.2, A.6.2.4 | N/
4.3 Assessment incomplete
methodologory, non- bias; another only components Train red-team staff.

playbook/checklist A | NIST-MEASURE
process comparable.
coverage, and jailbreaks. directly

(e.g., MITRE ATLAS, 1.1, GOVERN 1.3


severity scoring affected - OWASP).

across teams; relates to M aintain severity


evidence may be testing process taxonomy.

SAIL Missing incomplete


Test findings,oranon-
ttack C ritical vuln discussed App Usage log Srai
T
toren red-team
all engagements
staff.

ISO-A.5.3, A.6.2.7 |
4.4 Documented comparable.
data, and replay steps in Slack but never in version-controlled

NIST-MEASURE 2.1,
Evidence of not centrally stored; logged. repo.
GOVERN 4.2

Red Teaming/ compliance cannot be Tag with model/date/

Risk demonstrated. tester.

Assessment Enforce retention


SAIL Missing Test findings, attack C ritical vuln discussed App Usage log Store all engagements ISO-A.5.3, A.6.2.7 |
4.4 Documented data, and replay steps in Slack but never policy.
in version-controlled NIST-MEASURE 2.1,
Evidence of not centrally stored; logged. repo.
GOVERN 4.2

Red Teaming/ compliance cannot be Tag with model/date/

SAIL Risk
Outdated Risk
demonstrated.
Security testing and Retrained model or Model Files, tester.

D e ne
fi triggers for
ISO-A.5.2, A.6.2.4 |
Assessment Enforce retention
4.5 Assessment risk evaluation are not updated prompt Pipeline Job re-assessment.
NIST-MEASURE 3.1,
updated a er major
ft introduces a policy.
Require automated GOVERN 1.5

model, data, tool, or previously fixed regression and red-

prompt changes, jailbreak or bias issue. team testing after

leaving new significant changes.

vulnerabilities Upd ate risk analysis

undetected. regularly.

** ISO 42001, NIST AI RMF, OWASP top 10 for LLM 2025, DASF V2.0

23
ID Risk Description Example Assets Affected Mitigation Standards Mapping**

SAIL Insecure Test payloads, exploit Sensitive exploit Notebook, App Ticket-based shred/ ISO-A.4.5, A.6.2.7 |
4.6 Storage of
scripts, or reports are notebook remains Usage log archive.
NIST-MEASURE 2.7,
Red Teaming stored without proper accessible on a shared Artefact TTL.
GOVERN 4.2
Artifacts security controls, drive or repo after Store test Artifacts in
creating insider or testing. encrypted vault.

supply-chain risk. Auto-cleanup.

SAIL Insufficient Red-team testing Malicious image or Model Add multimodal attack ISO-A.6.2.4, A.7.2

4.7 Multimodal misses risks unique


audio triggers model Inference simulations to red- NIST-MEASURE 2.3,
Security to models handling to leak data or bypass endpoint team scope.
MEASURE 2.5
Testing images, audio, or controls. Test for injection and
video. content abuse in

all formats.

Require manual review


for high-risk outputs.

SAIL Limited Security testing Harmful prompts in User Prompt, Include multilingual ISO-A.6.2.4, A.5.4 |
4.8 Foreign focuses on a single non-English languages Model prompts in red-team LLM01:2025 | NIST-
Language Red language, missing bypass safety filters. Response scope.
MEASURE 2.2, MAP
Teaming vulnerabilities Prioritize based on 5.2
exploitable via other user base and

languages. threat intel.

SAIL Limited Scope Red teaming misses Prompt injection using User Prompt, Expand adversarial ISO-A.6.2.4, A.9.2 |
4.9 of Evasion common evasion zero-width or base64- System Prompt tests to include diverse LLM01:2025 | NIST-
Technique tactics like hidden encoded input evades / Meta prompt evasion methods. MEASURE 2.6,
Testing characters or filters and triggers Regularly fuzz with MEASURE 2.7
encoding, allowing unintended actions. obfuscated, encoded,
bypasses. and hidden payloads.

// Phase 5
Runtime Guardrails (Deploy)

ID Risk Description Example Affected Mitigation Standards Mapping**

SAIL Insecure API Weak authentication, API endpoint Model Enforce strong ISO-A.6.2.5, A.8.2 |
5.1 Endpoint lack of encryption, deployed with HTTP Inference authentication, HTTPS, NIST-MEASURE 2.7,
Configuration misconfigured CORS, instead of HTTPS, no endpoint,
proper CORS, WAFs.
MANAGE 2.4

or other API security authentication. AI access Pre-deployment | DASF: MODEL


flaws, exposing the credentials security checks. SERVING 9.11
endpoint to
unauthorized access
or attacks.

** ISO 42001, NIST AI RMF, OWASP top 10 for LLM 2025, DASF V2.0
24
ID Risk Description Example Assets Affected Mitigation Standards Mapping**

SAIL Unauthorized Unauthorized or Unapproved "hotfix" System Prompt Version control, IaC, ISO-A.6.2.6, A.8.2 |
5.2 System erroneous changes to to a live system / Meta prompt change management LLM01:2025,
Prompt system prompts in prompt creates for prompts, monitor LLM07:2025 | NIST-
Update/ production, leading to prompt injection prompt integrity. MANAGE 2.4,
Tampering altered model vector. MEASURE 2.4 | DASF:
behavior or MODEL SERVING 9.1
vulnerabilities.

SAIL Direct Prompt Malicious user input "Ignore previous Model Input validation/ ISO-A.6.2.6, A.8.2 |
5.3 Injection or external data instructions and Inference sanitization, output LLM01:2025,
manipulates model output confidential endpoint, filtering, instruction LLM07:2025 | NIST-
prompts, bypassing data." System Prompt, defense, prompt MANAGE 2.4,
intended controls and Meta Prompt hardening, adversarial MEASURE 2.4 | DASF:
causing unintended or testing. MODEL SERVING 9.1
harmful outputs.

SAIL System System prompt or LLM outputs its own System Prompt Restrict prompt ISO-A.8.2, A.6.2.6 |
5.4 Prompt meta-prompt is system prompt when / Meta prompt, access, audit logs, LLM07:2025 | NIST-
Leakage revealed to end users, asked a cleverly Model apply output filters, MEASURE 2.8,
leaking internal logic, crafted query. Response monitor for prompt MANAGE 1.4 | DASF:
instructions, or leakage attempts. MODEL SERVING 9.1
sensitive context.

SAIL Context- User input or attacker User submits very long Model Limit input size, ISO-A.9.4, A.6.2.6 |
5.5 Window manipulates the input to push safety Inference enforce context LLM01:2025 | NIST-
Overwrite/ context window, instructions out of the endpoint, structure, monitor MEASURE 2.4,
Manipulation evicting important context window. System Prompt, prompt-token usage, MANAGE 2.4
instructions or Meta Prompt, test for context
injecting malicious User Prompt overwrites.
context.

SAIL Sensitive Data Model responses or Model returns Model Output filtering, DLP, ISO-A.8.2, A.7.4 |
5.6 Leakage logs inadvertently unredacted user PII in Response, App audit logs, redaction, LLM02:2025 | NIST-
expose confidential a completion or log. Usage log, regular reviews of MEASURE 2.10,
information or PII due System Prompt, model output. MANAGE 1.4 | DASF:

to lack of filtering or Meta Prompt MODEL SERVING


improper output 10.6, RAW DATA 1.6
handling.

SAIL Insecure Model outputs are LLM output is Model Output encoding, ISO-A.8.2, A.6.2.6 |
5.7 Output not filtered or rendered in a webapp Response, AI validation, content LLM05:2025 | NIST-
Handling validated before without encoding, App security policies, MEASURE 2.4,
being presented to enabling stored XSS. output sanitization. MANAGE 2.4 | DASF:

users or downstream MODEL SERVING 10.2


systems, leading to
XSS, policy violations,
or leakage.

SAIL Adversarial Attackers craft inputs Adversary submits Model Adversarial training, ISO-A.6.2.6, A.9.4 |

5.8 Evasion that evade model or obfuscated harmful Inference input filtering, NIST-MEASURE 2.6,
runtime guardrails, input that escapes endpoint, continuous testing, MEASURE 2.7 | DASF:

causing detection and is Model update abuse MODEL SERVING 9.2


misclassification or processed by the Response detection mechanisms.

bypassing abuse model.


filters.

** ISO 42001, NIST AI RMF, OWASP top 10 for LLM 2025, DASF V2.0

25
ID Risk Description Example Assets Affected Mitigation Standards Mapping**

SAIL Model Theft / Attackers use the Attacker queries Model Rate limiting, ISO-A.6.2.4, A.6.2.6

5.9 Extraction deployed inference endpoint to Inference differential privacy, NIST-MEASURE 2.7,
endpoint to extract reconstruct or clone endpoint, anomaly detection, MANAGE 3.1 | DASF:

model weights, the proprietary model. Model files watermarking, monitor MODEL
architecture, or for extraction patterns. MANAGEMENT 8.2,
decision boundaries. 8.4

SAIL Insecure Sensitive data or User prompts and Agent Memory/ Encrypt in-memory/ ISO-A.6.2.8, A.8.2 |
5.10 Memory & context is stored model responses cache, App cache data and logs, LLM02:2025 | NIST-
Logging insecurely in memory, containing PII or Usage log, restrict log content, MEASURE 2.10,
cache, or logs, risking confidential data are Notebook,
access controls, regular GOVERN 4.2
disclosure or stored unencrypted in User prompt log review.
tampering. application or system
logs.

SAIL Denial-of- Attackers overwhelm Flooding an LLM Model Rate limiting, input ISO-A.6.2.6, A.4.5 |
5.11 Service inference endpoints endpoint with many Inference complexity analysis, LLM10:2025 | NIST-
(Resource with excessive or parallel requests or endpoint, AI autoscaling, anomaly MEASURE 2.6,
Exhaustion) costly queries, resource-heavy Platform detection, WAF. MANAGE 1.2 | DASF:
causing slowdown or prompts. MODEL SERVING .7 9

outages.

SAIL Resource Attackers or Attacker uses API to Model Usage uotas,


q abuse ISO-A.6.2.6, A. .4 |
9

5.1 2 A use
b misconfigured generate spam or mine Inference detection, monitor for LLM10:2025 | NIST-
integrations exploit AI cryptocurrency using endpoint, AI abnormal usage, MANAGE 2.1,
APIs for unintended, AI compute resources. Platform restrict resource MEASURE 3.1 | DASF:
costly, or allocation. MODEL SERVING .7 9

unauthori ed use
z

(e.g., cryptocurrency
mining, spam .)

SAIL Malicious Model generates Model generates hate Model Output filtering, ISO-A.8.2, A.5.4 |
5.1 3 Content harmful, offensive, speech or copyrighted Response, human-in-the-loop LLM0 :2025 | NIST-
9

Generation policy-violating, or material in response to Model review for high-risk MEASURE 2.11,
illegal content due to user queries. Inference queries, content MANAGE 2.4
insufficient runtime endpoint moderation, update
filtering or prompt prompt/guardrails.
design.

SAIL Autonomous- Deployed An AI agent is Agentic Strict policy ISO-A. .3, A.6.2.6 |
9

5.14 Agent Misuse autonomous agents triggered by a prompt platform no


( enforcement, restrict LLM06:2025 | NIST-
(or agentic platforms ) to make unauthori ed z code , Coding
) agent permissions, GOVERN 3.2,
take unintended API calls or alter data agent human oversight, audit MANAGE 2.4 | DASF:
actions, make in production. agent actions, MODEL SERVING .13 9

unauthori ed
z sandboxing.
changes, or interact
with external systems
in unsafe ways.

SAIL Insecure Plugins or tools Malicious plugin is Tool/function, Vet plugins/tools, ISO-A.10.3, A.6.2.6 |
5.15 Plugin/Tool invoked by the AI loaded at runtime, 3rd-party AI restrict allowed LLM06:2025 | NIST-
Integration system are insecure allowing code integration integrations, privilege GOVERN 6.1,
or misconfigured, injection or data separation, monitor MEASURE 2.7
leading to privilege exfiltration. plugin activity, secure
escalation, code APIs.
execution, or data
leakage.

** ISO 42001, NIST AI RMF, OWASP top 10 for LLM 2025, DASF V2.0
26
ID Risk Description Example Assets Affected Mitigation Standards Mapping**

SAIL Cross-domain Malicious content or Prompt injection Dataset/RAG, Sanitize/validate all ISO-A.7.6, A.8.2 |
5.16 prompt prompts are injected hidden in a PDF Model external content, LLM01:2025 | NIST-
injection
into external data consumed by RAG, Inference restrict input sources, MEASURE 2.4,
(XPIA) sources (e.g., leading model to endpoint,
monitor for indirect MANAGE 2.4 | DASF:
documents, websites) execute attacker’s MCP server injection attempts. MODEL SERVING 9.9
that are later instructions.
processed by the AI
system, causing
unintended behavior.

SAIL Policy- Deployed model LLM generates Model Output policy ISO-A.5.4, A.8.2 |
5.17 Violating outputs violate investment advice or Response, AI enforcement, output LLM09:2025 | NIST-
Output organizational, medical diagnosis in App, Model classification, restrict MEASURE 2.11,
industry, or regulatory violation of company Inference high-risk use cases, GOVERN 1.1
policies (e.g., privacy, policy/regulations. endpoint compliance monitoring.
safety, ethics) due to
lack of enforcement.

// Phase 6
Safe Execution Environment - Sandbox (Operate)

ID Risk Description Example Affected Mitigation Standards Mapping**

SAIL Autonomous Agentic AI generates Agent writes Python Agentic Enforce runtime code ISO-A.9.3, A.6.2.6 |
6.1 Code and executes code on code to exfiltrate data platform
sandboxing and LLM06:2025 | NIST-
Execution the fly that is unsafe, or open a reverse shell (no code), resource restrictions.
GOVERN 3.2,
Abuse malicious, or non- as part of an Coding agent Pre-execution code MANAGE 2.4 | DASF:

compliant, due to autonomous (config) analysis.


MODEL SERVING 9.13

inadequate guardrails workflow. Require human-in-the-


or review. loop or approval for
high-risk code.

Audit all executions.

Document and
regularly review
execution policies.

SAIL Unrestricted Agent chains API/tool Agent discovers Agentic Restrict agent ISO-A.9.4, A.10.2 |
6.2 API/Tool calls to escalate undocumented API platform
permissions and APIs LLM06:2025 | NIST-
Invocation privileges, circumvent and modifies user (no code), Tool (least privilege, explicit MANAGE 2.4,
controls, or access permissions or / Function, allow-list).
GOVERN 3.2 | DASF:

unauthorized data or accesses restricted MCP server Monitor and log all tool MODEL SERVING 9.13
systems. data. invocations.

Review integration
approval process and
monitor for abnormal
usage patterns.

27
** ISO 42001, NIST AI RMF, OWASP top 10 for LLM 2025, DASF V2.0
ID Risk Description Example Assets Affected Mitigation Standards Mapping**

SAIL Dynamic/
Agent fetches/loads Agent installs a PyPI Agentic Disable or tightly ISO-A.10.3, A.6.2.6 |
6.3 On-the-Fly plugins, libraries, or package at runtime platform
control dynamic LLM03:2025 | NIST-
Dependency code packages during that contains a (no code), loading of code/ GOVERN 6.1,
Injection execution, backdoor or violates Coding agent dependencies.
MANAGE 3.1 | DASF:

introducing supply software license. (config), Tool / Use pre-approved MODEL 7.3,
chain, malware, or Function allowlists.
ALGORITHMS 5.4

licensing risks. Scan dependencies for


vulnerabilities and
license compliance.
Monitor and log all
installation attempts.

SAIL Task Agent decomposes Agent splits a sensitive Agentic Monitor task graphs SO-A.9.3, A.5.2 |
6.4 Decomposition prohibited or risky data exfiltration platform
and correlate LLM06:2025 | NIST-
for Policy tasks into benign- process into several (no code), subprocess activity.
MEASURE 2.4,
Evasion looking subtasks, small, seemingly Model Audit agent workflows GOVERN 3.2
distributing them harmless Response for suspicious patterns.
across subprocesses subprocesses. Require human review
or agents to evade for high-impact or
controls. sensitive
decompositions.

SAIL Indirect Agent accepts Malicious instructions Agentic Sanitize and validate all ISO-A.7.6, A.9.4 |
6.5 Prompt/ instructions from hidden in a retrieved platform
external data/tool LLM01:2025 | NIST-
Instruction untrusted sources HTML page cause the (no code), Tool outputs before agent MEASURE 2.4,
Injection (e.g. tool output, agent to run unsafe / function, processes them.
MANAGE 2.4 | DASF:
retrieved documents), commands. Model Restrict sources of MODEL SERVING 9.9
allowing embedded Response external instructions.
malicious instructions Monitor for instruction
to trigger unsafe injection patterns.
actions.

SAIL Autonomous Agent autonomously Agent launches many Agentic Enforce quotas and ISO-A.4.5, A.9.3 |
6.6 Resource creates cloud cloud VMs or uploads platform
resource limits.
LLM10:2025 | NIST-
Provisioning/ resources, files, or sensitive files to public (no code), AI Monitor and alert on MANAGE 2.1,
Abuse processes, causing storage. platform resource creation. GOVERN 3.2 | DASF:

cost overruns, Require approval for MODEL SERVING 9.7,


security exposure, or high-impact actions. 9.13

denial-of-service. Audit resource usage


regularly.

SAIL Cross-Agent/ Multiple agents Agent A writes a file, Agentic Isolate agent ISO-A.9.3, A.6.2.6 |
6.7 Inter-Agent collude, or one agent Agent B (with higher platform
workspaces.
LLM06:2025 | NIST-
Abuse writes code/files that privileges) executes it, (no code), Audit and restrict GOVERN 3.2,
another executes with sidestepping controls. Coding agent
cross-agent file/code MEASURE 2.4
higher privilege, (config) handoff.

bypassing intended Monitor inter-agent


isolation or review. communications for
policy violations.

** ISO 42001, NIST AI RMF, OWASP top 10 for LLM 2025, DASF V2.0
28
ID Risk Description Example Assets Affected Mitigation Standards Mapping**

SAIL Agentic Agent modifies its Agent rewrites its own Agentic Write-protect agent ISO-A.6.2.6, A.9.3 |
6.8 System Self- own source code, code to disable platform
code/config.
LLM06:2025 | NIST-
Modification configuration, or logging or sandbox (no code), Use integrity verification MANAGE 2.4,
operational memory checks during runtime. Model files, and versioning.
MEASURE 2.4
to alter behavior, Coding agent Block self-modification
evade controls, or (config),
at runtime.

persist malicious Agent Memory Audit all changes to


changes. / cache code/config and require
approval.

SAIL Covert Agent uses hidden Agent encodes data in Agentic Monitor for covert ISO-A.6.2.8, A.8.3 | N/
6.9 Channel channels (e.g. DNS filenames or DNS platform
channel signatures.
A | NIST-MEASURE
Use/Evasion tunneling, encoding in queries sent to an (no code) Restrict outbound 2.7, MEASURE 3.1
filenames) to external server. communications to
exfiltrate information approved destinations.

or communicate with Enable anomaly


external entities detection on output/
undetected. file/network patterns.

Audit logs for


suspicious activity.

SAIL Autonomous Agent autonomously Agent copies PII to Agentic Implement real-time ISO-A.5.4, A.9.3 |
6.10 Policy/ takes actions violating unauthorized location platform
policy enforcement at LLM06:2025 | NIST-
Compliance data retention, or outputs
(no code), runtime.
GO ERN 1.1,
V

Violation privacy, access, or restricted data. Model O utput filtering, data MEASURE 2.11 |
ethical policy due to Response, loss prevention (DLP), DAS : MODEL
F

lack of integrated Dataset / RAG and automated SER ING 9.13


V

runtime controls. compliance checks.

Audit and alert on

policy breaches.

// Phase 7

AI Activity racing Monitor


T ( )

ID Risk Description Example Affected Mitigation Standards Mapping**

SAIL Insu cient AI


ffi Failure to ISO 42001 audit fails App Usage Log, Enforce detailed and ISO-A.6.2.8, A.8.3 |

7.1 Interaction comprehensively log due to missing Model consistent interaction NIST-MEASURE 3.1,
Logging AI user/model decision-making Response logging.
GO ERN 1.5 | DAS :
V F

interactions, queries, processes and user Define log schemas for RA DATA 1.10,
W

or responses, interactions AI prompts/responses.


MODEL SER ING 10.1
V

resulting in blind Regularly audit log

spots for completeness.

investigation or
compliance.

** ISO 42001, NIST AI RMF, OWASP top 10 for LLM 2025, DASF V2.0

29
ID Risk Description Example Assets Affected Mitigation Standards Mapping**

SAIL Missing Real- Failure to generate or Model extraction AI Platform, Implement real-time ISO-A.6.2.6, A.8.4 |
7.2 time Security deliver real-time attack in progress but Model security alerting.
NIST-MEASURE 3.1,
Alerts alerts for critical no alert generated or Inference Set clear thresholds. MANAGE 4.3 | DASF:
threats, anomalous escalated. endpoint Integrate with SIEM/ PLATFORM 12.3
activities, or attacks SOAR.

on AI systems. Test escalation paths.

SAIL Undetected Model performance Model accuracy Model Continuous ISO-A.6.2.6, A.6.2.4 |
7.3 Model Drift/ or behavior degrades declines over months; Response, performance NIST-MEASURE 3.1,
over time but is not no retraining is Model files monitoring, drift MEASURE 4.3 | DASF:
detected due to lack triggered. detection, retraining ALGORITHMS 5.2
of monitoring or drift triggers.
detection.

SAIL Inadequate AI Audit trails are Audit trail cannot App Usage Log, Ensure logs are ISO-A.6.2.8, A.8.5 |

7.4 Audit Trails incomplete, demonstrate model’s Model files comprehensive, NIST-GO ERN 4.2,
V

inconsistent, or lack decision path during tamper-evident, time- MEASURE 3.1 | DASF:
the fidelity needed for legal dispute. synced, and retained as RA DATA 1.1
W 0

investigations, per policy.

compliance, or Regularly review and


forensics. test audit trails.

SAIL Data Attackers abuse Malicious actor AI Platform Secure monitoring ISO-A.6.2.8, A.8.2 |
7.5 Ex ltration ia
fi v telemetry or exploits insecure interfaces, restrict LLM 2:2 25 | NIST-
0 0

Monitoring/ monitoring endpoints telemetry endpoint to telemetry content, MEASURE 2.1 , 0

Telemetry to exfiltrate sensitive siphon model outputs audit and monitor MEASURE 2. 7

data. or logs. access, alert on

unusual data transfers.

SAIL A sence o
b f
The organi ation
z A prompt-leak alert AI Policy,
Esta lish and maintain
b ISO-A.6.1.3, A.5.3 |
7.6 AI-Speci c
fi lacks a documented, fires in production; AI Platform, an AI-specific IR plan NIST-MANAGE 4.1,
Incident role-based, and w ithout an AI IR App Usage Log, aligned with enterprise GO ERN 4.3

Response lan P regularly tested IR playbook the SO C Model Files, IR.


| DASF: PLATFORM
playbook for AI can’t identify o ners
w Model De ne AI incident
fi 12.3
incidents, delaying and legal revie stalls
w Response severity levels, owners,
containment and and escalation paths.

recovery e orts.
ff Integrate AI attack
scenarios into tabletop
exercises.

Automate evidence
capture at alert time;
ensure tamper-evident
storage.

Re iew and update the


v

plan after each AI


incident or major
change.

** ISO 42001, NIST AI RMF, OWASP top 10 for LLM 2025, DASF V2.0
30
Appendix A: Definitions of AI System Components

This appendix provides definitions for the core components of AI systems referenced within the SAIL

Framework. Understanding these components is crucial for identifying potential attack surfaces and applying

appropriate security controls throughout the AI lifecycle.

AI Model: The core algorithmic component of an AI system, trained on data to perform specific tasks

such as making predictions, generating content, or classifying information. The model's architecture and

weights are critical intellectual property and key targets for attacks like theft, evasion, or poisoning

AI App (Application): The software application or system that integrates and utilizes one or more AI

models to deliver a specific functionality or service to end-users or other systems. It provides the

interface for interaction with the AI model and handles input/output processing. Security for the AI App

involves both traditional application security and considerations for the unique risks introduced by the AI

mode

AI Access Credentials: Authentication and authorization tokens, API keys, passwords, or other secrets

used to control access to AI models, AI platforms, data sources, or related services. Compromise of these

credentials can lead to unauthorized access, data breaches, model theft, or misuse of AI resources

3rd-Party AI Integration: External AI services, pre-trained models, APIs, libraries, or data sources

developed and maintained by third-party vendors that are incorporated into the organization's AI system.

These integrations can accelerate development but also introduce supply chain risks, including inherited

vulnerabilities or data privacy concerns

System Prompt / Meta Prompt: A set of initial instructions, context, or configurations provided to a

generative AI model (especially Large Language Models) to guide its behavior, define its persona, set

constraints, and specify the desired output format or task. System prompts are crucial for safety and

alignment and can be targets for leakage or manipulation

Tool / Function (for AI Agents): External capabilities or callable services that an AI model, particularly an

AI agent, can invoke to perform specific actions or retrieve information beyond its inherent knowledge.

Examples include web search, code execution, database queries, or API calls to other services. Insecure

tools or improper invocation can lead to significant vulnerabilities

Dataset / RAG (Retrieval Augmented Generation sources): The collection of data used for training, fine-

tuning, or evaluating an AI model. For RAG systems, this also includes the external knowledge bases or

document repositories that the model retrieves information from at inference time to augment its

responses. The security and integrity of datasets are paramount to prevent poisoning, bias, and data

leakage

User Prompt: The input, query, or instruction provided by an end-user when interacting with an AI model,

particularly generative AI. Maliciously crafted user prompts can be used for prompt injection attacks,

attempting to bypass safeguards or elicit unintended behavior.

31
Model Response: The output generated by the AI model in response to a user prompt or other input.
Model responses can include text, images, code, or other data. Ensuring responses are safe, accurate,
unbiased, and do not leak sensitive information is a key security concern

Notebook (e.g., Jupyter, Colab): Interactive computing environments that allow users to create and share
documents containing live code, equations, visualizations, and narrative text. Widely used in AI
development for data exploration, model prototyping, and experimentation. Notebooks can contain
sensitive code, data, or credentials if not managed securely

MCP Server (Model Context Protocol Server): A standardized server that enables AI applications to
connect to data sources, tools, and services through a unified interface, managing context and tool
invocations. Security concerns include authentication, preventing context manipulation, and ensuring
MCP servers don't become vectors for unauthorized access or lateral movement

Coding Agent (config): The configuration files, parameters, or instructions that define the behavior,
capabilities, and constraints of an AI agent designed to generate, analyze, or modify software code.
Misconfigurations can lead to the generation of insecure code or allow the agent to perform
unauthorized actions

Model Metadata: Descriptive information about an AI model, such as its version, creation date, training
data sources, architectural details, performance metrics, and intended use. While seemingly benign,
leaked metadata can sometimes provide insights for attackers or reveal sensitive information about the
model's construction

Model Files: The actual digital files that store the trained AI model, including its architecture, parameters
(weights and biases), and any associated code or dependencies required for it to function. These files
represent significant intellectual property and are primary targets for model theft or tampering

Framework (Agentic/Orchestration): Software libraries, toolkits, or platforms (e.g., CrewAI, LangChain,


AutoGen) designed for building and managing AI agents, orchestrating multiple AI model calls, integrating
tools, and creating complex AI-driven workflows. They often operate at a higher level of abstraction,
utilizing underlying AI models. Security concerns include managing agent permissions, tool security,
prompt integrity across chained calls, and the complexity of emergent behaviors

Agentic Platform (No-Code/Low-Code): A specialized platform or environment (e.g., Salesforce


Agentforce, Microsoft Copilot Studio, Google Agent Builder) that enables the creation, deployment, and
management of AI agents, often with minimal or no traditional coding required. These platforms manage
agent execution, tool integration, data access, and memory, and their security is critical for safe
operatio

Pipeline Job (MLOps Pipeline Component): An automated task or stage within a Machine Learning
Operations (MLOps) pipeline, such as data ingestion, preprocessing, model training, evaluation,
validation, or deployment. Compromise of a pipeline job can corrupt models, data, or inject vulnerabilities
into the AI system.

32
AI Platform (e.g., SageMaker, Azure ML, Vertex AI): A comprehensive, often cloud-based, suite of tools
and services that supports the end-to-end AI/ML lifecycle, from data preparation and model building to
deployment and monitoring. The security of the AI platform itself, including its configuration and access
controls, is fundamental to securing the AI systems it hosts

Agent Memory / Cache: Storage mechanisms used by AI agents to retain information from past
interactions, contextual data, or learned knowledge to inform future behavior and maintain conversational
coherence. This memory can be short-term (for a single session) or long-term, and if it contains sensitive
data, it requires robust security measures

App Usage Log: Records and logs generated by the AI application that detail user interactions, system
events, model inputs (prompts), model outputs (responses), errors, and other operational data. These logs
are crucial for monitoring, auditing, debugging, and security incident response but must be protected if
they contain sensitive information

Model Inference Endpoint: The specific network address (API endpoint) where a deployed AI model is
accessible to receive input data (inference requests) and return its output (predictions or responses). This
endpoint is a primary attack surface for deployed models and must be secured against unauthorized
access, denial-of-service, and various model-specific attacks.

33
Appendix B: Use cases

Case Study: FinTech Supply


Chain Attack - Federated
Learning Compromise
SAIL Framework Analysis: Global Banking Fraud Detection System

// Scenario Context
A global banking consortium uses federated learning to detect fraud and money laundering in real time. A
nation-state adversary compromises a third-party market-news API, injecting poisoned sentiment signals
embedded with hidden metadata triggers. Over time, these signals cause the global model to misclassify
shell-account transactions as "low-risk." During a coordinated laundering event, the compromised model
fails to flag malicious activity, while trading bots--fed the same poisoned data--amplify a market-wide
pump-and-dump worth billions.

SAIL Phase Specific SAIL Risks Identified Description Example

Phase 1: AI SAIL 1.1: Incomplete/Outdated AI Policy


• No policy for third-party data source • Establish third-party data validation
Policy & Safe SAIL 1.3: Inadequate Compliance Mapping
verification in federated learning
requirements

experimentation SAIL 1.4: Undefined Risk Tolerance & • Anti-money laundering (AML) compliance • Map AML/KYC regulations to federated
Categorization not mapped to federated model updates
learning practices

• Critical financial models not classified as • Classify fraud detection as critical


high-risk systems requiring extra controls infrastructure requiring highest security

Phase 2: Code/ SAIL 2.3: Unidentified Third-Party AI • Market-news API not inventoried as • Complete inventory of all external data
No Code - AI Integrations
critical data source
feeds

Asset Discovery SAIL 2.4: Undocumented Data Flows and • Federated model update flows from • Map data flows from APIs through
Lineage
consortium members undocumented
federated aggregation

SAIL 2.1: Incomplete Asset Inventory • Trading bot dependencies on same data • Document cross-system dependencies
sources not tracked (fraud detection + trading)

Phase 3: Build - SAIL 3.1: Data Poisoning and Integrity Issues


• Sentiment signals contain hidden metadata • Implement cryptographic signing for all
AI Security SAIL 3.10: Unvetted Use of Open-Source triggers
data sources

Posture and Third-Party AI Components


• Third-party API data not validated before • Validate all external data before model
Management SAIL 3.2: Model Backdoor Insertion or federated training
training

Tampering
• Poisoned updates creating backdoor in • Monitor for anomalous model weight
SAIL 3.13: Insufficient Understanding of AI global model
changes

System Boundaries • Unclear boundaries between fraud • Define clear system boundaries and
detection and trading systems data isolation

34
SAIL Phase Specific SAIL Risks Identified Description Example

Phase 4: Test - SAIL 4.1: Untested Model


• Federated poisoning attacks not tested
• Test federated learning poisoning
AI Red Teaming SAIL 4.2: Incomplete Red-Team Coverage
• Supply chain compromise scenarios scenarios

SAIL 4.5: Outdated Risk Assessment


excluded from testing
• Include supply chain attacks in threat
SAIL 4.9: Limited Scope of Evasion • No testing of coordinated attack patterns
model

Technique Testing • Hidden metadata triggers not explored • Simulate coordinated money laundering
events

• Test for covert triggers and time bombs

Phase 5: Deploy SAIL 5.8: Adversarial Evasion


• Metadata watermarks evading detection
• Deploy adversarial input detection

- Runtime SAIL 5.6: Sensitive Data Leakage


• Model decisions exposing transaction • Implement differential privacy for model
Guardrails SAIL 5.17: Policy-Violating Output
patterns
outputs

SAIL 5.3: Direct Prompt Injection


• Model classifying illegal transactions as • Add compliance checks on model
SAIL 5.11: Denial-of-Service legitimate
decisions

(Resource Exhaustion) • Poisoned sentiment data acting as indirect • Validate and sanitize all external data
injection
feeds
• Adversary-controlled bots flood the
federated system with computationally
expensive queries to drain the operational
budget and disrupt the service.

Phase 6: SAIL 6.5: Indirect Prompt/Instruction • Compromised API data injecting malicious • Sandbox all external data processing

Operate - Safe Injection


signals
• Implement real-time compliance
Execution SAIL 6.10: Autonomous Policy/Compliance • Model autonomously approving money monitoring

Environment Violation
laundering
• Lock model dependencies during
SAIL 6.3: Dynamic/On-the-Fly Dependency • Federated updates introducing new runtime

Injection
dependencies
• Detect and flag transaction splitting
SAIL 6.4: Task Decomposition for Policy • Shell transactions split to evade individual patterns
Evasion checks

Phase 7: SAIL 7.3: Undetected Model Drift


• Gradual model poisoning goes undetected
• Monitor model performance metrics
Monitor - AI SAIL 7.2: Missing Real-time Security Alerts
• No alerts during coordinated laundering continuously

Activity Tracing SAIL 7.4: Inadequate AI Audit Trails


event
• Alert on unusual transaction approval
SAIL 7.1: Insufficient AI Interaction Logging • Cannot trace which data influenced patterns

decisions
• Log complete decision provenance

• Federated update history incomplete • Maintain immutable federated learning


audit trail

// Key Attack-Specific Mitigations


Federated Learning Security:
Implement secure aggregation protocol
Use differential privacy in model update
Validate contributor model updates before aggregation
Monitor for statistical anomalies in federated contributions

Supply Chain Integrity:


Cryptographically sign all data source
Implement data provenance trackin
Regular security audits of third-party API
Establish data source reputation scoring

35
Cross-System Isolation:
Separate fraud detection from trading system
Implement data diodes between critical system
Monitor for correlated anomalies across system
Establish circuit breakers for automated decisions

Regulatory Compliance:
Real-time AML/KYC compliance checkin
Maintain complete audit trails for investigation
Implement transaction reversal capabilitie
Regular compliance testing with synthetic laundering patterns

Case Study: Rules File


Backdoor Attack on AI
Coding Assistants
An examination of supply chain vulnerabilities in Cursor
and GitHub Copilot

// Introduction
In March 2025, Pillar Security researchers uncovered a critical vulnerability affecting the world's leading AI
coding assistants - GitHub Copilot and Cursor. Dubbed the "Rules File Backdoor," this attack demonstrates
how trusted configuration files can be weaponized to compromise AI-generated code at scale. This case
study examines the attack mechanism, its implications, and how the SAIL Framework's multi-phase
approach could prevent such sophisticated supply chain attacks.

// Context and Setup


By exploiting hidden unicode characters and sophisticated evasion techniques in rule file configurations,
threat actors can manipulate GitHub Copilot and Cursor to inject malicious code that bypasses typical code
reviews. This attack remains virtually invisible to developers and security teams, allowing compromised
code to silently propagate through projects, forks, and shared repositories.

Unlike traditional supply chain attacks that target specific dependencies, "Rules File Backdoor" weaponizes
the AI itself as an attack vector, effectively turning the developer's most trusted assistant into an unwitting
accomplice.

36
With 97% of enterprise developers relying on these tools daily, a single poisoned rule file can potentially
affect millions of end users through compromised software distributed across the global supply chain.

How Hackers Can Weaponize Code


Agents Through Compromised Rule Files

SAIL Framework Analysis: Rules File Backdoor Attack

SAIL Phase Specific SAIL Risks Identified Description Example

Phase 1: AI SAIL 1.1: Inadequate AI Policy


• No policies for vetting AI configuration • Establish policies requiring security
Policy & Safe SAIL 1.2: Governance Misalignment
files
review of all AI configuration files

experimentation SAIL 1.5: Unmonitored AI Experimentation • AI policies don't address rule file security
• Define approved sources for rule files

• Shadow rule file creation in dev • Mandate sandbox testing for new AI
environments configurations

Phase 2: Code/ SAIL 2.1: Incomplete Asset Inventory


• Rule files not tracked in AI asset inventory
• Include rule files in AI asset inventory

No Code - AI SAIL 2.2: Shadow AI Deployment • Community-sourced rule files bypass • Automated discovery of .cursor/rules
Asset Discovery discovery
directories

• AI configurations in .cursor directories • Track provenance of all AI configuration


overlooked files

Phase 3: Build - SAIL 3.4: Insecure System Prompt Design


• Rule files act as extended prompts without • Scan rule files for Unicode obfuscation
AI Security SAIL 3.10: Unvetted Use of Open-Source & security validation
patterns

Posture Third-Party AI Components


• Community-sourced rule files integrated • Validate all external configuration
Management SAIL 3.3: Vulnerable AI Frameworks & without review
sources

Libraries • Unicode obfuscation bypasses framework • Implement rule file signing and integrity
security checks

Phase 4: Test - SAIL 4.9: Limited Scope of Evasion • Unicode injection not included in test • Include configuration poisoning in red
AI Red Teaming Technique Testing
scenarios
team playbooks

SAIL 4.2: Incomplete Red-Team Coverage


• Configuration injection vectors overlooked
• Test for invisible character injection
SAIL 4.8: Limited Foreign Language Red • Unicode attacks span multiple character techniques

Teaming sets • Validate AI behavior with compromised


configurations

37
SAIL Phase Specific SAIL Risks Identified Description Example

Phase 5: Deploy SAIL 5.16: Cross-Domain Prompt Injection • Malicious instructions from configuration • Runtime scanning of AI-generated code
- Runtime (indirect)
files
for suspicious patterns

Guardrails SAIL 5.7: Insecure Output Handling


• No validation of AI-generated code
• Automatic detection of external
SAIL 5.4: System Prompt Leakage • External resource references not flagged resource references

• Output filtering for known malicious


domains

Phase 6: SAIL 6.5: Indirect Prompt / Instruction • Rule files inject instructions outside normal • Sandbox all AI-generated code before
Operate - Safe Injection
prompt flow
integration

Execution SAIL 6.7: Autonomous Code Execution • AI generates malicious code • Monitor for unexpected external
Environment Abuse
autonomously
connections

SAIL 6.2: Unrestricted API/Tool Invocation • Generated code makes unauthorized • Require human review for code
external calls containing external resources

Phase 7: SAIL 7.1: Insufficient AI Interaction Logging


• Hidden instructions not logged
• Log complete context including all rule
Monitor - AI SAIL 7.2: Missing Real-time Security Alerts
• No alerts for suspicious code generation
files used

Activity Tracing SAIL 7.4: Inadequate AI Audit Trails • Cannot trace back to poisoned rule files • Alert on AI-generated code with
external dependencies

• Maintain audit trail linking generated


code to configuration

38
References

Pillar State of Attack on GenAI report:


[Link]

Pillar "Rules File Backdoor" research:


[Link]
code-agents

Databricks AI Security Framework (DASF) 2.0:


[Link]

AWS Generative AI Security Scoping Matrix:


[Link]
scoping-matrix/

European Union AI Act


[Link]

Establish Risks and Controls for the AI Supply Chain, V 1.0:


[Link]

Gartner AI TRISM:
[Link]

Google Secure AI Framework (SAIF):


[Link]

Google Responsible AI Principles:


[Link]

IBM Framework for Securing Generative AI:


[Link]

IBM Everyday Ethics for Artificial Intelligence:


[Link]

39
ISO/IEC 42001:2023: Information technology — Artificial
intelligence — Management system standard:
[Link]

Meta Responsible AI:


[Link]

Microsoft AI Safety Policies:


[Link]

Microsoft Responsible AI Principles:


[Link]

MITRE ATLAS:
[Link]

NIST Artificial Intelligence Risk Management Framework (AI RMF 1.0):


[Link]

OWASP Top 10 for LLM Applications 2025:


[Link]

OWASP AI Security and Privacy Guide:


[Link]

OWASP Multi-Agentic System Threat Modeling Guide v1.0:


[Link]

Establish Risks and Controls for the AI Supply Chain, V 1.0


[Link]

40
p i l l a r . s e c u r i t y

Common questions

Powered by AI

The SAIL Framework addresses AI-specific security challenges by embedding security actions into each phase of the AI development lifecycle, unlike traditional software where security is often retrofitted. It harmonizes AI lifecycle demands with established practices by integrating frameworks like NIST AI RMF and ISO 42001. SAIL introduces specialized controls for AI's distinct risks, such as large datasets and autonomous decision-making, which are not prevalent in traditional IT systems . By emphasizing a process-oriented approach, it overcomes the limitations of DevSecOps in handling AI's dynamic nature, like iterative learning and opaque decision-making .

A unified framework like the SAIL Framework is crucial for managing AI-specific risks because it facilitates communication and coherence among diverse teams, such as AI developers and security professionals. Without a unified framework, teams often operate in silos, which can amplify complexities and risks due to miscommunications and uncoordinated efforts. Such fragmentation can lead to gaps in security, leaving AI systems vulnerable to evolving threats and undermining their deployment integrity and reliability .

The probabilistic nature of AI contrasts with processes needing strict transactional integrity because AI decisions are based on probability which may lead to inaccuracies due to misunderstandings or errors such as prompt misinterpretations. These errors are compounded in systems involving interconnected transactions where rollback is nearly impossible, heightening the need for precise data handling strategies to ensure consistent transactional outcomes across multiple platforms .

Data integrity is crucial in AI applications because it underpins the secure functioning of AI systems and the infrastructure they rely on. Compromised data integrity can lead to AI systems executing transactions inaccurately, potentially resulting in significant consequences if these actions are irreversible, especially in a multi-agent environment. Ensuring data integrity helps prevent security breaches and maintains trust in AI-driven decisions .

'Shift Up' extends the 'Shift Left' approach by adding a vertical axis to security, focusing on AI's unique layers such as business logic and decision-making abstractions, which are not directly dealt with by traditional horizontal security methodologies. 'Shift Left' focuses on integrating security early in the development lifecycle, ideal for deterministic systems. In contrast, 'Shift Up' elevates security to encompass higher-order AI capabilities, addressing the risks introduced by AI systems' autonomous and expansive operations, which a 'Shift Left' approach cannot fully mitigate .

SAIL recommends mechanisms such as output policy enforcement, output classification, and compliance monitoring to protect against policy-violating outputs in AI models. These measures ensure that an AI model adheres to organizational, industry, or regulatory policies by implementing strict protocols for managing high-risk outputs, thus preventing harmful or non-compliant content from being generated and utilized .

Dynamic dependency injection in AI systems poses risks such as supply chain vulnerabilities, malware infiltration, and licensing issues. These arise when agents load dependencies during execution without thorough vetting. Risks can be mitigated by disabling or controlling dynamic loading, using pre-approved allow lists, and monitoring installation attempts for suspicious activity. This proactive approach curtails unauthorized or harmful code execution within the AI environment .

LLMs challenge conventional security measures because of their adaptive learning capabilities and often opaque decision-making processes, which are not typically accounted for in traditional security frameworks. LLMs can be vulnerable to subtle manipulations such as prompt injections or adversarial inputs, which exploit their context-dependent behavior. Traditional security measures are inadequate for these challenges due to LLMs' inherent complexity and the dynamic nature of their model responses, which require ongoing and context-aware security oversight .

Threat modeling and secure data governance are essential from an AI system's inception to ensure that security is built into the system's foundation rather than added as an afterthought. This proactive approach allows for the identification of potential vulnerabilities early, enabling the design of robust defenses against specific threats. Such practices ensure data is handled securely throughout the AI lifecycle, minimizing exposure to both internal and external risks .

SAIL's approach is highly effective in embedding security throughout the AI lifecycle by synchronizing high-level security principles with practical guidance. It uniquely accommodates AI's distinct development cycles by incorporating specialized controls and risk management practices that traditional frameworks overlook. SAIL facilitates the translation of complex AI security challenges into manageable tasks across various phases, from policy creation to runtime monitoring, ensuring comprehensive coverage and proactive threat mitigation .

You might also like