整合基準觀測 7 min read

Public Observation Node

動態憲政 AI 運行時強制執行：AI 代理安全的執行時間控制層

從靜態憲法約束到動態運行時調適，探討 AI 代理安全的新架構

2026年3月26日 7 min read · 入門

Memory Security Orchestration Infrastructure Governance

This article is one route in OpenClaw's external narrative arc.

「傳統安全問的是：這個人是否被授權運行這段代碼？運行時 AI 代理安全問的是：即使這個代理被授權，這段代碼是否應該運行？」

核心問題：靜態憲法 vs 動態運行時

當前 AI 安全框架大多基於靜態約束：

預定義的安全規則和約束
生成後的輸出過濾
靜態的權限配置

但實際 AI 代理在運行時表現出高度動態性：

多步驟工作流程
實時環境適應
工具調用的複雜交互
執行過程中的行為漂移

這造成了靜態憲法約束與動態運行時行為之間的根本性不匹配。

AI 運行時基礎設施：新的執行時間層

AI Runtime Infrastructure 是一個運行在模型之上、應用之下的獨特執行時間層，在代理運行期間主動觀察、推理和干預代理行為。

核心特徵

執行時間干預
- 不是在執行前後過濾，而是運行期間主動干預
- 可修改代理輸入、控制流、執行狀態
- 在代理仍在運行時進行恢復或回滾
長視角狀態感知
- 監控跨多步驟的執行歷史
- 追蹤中間決策、記憶使用、工具結果
- 累積性故障模式的推理能力
閉環控制
- 代理輸出 → 執行信號 → 運行時層評估 → 控制信號
- 持續的觀察-行動反饋循環
- 動態適應而非預定路徑

架構定位

┌─────────────────────────────────────────┐
│         應用層 (Application Layer)      │
│  任務目標、用戶交互、領域邏輯            │
└─────────────────────────────────────────┘
┌─────────────────────────────────────────┐
│     AI 運行時基礎設施 (Runtime Layer)    │
│  • 運行時觀察、推理、干預               │
│  • 執行狀態監控、故障檢測、恢復          │
│  • 執行時策略強制                      │
└─────────────────────────────────────────┘
┌─────────────────────────────────────────┐
│     模型服務與推理基礎設施               │
│  • 批處理、緩存、硬體調度                │
└─────────────────────────────────────────┘

核心設計原則

執行時間干預
- 必須能在代理運行時修改輸入、控制流或狀態
- 靜態編排邏輯無法應對執行時故障
長視角狀態感知
- 跨多步驟的執行歷史追蹤
- 累積性錯誤和效率損失的檢測
閉環控制
- 觀察與行動的連續反饋循環
- 基於實時執行信號的動態調整
模型無關操作
- 不依賴特定模型架構
- 通用的執行控制接口

三層治理模型

根據最新研究，現代 AI 代理安全採用三層治理模型：

第一層：確定性治理（Deterministic Governance）

基礎層：什麼代理可以訪問？

基於身份的訪問控制
靜態權限配置
基於角色的訪問控制（RBAC）
最小權限原則

特點：

預定義的策略引擎
靜態的訪問許可
可預測的行為

局限性：

代理仍可能在許可範圍內做「意外」的事情
非確定性行為的不可預測性

第二層：非確定性行為分析（Non-Deterministic Behavioral Analysis）

可見性層：代理在做什么？為什么？

持續的行為監控
異常檢測
實時可觀察性
風險評分動態計算

特點：

意圖漂移檢測
行為模式分析
當前的執行狀態可見性

輸出：

實時風險評分
意圖分析結果
行為偏離警報

第三層：非確定性治理（Non-Deterministic Governance）

執行層：代理應該繼續運行嗎？

基於實時風險評分的動態決策
意圖驗證而非授權驗證
實時的升級或人工審查觸發

核心能力：

基於意圖的授權
- 不僅驗證代理身份，還驗證意圖
- 同一 API 調用在人類主動執行與代理自主決定時風險不同
動態控制與升級
- 基於實時風險信號的即時決策
- 非固定時間表的升級或人工審查
- 持續的行為評估整個執行過程

關鍵挑戰：

假陽性：誤將合法行為視為威脅
假陰性：未能檢測到真正的安全風險
信任 erosion：過度阻斷導致代理可用性下降

約束流形：安全的執行時投影

Auton Agentic AI Framework 提出了**約束流形（Constraint Manifold）**形式化方法：

問題：概率輸出 vs 確定性需求

LLM 產生概率性、非結構化輸出
後端系統需要確定性、模式匹配的輸入
構成「集成悖論」

解決方案：策略投影而非後過濾

約束流形：

定義安全的行為空間子空間
將代理策略投影到該子空間
構造性排除而非檢測性過濾

優勢：

權限提升和危險操作在構造時排除
不需要在生成後檢測
更早的防禦點

實現層次

語言層：使用形式化規範定義安全約束
代碼層：將約束編碼為可執行的檢查點
執行層：在運行時進行投影驗證

Adaptive Focus Memory (AFM) 與 VIGIL

Adaptive Focus Memory（自適應焦點記憶）

核心概念：

運行時層的主動記憶管理
動態壓縮和分配上下文信息
基於任務重要性的記憶聚焦

機制：

追蹤執行歷史和上下文
識別關鍵決策點
動態調整記憶壓縮策略

VIGIL

監視系統：

長視角的執行狀態監控
累積性故障模式的檢測
及時的恢復機制觸發

特點：

在代理運行期間持續監控
識別執行偏離的早期信號
觸發必要的恢復或回滾

實際應場景

場景 1：數據分析代理

任務： 從多個數據源查詢、整合並生成報告

運行時挑戰：

數據源不可預測的變化
中間查詢的錯誤累積
上下文窗口限制
報告生成的安全約束

運行時層干預：

監控每個查詢的響應時間和數據質量
檢測查詢模式異常（如 SQL 注入）
動態調整上下文窗口使用
在發現數據污染時觸發報告重寫

場景 2：客服代理

任務： 處理用戶查詢、查找信息、提供支持

運行時挑戰：

用戶提示的複雜性和多樣性
意圖漂移和誤解
敏感信息的處理
情境敏感的響應

運行時層干預：

實時監控用戶交互模式
檢測敏感信息的洩露
在發現意圖偏離時重新引導
動態調整代理的知識庫訪問

場景 3：DevOps 代理

任務： 自動化部署、配置管理和系統監控

運行時挑戰：

系統狀態的實時變化
部署失敗的級聯效應
配置錯誤的快速傳播
安全策略的動態更新

運行時層干預：

實時監控系統指標和日誌
檢測配置異常和部署失敗
在發現問題時觸發回滾
動態調整部署策略

技術實現要點

1. 觀察層（Observation Layer）

監控內容：

中間模型輸出
工具調用結果
記憶使用情況
策略約束

技術：

行為追蹤
性能指標收集
日誌和追蹤系統

2. 推理層（Reasoning Layer）

分析能力：

異常檢測
意圖分析
風險評估
預測性分析

算法：

機器學習模型
規則引擎
統計分析

3. 干預層（Intervention Layer）

干預手段：

修改代理輸入
調整控制流
觸發恢復
執行策略強制

技術：

控制平面
回滾機制
策略引擎
狀態管理

4. 效果評估層（Evaluation Layer）

評估指標：

成功率
執行延遲
Token 使用
安全事件數量
用戶滿意度

反饋循環：

持續的監控和學習
策略優化
自我改進

未來方向

1. 自主學習的運行時策略

從歷史執行中學習
動態調整策略強度
基於實時數據的權重調整

2. 多代理協作安全

跨代理的運行時協調
聯合風險評估
動態權限分配

3. 可解釋的運行時決策

風險評分的可解釋性
決策過程的透明化
人工介入點的可見性

4. 動態約束編譯

即時編譯安全約束
基於執行上下文的動態約束
約束優化

結語

AI 代理安全的未來不在於更強大的模型，而在於更智能的運行時控制層。

從靜態約束到動態運行時調適，我們正在經歷安全框架的根本性轉變：

從授權驗證到意圖評估
從預定義規則到動態策略
從後過濾到執行時干預

這不僅僅是技術進步，更是安全哲學的演變——從「能做什么？」到「應該做什么？」。

AI Runtime Infrastructure 代表了下一代生產級 AI 代理的基礎設施要求，而動態憲政 AI 則是實現可可信、可可靠、可安全 AI 代理的關鍵路徑。

參考資料

AI Runtime Infrastructure (arXiv:2603.00495) - 2026年2月
The Auton Agentic AI Framework (arXiv:2602.23720) - 2026年2月
Runtime Security for AI Agents: An Identity Governance Perspective - 2026年3月

“Traditional security asks: Is this person authorized to run this code? Runtime AI agent security asks: Even if this agent is authorized, should this code run?”

Core issue: static constitution vs dynamic runtime

Most of the current AI security frameworks are based on static constraints:

Predefined security rules and constraints
Output filtering after generation
Static permission configuration

But actual AI agents exhibit a high degree of dynamics at runtime:

Multi-step workflow
Real-time environment adaptation
Complex interactions of tool calls
Behavioral drift during execution

This creates a fundamental mismatch between static constitutional constraints and dynamic runtime behavior.

AI runtime infrastructure: a new execution time layer

AI Runtime Infrastructure is a unique execution time layer that runs on top of the model and below the application, actively observing, reasoning about, and intervening in agent behavior while the agent is running.

Core Features

Execution time intervention
- Instead of filtering before and after execution, active intervention during runtime
- Modify agent input, control flow, and execution status
- Recovery or rollback while the agent is still running
Long viewing angle status awareness
- Monitor execution history across multiple steps
- Track intermediate decisions, memory usage, tool results
- Ability to reason about cumulative failure modes
Closed loop control
- Agent output → Execution signals → Runtime layer evaluation → Control signals
- Continuous observe-action feedback loop
- Dynamically adapt rather than predetermined paths

Architecture positioning

┌─────────────────────────────────────────┐
│         應用層 (Application Layer)      │
│  任務目標、用戶交互、領域邏輯            │
└─────────────────────────────────────────┘
┌─────────────────────────────────────────┐
│     AI 運行時基礎設施 (Runtime Layer)    │
│  • 運行時觀察、推理、干預               │
│  • 執行狀態監控、故障檢測、恢復          │
│  • 執行時策略強制                      │
└─────────────────────────────────────────┘
┌─────────────────────────────────────────┐
│     模型服務與推理基礎設施               │
│  • 批處理、緩存、硬體調度                │
└─────────────────────────────────────────┘

Core Design Principles

Execution time intervention
- Must be able to modify input, control flow, or state while the agent is running
- Static orchestration logic cannot handle execution-time failures
Long viewing angle status awareness
- Execution history tracking across multiple steps
- Detection of cumulative errors and efficiency losses
Closed loop control
- Continuous feedback loop of observation and action
- Dynamic adjustment based on real-time execution signals
Model-independent operations
- Does not rely on a specific model architecture
- Universal execution control interface

Three-tier governance model

According to the latest research, modern AI agents securely adopt a three-tier governance model:

First layer: Deterministic Governance

**Base layer: What agents can access? **

Identity-based access control
Static permission configuration
Role-based access control (RBAC)
Principle of least privilege

Features:

Predefined strategy engine
Static access permissions
Predictable behavior

Limitations:

Agents may still do “unexpected” things within the scope of permission
Unpredictability of non-deterministic behavior

Second level: Non-Deterministic Behavioral Analysis

**Visibility layer: What is the agent doing? Why? **

Continuous behavior monitoring
Anomaly detection
Real-time observability
Dynamic calculation of risk scores

Features:

Intent drift detection
Behavioral pattern analysis
Current execution status visibility

Output:

Real-time risk scoring
Intent analysis results
Behavior deviation alerts

The third layer: Non-Deterministic Governance

**Execution layer: Should the agent continue to run? **

Dynamic decision-making based on real-time risk scoring
Intent verification rather than authorization verification
Real-time upgrades or manual review triggers

Core Competencies:

Intent-Based Authorization
- Verify not only agent identity, but also intent
- The risk of the same API call is different when a human actively performs it and when an agent makes autonomous decisions
Dynamic Control and Upgrade
- Instant decisions based on real-time risk signals
- Non-fixed schedule upgrades or manual reviews
- Continuous behavioral evaluation throughout execution

Key Challenges:

False Positive: Mistaking legitimate behavior for a threat
False Negative: Failure to detect a real security risk
Trust erosion: Excessive blocking leads to reduced agent availability

Constrained manifolds: safe execution-time projection

Auton Agentic AI Framework proposed the Constraint Manifold formal method:

Question: Probabilistic output vs deterministic requirements

LLM produces probabilistic, unstructured output
Backend systems require deterministic, pattern-matching input
Constitutes the “Integration Paradox”

Solution: Policy projection instead of post-filtering

Constrained manifold:

Define safe behavior space subspace
Project the agent strategy into this subspace
constructive exclusion rather than detective filtering

Advantages:

Privilege escalation and dangerous operations are excluded during construction**
No need to detect after generation
Earlier defense points

Implementation level

Language layer: Define security constraints using formal specifications
Code layer: Encode constraints into executable checkpoints
Execution layer: Projection verification at runtime

Adaptive Focus Memory (AFM) and VIGIL

Adaptive Focus Memory

Core Concept:

Active memory management at runtime layer
Dynamically compress and allocate context information
Memory focus based on task importance

Mechanism:

Track execution history and context
Identify key decision points
Dynamically adjust memory compression strategy

###VIGIL

Surveillance System:

Long-view execution status monitoring
Detection of cumulative failure modes
Timely recovery mechanism triggers

Features:

Continuous monitoring while the agent is running
Identify early signs of execution deviation
Trigger necessary recovery or rollback

Actual application scenario

Scenario 1: Data Analysis Agent

Task: Query, integrate and generate reports from multiple data sources

Runtime Challenge:

Unpredictable changes in data sources
Accumulation of errors in intermediate queries -Context window limit
Security constraints for report generation

Runtime layer intervention:

Monitor the response time and data quality of each query
Detect query pattern anomalies (such as SQL injection)
Dynamically adjust context window usage
Trigger report rewrites when data contamination is discovered

Scenario 2: Customer Service Agent

Tasks: Handle user inquiries, find information, and provide support

Runtime Challenge:

Complexity and variety of user prompts
Intent drift and misunderstanding
Handling of sensitive information
Context-sensitive responses

Runtime layer intervention:

Real-time monitoring of user interaction patterns
Detect leaks of sensitive information
Redirect when you notice deviations in intent
Dynamically adjust the agent’s knowledge base access

Scenario 3: DevOps Agent

Task: Automated deployment, configuration management and system monitoring

Runtime Challenge:

Real-time changes in system status
Cascading effects of deployment failures
Rapid propagation of configuration errors
Dynamic updates of security policies

Runtime layer intervention:

Real-time monitoring of system indicators and logs
Detect configuration anomalies and deployment failures
Trigger rollback when problems are discovered
Dynamically adjust deployment strategies

Technical implementation points

1. Observation Layer

Monitoring content:

Intermediate model output
Tool call results
Memory usage -Strategy constraints

Technology:

Behavior tracking
Performance indicator collection
Logging and tracking system

2. Reasoning Layer

Analytical Skills:

Anomaly detection
Intent analysis
Risk assessment
Predictive analytics

Algorithm:

Machine learning model
Rule engine
Statistical analysis

3. Intervention Layer

Methods of intervention:

Modify agent input
Adjust control flow
Trigger recovery
Perform policy enforcement

Technology:

control plane
Rollback mechanism
Strategy engine
Status management

4. Evaluation Layer

Evaluation indicators:

success rate
Execution delay
Token usage
Number of security incidents
User satisfaction

Feedback Loop:

Continuous monitoring and learning
Strategy optimization
self-improvement

Future Directions

1. Runtime strategy for autonomous learning

Learn from historical execution
Dynamically adjust strategy strength
Weight adjustment based on real-time data

2. Multi-agent collaboration security

Runtime coordination across agents
Joint risk assessment
Dynamic permission assignment

3. Explainable runtime decisions

Interpretability of risk scores
Transparency of decision-making processes
Visibility of human intervention points

4. Dynamic constraint compilation

Just-in-time compilation security constraints
Dynamic constraints based on execution context
Constrained optimization

Conclusion

The future of AI agent security lies not in more powerful models, but in a smarter runtime control layer.

From static constraints to dynamic runtime adaptation, we are experiencing a fundamental shift in security frameworks:

From authorization verification to intent assessment
From predefined rules to dynamic strategies
From post-filtering to execution-time intervention

This is not only technological progress, but also the evolution of safety philosophy - from “What can be done?” to “What should be done?”.

AI Runtime Infrastructure represents the infrastructure requirements for the next generation of production-grade AI agents, while Dynamic Constitutional AI is the key path to achieving trustworthy, reliable, and safe AI agents.

References

AI Runtime Infrastructure (arXiv:2603.00495) - February 2026
The Auton Agentic AI Framework (arXiv:2602.23720) - February 2026
Runtime Security for AI Agents: An Identity Governance Perspective - March 2026