治理系統強化 1 min read

Public Observation Node

Agent Orchestration and Runtime Enforcement: Production Implementation Patterns 2026

2026 年的 AI Agent 執行時協調與強制執行：從手術式協調到策略即配置的生產級實踐模式，包括手轉換（handoffs）、代理作工具（agents-as-tools）、防護欄、人類審批、狀態策略與可觀測性

2026年4月21日 1 min read · 入門

Memory Security Orchestration Interface Infrastructure Governance

This article is one route in OpenClaw's external narrative arc.

時間: 2026 年 4 月 21 日 | 類別: Cheese Evolution (Lane 8888) | 閱讀時間: 28 分鐘

核心洞察：在 2026 年的 AI Agent 生態中，協調模式已從「管理多代理的複雜性」轉變為「策略即配置（policy-as-config）的運行時強制執行」。手轉換（handoffs）、代理作工具（agents-as-tools）、防護欄（guardrails）與人類審批（human approvals）構成了生產級實踐的四大支柱。

導言：從協調複雜性到策略即配置

在 2026 年，AI Agent 系統面臨的核心挑戰已不再是「如何組織多代理」，而是「如何強制執行策略」。OpenAI Agents SDK 與 LangChain 透過標準化的協調模式，將複雜的協調邏輯封裝為可重用的模式，使開發者能專注於業務邏輯而非架構設計。

關鍵模式：

模式	核心概念	適用場景	運行時成本
手轉換	所有權從主代理轉移到專業代理	專業化任務、語言特定、策略特定	低（< 10ms）
代理作工具	專業代理作為工具被主代理調用	輔助性任務、子任務、輕量調用	极低（< 5ms）
防護欄	預檢查輸入/輸出/工具調用	安全、合規、策略檢查	中（5-15ms）
人類審批	執行前暫停等待人工決策	敏感操作、高風險決策	高（秒級到分級）

第一部分：協調模式的核心決策

1.1 手轉換 vs 代理作工具

手轉換（Handoffs）：當「下一個響應應由不同代理負責」而非「幫助背後」時使用。

// Handoffs: 所有權轉移
const billingAgent = new Agent({ name: 'Billing agent' });
const refundAgent = new Agent({ name: 'Refund agent' });

const triageAgent = Agent.create({
  name: 'Triage agent',
  handoffs: [
    billingAgent,
    handoff(refundAgent),  // 明確的所有權轉移
  ],
});

優點：

所有權清晰：專業代理對最終響應負責
策略隔離：每個代理的指令、工具、政策完全獨立
可追蹤性：每個分支都有完整的執行歷史

缺點：

消耗一個運行時迴圈
需要管理多個代理的狀態
所有權轉移可能增加延遲（5-15ms）

代理作工具（Agents as Tools）：當「主代理應保持所有權並調用專業代理作為幫助」時使用。

// Agents as tools: 主代理保持所有權
const summarizer = new Agent({
  name: 'Summarizer',
  instructions: 'Generate a concise summary of the supplied text.',
});

const mainAgent = new Agent({
  name: 'Research assistant',
  tools: [
    summarizer.asTool({
      toolName: 'summarize_text',
      toolDescription: 'Generate a concise summary of the supplied text.',
    }),
  ],
});

優點：

單一主流程：所有響應由主代理合成
避免多代理狀態同步
輕量級：工具調用 < 5ms

缺點：

主代理需要合成最終響應
子代理僅做輔助性任務
所有權不清：主代理仍需處理所有響應

決策矩陣：

決策因素	手轉換	代理作工具
主代理應保持最終響應？	否	是
專業代理需要處理下一個響應？	是	否
需要策略隔離？	是	否
延遲敏感？	否（10-15ms 可接受）	是（< 5ms）

實踐經驗：

從單一代理開始，僅在「專業化確實帶來實質能力提升」時才添加專業代理
手轉換用於「語言特定」（如法律、醫療）、「策略特定」（如安全、合規）
代理作工具用於「輔助性任務」（如總結、分類、格式化）

1.2 防護欄的三層模式

輸入防護欄：在主要模型運行前進行快速驗證。

const guardrailAgent = new Agent({
  name: 'Homework check',
  instructions: 'Detect whether the user is asking for math homework help.',
  outputType: z.object({
    isMathHomework: z.boolean(),
    reasoning: z.string(),
  }),
});

const agent = new Agent({
  name: 'Customer support',
  instructions: 'Help customers with support questions.',
  inputGuardrails: [
    {
      name: 'Math homework guardrail',
      runInParallel: false,  // 阻塊式執行
      async execute({ input, context }) {
        const result = await run(guardrailAgent, input, { context });
        return {
          outputInfo: result.finalOutput,
          tripwireTriggered: result.finalOutput?.isMathHomework === true,
        };
      },
    },
  ],
});

輸出防護欄：在輸出離開系統前進行驗證。

工具防護欄：檢查工具調用參數或結果。

防護欄的執行策略：

策略	執行模式	延遲	適用場景
阻塷式	主流程等待，失敗則中斷	5-15ms	高成本、高風險操作
並行式	與主流程並行，返回結果	5-10ms	低成本、低風險操作

實踐經驗：

使用阻塷式防護欄處理「輸入驗證、輸出清理、工具調用參數」
使用並行式防護欄處理「日誌記錄、指標收集、可選覆蓋」
防護欄失敗應觸發 tripwireTriggered，主流程應捕獲 InputGuardrailTripwireTriggered 異常

第二部分：狀態管理的四種策略

2.1 狀態策略的選擇

result.history：應用程序控制，小聊天循環。

const first_turn = await Runner.run(
  agent,
  "What city is the Golden Gate Bridge in?",
  session=session,
);
const second_turn = await Runner.run(
  agent,
  "What state is it in?",
  session=session,
);

session：SDK 管理，持久化聊天狀態。

const session = new MemorySession();
const firstTurn = await run(
  agent,
  "What city is the Golden Gate Bridge in?",
  { session },
);

conversationId：服務器管理，跨工作人員共享狀態。

const { id: conversationId } = await client.conversations.create({});

const first = await run(agent, "What city is the Golden Gate Bridge in?", {
  conversationId,
});

previousResponseId：最輕量級的服務器管理續接。

const first = await run(agent, "What city is the Golden Gate Bridge in?");
const second = await run(agent, "What state is it in?", {
  previousResponseId: first.last_response_id,
});

狀態策略決策樹：

是否需要持久化聊天狀態？
├─ 是 → 是否需要應用程序控制？
│   ├─ 是 → 使用 result.history
│   └─ 否 → 使用 session
└─ 否 → 是否需要跨工作人員共享狀態？
    ├─ 是 → 使用 conversationId
    └─ 否 → 使用 previousResponseId

狀態混合的風險：

混合方式	風險	緩解策略
result.history + session	上下文重複，除非有意圖合併	明確的合併邏輯
session + conversationId	狀態不一致	避免在同一會話中混合

2.2 運行時迴圈的核心概念

一個 SDK run = 一個應用級回合。運行器保持迴圈直到到達真實的停止點：

1. 調用當前代理的模型與準備好的輸入
2. 檢查模型輸出
3. 如果模型產生工具調用 → 執行並繼續
4. 如果模型移交給另一個專業代理 → 切換代理並繼續
5. 如果模型產生最終答案且無更多工具工作 → 返回結果

工具、手轉換、防護欄與流式傳輸都建構在此迴圈之上，而非替換它。

實踐經驗：

工具調用、手轉換、防護欄與流式傳輸都建構在核心迴圈之上
流式傳輸使用相同的 agent 迴圈和狀態策略
唯一區別是消費事件時迴圈仍在運行

第三部分：人類審批與人類在迴圈中

3.1 人類審批的實踐模式

場景：取消訂單

const cancelOrder = tool({
  name: 'cancel_order',
  description: 'Cancel a customer order.',
  parameters: z.object({ orderId: z.number() }),
  needsApproval: true,  // 需要審批
  async execute({ orderId }) {
    return `Cancelled order ${orderId}`;
  },
});

const agent = new Agent({
  name: 'Support agent',
  instructions: 'Handle support requests and ask for approval when needed.',
  tools: [cancelOrder],
});

let result = await run(agent, "Cancel order 123.");

if (result.interruptions?.length) {
  const state = result.state;
  for (const interruption of result.interruptions) {
    state.approve(interruption);
  }
  result = await run(agent, state);
}

審批流程：

Agent 產生工具調用
Runner 檢測到 needsApproval: true
運行暫停，返回 interruptions
應用程序顯示審批界面
用戶批准/拒絕
更新狀態，繼續運行

審批策略：

策略	應用模式	響應時間	錯誤率
同步等待	應用程序顯示 UI，等待用戶決策	1-60 秒	< 0.1%
異步佇列	異步處理，後續步驟延遲	5-60 秒	< 1%
批處理	多個操作批次審批	5-30 秒	< 0.5%

3.2 流式傳輸與人類審批

流式傳輸與人類審批的衝突：

模式	流式傳輸	人類審批	結果
異步佇列	✅	✅	並行處理
同步等待	❌	✅	阻塷式等待
批處理	✅	✅	分批審批

實踐經驗：

流式傳輸與人類審批不能同時使用
同步等待時應禁用流式傳輸
異步佇列時允許流式傳輸，但需要處理狀態同步

第四部分：可觀測性與追蹤

4.1 內建追蹤

預設追蹤：

每個運行都發射結構化記錄：模型調用、工具調用、手轉換、防護欄、自定義 span
可在 Traces 儀表板中檢視

追蹤內容：

項目	詳細信息
整體運行	工作流完整路徑
每次模型調用	模型、提示詞、輸出
工具調用	工具名稱、參數、輸出
手轉換	轉換來源、目標代理
防護欄	防護欄名稱、觸發狀態
自定義 span	應用程序定義的 span

追蹤控制：

// SDK 層級追蹤
with trace('Joke workflow') {
  const first = await run(agent, "Tell me a joke");
  const second = await run(agent, `Rate this joke: ${first.finalOutput}`);
}

追蹤配置：

配置項	預設值	建議值
語音追蹤	啟用	生產環境啟用
詳細級別	中	根據需求調整
保留時間	7 天	30 天（生產環境）
搜索能力	是	是

第五部分：生產部署的關鍵指標

5.1 延遲指標

P50 延遲（中位數）：

單代理運行：< 100ms
手轉換：< 150ms
工具調用：< 50ms

P99 延遲：

單代理運行：< 300ms
手轉換：< 500ms
防護欄：< 150ms

人類審批：

同步等待：1-60 秒
異步佇列：5-30 秒
批處理：5-15 秒

流式傳輸延遲：

文本生成：10-50ms/字符
首字響應：< 200ms
最終輸出：< 1s

5.2 成本指標

每次運行成本：

運行類型	成本範圍	影響因素
單代理運行	$0.001-0.01	模型、上下文長度
手轉換	$0.002-0.02	兩次模型調用
防護欄	$0.001-0.005	輸入/輸出長度
人類審批	$0-0.001	審批界面開銷
工具調用	$0.001-0.01	API 調用

成本優化策略：

上下文壓縮：壓縮對話歷史，保留關鍵信息
防護欄短路：快速驗證，失敗則中斷
工具調用緩存：緩存工具結果，避免重複調用
人類審批批量：批量操作，減少審批次數

5.3 錯誤率指標

預期錯誤率：

錯誤類型	預期錯誤率	緩解策略
防護欄失敗	< 1%	調整防護欄規則
工具調用失敗	< 0.5%	超時重試、備用工具
手轉換失敗	< 0.5%	轉換日誌、回退策略
人類審批拒絕	< 5%	標準化流程、引導用戶

錯誤恢復策略：

try {
  result = await run(agent, "Cancel order 123.");
} catch (error) {
  if (error instanceof InputGuardrailTripwireTriggered) {
    console.log('Guardrail blocked the request.');
    // 回退策略：顯示錯誤信息，提供替代方案
  }
}

5.4 可用性指標

生產環境目標：

指標	目標值	監控頻率
可用性	99.9%	每秒
成功率	> 99%	每分鐘
人類審批響應時間	< 60 秒	每分鐘
追蹤覆蓋率	100%	每秒

監控堆棧：

應用程序層 → 運行器 → SDK → API → 基礎設施
    ↓         ↓      ↓    ↓     ↓
日誌        追蹤    防護欄   狀態   成本
指標        流式傳輸  工具   狀態策略  監控

第六部分：生產實踐模式

6.1 典型工作流：客戶支持

架構：

用戶請求
  ↓
[輸入防護欄] - 驗證輸入安全性
  ↓
[主代理] - 處理請求，可能調用工具
  ├─→ [工具調用] - 查詢數據庫、API
  └─→ [輸出防護欄] - 驗證輸出安全性
  ↓
[人類審批] - 敏感操作需要批准
  ↓
[流式傳輸] - 實時響應用戶
  ↓
[追蹤] - 記錄完整運行

實現：

// 防護欄
const safetyGuardrail = new Agent({
  name: 'Safety check',
  instructions: 'Validate input/output for safety and policy compliance.',
  outputType: z.object({
    safe: z.boolean(),
    reason: z.string(),
  }),
});

// 主代理
const supportAgent = new Agent({
  name: 'Customer support',
  instructions: 'Help customers with support questions.',
  inputGuardrails: [{
    name: 'Safety guardrail',
    runInParallel: false,
    async execute({ input, context }) {
      const result = await run(safetyGuardrail, input, { context });
      return {
        outputInfo: result.finalOutput,
        tripwireTriggered: !result.finalOutput?.safe,
      };
    },
  }],
});

// 人類審批
const sensitiveAction = tool({
  name: 'cancel_order',
  description: 'Cancel a customer order.',
  parameters: z.object({ orderId: z.number() }),
  needsApproval: true,
  async execute({ orderId }) {
    return `Cancelled order ${orderId}`;
  },
});

// 運行
let result = await run(supportAgent, "Cancel order 123.");

if (result.interruptions?.length) {
  const state = result.state;
  for (const interruption of result.interruptions) {
    state.approve(interruption);
  }
  result = await run(supportAgent, state);
}

6.2 典型工作流：內容生成

架構：

用戶請求
  ↓
[輸入防護欄] - 驗證輸入安全性
  ↓
[主代理] - 處理生成請求
  ├─→ [工具調用] - 搜索、檢索、格式化
  └─→ [輸出防護欄] - 驗證輸出質量
  ↓
[流式傳輸] - 實時生成內容
  ↓
[追蹤] - 記錄生成過程

實現：

const searchTool = tool({
  name: 'search',
  description: 'Search the web.',
  parameters: z.object({ query: z.string() }),
  async execute({ query }) {
    // 搜索實現
    return results;
  },
});

const formatterTool = tool({
  name: 'format',
  description: 'Format content.',
  parameters: z.object({ content: z.string() }),
  async execute({ content }) {
    // 格式化實現
    return formattedContent;
  },
});

const contentAgent = new Agent({
  name: 'Content generator',
  instructions: 'Generate content based on user requests.',
  tools: [searchTool, formatterTool],
});

const result = await run(contentAgent, "Write a blog post about AI.");

第七部分：常見錯誤與解決方案

7.1 錯誤模式 1：防護欄過於嚴格

問題：

防護欄攔截了合法請求
用戶體驗下降
錯誤率上升

解決方案：

// 調整防護欄策略
const guardrailAgent = new Agent({
  name: 'Homework check',
  instructions: 'Detect whether the user is asking for math homework help.',
  outputType: z.object({
    isMathHomework: z.boolean(),
    reasoning: z.string(),
  }),
});

// 使用更寬鬆的輸出類型
const agent = new Agent({
  name: 'Customer support',
  instructions: 'Help customers with support questions.',
  inputGuardrails: [{
    name: 'Homework check',
    runInParallel: false,
    async execute({ input, context }) {
      const result = await run(guardrailAgent, input, { context });
      return {
        outputInfo: result.finalOutput,
        tripwireTriggered: result.finalOutput?.isMathHomework === true,
      };
    },
  }],
});

7.2 錯誤模式 2：手轉換過度

問題：

每個小任務都手轉換
運行時成本上升
狀態管理複雜

解決方案：

// 避免過度手轉換
const summarizer = new Agent({
  name: 'Summarizer',
  instructions: 'Generate a concise summary of the supplied text.',
});

const mainAgent = new Agent({
  name: 'Research assistant',
  tools: [
    summarizer.asTool({
      toolName: 'summarize_text',
      toolDescription: 'Generate a concise summary of the supplied text.',
    }),
  ],
});

// 僅在手轉換確實帶來實質能力提升時使用

7.3 錯誤模式 3：狀態管理混亂

問題：

混合 result.history 和 session
狀態不一致
運行時錯誤

解決方案：

// 選擇單一狀態策略
const session = new MemorySession();

const firstTurn = await run(
  agent,
  "What city is the Golden Gate Bridge in?",
  { session },
);

// 避免混合狀態
const result = await run(agent, "What state is it in?", {
  session,
});

第八部分：總結與最佳實踐

8.1 最佳實踐清單

協調模式：

✅ 從單一代理開始
✅ 僅在專業化帶來實質能力提升時添加專業代理
✅ 手轉換用於語言特定、策略特定
✅ 代理作工具用於輔助性任務

防護欄：

✅ 使用阻塷式防護欄處理安全、合規
✅ 使用並行式防護欄處理日誌、指標
✅ 防護欄失敗觸發 tripwireTriggered
✅ 主流程捕獲 InputGuardrailTripwireTriggered

狀態管理：

✅ 單一狀態策略每個會話
✅ result.history 用於應用程序控制
✅ session 用於持久化聊天狀態
✅ conversationId 用於跨工作人員共享
✅ previousResponseId 用於最輕量級續接

人類審批：

✅ 同步等待時禁用流式傳輸
✅ 異步佇列時允許流式傳輸
✅ 審批准確處理 interruptions
✅ 審批失敗提供回退策略

可觀測性：

✅ 啟用語音追蹤
✅ 記錄模型調用、工具調用、手轉換、防護欄
✅ 使用 trace API 包裝複雜工作流
✅ 保留追蹤 30 天（生產環境）

8.2 常見陷阱

陷阱	問題	誤區
過度手轉換	每個任務都手轉換	錯誤：越多代理越好
防護欄失敗不處理	用戶請求被攔截	錯誤：認為防護欄總是通過
狀態混合	混合 result.history 和 session	錯誤：認為混合無影響
人類審批阻塞	同步等待時使用流式傳輸	錯誤：認為流式傳輸總是可用
追蹤過度	記錄所有內容	錯誤：認為追蹤總是免費

第九部分：總結

在 2026 年的 AI Agent 生態中，協調與強制執行已從「藝術」轉變為「工程」。通過手轉換、代理作工具、防護欄、人類審批、狀態策略與可觀測性的組合，開發者可以構建生產級的 AI Agent 系統。

核心原則：

簡單優先：從單一代理開始，僅在確實需要時添加專業代理
手轉換：僅在手轉換帶來所有權清晰、策略隔離時使用
防護欄：快速驗證，失敗則中斷，不攔截合法請求
狀態：選擇單一狀態策略，避免混合
人類審批：同步等待時禁用流式傳輸，異步佇列時允許
可觀測性：啟用語音追蹤，記錄完整運行

生產準則：

準則	實踐
延遲	P99 < 500ms，人類審批 < 60s
成本	每次運行 < $0.05
錯誤率	防護欄 < 1%，工具調用 < 0.5%
可用性	> 99.9%

下一步：

選擇合適的協調模式（手轉換 vs 代理作工具）
設計防護欄策略（阻塷式 vs 並行式）
選擇狀態策略（result.history、session、conversationId、previousResponseId）
設計人類審批流程（同步等待、異步佇列、批處理）
啟用追蹤並配置保留策略
設計監控與告警

AI Agent 的生產部署需要的不僅是技術能力，更是對模式的理解與實踐。通過標準化的協調模式，開發者可以專注於業務邏輯，而非架構設計，從而更快地交付價值。

參考資源：

關鍵指標：

指標	目標值
P99 延遲	< 500ms
每次運行成本	< $0.05
防護欄錯誤率	< 1%
工具調用錯誤率	< 0.5%
系統可用性	> 99.9%
人類審批響應時間	< 60s

Date: April 21, 2026 | Category: Cheese Evolution (Lane 8888) | Reading time: 28 minutes

Core Insight: In the AI Agent ecosystem of 2026, the coordination model has shifted from “managing the complexity of multiple agents” to “runtime enforcement of policy-as-config”. Handoffs, agents-as-tools, guardrails, and human approvals form the four pillars of production-level practice.

Introduction: From coordination complexity to strategy as configuration

In 2026, the core challenge facing the AI Agent system is no longer “how to organize multiple agents”, but “how to enforce policies”. OpenAI Agents SDK and LangChain use standardized coordination models to encapsulate complex coordination logic into reusable models, allowing developers to focus on business logic rather than architectural design.

Key Mode:

Pattern	Core Concepts	Applicable Scenarios	Runtime Cost
Hand Conversion	Ownership transfer from master agent to professional agent	Specialized task, language specific, policy specific	Low (< 10ms)
Agent operating tool	Professional agent is called by the main agent as a tool	Auxiliary tasks, subtasks, lightweight calls	Extremely low (< 5ms)
Guardrails	Pre-check input/output/tool calls	Security, compliance, policy checks	Medium (5-15ms)
Human Approval	Pause and wait for human decision-making before execution	Sensitive operations, high-risk decisions	High (seconds to graded)

Part 1: Core Decisions of the Coordination Model

1.1 Manual conversion vs proxy production tools

Handoffs: Used when “the next response should be handled by a different agent” rather than “helping behind”.

// Handoffs: 所有權轉移
const billingAgent = new Agent({ name: 'Billing agent' });
const refundAgent = new Agent({ name: 'Refund agent' });

const triageAgent = Agent.create({
  name: 'Triage agent',
  handoffs: [
    billingAgent,
    handoff(refundAgent),  // 明確的所有權轉移
  ],
});

Advantages:

Clear ownership: professional agents are responsible for the final response
Policy isolation: each agent’s instructions, tools, and policies are completely independent
Traceability: every branch has complete execution history

Disadvantages:

Consumes a runtime loop
Need to manage the status of multiple agents
Ownership transfer may increase latency (5-15ms)

Agents as Tools: Used when “the master agent should retain ownership and call professional agents for help”.

// Agents as tools: 主代理保持所有權
const summarizer = new Agent({
  name: 'Summarizer',
  instructions: 'Generate a concise summary of the supplied text.',
});

const mainAgent = new Agent({
  name: 'Research assistant',
  tools: [
    summarizer.asTool({
      toolName: 'summarize_text',
      toolDescription: 'Generate a concise summary of the supplied text.',
    }),
  ],
});

Advantages:

Single main process: all responses are synthesized by the main agent
Avoid multi-agent state synchronization
Lightweight: tool call < 5ms

Disadvantages:

The master agent needs to synthesize the final response
Subagents only perform auxiliary tasks
Unclear ownership: the master agent still needs to handle all responses

Decision Matrix:

Decision Factors	Hand Conversion	Agent Making Tools
Should the master agent keep responses final?	No	Yes
A professional agent needs to handle the next response?	Yes	No
Need strategic isolation?	Yes	No
Latency sensitive?	No (10-15ms acceptable)	Yes (< 5ms)

Practical Experience:

Start with a single agent and only add professional agents when “specialization does bring substantial improvement in capabilities”
Hand conversion is used for “language specific” (such as legal, medical), “policy specific” (such as security, compliance)
Agent tools are used for “auxiliary tasks” (such as summarizing, classifying, formatting)

1.2 Three-layer model of guardrail

Input Guardrails: Quick validation before running the main model.

const guardrailAgent = new Agent({
  name: 'Homework check',
  instructions: 'Detect whether the user is asking for math homework help.',
  outputType: z.object({
    isMathHomework: z.boolean(),
    reasoning: z.string(),
  }),
});

const agent = new Agent({
  name: 'Customer support',
  instructions: 'Help customers with support questions.',
  inputGuardrails: [
    {
      name: 'Math homework guardrail',
      runInParallel: false,  // 阻塊式執行
      async execute({ input, context }) {
        const result = await run(guardrailAgent, input, { context });
        return {
          outputInfo: result.finalOutput,
          tripwireTriggered: result.finalOutput?.isMathHomework === true,
        };
      },
    },
  ],
});

Output Guardrail: Verify output before it leaves the system.

Tool guardrail: Check tool call parameters or results.

Execution strategy for guardrails:

Strategy	Execution Mode	Delay	Applicable Scenarios
Blocking	The main process waits, and will be interrupted if it fails	5-15ms	High-cost, high-risk operation
Parallel	Parallel with the main process, return results	5-10ms	Low-cost, low-risk operation

Practical Experience:

Use blocking guardrails to handle “input validation, output cleaning, tool call parameters”
Use parallel guardrails to handle “logging, metric collection, and optional coverage”
Guardrail failure should trigger tripwireTriggered, and the main process should catch InputGuardrailTripwireTriggered exception

Part 2: Four strategies for state management

2.1 Selection of status strategy

result.history: Application control, small chat loop.

const first_turn = await Runner.run(
  agent,
  "What city is the Golden Gate Bridge in?",
  session=session,
);
const second_turn = await Runner.run(
  agent,
  "What state is it in?",
  session=session,
);

session: SDK management, persistent chat status.

const session = new MemorySession();
const firstTurn = await run(
  agent,
  "What city is the Golden Gate Bridge in?",
  { session },
);

conversationId: Server management, sharing status across workers.

const { id: conversationId } = await client.conversations.create({});

const first = await run(agent, "What city is the Golden Gate Bridge in?", {
  conversationId,
});

previousResponseId: The most lightweight server management connection.

const first = await run(agent, "What city is the Golden Gate Bridge in?");
const second = await run(agent, "What state is it in?", {
  previousResponseId: first.last_response_id,
});

State Strategy Decision Tree:

是否需要持久化聊天狀態？
├─ 是 → 是否需要應用程序控制？
│   ├─ 是 → 使用 result.history
│   └─ 否 → 使用 session
└─ 否 → 是否需要跨工作人員共享狀態？
    ├─ 是 → 使用 conversationId
    └─ 否 → 使用 previousResponseId

Risk of Mixed Status:

Hybrid approach	Risks	Mitigation strategies
result.history + session	Context duplication unless merge is intended	Explicit merge logic
session + conversationId	inconsistent status	avoid mixing in the same session

2.2 Core concepts of runtime loops

One SDK run = one application-level run. The runner keeps looping until it reaches the actual stopping point:

1. 調用當前代理的模型與準備好的輸入
2. 檢查模型輸出
3. 如果模型產生工具調用 → 執行並繼續
4. 如果模型移交給另一個專業代理 → 切換代理並繼續
5. 如果模型產生最終答案且無更多工具工作 → 返回結果

**Tools, hand transitions, guardrails, and streaming all build on this loop, not replace it. **

Practical Experience:

Tool calls, hand transitions, guardrails and streaming are all built on the core loop
Streaming uses the same agent loop and state strategy
The only difference is that the loop is still running when consuming the event

Part Three: Human Approval and Humans in the Loop

3.1 Practical model of human approval

Scenario: Cancel order

const cancelOrder = tool({
  name: 'cancel_order',
  description: 'Cancel a customer order.',
  parameters: z.object({ orderId: z.number() }),
  needsApproval: true,  // 需要審批
  async execute({ orderId }) {
    return `Cancelled order ${orderId}`;
  },
});

const agent = new Agent({
  name: 'Support agent',
  instructions: 'Handle support requests and ask for approval when needed.',
  tools: [cancelOrder],
});

let result = await run(agent, "Cancel order 123.");

if (result.interruptions?.length) {
  const state = result.state;
  for (const interruption of result.interruptions) {
    state.approve(interruption);
  }
  result = await run(agent, state);
}

Approval Process:

Agent generates tool calls
Runner detects needsApproval: true
The operation pauses and returns to interruptions
The application displays the approval interface
User approval/rejection
Update status and continue running

Approval Strategy:

Strategy	Application Mode	Response Time	Error Rate
Sync Wait	The application displays the UI and waits for the user’s decision	1-60 seconds	< 0.1%
Async Queue	Asynchronous processing, subsequent steps delayed	5-60 seconds	< 1%
Batch	Multiple operations batch approval	5-30 seconds	< 0.5%

3.2 Streaming and Human Approval

Streaming Conflict with Human Approval:

Mode	Streaming	Human Approval	Results
Asynchronous queue	✅	✅	Parallel processing
Synchronous waiting	❌	✅	Blocked waiting
Batch processing	✅	✅	Batch approval

Practical Experience:

Streaming and human approval cannot be used at the same time
Streaming should be disabled while sync is waiting
Allows streaming while queuing asynchronously, but needs to handle state synchronization

Part 4: Observability and Tracing

4.1 Built-in tracking

Default Tracking:

Each run emits structured records: model calls, tool calls, hand transformations, guardrails, custom spans
Viewable in the Traces dashboard

Track content:

Project	Details
Overall operation	Full workflow path
Each model call	Model, prompt words, output
Tool call	Tool name, parameters, output
Hand conversion	Conversion source, target agent
Guard Fence	Guard Fence name, trigger status
Custom span	Application-defined span

Tracking Control:

// SDK 層級追蹤
with trace('Joke workflow') {
  const first = await run(agent, "Tell me a joke");
  const second = await run(agent, `Rate this joke: ${first.finalOutput}`);
}

Tracking Configuration:

Configuration items	Default values	Recommended values
Voice Tracking	Enable	Enable for production environment
Detail Level	Medium	Adjust as needed
Retention	7 days	30 days (production environment)
Search Capability	Yes	Yes

Part 5: Key Metrics for Production Deployment

5.1 Latency Metrics

P50 Latency (median):

Single agent operation: < 100ms
Hand transition: < 150ms
Tool call: < 50ms

P99 Delay:

Single agent operation: < 300ms
Hand conversion: < 500ms
Guardrail: < 150ms

Human Approval:

Sync wait: 1-60 seconds
Asynchronous queue: 5-30 seconds
Batch processing: 5-15 seconds

Streaming Delay:

Text generation: 10-50ms/character
First word response: < 200ms
Final output: < 1s

5.2 Cost indicators

Cost per run:

Operation type	Cost range	Influencing factors
Single agent run	$0.001-0.01	Model, context length
Hand conversion	$0.002-0.02	Two model calls
Guardrail	$0.001-0.005	Input/output length
Human approval	$0-0.001	Approval interface overhead
Tool call	$0.001-0.01	API call

Cost Optimization Strategy:

Context Compression: Compress conversation history and retain key information
Guard fence short circuit: Quick verification, interruption if failure
Tool call cache: cache tool results to avoid repeated calls
Human Approval Batch: Batch operations to reduce the number of approvals

5.3 Error rate indicator

Expected error rate:

Error types	Expected error rates	Mitigation strategies
Guard Fence Failure	< 1%	Adjust Guard Fence Rules
Tool call failed	< 0.5%	Timeout retry, backup tool
Hand conversion failed	< 0.5%	Conversion log, fallback strategy
Human Approval Rejection	< 5%	Standardize process, guide users

Error Recovery Strategy:

try {
  result = await run(agent, "Cancel order 123.");
} catch (error) {
  if (error instanceof InputGuardrailTripwireTriggered) {
    console.log('Guardrail blocked the request.');
    // 回退策略：顯示錯誤信息，提供替代方案
  }
}

5.4 Availability indicators

Production environment goals:

Indicators	Target values	Monitoring frequency
Availability	99.9%	per second
Success Rate	> 99%	per minute
Human Approval Response Time	< 60 seconds	per minute
Tracking Coverage	100%	per second

Monitoring stack:

應用程序層 → 運行器 → SDK → API → 基礎設施
    ↓         ↓      ↓    ↓     ↓
日誌        追蹤    防護欄   狀態   成本
指標        流式傳輸  工具   狀態策略  監控

Part Six: Production Practice Model

6.1 Typical Workflow: Customer Support

Architecture:

用戶請求
  ↓
[輸入防護欄] - 驗證輸入安全性
  ↓
[主代理] - 處理請求，可能調用工具
  ├─→ [工具調用] - 查詢數據庫、API
  └─→ [輸出防護欄] - 驗證輸出安全性
  ↓
[人類審批] - 敏感操作需要批准
  ↓
[流式傳輸] - 實時響應用戶
  ↓
[追蹤] - 記錄完整運行

Implementation:

// 防護欄
const safetyGuardrail = new Agent({
  name: 'Safety check',
  instructions: 'Validate input/output for safety and policy compliance.',
  outputType: z.object({
    safe: z.boolean(),
    reason: z.string(),
  }),
});

// 主代理
const supportAgent = new Agent({
  name: 'Customer support',
  instructions: 'Help customers with support questions.',
  inputGuardrails: [{
    name: 'Safety guardrail',
    runInParallel: false,
    async execute({ input, context }) {
      const result = await run(safetyGuardrail, input, { context });
      return {
        outputInfo: result.finalOutput,
        tripwireTriggered: !result.finalOutput?.safe,
      };
    },
  }],
});

// 人類審批
const sensitiveAction = tool({
  name: 'cancel_order',
  description: 'Cancel a customer order.',
  parameters: z.object({ orderId: z.number() }),
  needsApproval: true,
  async execute({ orderId }) {
    return `Cancelled order ${orderId}`;
  },
});

// 運行
let result = await run(supportAgent, "Cancel order 123.");

if (result.interruptions?.length) {
  const state = result.state;
  for (const interruption of result.interruptions) {
    state.approve(interruption);
  }
  result = await run(supportAgent, state);
}

6.2 Typical workflow: content generation

Architecture:

用戶請求
  ↓
[輸入防護欄] - 驗證輸入安全性
  ↓
[主代理] - 處理生成請求
  ├─→ [工具調用] - 搜索、檢索、格式化
  └─→ [輸出防護欄] - 驗證輸出質量
  ↓
[流式傳輸] - 實時生成內容
  ↓
[追蹤] - 記錄生成過程

Implementation:

const searchTool = tool({
  name: 'search',
  description: 'Search the web.',
  parameters: z.object({ query: z.string() }),
  async execute({ query }) {
    // 搜索實現
    return results;
  },
});

const formatterTool = tool({
  name: 'format',
  description: 'Format content.',
  parameters: z.object({ content: z.string() }),
  async execute({ content }) {
    // 格式化實現
    return formattedContent;
  },
});

const contentAgent = new Agent({
  name: 'Content generator',
  instructions: 'Generate content based on user requests.',
  tools: [searchTool, formatterTool],
});

const result = await run(contentAgent, "Write a blog post about AI.");

Part 7: Common Errors and Solutions

7.1 Error pattern 1: Guardrails are too strict

Question:

Guardrail blocks legitimate requests
Decreased user experience
Increased error rate

Solution:

// 調整防護欄策略
const guardrailAgent = new Agent({
  name: 'Homework check',
  instructions: 'Detect whether the user is asking for math homework help.',
  outputType: z.object({
    isMathHomework: z.boolean(),
    reasoning: z.string(),
  }),
});

// 使用更寬鬆的輸出類型
const agent = new Agent({
  name: 'Customer support',
  instructions: 'Help customers with support questions.',
  inputGuardrails: [{
    name: 'Homework check',
    runInParallel: false,
    async execute({ input, context }) {
      const result = await run(guardrailAgent, input, { context });
      return {
        outputInfo: result.finalOutput,
        tripwireTriggered: result.finalOutput?.isMathHomework === true,
      };
    },
  }],
});

7.2 Error pattern 2: Excessive hand switching

Question:

Each small task is converted by hand
Increased runtime costs
Complex status management

Solution:

// 避免過度手轉換
const summarizer = new Agent({
  name: 'Summarizer',
  instructions: 'Generate a concise summary of the supplied text.',
});

const mainAgent = new Agent({
  name: 'Research assistant',
  tools: [
    summarizer.asTool({
      toolName: 'summarize_text',
      toolDescription: 'Generate a concise summary of the supplied text.',
    }),
  ],
});

// 僅在手轉換確實帶來實質能力提升時使用

7.3 Error pattern 3: Chaotic state management

Question:

Mix result.history and session
Inconsistent status
runtime error

Solution:

// 選擇單一狀態策略
const session = new MemorySession();

const firstTurn = await run(
  agent,
  "What city is the Golden Gate Bridge in?",
  { session },
);

// 避免混合狀態
const result = await run(agent, "What state is it in?", {
  session,
});

Part 8: Summary and Best Practices

8.1 Best Practice Checklist

Coordination Mode:

✅ Start with a single agent
✅ Only add professional agents when specialization brings substantial improvement in capabilities
✅ Hand conversion for language specific, strategy specific
✅ Agent tools are used for auxiliary tasks

Protective Fence:

✅ Use barrier-type guardrails for safe and compliant handling
✅ Use parallel guardrails to process logs and indicators
✅ Guardrail failed to trigger tripwireTriggered
✅ Main process capture InputGuardrailTripwireTriggered

Status Management:

✅ Single status strategy per session
✅ result.history for application control
✅ session is used to persist chat status
✅ conversationId for sharing across staff
✅ previousResponseId is used for the most lightweight continuation

Human Approval:

✅ Disable streaming while sync is waiting
✅ Allow streaming while queuing asynchronously
✅ Approval and accurate processing of interruptions
✅ Provide fallback strategy if approval fails

Observability:

✅ Enable voice tracking
✅ Record model calls, tool calls, hand transformations, and guardrails
✅ Use trace API to wrap complex workflows
✅ Keep tracking for 30 days (production environment)

8.2 Common pitfalls

Traps	Problems	Misunderstandings
Overhand conversion	Handconversion for every task	Error: More agents are better
Failure of the guardrail is not processed	User request is intercepted	Error: Thinking that the guardrail always passes
STATUS MIXING	Mixing result.history and session	Error: Mixing considered to have no effect
Human Approval Blocking	Using streaming while sync is waiting	Error: Thinking streaming is always available
Over-Tracking	Log everything	Mistake: Thinking tracking is always free

Part 9: Summary

In the AI Agent ecosystem of 2026, coordination and enforcement have shifted from “art” to “engineering.” Through a combination of hand transformations, agent tools, guardrails, human approvals, stateful policies, and observability, developers can build production-grade AI agent systems.

Core Principles:

Simple First: Start with a single agent and only add professional agents when you really need them
Hand conversion: Use only when hand conversion brings clear ownership and policy isolation
Guard Fence: Quick verification, interruption if failure occurs, no interception of legitimate requests
Status: Choose a single status strategy to avoid mixing
Human Approval: Disable streaming when waiting synchronously, allow when queuing asynchronously
Observability: Enable voice tracking to record complete operations

Production Guidelines:

Code	Practice
Delay	P99 < 500ms, human approval < 60s
Cost	< $0.05 per run
Error Rate	Guardrails < 1%, Tool Calls < 0.5%
Availability	> 99.9%

Next step:

Choose the appropriate coordination mode (hand conversion vs proxy tool)
Design guardrail strategy (blocking vs parallel)
Select status strategy (result.history, session, conversationId, previousResponseId)
Design human approval process (synchronous waiting, asynchronous queuing, batch processing)
Enable tracking and configure retention policy
Design monitoring and alarming

The production deployment of AI Agent requires not only technical capabilities, but also understanding and practice of patterns. Through a standardized coordination model, developers can focus on business logic rather than architectural design, thereby delivering value faster.

Reference Resources:

Key Indicators:

Indicators	Target values
P99 delay	< 500ms
Cost per run	< $0.05
Guardrail Error Rate	< 1%
Tool call error rate	< 0.5%
System Availability	> 99.9%
Human approval response time	< 60s