突破能力突破 4 min read

Public Observation Node

SAGE 自我進化代理系統實作指南：從提示詞到生產軟體

SAGE（Self-improving Autonomous Generation Engine）是一個基於 LangGraph 的協調器架構，透過專業代理（規劃者、編碼者、審查者、測試工程師）和模型路由器，將自然語言提示詞轉化為生產級的程式碼、測試和驗證。

2026年4月23日 4 min read · 入門

Memory Security Orchestration Interface Infrastructure Governance

This article is one route in OpenClaw's external narrative arc.

架構概覽

核心組件與流程

1. 交互式 Shell 與指令系統

SAGE 的預設 UX 是持久 REPL，使用 Python Prompt Toolkit 提供可靠的終端體驗：

# 啟動交互式 shell
sage  # 開啟 slash-command shell

# 常用指令
/run "Add JWT auth to the API"  # 執行完整流程
/research  # 交互式計劃審查
/auto  # 減少交互檢查點
/silent  # 自主模式，跳過失敗任務
/plan-only  # 僅輸出計劃，不執行工具
/dry-run  # 不應用檔案補丁

環境變數控制：

SAGE_SHELL_SIMPLE_INPUT=1 - 禁用 prompt_toolkit，使用 plain input()
SAGE_SHELL_NO_STATUSBAR=1 - 隱藏底部狀態欄
SAGE_RUN_OUTPUT=full - 詳細的驗證標籤和指標行
SAGE_VERIFY_TIMEOUT_S=30 - 計劃驗證子進程超時限制（秒）

2. 模型路由與配置

模型路由基於 models.yaml 配置檔案：

# ~/.config/sage/models.yaml
primary: gpt-4o
fallback: gpt-4o-mini
test_profile:
  primary: gpt-4o-mini
  fallback: gpt-4o-mini

角色配置：

每個代理角色都有 primary 和 fallback 模型

支援本地模型透過 Ollama：

ollama pull qwen2.5-coder:1.5b
ollama pull nomic-embed-text

環境變數：

SAGE_MODEL_PROFILE=test - 強制使用測試配置（小型本地模型）
SAGE_MODELS_YAML=~/.config/sage/models.yaml - 自訂配置路徑

3. 代理管道

完整流程：

規劃者 - 解析目標，生成 DAG
編碼者 - 實作代碼
審查者 - 檢查程式碼品質
測試工程師 - 執行測試
驗證 - 驗證輸出

交互式審查（預設模式）：

sage run "Add JWT auth to the API" --research
# 遇到計劃時顯示：
# [a]pprove  [r]eject  [e]dit .sage/last_plan.json

自動化執行：

sage run "Add JWT auth to the API" --auto  # 減少交互
sage run "Add JWT auth to the API" --silent  # 自主模式

實作工作流程

步驟 1：專案初始化

# 克隆 SAGE 倉庫
git clone https://github.com/Sris945/SAGE.git
cd SAGE

# 設定環境
bash startup.sh
source .venv/bin/activate

# 安裝依賴
pip install -e ".[dev,tui]"

步驟 2：專案初始化

cd ~/myproject
sage init  # 建立資料夾

創建的結構：

myproject/
├── .sage/
│   ├── rules.md
│   ├── last_run_metrics.json
│   └── last_plan.json
├── memory/
│   └── weekly_digest.md
├── pytest.ini
└── .gitignore

步驟 3：健康檢查

sage doctor  # 檢查 Python、venv、Ollama、models.yaml

步驟 4：執行完整流程

# 交互式模式（預設）
sage run "Add JWT auth to the API"

# 自動化模式
sage run "Add JWT auth to the API" --auto

# 深度驗證模式
sage run "Add JWT auth to the API" --research --full

輸出結構：

Goal: Add JWT auth to the API
Plan: [DAG visualization]
Actions: [Code changes, test runs]
Outcome: [Pass/Fail + metrics]

步驟 5：檢視 metrics

cat .sage/last_run_metrics.json
{
  "session_id": "...",
  "task_counts": {"planning": 1, "coding": 3, "review": 1, "testing": 2},
  "models_used": ["gpt-4o", "gpt-4o-mini"],
  "human_checkpoints_reached": 1,
  "local_vs_cloud_ratio": 0.75
}

關鍵設計決策與權衡

交互式 vs 自主執行

交互式模式（–research）：

優點：人工審查計劃，減少錯誤
缺點：增加等待時間，降低生產效率
適用場景：初次實作、複雜任務

自主模式（–silent）：

優點：快速執行，適合 CI/CD
缺點：跳過失敗任務，可能隱藏問題
適用場景：日常開發、重複任務

選擇策略：

初次實作：使用 --research，逐步審查
CI/CD：使用 --auto，減少等待
高風險任務：使用 --research + 手動審查

模型路由策略

本地模型優化（Ollama）：

# 編碼模型
ollama pull qwen2.5-coder:1.5b

# 嵌入模型
ollama pull nomic-embed-text

混合部署：

複雜任務：使用 GPT-4o（雲端）
重複任務：使用本地模型（離線）
測試驗證：使用小型模型（快速）

記憶層次

SAGE 提供多層記憶：

專案記憶 - .sage/memory/（專案特定）
會話記憶 - .sage/chat_sessions/（交互式聊天）
系統記憶 - .sage/system_state.json（系統狀態）
週期性摘要 - memory/weekly_digest.md

指標與部署場景

可度量指標

冷啟動時間：

本地模型：~50-100ms
雲端模型：~200-500ms
交互式模式：+100ms（等待審查）

會話吞吐量：

自主模式：~10 任務/小時
交互模式：~5 任務/小時

錯誤率：

計劃階段：<1%
編碼階段：2-5%（取決於複雜度）
驗證階段：<1%

人類介入點：

每個任務：1-3 個審查點
平均時間：30-90 秒

部署場景

場景 1：本地開發環境

# 開發者工作站
export SAGE_MODEL_PROFILE=test
sage run "Add JWT auth to the API" --auto

本地模型，快速迭代
頻繁交互，逐步審查

場景 2：CI/CD 流水線

# GitHub Actions
- name: Test with SAGE
  run: |
    source .venv/bin/activate
    sage run "Run unit tests" --silent --auto

自主模式，減少等待
設定環境變數：SAGE_NON_INTERACTIVE=1

場景 3：生產部署

# 預設配置
sage run "Add JWT auth to the API" --research
# 檢視計劃，批准後執行

交互式審查，確保品質
記錄 metrics：.sage/last_run_metrics.json

安全與治理

規則系統

全局規則： ~/.sage/rules.md 專案規則： .sage/rules.md 代理規則： .sage/rules.coder.md

# .sage/rules.md
- 每個檔案必須包含 docstring
- 不允許硬編碼 API 密鑰
- 測試覆蓋率必須 >= 80%

驗證規則：

sage rules validate  # 健康檢查
sage rules add "禁止硬編碼環境變數"  # 新增規則

會話管理

會話快照：

sage status  # 檢視上次會話快照
sage session reset  # 重置會話
sage session handoff  # 手動移交

記憶移交：

sage memory digest  # 摘要會話日誌

高級功能

訓練與評估

離線 RL：

sage rl collect-synth --rows 650  # 收集合成數據
sage rl export --output datasets/routing_v1.jsonl  # 匯出數據
sage rl analyze-rewards --data datasets/routing_v1.jsonl  # 分析獎勵
sage rl train-bc --data datasets/routing_v1.jsonl  # 訓練 BC
sage rl train-cql --data datasets/routing_v1.jsonl  # 訓練 CQL

模擬器：

sage sim generate --count 1000 --out datasets/sim_tasks.jsonl  # 生成任務
sage sim run --tasks datasets/sim_tasks.jsonl --workers 4  # 執行模擬

實作範例

範例 1：JWT 認證實作

sage run "Add JWT auth to the API"
# 計劃：
# - 修改 src/api.py
# - 添加 middleware/jwt.py
# - 更新 tests/test_auth.py

範例 2：重構遺留程式碼

sage run "Refactor legacy code with type hints"
# 計劃：
# - src/models.py（添加型別註解）
# - src/services.py（添加型別註解）
# - tests/test_models.py

範例 3：單元測試補充

sage run "Add unit tests for user authentication"
# 計劃：
# - tests/test_auth.py（新增測試）
# - 更新 pytest.ini

錯誤處理與除錯

常見問題：

模型未找到：

sage doctor  # 檢查 Ollama 安裝
ollama list  # 列出可用模型

環境變數未設定：
```
export SAGE_MODEL_PROFILE=test
```

會話未繼續：

sage run "your goal" --fresh  # 忽略記憶

與其他系統比較

SAGE vs Agent Infrastructure

特性	SAGE	Agent Infrastructure
架構	協調器管道	模組化基礎設施
交互式	Slash command shell	文檔為主
模型路由	內建 models.yaml	需自訂
規則系統	覆蓋層次	靜態規則
RL 支持	完整管道	僅模組

選擇建議：

需要交互式開發：選 SAGE
需要模組化治理：選 Agent Infrastructure

SAGE vs LiteSwarm

特性	SAGE	LiteSwarm
重型協調器	是	否（輕量級）
規劃 DAG	是	否（動態切換）
交互式審查	是	否（僅 API）
模型路由	內建	透過 litellm
複雜任務	優於	一般

選擇建議：

複雜任務：SAGE
快速原型：LiteSwarm

總結

SAGE 提供了從提示詞到生產軟體的完整管道，關鍵特性包括：

交互式審查 - 平衡速度與品質
模型路由 - 靈活的模型配置
多層記憶 - 遺漏最小化
可度量流程 - metrics.json 追蹤
生產就緒 - 適用於 CI/CD

部署建議：

初次實作：交互式模式 + 手動審查
CI/CD：自主模式 + 自動化
生產環境：混合模式 + metrics 監控

下一步：

閱讀 ARCHITECTURE_STATUS.md - 比對規範與實作
閱讀 docs/ - 完整文檔索引
查看 examples/ - 實作範例

Architecture Overview

SAGE (Self-improving Autonomous Generation Engine) is a coordinator architecture based on LangGraph that uses professional agents (planners, coders, reviewers, test engineers) and model routers to convert natural language prompts into production-level code, testing and verification.

Core components and processes

1. Interactive Shell and command system

SAGE’s default UX is a persistent REPL that uses the Python Prompt Toolkit to provide a reliable terminal experience:

# 啟動交互式 shell
sage  # 開啟 slash-command shell

# 常用指令
/run "Add JWT auth to the API"  # 執行完整流程
/research  # 交互式計劃審查
/auto  # 減少交互檢查點
/silent  # 自主模式，跳過失敗任務
/plan-only  # 僅輸出計劃，不執行工具
/dry-run  # 不應用檔案補丁

Environment variable control:

SAGE_SHELL_SIMPLE_INPUT=1 - disable prompt_toolkit, use plain input()
SAGE_SHELL_NO_STATUSBAR=1 - hide the bottom status bar
SAGE_RUN_OUTPUT=full - Detailed validation tags and metric rows
SAGE_VERIFY_TIMEOUT_S=30 - Plan validation subprocess timeout limit (seconds)

2. Model routing and configuration

Model routing is based on the models.yaml configuration file:

# ~/.config/sage/models.yaml
primary: gpt-4o
fallback: gpt-4o-mini
test_profile:
  primary: gpt-4o-mini
  fallback: gpt-4o-mini

Character Configuration:

Each agent role has primary and fallback models

Support local models through Ollama:

ollama pull qwen2.5-coder:1.5b
ollama pull nomic-embed-text

Environment variables:

SAGE_MODEL_PROFILE=test - Force use of test configuration (small local model)
SAGE_MODELS_YAML=~/.config/sage/models.yaml - Custom configuration path

3. Agent pipeline

Complete process:

Planner - parse the target and generate DAG
Coder - implements the code
Reviewer - Check code quality
Test Engineer - Perform testing
Validation - Validation output

Interactive Review (Preset Mode):

sage run "Add JWT auth to the API" --research
# 遇到計劃時顯示：
# [a]pprove  [r]eject  [e]dit .sage/last_plan.json

Automated execution:

sage run "Add JWT auth to the API" --auto  # 減少交互
sage run "Add JWT auth to the API" --silent  # 自主模式

Implementation workflow

Step 1: Project initialization

# 克隆 SAGE 倉庫
git clone https://github.com/Sris945/SAGE.git
cd SAGE

# 設定環境
bash startup.sh
source .venv/bin/activate

# 安裝依賴
pip install -e ".[dev,tui]"

Step 2: Project initialization

cd ~/myproject
sage init  # 建立資料夾

Structure created:

myproject/
├── .sage/
│   ├── rules.md
│   ├── last_run_metrics.json
│   └── last_plan.json
├── memory/
│   └── weekly_digest.md
├── pytest.ini
└── .gitignore

Step 3: Health Check

sage doctor  # 檢查 Python、venv、Ollama、models.yaml

Step 4: Execute the complete process

# 交互式模式（預設）
sage run "Add JWT auth to the API"

# 自動化模式
sage run "Add JWT auth to the API" --auto

# 深度驗證模式
sage run "Add JWT auth to the API" --research --full

Output structure:

Goal: Add JWT auth to the API
Plan: [DAG visualization]
Actions: [Code changes, test runs]
Outcome: [Pass/Fail + metrics]

Step 5: View metrics

cat .sage/last_run_metrics.json
{
  "session_id": "...",
  "task_counts": {"planning": 1, "coding": 3, "review": 1, "testing": 2},
  "models_used": ["gpt-4o", "gpt-4o-mini"],
  "human_checkpoints_reached": 1,
  "local_vs_cloud_ratio": 0.75
}

Key Design Decisions and Tradeoffs

Interactive vs autonomous execution

Interactive mode (–research):

Advantages: Manual review of plans, reducing errors
Disadvantages: increased waiting time, reduced production efficiency
Applicable scenarios: first implementation, complex tasks

Autonomous mode (–silent):

Advantages: fast execution, suitable for CI/CD
Disadvantages: skip failed tasks, may hide problems
Applicable scenarios: daily development, repetitive tasks

Select Strategy:

First implementation: using --research, step by step review
CI/CD: use --auto, reduce waiting
High risk tasks: use --research + manual review

Model routing strategy

Local model optimization (Ollama):

# 編碼模型
ollama pull qwen2.5-coder:1.5b

# 嵌入模型
ollama pull nomic-embed-text

Hybrid Deployment:

Complex tasks: use GPT-4o (cloud)
Repeat tasks: use local model (offline)
Test validation: using small models (fast)

Memory level

SAGE provides multiple layers of memory:

Project Memory - .sage/memory/ (project specific)
Conversation Memory - .sage/chat_sessions/ (Interactive Chat)
System Memory - .sage/system_state.json (system status)
Periodic Summary - memory/weekly_digest.md

Indicators and deployment scenarios

Measurable indicators

Cold start time:

Local model: ~50-100ms
Cloud model: ~200-500ms
Interactive mode: +100ms (pending review)

Session Throughput:

Autonomous mode: ~10 tasks/hour
Interactive mode: ~5 tasks/hour

Error rate:

Planning stage: <1%
Coding phase: 2-5% (depending on complexity)
Verification stage: <1%

Human intervention point:

Each mission: 1-3 review points
Average time: 30-90 seconds

Deployment scenario

Scenario 1: Local development environment

# 開發者工作站
export SAGE_MODEL_PROFILE=test
sage run "Add JWT auth to the API" --auto

Local model, fast iteration
Frequent interaction, step-by-step review

Scenario 2: CI/CD Pipeline

# GitHub Actions
- name: Test with SAGE
  run: |
    source .venv/bin/activate
    sage run "Run unit tests" --silent --auto

Autonomous mode, less waiting
Set environment variables: SAGE_NON_INTERACTIVE=1

Scenario 3: Production deployment

# 預設配置
sage run "Add JWT auth to the API" --research
# 檢視計劃，批准後執行

Interactive review to ensure quality
Record metrics: .sage/last_run_metrics.json

Security and Governance

Rule system

Global Rules: ~/.sage/rules.md Project Rules: .sage/rules.md Agency Rules: .sage/rules.coder.md

# .sage/rules.md
- 每個檔案必須包含 docstring
- 不允許硬編碼 API 密鑰
- 測試覆蓋率必須 >= 80%

Validation Rules:

sage rules validate  # 健康檢查
sage rules add "禁止硬編碼環境變數"  # 新增規則

Session management

Session Snapshot:

sage status  # 檢視上次會話快照
sage session reset  # 重置會話
sage session handoff  # 手動移交

Memory Transfer:

sage memory digest  # 摘要會話日誌

Advanced features

Training and Evaluation

Offline RL:

sage rl collect-synth --rows 650  # 收集合成數據
sage rl export --output datasets/routing_v1.jsonl  # 匯出數據
sage rl analyze-rewards --data datasets/routing_v1.jsonl  # 分析獎勵
sage rl train-bc --data datasets/routing_v1.jsonl  # 訓練 BC
sage rl train-cql --data datasets/routing_v1.jsonl  # 訓練 CQL

Simulator:

sage sim generate --count 1000 --out datasets/sim_tasks.jsonl  # 生成任務
sage sim run --tasks datasets/sim_tasks.jsonl --workers 4  # 執行模擬

Implementation example

Example 1: JWT authentication implementation

sage run "Add JWT auth to the API"
# 計劃：
# - 修改 src/api.py
# - 添加 middleware/jwt.py
# - 更新 tests/test_auth.py

Example 2: Refactoring legacy code

sage run "Refactor legacy code with type hints"
# 計劃：
# - src/models.py（添加型別註解）
# - src/services.py（添加型別註解）
# - tests/test_models.py

Example 3: Unit test supplement

sage run "Add unit tests for user authentication"
# 計劃：
# - tests/test_auth.py（新增測試）
# - 更新 pytest.ini

Error handling and debugging

FAQ:

Model not found:

sage doctor  # 檢查 Ollama 安裝
ollama list  # 列出可用模型

Environment variable not set:
```
export SAGE_MODEL_PROFILE=test
```

Session not continued:

sage run "your goal" --fresh  # 忽略記憶

Compare with other systems

SAGE vs Agent Infrastructure

Features	SAGE	Agent Infrastructure
Architecture	Coordinator Pipeline	Modular Infrastructure
Interactive	Slash command shell	Document-based
Model routing	Built-in models.yaml	Need to customize
Rule system	Coverage level	Static rules
RL support	Full pipeline	Mods only

Selection suggestions:

Interactive development required: select SAGE
Requires Modular Governance: Select Agent Infrastructure

SAGE vs LiteSwarm

Features	SAGE	LiteSwarm
Heavy Coordinator	Yes	No (Lightweight)
Planning DAG	Yes	No (dynamic switching)
Interactive review	Yes	No (API only)
model routing	built-in	via litellm
Complex tasks	Better than	Average

Selection suggestions:

Complex Mission: SAGE
Rapid Prototyping: LiteSwarm

Summary

SAGE provides a complete pipeline from prompt words to production software. Key features include:

Interactive Review - Balancing speed and quality
Model Routing - Flexible model configuration
Multi-layer memory - Minimize omissions
Measurable Process - metrics.json tracking
Production Ready - for CI/CD

Deployment recommendations:

First implementation: interactive mode + manual review
CI/CD: autonomous mode + automation
Production environment: mixed mode + metrics monitoring

Next step:

Read ARCHITECTURE_STATUS.md - Comparison specifications and implementation
Read docs/ - Full document index
View examples/ - implementation example