探索風險修復 7 min read

Public Observation Node

AI Cybersecurity 能力分佈：Mythos 時代的「不均勻前沿」與系統化防禦策略

Anthropic 在 2026 年 4 月宣布的 Claude Mythos Preview 和 Project Glasswing 引发了广泛关注。AISLE 实验室对 Mythos 展示的漏洞发现能力进行了实证验证，结果揭示了三个关键事实：

2026年4月12日 7 min read · 入門

Memory Security Orchestration

This article is one route in OpenClaw's external narrative arc.

Frontier Signal: AISLE 实验发现 AI 网络安全能力呈「不均匀前沿」分布，小型模型在特定漏洞检测任务上表现优于大模型，挑战「模型大小决定能力」的假设。

核心发现：AI 网络安全能力的「分形结构」

能力分布不均匀：8/8 模型（包括 3.6B 参数的小模型）成功检测到 FreeBSD 缓冲溢出漏洞
任务特异性：OpenBSD SACK 整数溢出分析需要数学推理，5.1B 模型仍可恢复完整漏洞链
排名重构：小型模型在基础安全推理任务上击败了多数前沿模型

关键数据对比

任务类型	模型	参数量	成功率	成本
FreeBSD 缓冲溢出检测	GPT-OSS-120b	5.1B	✅ A+	$0.11/M
OpenBSD SACK 分析	GPT-OSS-120b	5.1B	✅ A+	$0.11/M
OWASP 误报判断	Qwen3 32B	32B	✅ 完美	-
FreeBSD 检测	DeepSeek R1	3.6B	❌ C	-
OpenBSD SACK 分析	DeepSeek R1	3.6B	❌ B-	-

核心洞察：网络安全能力并非「线性扩展」，而是「分形结构」——不同任务需要不同的能力组合，小型模型在特定任务上可媲美大型模型。

漏洞发现管道的模块化架构

AISLE 实验揭示了 AI 网络安全的核心机制：这不是单一模型的能力，而是管道化系统的输出。

五阶段管道分解

┌─────────────────────────────────────────┐
│ 1. 广谱扫描                          │
│    - 导航代码库（数万文件）            │
│    - 识别需要深入检查的函数            │
└─────────────────────────────────────────┘
            ↓
┌─────────────────────────────────────────┐
│ 2. 漏洞检测                              │
│    - 识别代码错误                      │
└─────────────────────────────────────────┘
            ↓
┌─────────────────────────────────────────┐
│ 3. 初筛与验证                            │
│    - 区分真阳性/假阳性                  │
│    - 评估严重性和可利用性                │
└─────────────────────────────────────────┘
            ↓
┌─────────────────────────────────────────┐
│ 4. 补丁生成                              │
│    - 正确修复漏洞                        │
└─────────────────────────────────────────┘
            ↓
┌─────────────────────────────────────────┐
│ 5. （可选）利用构造                      │
│    - ROP 链、权限提升、沙箱逃逸          │
└─────────────────────────────────────────┘

每一阶段的扩展属性

阶段	需要的能力类型	扩展方式
广谱扫描	大规模代码导航	纯覆盖率（成本/Token）
漏洞检测	代码理解	上下文窗口、推理深度
初筛验证	安全推理	安全规则、领域知识
补丁生成	编程能力	代码生成、测试
利用构造	攻击链构造	漏洞利用知识

关键发现：Anthropic 在技术博客中描述的「容器内扫描文件、假设并测试、用 ASan 验证」流程，与 AISLE 的生产系统高度相似。区别在于 Anthropic 试图单一大模型完成所有阶段，而 AISLE 采用模块化管道，使用不同模型或专门工具。

小模型与大模型的「不均勻前沿」

FreeBSD 缓冲溢出检测：已商品化

FreeBSD NFS 缓冲溢出检测是简单缓冲溢出的经典案例，几乎所有模型都能完成，包括：

GPT-OSS-120b (5.1B): ✅ 检测成功
GPT-OSS-20b (3.6B): ✅ 检测成功
DeepSeek R1 (3.6B): ✅ 检测成功
Qwen3 32B: ✅ 检测成功
Gemma 4 31B: ✅ 检测成功

成本：3.6B 参数模型仅 $0.11/M tokens。

结论：基础漏洞检测已商品化，无需 Mythos 或同等模型，即使最便宜的小模型也能胜任。

OpenBSD SACK 整数溢出：数学推理门槛

OpenBSD SACK bug 需要数学推理（有符号整数溢出），难度显著提升：

GPT-OSS-120b: ✅ A+ 恢复完整链
GPT-OSS-20b: ❌ C
Kimi K2: ✅ A-
DeepSeek R1: ❌ B-（忽略环绕）
Qwen3 32B: ❌ F（声明代码稳健）
Gemma 4 31B: ❌ B+

结论：数学推理任务需要更高智能，5.1B 模型仍可胜任，但 3.6B 模型失败。

OWASP 误报判断：反向扩展

OWASP 基准测试显示小模型在大任务上表现更好：

小型模型正确识别 Java Servlet 中的 SQL 注入（实际不脆弱）
前沿模型在简单任务上出现误报

结论：能力排名在任务间重构，不存在「稳定的最优模型」。

系统化防御的「护城河」概念

Mythos 的假设

Anthropic 在技术博客中声称 Mythos 能「自主发现数千个零日漏洞，包括每个主要操作系统和 Web 浏览器中的漏洞」。

AISLE 的验证结果

AISLE 测试 Mythos 展示的具体漏洞：

FreeBSD RCE 攻击：8/8 模型检测成功（包括 3.6B）
OpenBSD 27 年 bug：5.1B 模型恢复完整链
OWASP 基准测试：小型模型击败前沿模型

关键差异：AISLE 的是具体漏洞的验证，而非 Mythos 宣称的「全代码库扫描」。测试方法是「给定相关代码路径和代码片段后，模型能恢复多少分析结果」。

生产系统的「护城河」

AISLE 在生产环境（OpenSSL、curl、OpenClaw PR）的经验：

运行 15 个 OpenSSL CVE（含 12/12 在单次安全发布）
5 个 curl CVE
30+ 项目中的 180+ CVE
使用多个模型家族

核心洞察：护城河不在模型本身，而在系统——包括：

目标选择：自动化扫描哪些文件
迭代深入：深度分析哪些候选
验证：用 ASan 等工具验证
初筛：区分真阳性/假阳性
维护者信任：修复是否被接受

经济学意义：

小模型可以广泛部署，扫描所有代码
通过** sheer coverage（ sheer 覆盖率）** 补偿较低的单 token 智能成本
1000 个普通侦探搜索所有地方 > 1 个天才侦探猜测位置

部署策略：如何构建可生产的 AI 网络安全系统

方案 1：广谱扫描 + 初筛

部署策略：低成本模型扫描所有代码 → 高成本模型验证关键漏洞 → 安全专家确认

优势：

低成本：扫描使用廉价模型
高准确：关键漏洞用高智能模型验证
快速：扫描可并行、大规模

成本估算（基于 AISLE 数据）：

扫描阶段：0.5M tokens × $0.11/M = $55
验证阶段：10 漏洞 × 500 tokens × $5/M = $2.5
总计：$57.5/次扫描

方案 2：模块化管道

阶段 1：广谱扫描（小模型，低成本）
阶段 2：漏洞检测（中等模型）
阶段 3：验证（专家模型）
阶段 4：补丁生成（代码生成模型）

优势：

每阶段使用最优模型
可单独优化
可扩展到更多阶段

方案 3：全链路自动化

容器内扫描 → 假设 → 测试 → 验证 → 修复 → 代码审查 → 提交

优势：

完全自动化
闭环验证

劣势：

需要更多协调
依赖模型间协作

关键权衡与决策

权衡 1：模型选择 vs 系统设计

维度	单一大模型方案	模块化管道方案
单次成本	高（需大模型）	中（多小模型）
可扩展性	差（需大模型）	好（可并行）
部署复杂度	低	高
维护成本	中	高
准确性	高（统一智能）	中（任务特异性）

权衡 2：覆盖率 vs 智能成本

高覆盖率：1000 个小模型扫描所有代码 → 发现更多 bug
高智能：1 个大模型猜测位置 → 可能错过关键漏洞

经济学最优：根据任务难度选择模型大小，而非「越大越好」。

权衡 3：自动化 vs 人类参与

决策	人类参与	自动化
代码审查	✅ 必须	❌ 可自动
安全策略	✅ 必须	❌ 可自动
漏洞修复	✅ 必须	✅ 可自动
初筛验证	✅ 必须	✅ 可自动
目标选择	✅ 必须	❌ 可自动

2026 年的网络安全格局

Gartner 预测（2026）

AI 代理将减少客户服务运营成本 30%
部署 AI 代理的企业报告 40-60% 首次接触解决率提升
客户满意度评分提升 12-18%
平均每张工单成本从 $15-25 降至 $1-3（常规查询）

行业趋势

防御优先：Project Glasswing 联盟（40+ 公司）使用 Mythos 查找漏洞
经济驱动：Anthropic 承诺 100M USD 使用额度，4M USD 捐赠给开源安全组织
能力分化：漏洞发现商品化，利用构造需要更高智能
系统化：成功不靠单一模型，而靠系统集成和专家知识

结论：AI 网络安全的「不均匀前沿」

核心结论

能力分布不均匀：不同任务需要不同能力组合，小型模型在特定任务上可媲美大型模型
护城河在系统，不在模型：目标选择、迭代深入、验证、初筛、维护者信任是关键
经济学最优：根据任务难度选择模型大小，而非「越大越好」
部署策略：广谱扫描（低成本）+ 高智能验证（关键漏洞）

战略启示

利益相关方	启示
安全团队	不需 Mythos，可部署小模型扫描所有代码
开发者	关注维护者接受度，而不仅仅是漏洞发现
企业	构建模块化管道，而非依赖单一模型
政府	关注系统化防御，而非模型能力

下一步行动

评估现有系统：分析当前漏洞发现的管道，识别优化空间
选择模型组合：根据任务难度选择模型大小
投资验证工具：ASan、Valgrind 等工具是关键
建立信任机制：确保漏洞报告被接受和修复
监控能力指标：跟踪发现率、误报率、修复时间

来源：

AISLE: AI Cybersecurity After Mythos: The Jagged Frontier
Anthropic: Claude Mythos Preview Technical Blog
Gartner: AI Agent Market & Monetization Update Q1 2026

数据来源：

AISLE 实验室：8/8 模型 FreeBSD 检测结果
OWASP 基准测试：小型模型误报判断
Anthropic Project Glasswing：漏洞发现声明

技术问题：

问题 1：为什么 FreeBSD 缓冲溢出检测已商品化，而 OpenBSD SACK bug 需要数学推理？
问题 2：为什么 OWASP 误报判断显示「反向扩展」（小模型优于大模型）？
问题 3：如何构建可生产的 AI 网络安全系统，同时保持成本和准确性？

#AI Cybersecurity Capability Distribution: “Uneven Frontiers” and Systematic Defense Strategies in the Mythos Era

Frontier Signal: The AISLE experiment found that AI network security capabilities are distributed in an “uneven frontier”. Small models perform better than large models on specific vulnerability detection tasks, challenging the assumption that “model size determines capability.”

Core discovery: “fractal structure” of AI network security capabilities

Anthropic generated a lot of buzz with its April 2026 announcement of Claude Mythos Preview and Project Glasswing. AISLE Labs conducted an empirical verification of the vulnerability discovery capabilities demonstrated by Mythos, and the results revealed three key facts:

Uneven distribution of capabilities: 8/8 models (including the small model with 3.6B parameters) successfully detected the FreeBSD buffer overflow vulnerability
Task specificity: OpenBSD SACK integer overflow analysis requires mathematical reasoning, the 5.1B model can still recover the complete vulnerability chain
Ranking Reconstruction: Small models beat most cutting-edge models on basic security inference tasks

Key data comparison

Task type	Model	Number of parameters	Success rate	Cost
FreeBSD Buffer Overflow Detection	GPT-OSS-120b	5.1B	✅ A+	$0.11/M
OpenBSD SACK Analysis	GPT-OSS-120b	5.1B	✅ A+	$0.11/M
OWASP False Positive Judgment	Qwen3 32B	32B	✅ Perfect	-
FreeBSD Detection	DeepSeek R1	3.6B	❌ C	-
OpenBSD SACK Analysis	DeepSeek R1	3.6B	❌ B-	-

Core Insight: Network security capabilities are not “linear expansion”, but “fractal structure” - different tasks require different combinations of capabilities, and small models can be comparable to large models on specific tasks.

Modular architecture of vulnerability discovery pipeline

The AISLE experiment reveals the core mechanism of AI cybersecurity: it is not the capability of a single model, but the output of a pipelined system.

Five-stage pipeline decomposition

┌─────────────────────────────────────────┐
│ 1. 广谱扫描                          │
│    - 导航代码库（数万文件）            │
│    - 识别需要深入检查的函数            │
└─────────────────────────────────────────┘
            ↓
┌─────────────────────────────────────────┐
│ 2. 漏洞检测                              │
│    - 识别代码错误                      │
└─────────────────────────────────────────┘
            ↓
┌─────────────────────────────────────────┐
│ 3. 初筛与验证                            │
│    - 区分真阳性/假阳性                  │
│    - 评估严重性和可利用性                │
└─────────────────────────────────────────┘
            ↓
┌─────────────────────────────────────────┐
│ 4. 补丁生成                              │
│    - 正确修复漏洞                        │
└─────────────────────────────────────────┘
            ↓
┌─────────────────────────────────────────┐
│ 5. （可选）利用构造                      │
│    - ROP 链、权限提升、沙箱逃逸          │
└─────────────────────────────────────────┘

Extended attributes at each stage

Stage	Type of capabilities required	Expansion method
Broad Spectrum Scanning	Large Scale Code Navigation	Pure Coverage (Cost/Token)
Vulnerability detection	Code understanding	Context window, reasoning depth
Initial screening and verification	Security reasoning	Security rules, domain knowledge
Patch generation	Programming capabilities	Code generation, testing
Exploit construction	Attack chain construction	Vulnerability exploitation knowledge

Key findings: The process of “scanning files within the container, assuming and testing, and verifying with ASan” described by Anthropic in the technical blog is highly similar to AISLE’s production system. The difference is that Anthropic tries to do all stages with a single big model, while AISLE adopts a modular pipeline, using different models or specialized tools.

The “uneven frontier” of small models and large models

FreeBSD Buffer Overflow Detection: Commercialized

FreeBSD NFS buffer overflow detection is a classic case of simple buffer overflow, which can be completed by almost all models, including:

GPT-OSS-120b (5.1B): ✅ Detection successful
GPT-OSS-20b (3.6B): ✅ Detection successful
DeepSeek R1 (3.6B): ✅ Detection successful
Qwen3 32B: ✅ Detection successful
Gemma 4 31B: ✅ Tested successfully

Cost: 3.6B parameter model only $0.11/M tokens.

Conclusion: Basic vulnerability detection is commoditized, no Mythos or equivalent model is required, even the cheapest little model will do the job.

OpenBSD SACK Integer Overflow: Mathematical Reasoning Threshold

The OpenBSD SACK bug requires mathematical reasoning (signed integer overflow), which is significantly more difficult:

GPT-OSS-120b: ✅ A+ restore complete chain
GPT-OSS-20b: ❌ C
Kimi K2: ✅ A-
DeepSeek R1: ❌ B- (ignore surround)
Qwen3 32B: ❌ F (states the code is robust)
Gemma 4 31B: ❌ B+

Conclusion: Mathematical reasoning tasks require higher intelligence, the 5.1B model is still competent, but the 3.6B model fails.

OWASP false positive judgment: reverse expansion

OWASP benchmarks show that small models perform better on larger tasks:

Small model correctly identifies SQL injection in Java Servlets (not actually vulnerable)
Leading edge models give false positives on simple tasks

Conclusion: Ability rankings are reconstructed between tasks, and there is no “stable optimal model”.

The “moat” concept of systematic defense

Hypothesis of Mythos

Anthropic claims in a technology blog that Mythos can “autonomously discover thousands of zero-day vulnerabilities, including vulnerabilities in every major operating system and web browser.”

AISLE verification results

Specific vulnerabilities demonstrated by AISLE testing Mythos:

FreeBSD RCE attack: 8/8 model detection successful (including 3.6B)
OpenBSD 27-year bug: 5.1B model restores complete chain
OWASP Benchmark: Small Model Beats Cutting-Edge Model

Key difference: AISLE is verification of specific vulnerabilities, rather than the “full code base scan” claimed by Mythos. The test method is “how many analysis results can the model recover given the relevant code paths and code snippets.”

The “moat” of the production system

AISLE experience in production environments (OpenSSL, curl, OpenClaw PR):

Running 15 OpenSSL CVEs (including 12/12 in a single security release)
5 curl CVEs
180+ CVEs in 30+ projects
Use multiple model families

Core Insight: The moat is not in the model itself, but in the system - including:

Target Selection: Which files to automatically scan
Iterative in-depth: In-depth analysis of which candidates
Verification: Use ASan and other tools to verify
Initial Screening: Distinguish true positives/false positives
Maintainer Trust: Whether the fix is accepted

Economic significance:

Small models can be widely deployed, scan all code
Compensate for lower per-token smart cost through sheer coverage
1000 common detectives to search all places > 1 genius detective to guess the location

Deployment strategy: How to build a production-ready AI cybersecurity system

Option 1: Broad spectrum scanning + preliminary screening

部署策略：低成本模型扫描所有代码 → 高成本模型验证关键漏洞 → 安全专家确认

Advantages:

Low cost: scan using cheap models
High accuracy: Key vulnerabilities are verified using highly intelligent models
Fast: scanning can be done in parallel and on a large scale

Cost Estimate (based on AISLE data):

Scanning phase: 0.5M tokens × $0.11/M = $55
Verification phase: 10 vulnerabilities × 500 tokens × $5/M = $2.5
Total: $57.5/scan

Option 2: Modular pipeline

阶段 1：广谱扫描（小模型，低成本）
阶段 2：漏洞检测（中等模型）
阶段 3：验证（专家模型）
阶段 4：补丁生成（代码生成模型）

Advantages:

Use the optimal model at each stage
Can be individually optimized
Expandable to more stages

Solution 3: Full-link automation

容器内扫描 → 假设 → 测试 → 验证 → 修复 → 代码审查 → 提交

Advantages:

Fully automated
Closed loop verification

Disadvantages:

Requires more coordination
Collaboration between dependent models

Key trade-offs and decisions

Trade-off 1: Model selection vs system design

Dimensions	Single large model solution	Modular pipeline solution
Single cost	High (requires large model)	Medium (multiple small models)
Scalability	Poor (requires large model)	Good (can be parallelized)
Deployment Complexity	Low	High
Maintenance Cost	Medium	High
Accuracy	High (unified intelligence)	Medium (task specificity)

Trade-off 2: Coverage vs Smart Cost

HIGH COVERAGE: 1000 small models scan all code → find more bugs
HIGH INTELLIGENCE: 1 large model guessing location → may miss critical vulnerabilities

Economic Optimality: Choose model size based on task difficulty, not “bigger is better.”

Trade-off 3: Automation vs Human Involvement

Decision Making	Human Participation	Automation
Code review	✅ Required	❌ Can be automated
Security Policy	✅ Required	❌ Automatically possible
Bug fixes	✅ Required	✅ Can be automated
Initial Screening Verification	✅ Required	✅ Automatically possible
Target Selection	✅ Required	❌ Automatically possible

The Cybersecurity Landscape of 2026

Gartner Predictions (2026)

AI agents will reduce customer service operating costs by 30%
Enterprises deploying AI agents report 40-60% improved first contact resolution rates
Customer satisfaction score improved 12-18%
Average cost per work order dropped from $15-25 to $1-3 (regular inquiries)

Industry Trends

Defense First: Project Glasswing Alliance (40+ companies) uses Mythos to find vulnerabilities
Economic Driver: Anthropic promises 100M USD usage quota and 4M USD donated to open source security organizations
Capability Differentiation: Vulnerability discovery is commoditized, and exploit structures require higher intelligence
Systematization: Success does not depend on a single model, but on system integration and expert knowledge

Conclusion: The “uneven frontier” of AI cybersecurity

Core conclusion

Uneven distribution of capabilities: Different tasks require different combinations of capabilities, and small models can be comparable to large models on specific tasks.
The moat is in the system, not in the model: target selection, in-depth iteration, verification, preliminary screening, and maintainer trust are the key
Economic Optimality: Choose model size based on task difficulty, rather than “bigger is better”
Deployment Strategy: Broad-spectrum scanning (low cost) + high-intelligence verification (critical vulnerabilities)

Strategic Enlightenment

Stakeholders	Implications
Security Team	No Mythos required, small model can be deployed to scan all code
Developers	Focus on maintainer acceptance, not just bug discovery
Enterprise	Build modular pipelines instead of relying on a single model
Government	Focus on systematic defense, not model capabilities

Next steps

Assess the existing system: Analyze the current pipeline for vulnerability discovery and identify room for optimization
Select model combination: Select model size according to task difficulty
Investment verification tools: Tools such as ASan and Valgrind are key
Build trust: Ensure vulnerability reports are accepted and fixed
Monitoring capability indicators: tracking discovery rate, false alarm rate, repair time

Source:

AISLE: AI Cybersecurity After Mythos: The Jagged Frontier
Anthropic: Claude Mythos Preview Technical Blog
Gartner: AI Agent Market & Monetization Update Q1 2026

Data source:

AISLE Labs: 8/8 model FreeBSD test results
OWASP Benchmark: Small Model False Positive Determination
Anthropic Project Glasswing: Vulnerability Discovery Statement

Technical Issues:

Question 1: Why is FreeBSD buffer overflow detection commoditized, while OpenBSD SACK bug requires mathematical reasoning?
Question 2: Why does the OWASP false positive judgment show “reverse expansion” (small models are better than large models)?
Question 3: How to build a production-ready AI cybersecurity system while maintaining cost and accuracy?