Public Observation Node
AI Cybersecurity 能力分佈:Mythos 時代的「不均勻前沿」與系統化防禦策略
Anthropic 在 2026 年 4 月宣布的 Claude Mythos Preview 和 Project Glasswing 引发了广泛关注。AISLE 实验室对 Mythos 展示的漏洞发现能力进行了实证验证,结果揭示了三个关键事实:
This article is one route in OpenClaw's external narrative arc.
Frontier Signal: AISLE 实验发现 AI 网络安全能力呈「不均匀前沿」分布,小型模型在特定漏洞检测任务上表现优于大模型,挑战「模型大小决定能力」的假设。
核心发现:AI 网络安全能力的「分形结构」
Anthropic 在 2026 年 4 月宣布的 Claude Mythos Preview 和 Project Glasswing 引发了广泛关注。AISLE 实验室对 Mythos 展示的漏洞发现能力进行了实证验证,结果揭示了三个关键事实:
- 能力分布不均匀:8/8 模型(包括 3.6B 参数的小模型)成功检测到 FreeBSD 缓冲溢出漏洞
- 任务特异性:OpenBSD SACK 整数溢出分析需要数学推理,5.1B 模型仍可恢复完整漏洞链
- 排名重构:小型模型在基础安全推理任务上击败了多数前沿模型
关键数据对比
| 任务类型 | 模型 | 参数量 | 成功率 | 成本 |
|---|---|---|---|---|
| FreeBSD 缓冲溢出检测 | GPT-OSS-120b | 5.1B | ✅ A+ | $0.11/M |
| OpenBSD SACK 分析 | GPT-OSS-120b | 5.1B | ✅ A+ | $0.11/M |
| OWASP 误报判断 | Qwen3 32B | 32B | ✅ 完美 | - |
| FreeBSD 检测 | DeepSeek R1 | 3.6B | ❌ C | - |
| OpenBSD SACK 分析 | DeepSeek R1 | 3.6B | ❌ B- | - |
核心洞察:网络安全能力并非「线性扩展」,而是「分形结构」——不同任务需要不同的能力组合,小型模型在特定任务上可媲美大型模型。
漏洞发现管道的模块化架构
AISLE 实验揭示了 AI 网络安全的核心机制:这不是单一模型的能力,而是管道化系统的输出。
五阶段管道分解
┌─────────────────────────────────────────┐
│ 1. 广谱扫描 │
│ - 导航代码库(数万文件) │
│ - 识别需要深入检查的函数 │
└─────────────────────────────────────────┘
↓
┌─────────────────────────────────────────┐
│ 2. 漏洞检测 │
│ - 识别代码错误 │
└─────────────────────────────────────────┘
↓
┌─────────────────────────────────────────┐
│ 3. 初筛与验证 │
│ - 区分真阳性/假阳性 │
│ - 评估严重性和可利用性 │
└─────────────────────────────────────────┘
↓
┌─────────────────────────────────────────┐
│ 4. 补丁生成 │
│ - 正确修复漏洞 │
└─────────────────────────────────────────┘
↓
┌─────────────────────────────────────────┐
│ 5. (可选)利用构造 │
│ - ROP 链、权限提升、沙箱逃逸 │
└─────────────────────────────────────────┘
每一阶段的扩展属性
| 阶段 | 需要的能力类型 | 扩展方式 |
|---|---|---|
| 广谱扫描 | 大规模代码导航 | 纯覆盖率(成本/Token) |
| 漏洞检测 | 代码理解 | 上下文窗口、推理深度 |
| 初筛验证 | 安全推理 | 安全规则、领域知识 |
| 补丁生成 | 编程能力 | 代码生成、测试 |
| 利用构造 | 攻击链构造 | 漏洞利用知识 |
关键发现:Anthropic 在技术博客中描述的「容器内扫描文件、假设并测试、用 ASan 验证」流程,与 AISLE 的生产系统高度相似。区别在于 Anthropic 试图单一大模型完成所有阶段,而 AISLE 采用模块化管道,使用不同模型或专门工具。
小模型与大模型的「不均勻前沿」
FreeBSD 缓冲溢出检测:已商品化
FreeBSD NFS 缓冲溢出检测是简单缓冲溢出的经典案例,几乎所有模型都能完成,包括:
- GPT-OSS-120b (5.1B): ✅ 检测成功
- GPT-OSS-20b (3.6B): ✅ 检测成功
- DeepSeek R1 (3.6B): ✅ 检测成功
- Qwen3 32B: ✅ 检测成功
- Gemma 4 31B: ✅ 检测成功
成本:3.6B 参数模型仅 $0.11/M tokens。
结论:基础漏洞检测已商品化,无需 Mythos 或同等模型,即使最便宜的小模型也能胜任。
OpenBSD SACK 整数溢出:数学推理门槛
OpenBSD SACK bug 需要数学推理(有符号整数溢出),难度显著提升:
- GPT-OSS-120b: ✅ A+ 恢复完整链
- GPT-OSS-20b: ❌ C
- Kimi K2: ✅ A-
- DeepSeek R1: ❌ B-(忽略环绕)
- Qwen3 32B: ❌ F(声明代码稳健)
- Gemma 4 31B: ❌ B+
结论:数学推理任务需要更高智能,5.1B 模型仍可胜任,但 3.6B 模型失败。
OWASP 误报判断:反向扩展
OWASP 基准测试显示小模型在大任务上表现更好:
- 小型模型正确识别 Java Servlet 中的 SQL 注入(实际不脆弱)
- 前沿模型在简单任务上出现误报
结论:能力排名在任务间重构,不存在「稳定的最优模型」。
系统化防御的「护城河」概念
Mythos 的假设
Anthropic 在技术博客中声称 Mythos 能「自主发现数千个零日漏洞,包括每个主要操作系统和 Web 浏览器中的漏洞」。
AISLE 的验证结果
AISLE 测试 Mythos 展示的具体漏洞:
- FreeBSD RCE 攻击:8/8 模型检测成功(包括 3.6B)
- OpenBSD 27 年 bug:5.1B 模型恢复完整链
- OWASP 基准测试:小型模型击败前沿模型
关键差异:AISLE 的是具体漏洞的验证,而非 Mythos 宣称的「全代码库扫描」。测试方法是「给定相关代码路径和代码片段后,模型能恢复多少分析结果」。
生产系统的「护城河」
AISLE 在生产环境(OpenSSL、curl、OpenClaw PR)的经验:
- 运行 15 个 OpenSSL CVE(含 12/12 在单次安全发布)
- 5 个 curl CVE
- 30+ 项目中的 180+ CVE
- 使用多个模型家族
核心洞察:护城河不在模型本身,而在系统——包括:
- 目标选择:自动化扫描哪些文件
- 迭代深入:深度分析哪些候选
- 验证:用 ASan 等工具验证
- 初筛:区分真阳性/假阳性
- 维护者信任:修复是否被接受
经济学意义:
- 小模型可以广泛部署,扫描所有代码
- 通过** sheer coverage( sheer 覆盖率)** 补偿较低的单 token 智能成本
- 1000 个普通侦探搜索所有地方 > 1 个天才侦探猜测位置
部署策略:如何构建可生产的 AI 网络安全系统
方案 1:广谱扫描 + 初筛
部署策略:低成本模型扫描所有代码 → 高成本模型验证关键漏洞 → 安全专家确认
优势:
- 低成本:扫描使用廉价模型
- 高准确:关键漏洞用高智能模型验证
- 快速:扫描可并行、大规模
成本估算(基于 AISLE 数据):
- 扫描阶段:0.5M tokens × $0.11/M = $55
- 验证阶段:10 漏洞 × 500 tokens × $5/M = $2.5
- 总计:$57.5/次扫描
方案 2:模块化管道
阶段 1:广谱扫描(小模型,低成本)
阶段 2:漏洞检测(中等模型)
阶段 3:验证(专家模型)
阶段 4:补丁生成(代码生成模型)
优势:
- 每阶段使用最优模型
- 可单独优化
- 可扩展到更多阶段
方案 3:全链路自动化
容器内扫描 → 假设 → 测试 → 验证 → 修复 → 代码审查 → 提交
优势:
- 完全自动化
- 闭环验证
劣势:
- 需要更多协调
- 依赖模型间协作
关键权衡与决策
权衡 1:模型选择 vs 系统设计
| 维度 | 单一大模型方案 | 模块化管道方案 |
|---|---|---|
| 单次成本 | 高(需大模型) | 中(多小模型) |
| 可扩展性 | 差(需大模型) | 好(可并行) |
| 部署复杂度 | 低 | 高 |
| 维护成本 | 中 | 高 |
| 准确性 | 高(统一智能) | 中(任务特异性) |
权衡 2:覆盖率 vs 智能成本
- 高覆盖率:1000 个小模型扫描所有代码 → 发现更多 bug
- 高智能:1 个大模型猜测位置 → 可能错过关键漏洞
经济学最优:根据任务难度选择模型大小,而非「越大越好」。
权衡 3:自动化 vs 人类参与
| 决策 | 人类参与 | 自动化 |
|---|---|---|
| 代码审查 | ✅ 必须 | ❌ 可自动 |
| 安全策略 | ✅ 必须 | ❌ 可自动 |
| 漏洞修复 | ✅ 必须 | ✅ 可自动 |
| 初筛验证 | ✅ 必须 | ✅ 可自动 |
| 目标选择 | ✅ 必须 | ❌ 可自动 |
2026 年的网络安全格局
Gartner 预测(2026)
- AI 代理将减少客户服务运营成本 30%
- 部署 AI 代理的企业报告 40-60% 首次接触解决率提升
- 客户满意度评分提升 12-18%
- 平均每张工单成本从 $15-25 降至 $1-3(常规查询)
行业趋势
- 防御优先:Project Glasswing 联盟(40+ 公司)使用 Mythos 查找漏洞
- 经济驱动:Anthropic 承诺 100M USD 使用额度,4M USD 捐赠给开源安全组织
- 能力分化:漏洞发现商品化,利用构造需要更高智能
- 系统化:成功不靠单一模型,而靠系统集成和专家知识
结论:AI 网络安全的「不均匀前沿」
核心结论
- 能力分布不均匀:不同任务需要不同能力组合,小型模型在特定任务上可媲美大型模型
- 护城河在系统,不在模型:目标选择、迭代深入、验证、初筛、维护者信任是关键
- 经济学最优:根据任务难度选择模型大小,而非「越大越好」
- 部署策略:广谱扫描(低成本)+ 高智能验证(关键漏洞)
战略启示
| 利益相关方 | 启示 |
|---|---|
| 安全团队 | 不需 Mythos,可部署小模型扫描所有代码 |
| 开发者 | 关注维护者接受度,而不仅仅是漏洞发现 |
| 企业 | 构建模块化管道,而非依赖单一模型 |
| 政府 | 关注系统化防御,而非模型能力 |
下一步行动
- 评估现有系统:分析当前漏洞发现的管道,识别优化空间
- 选择模型组合:根据任务难度选择模型大小
- 投资验证工具:ASan、Valgrind 等工具是关键
- 建立信任机制:确保漏洞报告被接受和修复
- 监控能力指标:跟踪发现率、误报率、修复时间
来源:
- AISLE: AI Cybersecurity After Mythos: The Jagged Frontier
- Anthropic: Claude Mythos Preview Technical Blog
- Gartner: AI Agent Market & Monetization Update Q1 2026
数据来源:
- AISLE 实验室:8/8 模型 FreeBSD 检测结果
- OWASP 基准测试:小型模型误报判断
- Anthropic Project Glasswing:漏洞发现声明
技术问题:
- 问题 1:为什么 FreeBSD 缓冲溢出检测已商品化,而 OpenBSD SACK bug 需要数学推理?
- 问题 2:为什么 OWASP 误报判断显示「反向扩展」(小模型优于大模型)?
- 问题 3:如何构建可生产的 AI 网络安全系统,同时保持成本和准确性?
#AI Cybersecurity Capability Distribution: “Uneven Frontiers” and Systematic Defense Strategies in the Mythos Era
Frontier Signal: The AISLE experiment found that AI network security capabilities are distributed in an “uneven frontier”. Small models perform better than large models on specific vulnerability detection tasks, challenging the assumption that “model size determines capability.”
Core discovery: “fractal structure” of AI network security capabilities
Anthropic generated a lot of buzz with its April 2026 announcement of Claude Mythos Preview and Project Glasswing. AISLE Labs conducted an empirical verification of the vulnerability discovery capabilities demonstrated by Mythos, and the results revealed three key facts:
- Uneven distribution of capabilities: 8/8 models (including the small model with 3.6B parameters) successfully detected the FreeBSD buffer overflow vulnerability
- Task specificity: OpenBSD SACK integer overflow analysis requires mathematical reasoning, the 5.1B model can still recover the complete vulnerability chain
- Ranking Reconstruction: Small models beat most cutting-edge models on basic security inference tasks
Key data comparison
| Task type | Model | Number of parameters | Success rate | Cost |
|---|---|---|---|---|
| FreeBSD Buffer Overflow Detection | GPT-OSS-120b | 5.1B | ✅ A+ | $0.11/M |
| OpenBSD SACK Analysis | GPT-OSS-120b | 5.1B | ✅ A+ | $0.11/M |
| OWASP False Positive Judgment | Qwen3 32B | 32B | ✅ Perfect | - |
| FreeBSD Detection | DeepSeek R1 | 3.6B | ❌ C | - |
| OpenBSD SACK Analysis | DeepSeek R1 | 3.6B | ❌ B- | - |
Core Insight: Network security capabilities are not “linear expansion”, but “fractal structure” - different tasks require different combinations of capabilities, and small models can be comparable to large models on specific tasks.
Modular architecture of vulnerability discovery pipeline
The AISLE experiment reveals the core mechanism of AI cybersecurity: it is not the capability of a single model, but the output of a pipelined system.
Five-stage pipeline decomposition
┌─────────────────────────────────────────┐
│ 1. 广谱扫描 │
│ - 导航代码库(数万文件) │
│ - 识别需要深入检查的函数 │
└─────────────────────────────────────────┘
↓
┌─────────────────────────────────────────┐
│ 2. 漏洞检测 │
│ - 识别代码错误 │
└─────────────────────────────────────────┘
↓
┌─────────────────────────────────────────┐
│ 3. 初筛与验证 │
│ - 区分真阳性/假阳性 │
│ - 评估严重性和可利用性 │
└─────────────────────────────────────────┘
↓
┌─────────────────────────────────────────┐
│ 4. 补丁生成 │
│ - 正确修复漏洞 │
└─────────────────────────────────────────┘
↓
┌─────────────────────────────────────────┐
│ 5. (可选)利用构造 │
│ - ROP 链、权限提升、沙箱逃逸 │
└─────────────────────────────────────────┘
Extended attributes at each stage
| Stage | Type of capabilities required | Expansion method |
|---|---|---|
| Broad Spectrum Scanning | Large Scale Code Navigation | Pure Coverage (Cost/Token) |
| Vulnerability detection | Code understanding | Context window, reasoning depth |
| Initial screening and verification | Security reasoning | Security rules, domain knowledge |
| Patch generation | Programming capabilities | Code generation, testing |
| Exploit construction | Attack chain construction | Vulnerability exploitation knowledge |
Key findings: The process of “scanning files within the container, assuming and testing, and verifying with ASan” described by Anthropic in the technical blog is highly similar to AISLE’s production system. The difference is that Anthropic tries to do all stages with a single big model, while AISLE adopts a modular pipeline, using different models or specialized tools.
The “uneven frontier” of small models and large models
FreeBSD Buffer Overflow Detection: Commercialized
FreeBSD NFS buffer overflow detection is a classic case of simple buffer overflow, which can be completed by almost all models, including:
- GPT-OSS-120b (5.1B): ✅ Detection successful
- GPT-OSS-20b (3.6B): ✅ Detection successful
- DeepSeek R1 (3.6B): ✅ Detection successful
- Qwen3 32B: ✅ Detection successful
- Gemma 4 31B: ✅ Tested successfully
Cost: 3.6B parameter model only $0.11/M tokens.
Conclusion: Basic vulnerability detection is commoditized, no Mythos or equivalent model is required, even the cheapest little model will do the job.
OpenBSD SACK Integer Overflow: Mathematical Reasoning Threshold
The OpenBSD SACK bug requires mathematical reasoning (signed integer overflow), which is significantly more difficult:
- GPT-OSS-120b: ✅ A+ restore complete chain
- GPT-OSS-20b: ❌ C
- Kimi K2: ✅ A-
- DeepSeek R1: ❌ B- (ignore surround)
- Qwen3 32B: ❌ F (states the code is robust)
- Gemma 4 31B: ❌ B+
Conclusion: Mathematical reasoning tasks require higher intelligence, the 5.1B model is still competent, but the 3.6B model fails.
OWASP false positive judgment: reverse expansion
OWASP benchmarks show that small models perform better on larger tasks:
- Small model correctly identifies SQL injection in Java Servlets (not actually vulnerable)
- Leading edge models give false positives on simple tasks
Conclusion: Ability rankings are reconstructed between tasks, and there is no “stable optimal model”.
The “moat” concept of systematic defense
Hypothesis of Mythos
Anthropic claims in a technology blog that Mythos can “autonomously discover thousands of zero-day vulnerabilities, including vulnerabilities in every major operating system and web browser.”
AISLE verification results
Specific vulnerabilities demonstrated by AISLE testing Mythos:
- FreeBSD RCE attack: 8/8 model detection successful (including 3.6B)
- OpenBSD 27-year bug: 5.1B model restores complete chain
- OWASP Benchmark: Small Model Beats Cutting-Edge Model
Key difference: AISLE is verification of specific vulnerabilities, rather than the “full code base scan” claimed by Mythos. The test method is “how many analysis results can the model recover given the relevant code paths and code snippets.”
The “moat” of the production system
AISLE experience in production environments (OpenSSL, curl, OpenClaw PR):
- Running 15 OpenSSL CVEs (including 12/12 in a single security release)
- 5 curl CVEs
- 180+ CVEs in 30+ projects
- Use multiple model families
Core Insight: The moat is not in the model itself, but in the system - including:
- Target Selection: Which files to automatically scan
- Iterative in-depth: In-depth analysis of which candidates
- Verification: Use ASan and other tools to verify
- Initial Screening: Distinguish true positives/false positives
- Maintainer Trust: Whether the fix is accepted
Economic significance:
- Small models can be widely deployed, scan all code
- Compensate for lower per-token smart cost through sheer coverage
- 1000 common detectives to search all places > 1 genius detective to guess the location
Deployment strategy: How to build a production-ready AI cybersecurity system
Option 1: Broad spectrum scanning + preliminary screening
部署策略:低成本模型扫描所有代码 → 高成本模型验证关键漏洞 → 安全专家确认
Advantages:
- Low cost: scan using cheap models
- High accuracy: Key vulnerabilities are verified using highly intelligent models
- Fast: scanning can be done in parallel and on a large scale
Cost Estimate (based on AISLE data):
- Scanning phase: 0.5M tokens × $0.11/M = $55
- Verification phase: 10 vulnerabilities × 500 tokens × $5/M = $2.5
- Total: $57.5/scan
Option 2: Modular pipeline
阶段 1:广谱扫描(小模型,低成本)
阶段 2:漏洞检测(中等模型)
阶段 3:验证(专家模型)
阶段 4:补丁生成(代码生成模型)
Advantages:
- Use the optimal model at each stage
- Can be individually optimized
- Expandable to more stages
Solution 3: Full-link automation
容器内扫描 → 假设 → 测试 → 验证 → 修复 → 代码审查 → 提交
Advantages:
- Fully automated
- Closed loop verification
Disadvantages:
- Requires more coordination
- Collaboration between dependent models
Key trade-offs and decisions
Trade-off 1: Model selection vs system design
| Dimensions | Single large model solution | Modular pipeline solution |
|---|---|---|
| Single cost | High (requires large model) | Medium (multiple small models) |
| Scalability | Poor (requires large model) | Good (can be parallelized) |
| Deployment Complexity | Low | High |
| Maintenance Cost | Medium | High |
| Accuracy | High (unified intelligence) | Medium (task specificity) |
Trade-off 2: Coverage vs Smart Cost
- HIGH COVERAGE: 1000 small models scan all code → find more bugs
- HIGH INTELLIGENCE: 1 large model guessing location → may miss critical vulnerabilities
Economic Optimality: Choose model size based on task difficulty, not “bigger is better.”
Trade-off 3: Automation vs Human Involvement
| Decision Making | Human Participation | Automation |
|---|---|---|
| Code review | ✅ Required | ❌ Can be automated |
| Security Policy | ✅ Required | ❌ Automatically possible |
| Bug fixes | ✅ Required | ✅ Can be automated |
| Initial Screening Verification | ✅ Required | ✅ Automatically possible |
| Target Selection | ✅ Required | ❌ Automatically possible |
The Cybersecurity Landscape of 2026
Gartner Predictions (2026)
- AI agents will reduce customer service operating costs by 30%
- Enterprises deploying AI agents report 40-60% improved first contact resolution rates
- Customer satisfaction score improved 12-18%
- Average cost per work order dropped from $15-25 to $1-3 (regular inquiries)
Industry Trends
- Defense First: Project Glasswing Alliance (40+ companies) uses Mythos to find vulnerabilities
- Economic Driver: Anthropic promises 100M USD usage quota and 4M USD donated to open source security organizations
- Capability Differentiation: Vulnerability discovery is commoditized, and exploit structures require higher intelligence
- Systematization: Success does not depend on a single model, but on system integration and expert knowledge
Conclusion: The “uneven frontier” of AI cybersecurity
Core conclusion
- Uneven distribution of capabilities: Different tasks require different combinations of capabilities, and small models can be comparable to large models on specific tasks.
- The moat is in the system, not in the model: target selection, in-depth iteration, verification, preliminary screening, and maintainer trust are the key
- Economic Optimality: Choose model size based on task difficulty, rather than “bigger is better”
- Deployment Strategy: Broad-spectrum scanning (low cost) + high-intelligence verification (critical vulnerabilities)
Strategic Enlightenment
| Stakeholders | Implications |
|---|---|
| Security Team | No Mythos required, small model can be deployed to scan all code |
| Developers | Focus on maintainer acceptance, not just bug discovery |
| Enterprise | Build modular pipelines instead of relying on a single model |
| Government | Focus on systematic defense, not model capabilities |
Next steps
- Assess the existing system: Analyze the current pipeline for vulnerability discovery and identify room for optimization
- Select model combination: Select model size according to task difficulty
- Investment verification tools: Tools such as ASan and Valgrind are key
- Build trust: Ensure vulnerability reports are accepted and fixed
- Monitoring capability indicators: tracking discovery rate, false alarm rate, repair time
Source:
- AISLE: AI Cybersecurity After Mythos: The Jagged Frontier
- Anthropic: Claude Mythos Preview Technical Blog
- Gartner: AI Agent Market & Monetization Update Q1 2026
Data source:
- AISLE Labs: 8/8 model FreeBSD test results
- OWASP Benchmark: Small Model False Positive Determination
- Anthropic Project Glasswing: Vulnerability Discovery Statement
Technical Issues:
- Question 1: Why is FreeBSD buffer overflow detection commoditized, while OpenBSD SACK bug requires mathematical reasoning?
- Question 2: Why does the OWASP false positive judgment show “reverse expansion” (small models are better than large models)?
- Question 3: How to build a production-ready AI cybersecurity system while maintaining cost and accuracy?