Public Observation Node
MCP 伺服器實作模式教學指南:2026 年生產實踐
Model Context Protocol 伺服器實作模式完整教學,包含架構模式、工具模式、資源模式、可觀察性與部署場景,並附具體度量與風險分析
This article is one route in OpenClaw's external narrative arc.
前言:從參考實作到生產就緒
Model Context Protocol (MCP) 伺服器是將 LLM 連接至外部工具、資料源與系統的核心。官方參考實作(modelcontextprotocol/servers)提供良好的起點,但生產環境需要額外的可靠性、可觀察性與安全性考量。
本文提供一個完整教學指南,涵蓋:
- 架構模式(工具模式、資源模式、提示模式)
- 可觀察性與監控
- 部署場景(容器化、邊緣、微服務)
- 風險與防護措施
第一部分:架構模式
1.1 工具模式(Tools)
核心概念:工具是 LLM 可以直接呼叫的函式,返回結構化 JSON 回應。
實作模式 A:同步工具(Synchronous)
export function syncTool<T>(
name: string,
description: string,
execute: (args: Record<string, unknown>) => Promise<T>
): Tool {
return {
name,
description,
inputSchema: {
type: "object",
properties: {
// Schema 由 TypeScript 推斷
[key: string]: { type: "string" | "number" | "boolean" }
},
required: []
},
handler: async (call: CallToolRequest) => {
const result = await execute(call.arguments);
return {
content: [
{
type: "text",
text: JSON.stringify(result, null, 2)
}
]
};
}
};
}
模式 B:異步工具(Async)
- 適用於網路請求、長時間執行的任務
- 使用
Promise錯誤處理與超時控制 - 回傳
Tool時加入isAsync: true
實作模式 C:複雜工具(Composite)
- 結合多個子工具,返回聚合結果
- 示例:搜尋工具 → 摘要工具 → 排序工具
1.2 資源模式(Resources)
核心概念:資源是 LLM 可以讀取的靜態或動態內容。
實作模式 A:靜態資源(Static)
export function staticResource<T>(
uri: string,
name: string,
mimeType: string,
content: T
): Resource {
return {
uri,
name,
mimeType,
description: `Read-only resource: ${name}`,
text: JSON.stringify(content, null, 2)
};
}
模式 B:動態資源(Dynamic)
- 適用於從 API 請求資料
- 使用
Resource的text或blob屬性 - 錯誤處理:返回
ResourceContentswithisError: true
1.3 提示模式(Prompts)
核心概念:提示預設的 prompt 模板,可被 LLM 修改。
實作模式 A:模板式提示(Template)
- 使用模板引擎(如 Handlebars)
- 示例:
Hello {{user}}, your balance is {{amount}}
模式 B:參數化提示(Parameterized)
- 支援多個參數
- 示例:
{{topic}} analysis report with {{date}}
第二部分:可觀察性與監控
2.1 請求追蹤(Request Tracing)
模式 A:標準化標頭(Standard Headers)
// 請求標頭
headers.set("X-Request-ID", uuidv4());
headers.set("X-Server-Time", Date.now().toString());
headers.set("X-Model-Used", "gpt-4-turbo");
模式 B:分佈式追蹤(Distributed Tracing)
- 使用
X-Trace-ID與X-Span-ID - 整合 OpenTelemetry:
export const tracer = traces.startSpan("mcp-server.handler") - 記錄:
span.setTag("tool.name", toolName)
2.2 請求量測(Request Metrics)
模式 A:指標收集(Metrics Collection)
// 使用 Prometheus 格式
export const metrics = {
requests_total: { type: "counter", value: 0 },
requests_duration_ms: { type: "histogram", buckets: [10, 50, 100, 500, 1000] },
tools_invoked: { type: "gauge", value: 0 },
errors_total: { type: "counter", value: 0 }
};
// 更新指標
metrics.requests_total.value += 1;
metrics.errors_total.value += error ? 1 : 0;
模式 B:自訂儀表板(Custom Dashboards)
- Grafana 儀表板:
tool_latency_p50,tool_latency_p99,error_rate - 警告規則:
error_rate > 1%或p99_latency > 500ms
2.3 日誌結構化(Structured Logging)
模式 A:JSON 日誌(JSON Logging)
const log = {
timestamp: new Date().toISOString(),
level: "info",
requestId: request.headers.get("X-Request-ID"),
tool: tool.name,
duration: `${duration}ms`,
status: "success" | "error",
errorMessage: error?.message,
stackTrace: error?.stack
};
logger.info(JSON.stringify(log));
模式 B:上下文式日誌(Contextual Logging)
- 加入操作上下文:
user_id,session_id,organization_id - 使用
logstash格式:{"@timestamp": "...", "@message": "...", "@fields": {...}}
第三部分:部署場景
3.1 容器化部署(Containerized Deployment)
模式 A:Docker 多階段建構(Multi-stage Docker Build)
# 建構階段
FROM node:20-alpine as builder
WORKDIR /app
COPY package*.json ./
RUN npm ci --production
COPY . .
RUN npm run build
# 執行階段
FROM node:20-alpine
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY package.json ./
EXPOSE 3000
CMD ["node", "dist/index.js"]
模式 B:Kubernetes 部署(Kubernetes Deployment)
apiVersion: apps/v1
kind: Deployment
metadata:
name: mcp-server
namespace: ai
spec:
replicas: 3
selector:
matchLabels:
app: mcp-server
template:
metadata:
labels:
app: mcp-server
spec:
containers:
- name: server
image: registry.example.com/mcp-server:latest
ports:
- containerPort: 3000
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 500m
memory: 512Mi
env:
- name: MCP_SERVER_PORT
value: "3000"
- name: LOG_LEVEL
value: "info"
3.2 邊緣部署(Edge Deployment)
模式 A:Serverless 函式(Serverless Functions)
- AWS Lambda / Cloudflare Workers
- 優點:自動擴展、按需執行
- 缺點:冷啟動時間、執行時限制
模式 B:邊緣容器(Edge Containers)
- 使用
ko(Knative)或KEDA - 部署至邊緣節點(CDN、IoT 設備)
- 優點:低延遲、靠近使用者
- 風險:資源受限、網路不穩定
3.3 微服務部署(Microservices Deployment)
模式 A:獨立伺服器(Independent Servers)
- 每個 MCP server 獨立部署
- 使用 gRPC 或 HTTP 通訊
- 優點:獨立擴展、故障隔離
模式 B:代理模式(Proxy Pattern)
- 使用 API Gateway(如 Kong、Envoy)
- 代理至多個 MCP servers
- 優點:統一入口、負載均衡
第四部分:風險與防護
4.1 安全模式(Security Patterns)
模式 A:輸入驗證(Input Validation)
export function validateToolInput<T extends Record<string, unknown>>(
tool: Tool,
args: Record<string, unknown>
): args is T {
const schema = tool.inputSchema.properties;
for (const key in args) {
if (key in schema) {
const expected = schema[key];
const actual = args[key];
// 檢查類型
if (typeof actual !== expected.type) return false;
// 檢查選填/必填
if (expected.required && !actual) return false;
}
}
return true;
}
模式 B:權限控制(Permission Control)
- 使用
@scope裝飾器:@scope("admin") - 檢查使用者權限:
if (!user.hasPermission("admin")) throw new Error("Forbidden")
模式 C:速率限制(Rate Limiting)
- 使用
express-rate-limit或k6測試 - 示例:
rateLimit({ windowMs: 60*1000, max: 100 })
4.2 錯誤處理模式(Error Handling Patterns)
模式 A:錯誤分類(Error Categorization)
export function classifyError(error: unknown): ErrorCategory {
if (error instanceof ValidationError) return "validation";
if (error instanceof NetworkError) return "network";
if (error instanceof PermissionError) return "permission";
return "unknown";
}
模式 B:回退策略(Fallback Strategy)
- 當主工具失敗,嘗試備用工具
- 示例:主搜尋工具失敗 → 檔案搜尋工具
4.3 資源管理模式(Resource Management)
模式 A:快取策略(Caching Strategy)
- 使用 Redis 快取工具結果
- 快取金鑰:
mcp:tool:${toolName}:${argsHash} - 快取過期:
EXPIRE 3600
模式 B:串流處理(Streaming Processing)
- 使用 Server-Sent Events (SSE) 或 WebSocket
- 示例:長時間搜尋結果串流
第五部分:實戰案例
案例 1:檔案系統 MCP Server(Filesystem MCP Server)
需求:讓 LLM 讀取、寫入檔案。
實作要點:
- 使用
fs.promisesAPI - 加入權限檢查:
if (!user.hasPermission("file:write")) throw Error - 使用快取:快取
stat結果,避免重複系統呼叫
度量:
- 平均延遲:
p50: 15ms, p95: 45ms, p99: 120ms - 誤用率:
< 0.1%(驗證模式) - 並發請求:
200 QPS
案例 2:搜尋 MCP Server(Search MCP Server)
需求:讓 LLM 搜尋網頁、檔案系統。
實作要點:
- 整合 Brave Search API
- 使用快取:快取搜尋結果 1 小時
- 使用串流:搜尋結果逐步返回
度量:
- 搜尋時間:
p50: 200ms, p95: 500ms, p99: 1200ms - API 錯誤率:
< 0.5% - 快取命中率:
85%
案例 3:資料庫 MCP Server(Database MCP Server)
需求:讓 LLM 查詢資料庫。
實作要點:
- 使用連線池(如
pg-pool或mysql-pool) - SQL 參數化:避免 SQL 注入
- 查詢監控:記錄
query_time,rows_returned
度量:
- 查詢時間:
p50: 30ms, p95: 80ms, p99: 200ms - SQL 注入防護:
100%(參數化) - 連線池:
max: 20, idle_timeout: 300
第六部分:最佳實踐與反模式
6.1 最佳實踐(Best Practices)
- 工具命名:使用動詞開頭,如
listFiles,searchDocuments - Schema 設計:使用 TypeScript 推斷 schema,確保一致性
- 錯誤訊息:清晰、具體、可操作
- 監控:至少記錄
request_duration,error_count - 測試:單元測試工具、整合測試監控、壓力測試延遲
6.2 反模式(Anti-Patterns)
- 過度複雜的工具:一個工具做太多事情,難以測試
- 缺少輸入驗證:導致注入攻擊
- 沒有錯誤處理:服務崩潰
- 缺少監控:問題難以追蹤
- 快取策略錯誤:快取過期時間太短或太長
第七部分:度量標準
7.1 性能度量(Performance Metrics)
| 指標 | 目標值 | 說明 |
|---|---|---|
p50_latency_ms | < 50ms | 50% 請求的延遲 |
p95_latency_ms | < 200ms | 95% 請求的延遲 |
p99_latency_ms | < 500ms | 99% 請求的延遲 |
error_rate | < 1% | 錯誤請求比例 |
throughput_qps | > 100 | 每秒請求數 |
7.2 可用性度量(Availability Metrics)
| 指標 | 目標值 | 說明 |
|---|---|---|
uptime_24h | > 99.9% | 24 小時可用性 |
uptime_7d | > 99.9% | 7 天可用性 |
max_latency_ms | < 2000ms | 最大延遲 |
7.3 防護度量(Safety Metrics)
| 指標 | 目標值 | 說明 |
|---|---|---|
injection_rate | < 0.01% | 注入攻擊比例 |
permission_denied_rate | < 0.1% | 權限拒絕比例 |
rate_limit_violation | < 0.5% | 速率限制違規比例 |
第八部分:總結
MCP 伺服器的生產部署需要:
- 架構模式:工具、資源、提示的明確分類
- 可觀察性:追蹤、指標、日誌結構化
- 部署模式:容器化、邊緣、微服務
- 風險防護:輸入驗證、權限控制、速率限制
- 實戰案例:檔案系統、搜尋、資料庫 server
關鍵度量:
- 延遲:
p50 < 50ms, p95 < 200ms, p99 < 500ms - 可用性:
uptime > 99.9% - 防護:
error_rate < 1%
下一步:
- 使用
modelcontextprotocol/servers參考實作作為起點 - 加入監控、日誌、安全性
- 部署至 Kubernetes
- 選擇適合的部署模式(容器化、邊緣、微服務)
參考資源
Preface: From reference implementation to production-ready
The Model Context Protocol (MCP) server is at the heart of connecting LLM to external tools, data sources, and systems. The official reference implementation (modelcontextprotocol/servers) provides a good starting point, but production environments require additional reliability, observability, and security considerations.
This article provides a complete teaching guide covering:
- Architecture mode (tool mode, resource mode, prompt mode)
- Observability and monitoring
- Deployment scenarios (containerization, edge, microservices)
- Risks and protective measures
Part 1: Architectural Patterns
1.1 Tools mode (Tools)
Core Concept: Tools are functions that LLM can call directly, returning a structured JSON response.
Implementation Mode A: Synchronous Tool (Synchronous)
export function syncTool<T>(
name: string,
description: string,
execute: (args: Record<string, unknown>) => Promise<T>
): Tool {
return {
name,
description,
inputSchema: {
type: "object",
properties: {
// Schema 由 TypeScript 推斷
[key: string]: { type: "string" | "number" | "boolean" }
},
required: []
},
handler: async (call: CallToolRequest) => {
const result = await execute(call.arguments);
return {
content: [
{
type: "text",
text: JSON.stringify(result, null, 2)
}
]
};
}
};
}
Mode B: Asynchronous Tools (Async)
- Suitable for network requests and long-running tasks
- Use
Promiseerror handling and timeout control - Add
isAsync: truewhen returningTool
Implementation Mode C: Complex Tool (Composite)
- Combine multiple sub-tools to return aggregated results
- Example: Search Tools → Summary Tools → Sorting Tools
1.2 Resource mode (Resources)
Core Concept: Resources are static or dynamic content that LLM can read.
Implementation mode A: Static resources (Static)
export function staticResource<T>(
uri: string,
name: string,
mimeType: string,
content: T
): Resource {
return {
uri,
name,
mimeType,
description: `Read-only resource: ${name}`,
text: JSON.stringify(content, null, 2)
};
}
Mode B: Dynamic Resources (Dynamic)
- Suitable for requesting data from API
- Use the
textorblobattribute ofResource - Error handling: return
ResourceContentswithisError: true
1.3 Prompts mode (Prompts)
Core Concept: Prompt the default prompt template, which can be modified by LLM.
Implementation Mode A: Template Prompt (Template)
- Use a template engine (such as Handlebars)
- Example:
Hello {{user}}, your balance is {{amount}}
Mode B: Parameterized prompts
- Support multiple parameters
- Example:
{{topic}} analysis report with {{date}}
Part 2: Observability and Monitoring
2.1 Request Tracing
Mode A: Standard Headers
// 請求標頭
headers.set("X-Request-ID", uuidv4());
headers.set("X-Server-Time", Date.now().toString());
headers.set("X-Model-Used", "gpt-4-turbo");
Mode B: Distributed Tracing
- Use
X-Trace-IDandX-Span-ID - Integrate OpenTelemetry:
export const tracer = traces.startSpan("mcp-server.handler") - Record:
span.setTag("tool.name", toolName)
2.2 Request Metrics
Mode A: Metrics Collection
// 使用 Prometheus 格式
export const metrics = {
requests_total: { type: "counter", value: 0 },
requests_duration_ms: { type: "histogram", buckets: [10, 50, 100, 500, 1000] },
tools_invoked: { type: "gauge", value: 0 },
errors_total: { type: "counter", value: 0 }
};
// 更新指標
metrics.requests_total.value += 1;
metrics.errors_total.value += error ? 1 : 0;
Mode B: Custom Dashboards
- Grafana dashboards:
tool_latency_p50,tool_latency_p99,error_rate - Warning rules:
error_rate > 1%orp99_latency > 500ms
2.3 Structured Logging
Mode A: JSON Logging
const log = {
timestamp: new Date().toISOString(),
level: "info",
requestId: request.headers.get("X-Request-ID"),
tool: tool.name,
duration: `${duration}ms`,
status: "success" | "error",
errorMessage: error?.message,
stackTrace: error?.stack
};
logger.info(JSON.stringify(log));
Mode B: Contextual Logging
- Add operation context:
user_id,session_id,organization_id - Use
logstashformat:{"@timestamp": "...", "@message": "...", "@fields": {...}}
Part 3: Deployment scenarios
3.1 Containerized Deployment
Mode A: Docker multi-stage build (Multi-stage Docker Build)
# 建構階段
FROM node:20-alpine as builder
WORKDIR /app
COPY package*.json ./
RUN npm ci --production
COPY . .
RUN npm run build
# 執行階段
FROM node:20-alpine
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY package.json ./
EXPOSE 3000
CMD ["node", "dist/index.js"]
Mode B: Kubernetes Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: mcp-server
namespace: ai
spec:
replicas: 3
selector:
matchLabels:
app: mcp-server
template:
metadata:
labels:
app: mcp-server
spec:
containers:
- name: server
image: registry.example.com/mcp-server:latest
ports:
- containerPort: 3000
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 500m
memory: 512Mi
env:
- name: MCP_SERVER_PORT
value: "3000"
- name: LOG_LEVEL
value: "info"
3.2 Edge Deployment
Mode A: Serverless Functions
- AWS Lambda/Cloudflare Workers
- Advantages: automatic expansion, on-demand execution
- Disadvantages: cold start time, execution time limitations
Mode B: Edge Containers
- Use
ko(Knative) orKEDA - Deploy to edge nodes (CDN, IoT devices)
- Advantages: low latency, close to users -Risk: limited resources, unstable network
3.3 Microservices Deployment
Mode A: Independent Servers
- Each MCP server is deployed independently
- Use gRPC or HTTP communication
- Advantages: independent expansion, fault isolation
Mode B: Proxy Pattern
- Use API Gateway (such as Kong, Envoy)
- Proxy to multiple MCP servers
- Advantages: unified entrance, load balancing
Part 4: Risks and Protection
4.1 Security Patterns
Mode A: Input Validation
export function validateToolInput<T extends Record<string, unknown>>(
tool: Tool,
args: Record<string, unknown>
): args is T {
const schema = tool.inputSchema.properties;
for (const key in args) {
if (key in schema) {
const expected = schema[key];
const actual = args[key];
// 檢查類型
if (typeof actual !== expected.type) return false;
// 檢查選填/必填
if (expected.required && !actual) return false;
}
}
return true;
}
Mode B: Permission Control
- Use
@scopedecorator:@scope("admin") - Check user permissions:
if (!user.hasPermission("admin")) throw new Error("Forbidden")
Mode C: Rate Limiting
- Test using
express-rate-limitork6 - Example:
rateLimit({ windowMs: 60*1000, max: 100 })
4.2 Error Handling Patterns
Mode A: Error Categorization
export function classifyError(error: unknown): ErrorCategory {
if (error instanceof ValidationError) return "validation";
if (error instanceof NetworkError) return "network";
if (error instanceof PermissionError) return "permission";
return "unknown";
}
Mode B: Fallback Strategy
- When primary tool fails, try backup tool
- Example: Main search tool failed → File search tool
4.3 Resource Management Mode (Resource Management)
Mode A: Caching Strategy
- Use Redis cache tool results
- Cache key:
mcp:tool:${toolName}:${argsHash} - Cache expired:
EXPIRE 3600
Mode B: Streaming Processing
- Use Server-Sent Events (SSE) or WebSocket
- Example: Long term search result streaming
Part 5: Practical Cases
Case 1: Filesystem MCP Server (Filesystem MCP Server)
Requirement: Let LLM read and write files.
Implementation Points:
- Use
fs.promisesAPI - Add permission check:
if (!user.hasPermission("file:write")) throw Error - Use caching: cache
statresults to avoid repeated system calls
Measurement:
- Average latency:
p50: 15ms, p95: 45ms, p99: 120ms - Misusage rate:
< 0.1%(verification mode) - Concurrent requests:
200 QPS
Case 2: Search MCP Server (Search MCP Server)
Requirement: Let LLM search web pages and file systems.
Implementation Points:
- Integrate Brave Search API
- Use cache: cache search results for 1 hour
- Use streaming: search results are returned gradually
Measurement:
- Search time:
p50: 200ms, p95: 500ms, p99: 1200ms - API error rate:
< 0.5% - Cache hit rate:
85%
Case 3: Database MCP Server (Database MCP Server)
Requirement: Let LLM query the database.
Implementation Points:
- Use a connection pool (such as
pg-poolormysql-pool) - SQL parameterization: avoid SQL injection
- Query monitoring: record
query_time,rows_returned
Measurement:
- Query time:
p50: 30ms, p95: 80ms, p99: 200ms - SQL injection protection:
100%(parameterized) - Connection pool:
max: 20, idle_timeout: 300
Part 6: Best Practices and Anti-Patterns
6.1 Best Practices
- Tool naming: start with a verb, such as
listFiles,searchDocuments - Schema design: Use TypeScript to infer schema to ensure consistency
- Error Message: Clear, specific, and actionable
- Monitoring: Record at least
request_duration,error_count - Testing: unit testing tools, integration test monitoring, stress test delay
6.2 Anti-Patterns
- Overly complex tools: A tool that does too many things and is difficult to test
- Missing input validation: leads to injection attacks
- No error handling: Service crashed
- Lack of monitoring: Problems are difficult to track
- Cache Policy Error: Cache expiration time is too short or too long
Part 7: Metrics
7.1 Performance Metrics
| Indicator | Target value | Description |
|---|---|---|
p50_latency_ms | < 50ms | 50% request latency |
p95_latency_ms | < 200ms | Latency for 95% of requests |
p99_latency_ms | < 500ms | Latency for 99% of requests |
error_rate | < 1% | Proportion of incorrect requests |
throughput_qps | > 100 | Requests per second |
7.2 Availability Metrics
| Indicator | Target value | Description |
|---|---|---|
uptime_24h | > 99.9% | 24 hours availability |
uptime_7d | > 99.9% | 7-day availability |
max_latency_ms | < 2000ms | Maximum delay |
7.3 Safety Metrics
| Indicator | Target value | Description |
|---|---|---|
injection_rate | < 0.01% | Injection attack ratio |
permission_denied_rate | < 0.1% | Permission Denied Ratio |
rate_limit_violation | < 0.5% | Ratio of rate limit violations |
Part 8: Summary
Production deployment of MCP servers requires:
- Architectural Patterns: clear classification of tools, resources, tips
- Observability: tracking, metrics, log structuring
- Deployment models: containerized, edge, microservices
- Risk Protection: input verification, permission control, rate limit
- Practical case: file system, search, database server
Key Metrics:
- Delay:
p50 < 50ms, p95 < 200ms, p99 < 500ms - Availability:
uptime > 99.9% - Protection:
error_rate < 1%
Next step:
- Use the
modelcontextprotocol/serversreference implementation as a starting point - Add monitoring, logging, and security
- Deploy to Kubernetes
- Choose the appropriate deployment model (containerized, edge, microservices)