【編程干貨】本地用 Ollama + LLaMA 3 實現 Model Context Protocol（MCP）對話服務

模型上下文協議（MCP）本身提供的是一套標準化的通信規則和接口，簡化了客戶端應用的開發。

MCP 實際上是一套規范，官方把整套協議分成「傳輸層 + 協議層 + 功能層」三大塊，并對初始化握手、能力協商、數據/工具暴露、安全約束等均給出了約束性 JSON-Schema。

基于 JSON-RPC 2.0 的開放協議，目標是像 USB-C 一樣為 LLM 應用提供統一“插槽”。它定義三方角色——Host（應用）／Client（適配層）／Server（能力提供者），通過標準消息把資源、工具、提示模板等上下文按需送進 LLM，同時保留“人-機共管”審批流程，實現安全、可組合的本地或遠程代理體系。

但是，需要注意，不要連接不信任的 server，MCP 沒有在數據安全上提供任何保障。

下面我們可以實現一個最簡的 MCP 服務。

0 先決條件

軟件	版本建議	作用
操作系統	Windows 10/11、macOS 13+ 或任何較新的 Linux	僅需能運行 Ollama
Ollama	≥ 0.1.34	本地大語言模型后端
Python	3.10 – 3.12	運行 FastAPI 服務
Git	最新	克隆或管理代碼
CUDA 12+（可選）	若使用 NVIDIA GPU	提升推理速度

CPU 也可運行，但 LLaMA 3-8B 在 GPU 上體驗最佳。

1 安裝 Ollama 并拉取 LLaMA 3

# 1.1 安裝（macOS / Linux）
curl -fsSL https://ollama.com/install.sh | sh
# Windows 請下載 .msi 并雙擊安裝# 1.2 啟動后臺（首次安裝會自動啟動）
ollama serve# 1.3 拉取 8B 模型（約 4.7 GB）
ollama pull llama3:8b

2 創建最小可用的 MCP 后端

2.1 目錄結構

mcp-ollama/
│
├── app/
│   ├── main.py # FastAPI 入口
│   ├── mcp.py # 協議封裝
│   └── store.py # 簡易上下文存儲
└── requirements.txt

2.2 requirements.txt

fastapi==0.110.*
uvicorn[standard]==0.29.*
httpx==0.27.*
python-dotenv==1.0.*

2.3 MCP 協議封裝（app/mcp.py）

from typing importList, Dictdefbuild_prompt(history: List[Dict[str, str]], user_msg: str) -> str:"""把歷史消息與當前輸入拼接成大模型 prompt。MCP 規定 key 為 role / content，與 OpenAI messages 一致。"""lines = []for msg in history:lines.append(f"{msg['role'].upper()}: {msg['content']}")lines.append(f"USER: {user_msg}")return"\n".join(lines)SYSTEM_PROMPT = ("You are an AI assistant that follows instructions precisely. ""Answer in the same language as the question."
)

2.4 上下文存儲（app/store.py）

from collections import defaultdict, deque_MAX_TURNS = 8 # 只保留最近 8 輪
_history = defaultdict(lambda: deque(maxlen=_MAX_TURNS))def add_msg(session_id: str, role: str, content: str):_history[session_id].append({"role": role, "content": content})def get_history(session_id: str):return list(_history[session_id])

2.5 FastAPI 入口（app/main.py）

import os, uuid, httpx
from fastapi import FastAPI
from pydantic import BaseModel
from app.mcp import build_prompt, SYSTEM_PROMPT
from app.store import add_msg, get_historyOLLAMA_URL = os.getenv("OLLAMA_URL", "http://localhost:11434")classChatReq(BaseModel):session_id: str | None = Nonemessage: strclassChatResp(BaseModel):session_id: stranswer: strapp = FastAPI(title="MCP Demo")@app.post("/chat", response_model=ChatResp) async def chat(req: ChatReq):sid = req.session_id orstr(uuid.uuid4())history = get_history(sid)prompt = f"{SYSTEM_PROMPT}\n{build_prompt(history, req.message)}"payload = {"model": "llama3:8b", "prompt": prompt, "stream": False}asyncwith httpx.AsyncClient() as client:r = await client.post(f"{OLLAMA_URL}/api/generate", json=payload, timeout=120)r.raise_for_status()ans = r.json()["response"]add_msg(sid, "user", req.message)add_msg(sid, "assistant", ans)return ChatResp(session_id=sid, answer=ans)

2.6 運行服務

pip install -r requirements.txt
uvicorn app.main:app --reload --port 8000

3 驗證接口

# 第一次調用
curl -X POST http://localhost:8000/chat \
-H "Content-Type: application/json" \
-d '{"message":"Hello, what is MCP?"}' # 記錄返回的 session_id# 帶歷史上下文的第二次調用
curl -X POST http://localhost:8000/chat \
-H "Content-Type: application/json" \
-d '{"session_id":"<上一步返回的id>","message":"Explain in Chinese."}'

你將看到回答中自動繼承了第一輪對話，上下文邏輯即生效。

本文來自互聯網用戶投稿，該文觀點僅代表作者本人，不代表本站立場。本站僅提供信息存儲空間服務，不擁有所有權，不承擔相關法律責任。
如若轉載，請注明出處：http://www.pswp.cn/bicheng/79907.shtml
繁體地址，請注明出處：http://hk.pswp.cn/bicheng/79907.shtml
英文地址，請注明出處：http://en.pswp.cn/bicheng/79907.shtml

如若內容造成侵權/違法違規/事實不符，請聯系多彩編程網進行投訴反饋email:809451989@qq.com，一經查實，立即刪除！