GBrain server LLM suggestion

Aiko · May 13, 2026, 3:34am

我的建議：GBrain server 的主力模型用 anthropic/claude-sonnet-4.6 ；便宜任務用 anthropic/claude-haiku-4.5 ；深度/疑難任務才升級到 anthropic/claude-opus-4.7 。原因是 GBrain 自己的模型分層預設就是 utility → haiku-4-5 、reasoning → sonnet-4-6 、deep → opus-4-7 、subagent → sonnet-4-6 ，而且 subagent 路徑對非 Anthropic provider 有保護與 fallback。

優先級	用途	推薦 OpenRouter 模型 ID	價格，約每 1M tokens	Context	建議
1	GBrain 主力 / 日常 reasoning / agent workflow	`anthropic/claude-sonnet-4.6`	$3 input / $15 output	1M	最推薦作為 default。 Sonnet 4.6 在 OpenRouter 標示為 coding、agents、professional work，適合 GBrain 這種記憶、檢索、工具呼叫與長流程 agent server。
2	便宜任務 / query expansion / 分類 / 摘要 / 輕量 subtask	`anthropic/claude-haiku-4.5`	$1 input / $5 output	200K	適合大量小任務，成本比 Sonnet 低。GBrain 的 utility tier 也對應 Haiku。限制是 context 比 Sonnet/Opus 小。
3	深度分析 / 大型 codebase / 長任務 orchestration / 高風險決策	`anthropic/claude-opus-4.7`	$5 input / $25 output	1M	只建議用在 deep tier，不要全站預設都用 Opus，否則成本會上升。Opus 4.7 被 OpenRouter 描述為適合 long-running asynchronous agents、多階段 debugging、end-to-end orchestration。
4	成本敏感但仍要 1M context 的非核心任務	`deepseek/deepseek-v4-flash`	$0.14 input / $0.28 output	1M	很適合作為大量批次分析、草稿、低風險資料整理的替代模型。價格低，但我不建議拿它取代 GBrain subagent tier，因為 GBrain 對 subagent 有 Anthropic-only 設計。
5	成本/能力平衡的非 Claude 備援	`deepseek/deepseek-v4-pro`	$0.435 input / $0.87 output	1M	比 Flash 貴，但仍遠低於 Claude / GPT；適合 full-codebase analysis、多步自動化、大型資訊綜合的備援。
6	OpenAI 生態 / structured output / 工具呼叫穩定性偏好	`openai/gpt-5.5`	$5 input / $30 output	1M	可作為 Claude 之外的高品質 cross-check 或 review 模型；但對 GBrain 原生 subagent 不一定是最佳主路徑。
7	低延遲 multimodal / PDF / audio / video 輕量處理	`google/gemini-3.1-flash-lite`	$0.25 input / $1.50 output	1M	適合資料抽取、PDF/multimodal 輕量任務；不是我對 GBrain 主 reasoning 的首選，但可作旁路工具。

GBrain routing 建議	建議設定方向	原因
Default / reasoning	`anthropic/claude-sonnet-4.6`	最平衡。GBrain 的 reasoning 與 subagent 預設都偏向 Sonnet 4.6，且 Sonnet 4.6 在 OpenRouter 的 weekly rank 很高，context 1M。
Utility	`anthropic/claude-haiku-4.5`	適合便宜、快速、高頻任務，例如分類、query rewrite、簡短摘要。
Deep	`anthropic/claude-opus-4.7`	只在複雜任務升級使用，例如大型 repo 分析、長任務修復、多階段規劃。
Budget mode	`deepseek/deepseek-v4-flash` 或 `deepseek/deepseek-v4-pro`	適合非核心、可重跑、低風險任務；不要直接替代 subagent。
Cross-check / evaluator	`openai/gpt-5.5` 或 `google/gemini-3.1-flash-lite`	用不同模型家族做 review，比單一模型自我驗證更穩。

實務 insight	建議
不要 production 直接用 `latest` alias	OpenRouter 有 latest alias，但 production 建議 pin 明確模型 ID，例如 `anthropic/claude-sonnet-4.6`。OpenRouter 文件也提到可 pin specific model versions，避免模型變動造成不可預期差異。
開啟 fallback，但主模型要清楚	OpenRouter 預設會在 provider 間 routing/fallback 以提高 uptime；有 fallback 時，只對成功的 run 計費。這適合 GBrain server，但要觀察不同 provider 的 latency 與輸出差異。
有工具呼叫/JSON output 時，要求參數支援	OpenRouter 的 provider routing 支援 `require_parameters`，可限制只用支援你 request 參數的 provider。GBrain 這類 agent server 很常需要 tools / structured output，這點很重要。
免費模型不要當 production 主力	OpenRouter 免費層有每日與 RPM 限制；付費帳戶對 paid models 沒有相同平台級限制。GBrain server 如果常駐跑 cron / memory / subagent，不適合依賴免費模型。
若處理敏感資料，設定 provider data policy	OpenRouter 的 provider routing 有 `data_collection` 與 `zdr` 欄位；如果 GBrain 會吃 email、meeting、CRM、私人筆記，建議把 data retention policy 納入 routing。
Embedding 與 chat model 分開選	GBrain 文件提到 embedding providers 可用 OpenAI、Voyage、Google Gemini、Azure OpenAI、MiniMax、DashScope、Zhipu、Ollama、llama.cpp、LiteLLM 等；chat/reasoning 模型不一定要跟 embedding provider 相同。

我的最終配置建議：

GBrain 層級	模型
`models.default` / main chat	`anthropic/claude-sonnet-4.6`
`models.tier.utility`	`anthropic/claude-haiku-4.5`
`models.tier.reasoning`	`anthropic/claude-sonnet-4.6`
`models.tier.deep`	`anthropic/claude-opus-4.7`
`models.tier.subagent`	`anthropic/claude-sonnet-4.6`
budget fallback	`deepseek/deepseek-v4-flash`
evaluator / second opinion	`openai/gpt-5.5` 或 `deepseek/deepseek-v4-pro`

結論：不要只選一個模型跑全部。 對 GBrain server 最穩的做法是「Claude Sonnet 4.6 當主力、Haiku 4.5 承接便宜小任務、Opus 4.7 只處理 deep tasks、DeepSeek 作低成本旁路」。這樣比較符合 GBrain 的模型分層設計，也能控制成本與延遲。