FAQ
收集和整理各个 MAAS Provider 的 API 问题
tip
- tool call 缓存实际是缓存的 schema+描述 等
Anthropic Bedrock need thinking block for thinking
Expected `thinking` or `redacted_thinking`, but found `tool_use`.
When `thinking` is enabled, a final `assistant` message must start with a thinking block
- GCP Vertex AI 要求没这么严格
Thinking encryption
- 闭源模型会对思考内容加密,避免被蒸馏
- 可能会提供思考内容的总结内容
- 思考内容加密后得到 singature
- 交叉 thinking 的时,tool call 也会包含 thinking 信息用于保留推理状态
Vertex AI
- 非 function 的 thought_signature 不强制要求,但推荐包含
- 确保模型高质量推理
{
"content": {
"role": "model",
"parts": [
{
"functionCall": {
"name": "check_flight",
"args": {
"flight": "AA100"
}
},
"thoughtSignature": "<SIGNATURE_A>"
}
]
}
}
Anthropic
{
"content": [
{
"type": "thinking",
"thinking": "Let me analyze this step by step...",
"signature": "WaUjzkypQ2mUEVM36O2TxuC06KN8xyfbJwyem2dw3URve/op91XWHOEBLLqIOMfFG/UvLEczmEsUjavL...."
},
{
"type": "redacted_thinking",
"data": "EmwKAhgBEgy3va3pzix/LafPsn4aDFIT2Xlxh0L5L8rLVyIwxtE3rAFBa8cr3qpP..."
},
{
"type": "text",
"text": "Based on my analysis..."
}
]
}
- type signature_delta
- redacted_thinking
- sonet 3.7
- signature
- claude 4+
- 返回总结的思考内容
Bedrock 特殊测试 prompt
ANTHROPIC_MAGIC_STRING_TRIGGER_REDACTED_THINKING_46C9A13E193C177646C7398A98432ECCCE4C1253D5E2D82641AC0E52CC2876CB
role developer vs system
- OpenAI o1-2024-12-17 之后推出的
- developer 权重比 system 高
- developer
- 强调规则
- system
- 强调角色
AI_APICallError: Error while downloading [URL REDACTED].
openai 相关似乎不允许 wikimedia 来源图片
Output Speed
| 参考 | TPS |
|---|---|
| 朗读/听书 | 3-4 |
| 正常默读 | 5-10 |
| 快速略读 | 15 - 25 |
| Model | TPS |
|---|---|
| Claude Sonnet 4.5 | 40 |
| gemini-3-flash-preview | 80-100 |
| 级别 | TPS | 典型应用场景 |
|---|---|---|
| 超快 (Instant) | 800 - 1200 | 实时语音助手、搜索建议 |
| 快速 (Fast) | 150 - 250 | 简单翻译、摘要、简单对话 |
| 标准 (Standard) | 70 - 100 | 复杂指令、代码生成、字幕 |
| 重型 (Heavy) | 20 - 50 | 深度写作、复杂逻辑推理 |
- Prefill Speed
- 一般 > 2000t/s
- Context Caching 加速 Prefill
- TPS / Token Per Seconds
- 思考影响速度
- 思考 budget 影响思考深度
Gemini
Missing thought_signature in function call
Please ensure that the number of function response parts is equal to the number of function call parts of the function call turn.
Unable to submit request because thinking_budget and thinking_level are not supported together
Gemini 限制
Anthropic
Claude temperature, top_p 不能一起传
- Claude Sonnet 4.5 and Claude Haiku 4.5 only support specification of one of temperature or top_p parameters, but cannot handle both.
- 思考与 temperature、top_p 或 top_k 修改不兼容,也不兼容强制使用工具。
- 启用思考后,您无法预先填写响应。
- 对思考预算进行更改,会导致包含消息的缓存提示前缀失效。但是,当思考的参数发生变化时,缓存系统提示和工具定义将继续起作用。
- 参考
max_tokens must be greater than thinking.budget_tokens
Input should be greater than or equal to 1024
- budget_tokens 最小 1024
thinking or redacted_thinking blocks in the latest assistant message cannot be modified These blocks must remain as they were in the original response
上下文丢失
Invalid signature in thinking block
消息里的 singature 无效
Moonshoot
- 协议严格,kimi follow 类似 anthropic 的限制
tool_call_id is not found
缺少 tool_calls,但是有 tool 角色和 tool_call_id
thinking is enabled but reasoning_content is missing in assistant tool call message at index
tool_call 缺少 reasoning_content
Bedrock
reasoning: Extra inputs are not permitted
协议很严格,不允许额外字段
Access to Bedrock models is not allowed for this account.
Access to Bedrock models is not allowed for this account.
Request a quota increase from: https://support.console.aws.amazon.com/support/home?region=us-east-1#/case/create?issueType=service-limit-increase