Skip to main content

FAQ

收集和整理各个 MAAS Provider 的 API 问题

tip
  • tool call 缓存实际是缓存的 schema+描述 等

Anthropic Bedrock need thinking block for thinking

Expected `thinking` or `redacted_thinking`, but found `tool_use`.
When `thinking` is enabled, a final `assistant` message must start with a thinking block
  • GCP Vertex AI 要求没这么严格

Thinking encryption

  • 闭源模型会对思考内容加密,避免被蒸馏
  • 可能会提供思考内容的总结内容
  • 思考内容加密后得到 singature
  • 交叉 thinking 的时,tool call 也会包含 thinking 信息用于保留推理状态

Vertex AI

  • 非 function 的 thought_signature 不强制要求,但推荐包含
    • 确保模型高质量推理
{
"content": {
"role": "model",
"parts": [
{
"functionCall": {
"name": "check_flight",
"args": {
"flight": "AA100"
}
},
"thoughtSignature": "<SIGNATURE_A>"
}
]
}
}

Anthropic

{
"content": [
{
"type": "thinking",
"thinking": "Let me analyze this step by step...",
"signature": "WaUjzkypQ2mUEVM36O2TxuC06KN8xyfbJwyem2dw3URve/op91XWHOEBLLqIOMfFG/UvLEczmEsUjavL...."
},
{
"type": "redacted_thinking",
"data": "EmwKAhgBEgy3va3pzix/LafPsn4aDFIT2Xlxh0L5L8rLVyIwxtE3rAFBa8cr3qpP..."
},
{
"type": "text",
"text": "Based on my analysis..."
}
]
}
  • type signature_delta
  • redacted_thinking
    • sonet 3.7
  • signature
    • claude 4+
    • 返回总结的思考内容

Bedrock 特殊测试 prompt

ANTHROPIC_MAGIC_STRING_TRIGGER_REDACTED_THINKING_46C9A13E193C177646C7398A98432ECCCE4C1253D5E2D82641AC0E52CC2876CB

role developer vs system

  • OpenAI o1-2024-12-17 之后推出的
  • developer 权重比 system 高
  • developer
    • 强调规则
  • system
    • 强调角色

AI_APICallError: Error while downloading [URL REDACTED].

openai 相关似乎不允许 wikimedia 来源图片

Output Speed

参考TPS
朗读/听书3-4
正常默读5-10
快速略读15 - 25
ModelTPS
Claude Sonnet 4.540
gemini-3-flash-preview80-100
级别TPS典型应用场景
超快 (Instant)800 - 1200实时语音助手、搜索建议
快速 (Fast)150 - 250简单翻译、摘要、简单对话
标准 (Standard)70 - 100复杂指令、代码生成、字幕
重型 (Heavy)20 - 50深度写作、复杂逻辑推理
  • Prefill Speed
    • 一般 > 2000t/s
    • Context Caching 加速 Prefill
  • TPS / Token Per Seconds
  • 思考影响速度
    • 思考 budget 影响思考深度

Gemini

Missing thought_signature in function call

Please ensure that the number of function response parts is equal to the number of function call parts of the function call turn.

Unable to submit request because thinking_budget and thinking_level are not supported together

Gemini 限制

Anthropic

Claude temperature, top_p 不能一起传

max_tokens must be greater than thinking.budget_tokens

Input should be greater than or equal to 1024

  • budget_tokens 最小 1024

thinking or redacted_thinking blocks in the latest assistant message cannot be modified These blocks must remain as they were in the original response

上下文丢失


Invalid signature in thinking block

消息里的 singature 无效

Moonshoot

  • 协议严格,kimi follow 类似 anthropic 的限制

tool_call_id is not found

缺少 tool_calls,但是有 tool 角色和 tool_call_id

thinking is enabled but reasoning_content is missing in assistant tool call message at index

tool_call 缺少 reasoning_content

Bedrock

reasoning: Extra inputs are not permitted

协议很严格,不允许额外字段

Access to Bedrock models is not allowed for this account.

Access to Bedrock models is not allowed for this account.
Request a quota increase from: https://support.console.aws.amazon.com/support/home?region=us-east-1#/case/create?issueType=service-limit-increase