Skip to content

fix(dashscope): enhance usage parsing robustness to prevent VSCode cr…#2425

Open
magucas wants to merge 1 commit intofarion1231:mainfrom
magucas:main
Open

fix(dashscope): enhance usage parsing robustness to prevent VSCode cr…#2425
magucas wants to merge 1 commit intofarion1231:mainfrom
magucas:main

Conversation

@magucas
Copy link
Copy Markdown

@magucas magucas commented Apr 28, 2026

PR: Fix DashScope Usage Parsing to Prevent VSCode Extension Crashes

Summary / 概述

Enhanced build_anthropic_usage_from_responses() and streaming SSE handlers to handle null, missing, empty, and partial usage fields gracefully. This prevents VSCode Extension crashes with "Cannot read properties of null (reading 'output_tokens')" when connecting to DashScope (Alibaba Cloud Bailian) models.

问题背景:DashScope(阿里云百炼)的 Responses API 在流式响应中可能返回畸形的 usage 对象:

  • usage: null - 完全缺失
  • usage: {} - 空对象
  • usage: {"input_tokens": 100} - 只有部分字段
  • OpenAI 字段名变体(prompt_tokens/completion_tokens

这些情况会导致 VSCode Extension 访问 usage.output_tokens 时崩溃。

解决方案:实施防御性解析策略,确保在任何畸形情况下都返回有效的 Anthropic-compatible usage 结构。

Related Issue / 关联 Issue

Fixes #issue-dashscope-usage-null-pointer

Technical Details / 技术细节

Changes / 改动内容

1. Non-streaming Responses API (transform_responses.rs)

// Before: 直接访问字段,可能崩溃
let input = u.get("input_tokens").and_then(|v| v.as_u64()).unwrap_or(0);
let output = u.get("output_tokens").and_then(|v| v.as_u64()).unwrap_or(0);

// After: 防御性解析 + OpenAI 字段名回退 + 空对象检测
if u.as_object().map(|obj| obj.is_empty()).unwrap_or(false) {
    log::warn!("[Responses] Empty usage object received, using defaults");
    return json!({"input_tokens": 0, "output_tokens": 0});
}

let input = u.get("input_tokens")
    .or_else(|| u.get("prompt_tokens"))  // OpenAI fallback
    .and_then(|v| v.as_u64())
    .unwrap_or(0);

关键改进

  • ✅ 添加 null 检查:!v.is_null() && v.is_object()
  • ✅ 添加空对象检测:obj.is_empty() → 使用默认值
  • ✅ 添加 OpenAI 字段名回退:prompt_tokens/completion_tokens
  • ✅ 添加警告日志:记录所有畸形情况
  • ✅ 保留 cache token 字段:即使 input/output 缺失

2. Streaming Responses API (streaming_responses.rs)

// Before: 直接访问可能为 null 的 usage
let usage_json = chunk.usage.as_ref().map(|u| {
    json!({
        "input_tokens": u.input_tokens,
        "output_tokens": u.output_tokens
    })
});

// After: 确保 usage_json 始终为 Some(Value)
let usage_json = chunk.usage.as_ref().map(|u| {
    // ... build usage ...
}).or_else(|| {
    log::warn!("Missing usage, using defaults");
    Some(json!({"input_tokens": 0, "output_tokens": 0}))
});

3. Streaming SSE Handler (streaming.rs) - Merged with upstream

结合了上游的缓存策略(延迟到 [DONE] 发送 message_delta)和防御性 usage 处理:

// 策略:缓存 message_delta → 等待完整 usage → 确保 usage 非空
let safe_usage_json = usage_json.or_else(|| {
    log::warn!("[Claude/OpenRouter] finish_reason chunk has no usage, using latest_usage or defaults");
    latest_usage.clone()
}).unwrap_or_else(|| {
    log::warn!("[Claude/OpenRouter] No usage available, using defaults to prevent null usage");
    json!({"input_tokens": 0, "output_tokens": 0})
});
pending_message_delta = Some((stop_reason, Some(safe_usage_json)));

Modified Files / 修改文件

  • src-tauri/src/proxy/providers/transform_responses.rs

    • Enhanced build_anthropic_usage_from_responses() with defensive parsing
    • Added field name resolution priority (Anthropic → OpenAI → default)
    • Added comprehensive logging for malformed usage scenarios
  • src-tauri/src/proxy/providers/streaming_responses.rs

    • Fixed SSE event handlers with null-safe usage access
    • Ensured usage_json is always Some(Value), never None
  • src-tauri/src/proxy/providers/streaming.rs (merged with upstream)

    • Combined upstream caching strategy with defensive usage handling
    • Ensures safe usage fallback chain: usage_jsonlatest_usage → defaults
  • CHANGELOG.md

    • Added bug fix documentation under "Unreleased"

Test Coverage / 测试覆盖

已验证的场景

  • usage: null → 返回 {"input_tokens": 0, "output_tokens": 0}
  • usage: {} → 返回 {"input_tokens": 0, "output_tokens": 0}
  • usage: {"input_tokens": 100} → 返回 {"input_tokens": 100, "output_tokens": 0}
  • ✅ OpenAI field names → 自动转换为 Anthropic 格式
  • ✅ Cache tokens preserved → 即使 input/output 缺失仍保留

Rebase 验证

  • ✅ 成功 rebase 到 origin/main (21e2d68)
  • ✅ 合并了上游的 message_delta 缓存策略
  • ✅ 保留了防御性 usage 处理逻辑
  • ✅ 通过所有现有的 streaming tests

Screenshots / 截图

Before / 修改前 After / 修改后
VSCode Extension crash: Cannot read properties of null (reading 'output_tokens') Stable: Always returns valid usage structure
White screen / connection refused in proxy Proxy handles malformed usage gracefully

Impact Analysis / 影响分析

Benefits / 优势

  1. 稳定性提升:防止 DashScope 和其他兼容 provider 导致的崩溃
  2. 兼容性增强:支持 OpenAI 和 Anthropic 字段名变体
  3. 可调试性改进:完整的日志记录,便于排查畸形 usage 情况
  4. 向后兼容:不影响现有正常 provider 的行为

Potential Risks / 潜在风险

  • ⚠️ 日志量增加:畸形 usage 会产生警告日志(可通过日志级别控制)
  • ⚠️ 默认值影响:缺失 usage 时使用 0 值,可能影响计费统计准确性
    • 缓解措施:日志警告明确标注,便于后续补全

Compatibility / 兼容性

  • ✅ 完全向后兼容,不影响现有正常 provider
  • ✅ 支持 Anthropic Responses API 标准
  • ✅ 支持 OpenAI field name 变体(兼容层)
  • ✅ 适用于所有 OpenAI-compatible providers(DashScope、DeepSeek、Moonshot 等)

Implementation Notes / 实现细节

Field Name Resolution Priority / 字段名解析优先级

input_tokens:
  1. Anthropic: input_tokens
  2. OpenAI: prompt_tokens
  3. Default: 0

output_tokens:
  1. Anthropic: output_tokens
  2. OpenAI: completion_tokens
  3. Default: 0

cache_read_input_tokens:
  1. Direct field: cache_read_input_tokens
  2. Nested: input_tokens_details.cached_tokens
  3. Nested: prompt_tokens_details.cached_tokens

Error Handling Strategy / 错误处理策略

// 三层防御链:
1. Primary: usage_json (from current chunk)
2. Fallback: latest_usage (from previous chunks)
3. Default: {"input_tokens": 0, "output_tokens": 0}

// 每层都有日志记录:
- Layer 1 success: debug log
- Layer 2 fallback: warn log "using latest_usage"
- Layer 3 default: warn log "using defaults"

Logging Strategy / 日志策略

// 空对象 {} → WARN
log::warn!("[Responses] Empty usage object received, using defaults");

// OpenAI field fallback → DEBUG
log::debug!("[Responses] Using OpenAI field name fallback: prompt_tokens");

// 完全缺失 → WARN
log::warn!("[Claude/OpenRouter] finish_reason chunk has no usage, using latest_usage or defaults");

Checklist / 检查清单

  • Verified fix resolves VSCode Extension crash issue
  • Tested with DashScope (Alibaba Cloud Bailian) models
  • Rebased to latest origin/main successfully
  • Merged upstream improvements (message_delta caching strategy)
  • Preserved defensive usage handling logic
  • Updated CHANGELOG.md with bug fix documentation
  • Added comprehensive logging for debugging
  • pnpm typecheck passes / 通过 TypeScript 类型检查
  • pnpm format:check passes / 通过代码格式检查
  • cargo clippy passes (if Rust code changed) / 通过 Clippy 检查(如修改了 Rust 代码)
  • Updated i18n files if user-facing text changed / 如修改了用户可见文本,已更新国际化文件

Note: Checklist items marked with [ ] should be run before final merge.

Additional Context / 额外信息

Related Commits / 相关提交

  • Upstream commit 21e2d68: fix(proxy): preserve scoped reasoning_content for tool calls
  • Upstream commit 6441bc5: fix(proxy): dedupe streaming message_delta
  • This commit 1d679cf: fix(dashscope): enhance usage parsing robustness

Testing Environment / 测试环境

  • Provider: DashScope (Alibaba Cloud Bailian)
  • Model: qwen-max, qwen-plus
  • Client: VSCode Extension with Claude support
  • Error observed: Cannot read properties of null (reading 'output_tokens')

Future Improvements / 未来改进方向

  1. 计费补全机制:后续 chunk 可能补全 usage,需要统计完整性验证
  2. Provider 特化处理:针对特定 provider 的 usage 模式优化日志级别
  3. 单元测试扩展:添加畸形 usage 对象的测试用例

Commit Details:

  • Commit ID: 1d679cf
  • Author: MaGang magangucas@126.com
  • Files changed: 4
  • Lines changed: +187, -12
  • Rebase status: Successfully merged with origin/main

…ashes

Enhanced build_anthropic_usage_from_responses() to handle null, missing, empty,
and partial usage fields gracefully. This prevents VSCode Extension crashes with
"Cannot read properties of null (reading 'output_tokens')" when connecting to
DashScope (Alibaba Cloud Bailian) models.

Changes:
- Added defensive null checks and empty object detection
- Implemented OpenAI field name fallbacks (prompt_tokens/completion_tokens)
- Added comprehensive logging for malformed usage scenarios
- Fixed streaming SSE event handlers with null-safe usage access
- Preserved cache token fields even when input/output tokens are missing

This ensures the proxy never crashes on malformed Responses API usage objects,
returning valid Anthropic-compatible usage structures (input_tokens/output_tokens)
in all cases.
@magucas
Copy link
Copy Markdown
Author

magucas commented Apr 28, 2026

fix bug for ISSUE: #2422

@farion1231
Copy link
Copy Markdown
Owner

@codex review

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Keep them coming!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants