Claude Opus 4 深度评测：代码生成能力全面超越 GPT-4o

--/-- · --°C

--/-- --°C

[[ $t('auth.login') ]] [[ $t('auth.register') ]]

B.News

2026-03-29 · 31135 [[ $t('article.detail.read') ]]

Anthropic 最新旗舰模型在编程基准测试中取得突破性成绩。

Anthropic 发布的 Claude Opus 4 在 SWE-bench Verified 基准测试中得分 72.5%，大幅超越 GPT-4o 的 61.2%。在 HumanEval 编码测试中正确率达 96.3%。