Feat/prompt tuning v3 data driven#44
Conversation
Analyzed a real-world PR (off-by-one arithmetic bug in distributed render chunk plan) to validate and refine prompts: - SYSTEM_ROLE: add '算术误差(off-by-one/舍入)' to correctness focus area - OUTPUT_FORMAT: add 'Arithmetic' to risk types - CHUNK_CROSS_CHUNK_HINT: add item 6 — test-implementation alignment check - GlobalPromptBuilder: add '实现与测试的对应性' to L3 reasoning focus; add rule 5 to REASONING_RULES for test coverage risk detection - Tests: update assertions for new content Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…t results Based on 10-run test against TheAlgorithms/Java#7427 (QR Decomposition): Fixes applied: 1. SYSTEM_ROLE: add 3 mandatory checks for numerical computing review (numerical stability, dimension bounds, floating-point comparison) 2. OUTPUT_FORMAT: add risk rating rules (must-use table) and summary template ('本质,影响范围') 3. Task list: add item 5 for matrix dimension boundary (m < n) check 4. Global OUTPUT_FORMAT: expand issueType enum with NUMERICAL_ACCURACY, ALGORITHM_CHOICE, PERFORMANCE, SECURITY_VULNERABILITY 5. REASONING_RULES: constrain architectureSuggestions to PR scope only Tests updated for renumbered items (5->6/7) and new enum values. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ITHM_CHOICE, PERFORMANCE etc. Java IssueType enum now matches the expanded issueType list in prompts. New types: INTERFACE_INCONSISTENCY, REPEAT_LOGIC, SECURITY_VULNERABILITY, NUMERICAL_ACCURACY, ALGORITHM_CHOICE, PERFORMANCE. Old types preserved for backward compat. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (6)
📝 WalkthroughWalkthroughThis PR enhances the AI code review system by expanding issue type categories from 6 to 12, adding test-coverage correspondence verification to global analysis, and introducing strict numeric stability and matrix dimension boundary validation rules at the chunk-review level. ChangesReview Capability Enhancement
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Poem
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
标题
feat: data-driven prompt tuning v3 — 5 targeted fixes from 10-run test results + IssueType enum alignment
功能描述
基于 10 轮实测用例(PR: TheAlgorithms/Java#7427, QR Decomposition Gram-Schmidt)暴露的模型输出问题,完成5处提示词定向优化,并同步对齐 IssueType 枚举定义。
实测发现问题:
实现思路
测试方式
./mvnw test 全部 296 个测试用例通过
Summary by CodeRabbit