Skip to content

feat(agent): add structured category and severity fields to review output #16

@Fanduzi

Description

@Fanduzi

Problem Statement

ocr review machine-readable output currently exposes finding text, location, and suggestion information, but does not appear to expose structured category and severity fields for each finding.

This makes CI action and script integrations harder to implement cleanly:

  • Findings cannot be reliably sorted, grouped, or filtered by type or importance.
  • CI scripts cannot render consistent category or severity labels without re-interpreting natural-language comment text.
  • Build gating based on severity (e.g., fail on critical issues) requires fragile text parsing.
  • This is especially relevant for GitHub Actions and GitLab CI integrations that parse ocr output and publish structured PR/MR comments.

Similar structured metadata is common in code review tools and makes automated PR/MR comments significantly easier to scan and act on.

Proposed Solution

Add optional category and severity fields to each finding in JSON and agent output.

Suggested severity values:

Value Meaning
critical Security vulnerability, data loss risk, system crash
high Significant bug, functional failure, performance regression
medium Moderate issue, edge-case problem, maintainability concern
low Style, readability, minor best-practice suggestion
info Informational, no action required

Suggested category values:

Value Meaning
bug Correctness issue, logic error
security Security vulnerability, unsafe pattern
performance Performance regression, resource concern
maintainability Readability, complexity, refactoring opportunity
test Missing or inadequate test coverage
style Formatting, naming, convention
documentation Missing or incorrect documentation
other Does not fit the above categories

Backward compatibility:

  • New fields are optional at first; existing fields remain unchanged.
  • Downstream consumers that do not use category or severity are unaffected.

JSON output example:

{
  "path": "internal/auth/handler.go",
  "content": "Missing input validation on user-supplied email field.",
  "existing_code": "email := r.FormValue(\"email\")",
  "suggestion_code": "email := r.FormValue(\"email\")\nif !isValidEmail(email) { ... }",
  "start_line": 42,
  "end_line": 42,
  "category": "security",
  "severity": "high"
}

Comment rendering example:

A CI action or script could render these fields as badges in PR/MR comments:

![category](https://img.shields.io/badge/category-security-red)
![severity](https://img.shields.io/badge/severity-high-orange)

**Issue:** Missing input validation on user-supplied email field.

category severity

CI integrations could also:

  • Sort findings by severity (critical first, info last)
  • Group findings by category in a summary table
  • Fail or warn the build when critical or high findings are present

Alternatives Considered

  • Keep output free-text only. This avoids schema changes but forces every CI integration to re-parse comment text to extract classification, leading to inconsistent rendering across tools.
  • Add only severity without category. This is simpler but loses the ability to filter by issue type (e.g., surface only security findings).
  • Add confidence alongside category/severity. This could be useful but adds scope; it can be a separate follow-up.
  • Let each CI integration classify findings independently. This pushes complexity to every consumer and risks inconsistent classification across the ecosystem.

Affected Area

  • Output / Formatting
  • Review Agent / LLM interaction
  • CI / Integration
  • Documentation

Additional Context

The maintainers suggested opening a separate issue for structured category and severity fields, and PR #11 shows the current CI integration approach.

Design questions for discussion:

  1. Should values be strict enums, or should the schema allow extensible strings?
  2. Should critical and info be included from the start, or should OCR begin with high | medium | low and expand later?
  3. Should CI integrations be responsible for rendering, filtering, and failing builds based on these fields, or should the CLI also expose filtering flags (e.g., --severity critical,high)?
  4. Should a separate confidence field be considered in a future iteration?

Acceptance criteria:

  • JSON and agent output includes category and severity per finding when available
  • Documentation explains allowed values and their intended semantics
  • Existing consumers that do not use these fields remain unaffected
  • CI integrations can render labels or badges, filter findings, and optionally gate on severity
中文说明

问题陈述

ocr review 的机器可读输出目前包含 finding 的文本、位置和建议信息,但没有为每个 finding 暴露结构化的 category(类别)和 severity(严重程度)字段。

这使得 CI Action 和脚本集成难以干净地实现:

  • 无法对 finding 进行可靠的排序、分组或按类型/重要性过滤。
  • CI 脚本无法在不重新解析自然语言文本的情况下渲染一致的类别或严重程度标签。
  • 基于严重程度的构建门禁(例如,在发现关键问题时失败)需要脆弱的文本解析。
  • 这对于解析 ocr 输出并发布结构化 PR/MR 评论的 GitHub Actions 和 GitLab CI 集成尤为重要。

类似的结构化元数据在代码审查工具中很常见,能使自动化的 PR/MR 评论更易于浏览和处理。

建议方案

在 JSON 和 agent 输出中为每个 finding 添加可选的 categoryseverity 字段。

建议的 severity 值:

含义
critical 安全漏洞、数据丢失风险、系统崩溃
high 重大 bug、功能失败、性能回退
medium 中等问题、边界情况、可维护性隐患
low 代码风格、可读性、轻微最佳实践建议
info 信息性提示,无需操作

建议的 category 值:

含义
bug 正确性问题、逻辑错误
security 安全漏洞、不安全的模式
performance 性能回退、资源问题
maintainability 可读性、复杂度、重构机会
test 缺失或不充分的测试覆盖
style 格式、命名、编码规范
documentation 缺失或不正确的文档
other 不属于以上类别

向后兼容:

  • 新字段初期为可选,现有字段保持不变。
  • 不使用 categoryseverity 的下游消费者不受影响。

JSON 输出示例:

{
  "path": "internal/auth/handler.go",
  "content": "Missing input validation on user-supplied email field.",
  "existing_code": "email := r.FormValue(\"email\")",
  "suggestion_code": "email := r.FormValue(\"email\")\nif !isValidEmail(email) { ... }",
  "start_line": 42,
  "end_line": 42,
  "category": "security",
  "severity": "high"
}

评论渲染示例:

CI Action 或脚本可以在 PR/MR 评论中将这些字段渲染为 badge:

![category](https://img.shields.io/badge/category-security-red)
![severity](https://img.shields.io/badge/severity-high-orange)

**Issue:** Missing input validation on user-supplied email field.

category severity

CI 集成还可以:

  • 按严重程度排序 finding(critical 在前,info 在后)
  • 在摘要表格中按类别分组 finding
  • 当存在 criticalhigh 级别的 finding 时失败或发出警告

考虑过的替代方案

  • 仅保留自由文本输出。 这避免了 schema 变更,但迫使每个 CI 集成重新解析评论文本来提取分类,导致跨工具的渲染不一致。
  • 仅添加 severity 不添加 category 这更简单,但失去了按问题类型过滤的能力(例如,仅展示安全相关的 finding)。
  • 同时添加 confidence 字段。 这可能有用但增加了范围;可以作为单独的后续工作。
  • 让每个 CI 集成独立分类 finding。 这将复杂性推给每个消费者,并存在跨生态系统分类不一致的风险。

影响范围

  • Output / Formatting(输出 / 格式化)
  • Review Agent / LLM interaction(审查 Agent / LLM 交互)
  • CI / Integration(CI / 集成)
  • Documentation(文档)

补充上下文

维护者建议为结构化的 category 和 severity 字段单独开 issue,PR #11 展示了当前的 CI 集成方式。

待讨论的设计问题:

  1. 值应该是严格枚举,还是允许可扩展的字符串?
  2. 是否从一开始就包含 criticalinfo,还是 OCR 先从 high | medium | low 开始,后续再扩展?
  3. CI 集成是否应负责基于这些字段进行渲染、过滤和构建失败判定,还是 CLI 也应暴露过滤参数(例如 --severity critical,high)?
  4. 是否应在后续迭代中考虑单独的 confidence 字段?

验收标准:

  • JSON 和 agent 输出在可用时为每个 finding 包含 categoryseverity
  • 文档说明允许的值及其预期语义
  • 不使用这些字段的现有消费者不受影响
  • CI 集成可以渲染标签或 badge、过滤 finding,并可选地基于严重程度进行门禁控制

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions