Skip to content

Conversation

@juiwenchen
Copy link
Contributor

@juiwenchen juiwenchen commented Nov 5, 2025

Addressing #43 by adding simplified RST directive parser and feature to remove leading sequences of marked RST text

Documentation

  • Added a dedicated section (docs/source/components/rst_parser.rst) describing the new simplified RST parser, its purpose, and limitations.

RST block extraction and parsing:

  • Added a new parser module using Lark to parse extracted RST blocks, extracting directive type, argument, options, and content.
  • Updated MarkedRstConfig and related configuration files to support new fields: strip_leading_sequences, indented_spaces, and link_options for more stable RST extraction and parsing.

Integration and warning handling:

  • Integrated the RST parser into the extraction workflow in analyse, including logic to handle parsing errors and report warnings.
  • Extended the MarkedRst model to store parsed need data from RST blocks.

@MrChocolateMoose
Copy link

We are currently using this feature (by using this branch instead of 1.1.0). I can confirm that this is working for us.

Was finishing the implementation and getting this PR merged planned for 1.2?

@juiwenchen
Copy link
Contributor Author

juiwenchen commented Nov 27, 2025

We are currently using this feature (by using this branch instead of 1.1.0). I can confirm that this is working for us.

Was finishing the implementation and getting this PR merged planned for 1.2?

Didn't expect you tested it already! Very happy to receive such info, and thanks for it. I am currently working on its warning handling to be more user-friendly. And yes, this feature is planned for 1.2.

@juiwenchen juiwenchen marked this pull request as ready for review November 28, 2025 10:46
Copilot AI review requested due to automatic review settings November 28, 2025 10:46
@codecov-commenter
Copy link

Codecov Report

❌ Patch coverage is 82.92683% with 42 lines in your changes missing coverage. Please review.
✅ Project coverage is 89.10%. Comparing base (bb88d52) to head (01029db).

Files with missing lines Patch % Lines
tests/conftest.py 54.54% 20 Missing and 5 partials ⚠️
src/sphinx_codelinks/analyse/sn_rst_parser.py 92.30% 3 Missing and 3 partials ⚠️
src/sphinx_codelinks/analyse/analyse.py 88.88% 2 Missing and 3 partials ⚠️
src/sphinx_codelinks/analyse/projects.py 25.00% 2 Missing and 1 partial ⚠️
...codelinks/sphinx_extension/directives/src_trace.py 72.72% 1 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #44      +/-   ##
==========================================
- Coverage   89.38%   89.10%   -0.28%     
==========================================
  Files          29       31       +2     
  Lines        2392     2625     +233     
  Branches      289      327      +38     
==========================================
+ Hits         2138     2339     +201     
- Misses        158      179      +21     
- Partials       96      107      +11     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copilot finished reviewing on behalf of juiwenchen November 28, 2025 10:50
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds support for generating Sphinx-Needs items from marked RST directives embedded in source code comments. It introduces a simplified RST directive parser using the Lark parsing library to extract and parse RST blocks, making it possible to define needs using full RST directive syntax within code comments.

Key Changes

  • RST Parser Implementation: Added a new Lark-based parser (sn_rst_parser.py) to parse RST directive syntax from extracted comment blocks, extracting directive types, titles, options, and content
  • Configuration Extensions: Extended MarkedRstConfig with three new fields (strip_leading_sequences, indented_spaces, and link_options) to handle various comment styles and RST parsing requirements
  • Integration and Warning System: Integrated the parser into the analysis workflow with comprehensive error handling and warnings for invalid RST directives

Reviewed changes

Copilot reviewed 32 out of 33 changed files in this pull request and generated 12 comments.

Show a summary per file
File Description
src/sphinx_codelinks/analyse/sn_rst_parser.py New simplified RST parser module using Lark with grammar definitions, preprocessing, and transformation logic
src/sphinx_codelinks/analyse/analyse.py Integration of RST parser into extraction workflow with warning handling and link option processing
src/sphinx_codelinks/analyse/projects.py Extended warning collection to include RST parsing warnings from all projects
src/sphinx_codelinks/analyse/models.py Extended MarkedRst model to include parsed need data
src/sphinx_codelinks/analyse/utils.py Updated RST extraction logic with improved row offset calculation
src/sphinx_codelinks/config.py Added new configuration fields for RST parsing behavior
src/sphinx_codelinks/sphinx_extension/directives/src_trace.py Updated directive to handle both one-line needs and parsed RST blocks
src/sphinx_codelinks/cmd.py Added call to update_warnings in analyse command
tests/test_rst_parser.py Comprehensive test suite for RST parser with positive and negative test cases
tests/test_analyse.py Added integration tests for RST extraction and parsing
tests/fixture_files/analyse_rst.yml Test fixtures for various RST marker scenarios
tests/conftest.py Added test infrastructure for fixture file parametrization
tests/data/marked_rst/dummy_1.cpp Fixed spacing in inline RST marker
tests/data/dcdc/charge/demo_2.cpp Added RST block example with proper markers
docs/source/components/rst_parser.rst New documentation describing the simplified RST parser and its limitations
docs/source/components/configuration.rst Documented new configuration options for RST parsing
docs/source/components/analyse.rst Updated documentation with RST block examples
docs/source/components/directive.rst Updated to reflect support for both one-line and RST block needs
docs/source/development/change_log.rst Added release notes for new RST parsing feature
pyproject.toml Added Lark dependency

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

=========

.. attention:: ``src-trace`` directive currently only supports :ref:`one-line need definition <oneline>`.
.. attention:: ``src-trace`` directive do NOT supports :ref:`Sphinx-Needs ID Refs <analyse_need_id_refs>`.
Copy link

Copilot AI Nov 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grammar error: "do NOT supports" should be "does NOT support" (singular verb form).

Suggested change
.. attention:: ``src-trace`` directive do NOT supports :ref:`Sphinx-Needs ID Refs <analyse_need_id_refs>`.
.. attention:: ``src-trace`` directive does NOT support :ref:`Sphinx-Needs ID Refs <analyse_need_id_refs>`.

Copilot uses AI. Check for mistakes.
Comment on lines +1 to +2
"""Test script for RST directive Lark parser."""

Copy link

Copilot AI Nov 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The module docstring describes this as a "Test script" but this is actually a production module that implements RST parsing functionality. The docstring should describe the module's purpose in the codebase.

Suggested change
"""Test script for RST directive Lark parser."""
"""Production module for parsing reStructuredText (RST) directives using the Lark parser.
This module provides functionality to parse RST directive blocks and extract their components,
such as type, title, options, and content, for use in Sphinx code linking and analysis tools.
"""

Copilot uses AI. Check for mistakes.
"column": end_column,
},
}
need_directive: None | NeedDirectiveType | UnexpectedInput = None
Copy link

Copilot AI Nov 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable need_directive is initialized to None on line 340 and then immediately reassigned on line 341. The initial assignment to None is redundant and should be removed.

Suggested change
need_directive: None | NeedDirectiveType | UnexpectedInput = None
need_directive: None | NeedDirectiveType | UnexpectedInput

Copilot uses AI. Check for mistakes.
and isinstance(val, str)
):
# convert link options values to list
resolved[key] = val.split(",")
Copy link

Copilot AI Nov 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When splitting link options by comma on line 361, the resulting list items are not stripped of whitespace. This can be seen in the test snapshot where " IMPL_002" includes a leading space. The split values should be stripped to avoid inconsistencies. Consider using [item.strip() for item in val.split(",")] instead of just val.split(",").

Suggested change
resolved[key] = val.split(",")
resolved[key] = [item.strip() for item in val.split(",")]

Copilot uses AI. Check for mistakes.
Comment on lines +56 to +71
link_options_rst_marker:
dummy_1.c: |
/*
* @rst
* .. impl:: implement main function
* :id: REQ_001
* :links: IMPL_001, IMPL_002
*
* This is content for the main function implementation.
* @endrst
*/
int main() {
return 0;
}
link_options_rst_marker:
Copy link

Copilot AI Nov 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two test cases with the same key link_options_rst_marker (lines 56 and 71). In YAML, duplicate keys result in the second one overwriting the first, so only one test case will actually be used. One of these keys should be renamed (e.g., link_options_rst_marker_2) to ensure both test cases are executed.

Copilot uses AI. Check for mistakes.
Comment on lines +160 to +161
- Stripe leading spaces before the directive marker to get relative indentations.
- Stripe trailing spaces at the end
Copy link

Copilot AI Nov 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the docstring, "Stripe" should be "Strip" in both occurrences (lines 160-161). "Stripe" means to draw lines, while "Strip" means to remove.

Suggested change
- Stripe leading spaces before the directive marker to get relative indentations.
- Stripe trailing spaces at the end
- Strip leading spaces before the directive marker to get relative indentations.
- Strip trailing spaces at the end

Copilot uses AI. Check for mistakes.
def directive(self, name, *optionals):
# NAME,, optional title/options/content
need = {"type": name}
# flaten optionals
Copy link

Copilot AI Nov 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment "flaten optionals" has a typo. It should be "flatten optionals".

Suggested change
# flaten optionals
# flatten optionals

Copilot uses AI. Check for mistakes.
Comment on lines +20 to +23
- ``leading_sequences``: List of leading character sequences to strip from each line.

This option allows users to specify a list of leading character sequences (e.g., ``*``, ``-``) that should be stripped
from each line of the marked RST block before parsing.
Copy link

Copilot AI Nov 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The configuration field name in the code is strip_leading_sequences (line 76) but the documentation refers to it as leading_sequences (line 20). These should be consistent. The code uses the correct descriptive name strip_leading_sequences, so the documentation should be updated to match.

Suggested change
- ``leading_sequences``: List of leading character sequences to strip from each line.
This option allows users to specify a list of leading character sequences (e.g., ``*``, ``-``) that should be stripped
from each line of the marked RST block before parsing.
- ``strip_leading_sequences``: List of leading character sequences to strip from each line.
This option allows users to specify a list of leading character sequences (e.g., ``*``, ``-``) that should be stripped
from each line of the marked RST block before parsing, by setting the ``strip_leading_sequences`` option.

Copilot uses AI. Check for mistakes.
return line[1].rstrip()

def content_block(self, *lines):
# items is list of lines
Copy link

Copilot AI Nov 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment "items is list of lines" should be "items are a list of lines" for proper grammar.

Suggested change
# items is list of lines
# items are a list of lines

Copilot uses AI. Check for mistakes.
It is designed to only parse the RST text extracted by :ref:`RST markers <analyse_rst>`, focusing on specific directive types and their associated options and content.
By doing so, the parser avoids the complexity of a full reST parser while still capturing the essential structure needed for Sphinx-Needs integration from the source code.

The parser does't have the Sphinx-Needs directive validation logic. It only checks the syntax of the RST directives and extracts the directive type, argument, options, and content.
Copy link

Copilot AI Nov 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grammar error: "The parser does't have" should be "The parser doesn't have".

Suggested change
The parser does't have the Sphinx-Needs directive validation logic. It only checks the syntax of the RST directives and extracts the directive type, argument, options, and content.
The parser doesn't have the Sphinx-Needs directive validation logic. It only checks the syntax of the RST directives and extracts the directive type, argument, options, and content.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants