Skip to content

Conversation

@Taiwrash
Copy link

@Taiwrash Taiwrash commented Feb 3, 2026

micro-fix: Optimized CSV reading by counting rows in a single pass and removing redundant file read.

Description

Updated the current implementation of the csv_read tool in aden_tools as it is inefficient for large datasets and contains a subtle counting bug for complex CSV files. -

Type of Change

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update
  • Refactoring (no functional changes)

Related Issues

Fixes #1737

Changes Made

  • Initialize total_rows = 0 and increment it while iterating through the csv.DictReader.
  • Handle offset and limit logic within the same loop.
  • Remove the second with open block entirely.

Testing

Describe the tests you ran to verify your changes:

  • Unit tests pass (cd core && pytest tests/)
  • Lint passes (cd core && ruff check .)
  • Manual testing performed

Checklist

  • My code follows the project's style guidelines
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

Screenshots (if applicable)

Screenshot 2026-01-28 at 12 18 21 Screenshot 2026-01-28 at 12 19 46

Optimized CSV reading by counting rows in a single pass and removing the need for a second read for total row count.
@Som-0619
Copy link

Som-0619 commented Feb 4, 2026

can i work on this to complete the remaining tasks....need assign tag

@Taiwrash
Copy link
Author

Taiwrash commented Feb 4, 2026

can i work on this to complete the remaining tasks....need assign tag

this PR fixed it. if you find any other optimised approach, feel free to create an issue for it or drop your recommendations here.

@Som-0619

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[PERF]: Optimize csv_read to avoid redundant file reads

2 participants