Skip to content

[BUG] Chunk Overlap Calculation Uses Wrong Formula (Divides by 40) #22

@EnthusiasticTech

Description

@EnthusiasticTech

Project

vgrep

Description

The chunk overlap calculation in chunk_content divides chunk_overlap by 40, which doesn't match the documented behavior. With the default chunk_overlap = 64 characters, the actual overlap is only ~1 line instead of 64 characters. This bug exists in three separate code locations due to code duplication.

Affected Code Locations (3 duplicates):
src/core/indexer.rs:354-356 (Indexer)
src/core/indexer.rs:730-732 (ServerIndexer)
src/watcher.rs:418-420 (FileWatcher)

Error Message

Debug Logs

============================================================
CHUNK OVERLAP CALCULATION BUG
============================================================

Config: chunk_overlap = 64 characters
Code calculates: 64 / 40 = 1 lines overlap

Result: Only 1 line(s) of overlap!

Expected: ~64 characters of overlap (maybe 2-3 lines)
Actual: 1 line of overlap (probably ~40 chars)
============================================================

System Information

Bounty Version: 0.1.0
OS: Ubuntu 24.04 LTS
CPU: AMD EPYC-Genoa Processor (8 cores)
RAM: 15 GB

Screenshots

No response

Steps to Reproduce

  1. Examine the code at src/core/indexer.rs lines 354-356:
    let overlap_start = if line_idx > 0 { line_idx.saturating_sub(self.chunk_overlap / 40)} else { 0};
  2. Calculate: 64 / 40 = 1 (integer division)
  3. Index a file with multiple chunks and examine the actual overlap

Expected Behavior

The overlap should be approximately 64 characters (default value) between chunks, which would be roughly 2-3 lines of code depending on line length.

Actual Behavior

The overlap is only 1 line due to the arbitrary division by 40:
Config: chunk_overlap = 64 charactersCode calculates: 64 / 40 = 1 line overlap

Additional Context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingvalidValid issuevgrep

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions