Project
vgrep
Description
The should_index() function includes an empty string "" in its extension match pattern, causing all files without extensions to be indexed. This includes compiled binaries, core dumps, lock files, and other non-text files that should be excluded.
Error Message
# May produce errors like:
Failed to read file: stream did not contain valid UTF-8
# Or silently corrupt the index with binary content
Debug Logs
$ RUST_LOG=debug vgrep index
# Shows attempts to read binary files without extensions
System Information
OS: Ubuntu 22.04
vgrep version: 0.1.0
Screenshots
No response
Steps to Reproduce
- Create a directory with mixed files:
echo "valid source" > test.rs
cp /bin/ls ./my_binary # Or any binary without extension
- Run
vgrep index
- Observe that vgrep attempts to index
my_binary
- Check logs/output for UTF-8 errors or observe the binary in the database
Expected Behavior
Only known text/source file types should be indexed. Files without extensions should only be indexed if they match specific known names (Dockerfile, Makefile, etc.).
Actual Behavior
All files without extensions are matched by | "" in the extension pattern and are attempted to be indexed.
Additional Context
Files affected:
src/core/indexer.rs → should_index() (line 310)
src/watcher.rs → should_index() (line 244)
Problematic code:
matches!(
ext.as_str(),
"rs" | "py" | ... | "" // Matches ALL extensionless files
)
Project
vgrep
Description
The
should_index()function includes an empty string""in its extension match pattern, causing all files without extensions to be indexed. This includes compiled binaries, core dumps, lock files, and other non-text files that should be excluded.Error Message
Debug Logs
$ RUST_LOG=debug vgrep index # Shows attempts to read binary files without extensionsSystem Information
Screenshots
No response
Steps to Reproduce
vgrep indexmy_binaryExpected Behavior
Only known text/source file types should be indexed. Files without extensions should only be indexed if they match specific known names (Dockerfile, Makefile, etc.).
Actual Behavior
All files without extensions are matched by
| ""in the extension pattern and are attempted to be indexed.Additional Context
Files affected:
src/core/indexer.rs→should_index()(line 310)src/watcher.rs→should_index()(line 244)Problematic code: