This project is my implementation of the Codecrafters.io grep challenge, where the goal is to build a simplified version of the classic Unix tool grep — completely from scratch — using Rust.
The project is not just a line matcher: I went deeper into regex parsing and built a custom pattern engine that supports literals, character classes, anchors, quantifiers, groups, and alternation.
At the core, I implemented a mini regex engine.
- Literals → match exact characters (e.g.,
abc) - Wildcards →
.matches any single character - Character classes:
\d→ digits\w→ identifiers (a-z,A-Z,_)[abc]→ match one of given characters[^abc]→ match any character except those
- Anchors:
^→ start of line$→ end of line^...$→ exact line match
- Quantifiers:
+→ one or more?→ zero or one
- Alternation:
(cat|dog)→ expands into multiple sub-patterns (cat,dog)
I wrote a full parser that converts regex strings into an List of tokens, and a recursive matcher that evaluates input against this structure.
- Search through one or multiple files.
- Recursive folder search with
-r.
# From file(s)
./rusty_grep -E "pattern" file1.txt file2.txt
# Recursive search
./rusty_grep -r -E "pattern" <directory>- 0 → at least one match found
- 1 → no matches found
- errors (invalid input, file not found, etc.)
- Input regex is expanded (handles alternation like
(cat|dog)).
For example, if the pattern isI love (cat|dog), it is treated as two separate patterns:I love catI love dog
- Each pattern string is parsed into a list of tokens
(e.g.,Token::Literal,Token::CharClass,Token::GroupClass, etc.).
For each input line:
- The regex engine attempts to match the tokens recursively.
- Supports multiple "remaining string" states when quantifiers are applied.
- If any subpattern matches, the line is considered a match.
- Opens each file with a buffered reader.
- Passes each line through the matcher.
- Collects matching lines and prints them.
I wrote unit tests for the parser and matcher to ensure correctness.
\d→ correctly parses into a Digit class.abc+→ parses intoa,b, andc+.(cat|dog)→ expands into two subpatterns.- Anchors (
^,$) → tested on multiple inputs.
If you don’t already have Rust installed, get it from rustup.rs:
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | shgit clone https://github.com/mostafa630/rusty_grep.git
cd rusty_grepcargo build --releaseYou can create the test folder anywhere on your system. For example:
mkdir ~/my_test_files
echo "the cat sleeps" > ~/my_test_files/animals1.txt
echo "a dog runs" > ~/my_test_files/animals2.txt
echo "a bat flies" > ~/my_test_files/mixed.txt
echo "dogs and cats are friends" > ~/my_test_files/story.txtNavigate to the build output:
cd target/release./rusty_grep -E "(cat|dog)" ~/my_test_files/animals1.txt ~/my_test_files/animals2.txt###Output
the cat sleeps
a dog runs./rusty_grep -r -E "(cat|dog)" ~/my_test_files###Output
animals1.txt:the cat sleeps
animals2.txt:a dog runs
story.txt:dogs and cats are friendsYou can also pipe text directly into rusty_grep without files:
echo "hello cat" | ./rusty_grep -E "cat"###Output
input matched the pattern