Skip to content

Commit 64b8c4f

Browse files
author
Don Johnson
committed
added doc
added doc
1 parent d1f3393 commit 64b8c4f

File tree

1 file changed

+247
-0
lines changed

1 file changed

+247
-0
lines changed

grep-doc/doc.md

+247
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,247 @@
1+
## 1. Basic Syntax
2+
3+
```bash
4+
grep [OPTIONS] PATTERN [FILE...]
5+
```
6+
7+
- **PATTERN**: A (basic) regular expression unless using extended mode (e.g., with `-E`).
8+
- **FILE...**: One or more files to be searched. If no files are specified, **grep** reads from standard input.
9+
10+
### Common Variants
11+
12+
- **grep**: Uses Basic Regular Expressions (BRE) by default.
13+
- **egrep** (or `grep -E`): Uses Extended Regular Expressions (ERE).
14+
- **fgrep** (or `grep -F`): Interprets pattern as a fixed string, not a regex.
15+
16+
> **POSIX Note**: The official POSIX standard only specifies **grep** with basic regex. Using `-E` for extended syntax and `-F` for fixed strings are allowed as extensions but are widely supported.
17+
18+
---
19+
20+
## 2. Pattern Basics
21+
22+
### 2.1. Anchors
23+
- `^` : Match start of line
24+
- `$` : Match end of line
25+
26+
Example:
27+
```bash
28+
grep "^abc" file.txt
29+
```
30+
Search lines starting with **abc**.
31+
32+
### 2.2. Special Characters (BRE)
33+
- `.` : Match any single character (except newline in most implementations)
34+
- `[ ]` : Match any single character within brackets
35+
- `[^ ]` : Match any single character **not** in brackets
36+
- `*` : Match **zero or more** of the preceding element
37+
38+
> **POSIX Tip**: In basic regex, `+`, `?`, and `|` are not available unless escaped or used in ERE mode (`-E`).
39+
> - `\+` : Match one or more (BRE)
40+
> - `\?` : Match zero or one (BRE)
41+
> - `\|` : Alternation (OR) (BRE)
42+
43+
### 2.3. Extended Regex (ERE)
44+
When using `grep -E` or `egrep`, you get:
45+
- `+` : One or more
46+
- `?` : Zero or one
47+
- `|` : Alternation (OR)
48+
- `()` : Grouping without backslash
49+
50+
Example ERE:
51+
```bash
52+
grep -E "(abc|xyz)+" file.txt
53+
```
54+
Matches lines containing one or more occurrences of either `abc` or `xyz`.
55+
56+
---
57+
58+
## 3. Common Options
59+
60+
### 3.1. Matching Options
61+
- **-i** : Ignore case
62+
- **-v** : Invert match (show lines **not** matching)
63+
- **-x** : Match whole lines only (line must match the pattern entirely)
64+
- **-F** : Interpret pattern as a fixed string (no regex)
65+
- **-E** : Use Extended Regular Expressions
66+
67+
#### Example
68+
```bash
69+
grep -i "hello" file.txt # case-insensitive match
70+
grep -v "ERROR" logfile.txt # lines without "ERROR"
71+
grep -x "foobar" file.txt # lines that are exactly "foobar"
72+
grep -F "literal string" file.txt
73+
```
74+
75+
### 3.2. Output Control
76+
- **-n** : Prefix each matching line with the line number
77+
- **-l** : Print only names of files with matches, not the lines
78+
- **-L** : Print only names of files **without** matches
79+
- **-c** : Print only a count of matching lines
80+
- **-H** : Print file name with each match (on by default when searching multiple files)
81+
- **-h** : Do not print file names
82+
- **-s** : Suppress error messages about nonexistent or unreadable files
83+
84+
#### Example
85+
```bash
86+
grep -n "pattern" file.txt # line numbers
87+
grep -l "TODO" *.c # just file names that have "TODO"
88+
grep -L "TODO" *.c # file names that don't have "TODO"
89+
grep -c "foobar" file1 file2 # count matches in each file
90+
```
91+
92+
### 3.3. Context Control
93+
*(Not strictly defined in older POSIX specs, but typically available in modern grep)*
94+
- **-A N** : Print **N** lines **After** each match
95+
- **-B N** : Print **N** lines **Before** each match
96+
- **-C N** : Print **N** lines of context (before & after) for each match
97+
98+
#### Example
99+
```bash
100+
grep -C 2 "ERROR" logfile.txt # 2 lines before & after
101+
grep -B 3 "Segmentation" syslog # 3 lines before
102+
grep -A 1 "TODO" mycode.c # 1 line after
103+
```
104+
105+
### 3.4. File Recursion
106+
*(This is not strictly in POSIX, but supported by many `grep` implementations.)*
107+
- **-r or -R** : Recursively search directories for matches
108+
- **--exclude** : Exclude files/directories that match a pattern
109+
- **--include** : Search only files that match a pattern
110+
111+
#### Example
112+
```bash
113+
# Recursively search for "main" in all .c and .h files
114+
grep -r --include='*.{c,h}' "main" src/
115+
116+
# Recursively search all files but exclude *.md
117+
grep -r --exclude='*.md' "TODO" .
118+
```
119+
120+
---
121+
122+
## 4. Basic vs Extended Regular Expressions
123+
124+
| Feature | Basic (BRE) | Extended (ERE) |
125+
|-----------------------|---------------------|-----------------------------|
126+
| `.` | Any single char | Same |
127+
| `[...]` | Character class | Same |
128+
| `[^...]` | Negated char class | Same |
129+
| `*` | Zero or more | Same |
130+
| `\+` | One or more | `+` (no backslash) |
131+
| `\?` | Zero or one | `?` (no backslash) |
132+
| `\|` | Alternation (OR) | `|` (no backslash) |
133+
| `\(`, `\)` | Grouping | `(`, `)` (no backslash) |
134+
135+
**Examples:**
136+
137+
- **BRE**: `grep "fo\+bar" file.txt` matches `foobar`, `foooobar`, etc.
138+
- **ERE**: `grep -E "fo+bar" file.txt` does the same.
139+
140+
---
141+
142+
## 5. Useful Patterns & Tricks
143+
144+
1. **Search for lines beginning with a digit**:
145+
```bash
146+
grep '^[0-9]' file.txt
147+
```
148+
2. **Search for lines ending with a period**:
149+
```bash
150+
grep '\.$' file.txt
151+
```
152+
3. **Match lines containing either 'cat' or 'dog' (BRE)**:
153+
```bash
154+
grep "cat\|dog" file.txt
155+
```
156+
*(ERE: `grep -E "cat|dog" file.txt`)*
157+
4. **Search for an empty line**:
158+
```bash
159+
grep '^$' file.txt
160+
```
161+
5. **Count matches across multiple files**:
162+
```bash
163+
grep -c "TODO" *.txt
164+
```
165+
6. **Show line numbers with matches**:
166+
```bash
167+
grep -n "ERROR" logfile.txt
168+
```
169+
7. **Search ignoring case**:
170+
```bash
171+
grep -i "hello" file.txt
172+
```
173+
8. **Invert search (find lines that do *not* match)**:
174+
```bash
175+
grep -v "WARN" logfile.txt
176+
```
177+
178+
---
179+
180+
## 6. Performance Tips
181+
182+
1. **Fixed-String Search** (`-F`):
183+
If you don’t need regex power, `grep -F` is often faster (implementation-dependent).
184+
185+
2. **Small Pattern Files** (`-f file`):
186+
If you have many patterns to match, store them in a file and use `-f`.
187+
188+
3. **Binary Data** (`-a` or `-I` in some implementations):
189+
Forcibly treat files as text, preventing some greps from bailing on binary matches.
190+
*Not strictly POSIX but widely available.*
191+
192+
---
193+
194+
## 7. Quick Reference Table
195+
196+
| **Option** | **Meaning** |
197+
|---------------|-----------------------------------------------------|
198+
| `-i` | Case-insensitive search |
199+
| `-v` | Invert the match |
200+
| `-c` | Print only count of matching lines |
201+
| `-n` | Show line numbers for matches |
202+
| `-l` | Show filenames with matches |
203+
| `-L` | Show filenames without matches |
204+
| `-h` | Hide filename from output |
205+
| `-H` | Always show filename in output |
206+
| `-s` | Silent mode (suppress error messages) |
207+
| `-x` | Force pattern to match entire line |
208+
| `-F` | Pattern is a fixed string, not a regex |
209+
| `-E` | Use extended regex (like `egrep`) |
210+
| `-f FILE` | Take patterns from a file |
211+
| `-A NUM` | Print NUM lines after a match |
212+
| `-B NUM` | Print NUM lines before a match |
213+
| `-C NUM` | Print NUM lines before & after a match |
214+
| `-r / -R` | Recursive search through directories (not POSIX) |
215+
| `--exclude` | Exclude files by pattern (extension) |
216+
| `--include` | Include only files matching pattern |
217+
218+
---
219+
220+
## 8. Common Workflows
221+
222+
1. **Searching inside compressed logs** (using a pipe):
223+
```bash
224+
zcat /var/log/messages*.gz | grep "somepattern"
225+
```
226+
2. **Combining with `find` for custom recursion** (POSIX-friendly way):
227+
```bash
228+
find . -type f -exec grep "pattern" {} +
229+
```
230+
3. **Filtering system logs**:
231+
```bash
232+
tail -f /var/log/syslog | grep --line-buffered "ERROR"
233+
```
234+
4. **Multiple patterns** (store them in a file):
235+
```bash
236+
grep -f patterns.txt biglogfile
237+
```
238+
239+
---
240+
241+
# Final Words
242+
243+
- **Know Your Implementation**: While **grep** is standardized by POSIX, many options (like `-r`, `-A`, `-B`, `-C`) are widespread extensions.
244+
- **When in Doubt**: Stick to the core options (`-i`, `-v`, `-c`, `-n`, etc.) and basic/extended regex constructs that are guaranteed across Unix-like systems.
245+
- **Extended vs Basic**: Use `grep -E` (or `egrep`) if you need plus signs (`+`), question marks (`?`), or alternation without escaping.
246+
247+
This covers the main use cases and commands you’ll rely on for effective text searching with **grep** in a POSIX-compliant manner (plus a few commonly available extensions). Happy searching!

0 commit comments

Comments
 (0)