Skip to content

Commit 7645ed5

Browse files
authored
Using the tesseract CLI tool
1 parent 2f1f03b commit 7645ed5

File tree

1 file changed

+17
-0
lines changed

1 file changed

+17
-0
lines changed

tesseract/tesseract-cli.md

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
# Using the tesseract CLI tool
2+
3+
Tesseract OCR has a command-line utility which is woefully under-documented. Thanks to [Alexandru Nedelcu](https://alexn.org/blog/2020/11/11/organize-index-screenshots-ocr-macos.html) I figured out how to use it today.
4+
5+
To install on macOS:
6+
7+
brew install tesseract
8+
9+
To convert an image into an annotated PDF (which you can then copy and paste text out of, and which will be correctly indexed by Spotlight):
10+
11+
tesseract image.png output-file -l eng pdf
12+
13+
The second `output-file` argument there is the path and filename of the output - note that I didn't include a `.pdf` extension because Tesseract adds that automatically - so the output will be in a file called `output-file.pdf`.
14+
15+
To get out just the plain text:
16+
17+
tesseract image.png output-file -l eng txt

0 commit comments

Comments
 (0)