OCR2MD

A powerful OCR tool that converts various document formats to Markdown using advanced AI models. OCR2MD combines intelligent text recognition with Markdown formatting, making it easy to transform your documents while preserving their original structure.

Features

Multiple Format Support
- PDF files
- Office documents (Word, Excel, PowerPoint)
- Images (JPG, PNG, GIF, BMP, TIFF, WebP)
- Web documents (HTML, XML)
- Text files (TXT, RTF)
Intelligent Recognition
- Advanced AI-powered OCR
- Multiple AI model support
- Maintains original document formatting
- Accurate text and layout recognition
Easy to Use
- Modern graphical interface
- Simple drag-and-drop operation
- Real-time conversion progress
- Batch processing support

Installation

Install Python (3.8 or higher)

# Download from https://www.python.org/downloads/
# Make sure to check "Add Python to PATH" during installation

Install system dependencies

# macOS (using Homebrew)
brew install libreoffice
brew install graphicsmagick
brew install poppler

# Linux (Ubuntu/Debian)
sudo apt-get install libreoffice
sudo apt-get install graphicsmagick
sudo apt-get install poppler-utils

# Windows
# Download and install:
# - LibreOffice: https://www.libreoffice.org/download/
# - GraphicsMagick: http://www.graphicsmagick.org/download.html
# - Poppler: https://github.com/oschwartz10612/poppler-windows/releases/
#   After downloading Poppler, add its 'bin' directory to your system PATH

Create virtual environment

# Create venv
python -m venv venv

# Activate venv
# Windows
venv\Scripts\activate
# macOS/Linux
source venv/bin/activate

Install Python dependencies
```
pip install -r requirements.txt
```

Usage

Start the application
```
python main.py
```
Select file to convert
- Click "Browse" to select a file
- Supported formats: PDF, DOC, DOCX, XLS, XLSX, PPT, PPTX, Images, etc.
Configure conversion
- Select AI model from dropdown list
- Optionally specify pages to convert (e.g., "1,2,3" or "1-5")
Start conversion
- Click "Start Convert" to begin
- Monitor progress in status window
- Converted files will be saved to Downloads folder

Configuration

Create config.yaml in the project root:

vendors:
  - name: "Vendor Name"
    models:
      - name: "Model Name"
        model_id: "model-id"
        env_vars:
          - key: "API_KEY"
            value: "your-api-key"

Requirements

Python 3.8 or higher
LibreOffice
GraphicsMagick
Poppler (PDF processing and text extraction)
Required Python packages listed in requirements.txt

License

This project is licensed under the MIT License - see the LICENSE file for details.

Contributing

Fork the repository
Create your feature branch
Commit your changes
Push to the branch
Create a Pull Request

Acknowledgments

Thanks to all the open source projects that made this possible
Special thanks to the AI model providers

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
images		images
resources		resources
.gitignore		.gitignore
DEPENDENCIES.md		DEPENDENCIES.md
LICENSE		LICENSE
README.md		README.md
config.yaml		config.yaml
converter.py		converter.py
create_icon.py		create_icon.py
gui.py		gui.py
install.py		install.py
instruction.md		instruction.md
main.py		main.py
requirements.txt		requirements.txt
settings.py		settings.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OCR2MD

Features

Installation

Usage

Configuration

Requirements

License

Contributing

Acknowledgments

About

Releases

Packages

Languages

License

jiawuwei/ocr2md

Folders and files

Latest commit

History

Repository files navigation

OCR2MD

Features

Installation

Usage

Configuration

Requirements

License

Contributing

Acknowledgments

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages