Name	Name	Last commit message	Last commit date
parent directory ..
README.md	README.md
mcp.json	mcp.json

Document Loader MCP

Read PDF, Word, Excel, PowerPoint, and image files directly in Claude Code.

Source: AWS Labs - Document Loader MCP

What It Does

Tool	File Types	Description
`read_document`	PDF, Word, Excel, PowerPoint	Extract text content from documents
`read_image`	PNG, JPG, GIF, BMP, TIFF, WEBP	Load images for visual analysis

Installation

Prerequisites

Node.js 18+ or Bun
Claude Code CLI

Step 1: Install the MCP Package

# Using npm
npm install -g @anthropic/mcp-document-loader

# Or using bun
bun add -g @anthropic/mcp-document-loader

Step 2: Add to Claude Code Settings

Edit your Claude Code settings file:

# macOS/Linux
~/.claude/settings.json

# Or open via Claude Code
claude config

Add this to the mcpServers section:

{
  "mcpServers": {
    "document-loader-mcp": {
      "command": "npx",
      "args": ["-y", "@anthropic/mcp-document-loader"],
      "env": {}
    }
  }
}

Alternative using Bun:

{
  "mcpServers": {
    "document-loader-mcp": {
      "command": "bunx",
      "args": ["@anthropic/mcp-document-loader"],
      "env": {}
    }
  }
}

Step 3: Verify Installation

# List available MCPs
mcp-cli servers

# Check the tools
mcp-cli tools document-loader-mcp

# Should show:
# document-loader-mcp/read_document
# document-loader-mcp/read_image

Usage

Reading Documents

# Check the schema first (REQUIRED)
mcp-cli info document-loader-mcp/read_document

# Read a PDF
mcp-cli call document-loader-mcp/read_document '{"file_path": "/path/to/file.pdf", "file_type": "pdf"}'

# Read a Word document
mcp-cli call document-loader-mcp/read_document '{"file_path": "/path/to/file.docx", "file_type": "docx"}'

# Read an Excel file
mcp-cli call document-loader-mcp/read_document '{"file_path": "/path/to/file.xlsx", "file_type": "xlsx"}'

# Read a PowerPoint
mcp-cli call document-loader-mcp/read_document '{"file_path": "/path/to/file.pptx", "file_type": "pptx"}'

Reading Images

# Check the schema first
mcp-cli info document-loader-mcp/read_image

# Read an image
mcp-cli call document-loader-mcp/read_image '{"file_path": "/path/to/image.png"}'

Parameters

read_document

Parameter	Type	Required	Description
`file_path`	string	Yes	Absolute path to the document
`file_type`	string	Yes	One of: `pdf`, `docx`, `doc`, `xlsx`, `xls`, `pptx`, `ppt`
`timeout_seconds`	integer	No	Timeout (default: 30, min: 5, max: 300)

read_image

Parameter	Type	Required	Description
`file_path`	string	Yes	Absolute path to the image
`timeout_seconds`	integer	No	Timeout (default: 30, min: 5, max: 300)

Auto-Detection in CLAUDE.md

Add this to your ~/.claude/CLAUDE.md to make Claude automatically use this MCP:

### Document Reading - Use `document-loader-mcp` (NOT built-in Read tool)

| File Extension | MCP Command |
|----------------|-------------|
| `.pdf` | `mcp-cli call document-loader-mcp/read_document '{"file_path": "/path/to/file.pdf", "file_type": "pdf"}'` |
| `.doc`, `.docx` | `mcp-cli call document-loader-mcp/read_document '{"file_path": "/path/to/file.docx", "file_type": "docx"}'` |
| `.ppt`, `.pptx` | `mcp-cli call document-loader-mcp/read_document '{"file_path": "/path/to/file.pptx", "file_type": "pptx"}'` |
| `.xls`, `.xlsx` | `mcp-cli call document-loader-mcp/read_document '{"file_path": "/path/to/file.xlsx", "file_type": "xlsx"}'` |

**Auto-detection rules:**
1. File path ends in .pdf, .doc, .docx, .ppt, .pptx, .xls, .xlsx → USE document-loader-mcp IMMEDIATELY
2. File path ends in .png, .jpg, .jpeg, .gif, .bmp, .tiff, .webp → USE document-loader-mcp/read_image
3. **NEVER ask "should I read this?"** - just read it automatically

Troubleshooting

"MCP not found"

# Reinstall
npm install -g @anthropic/mcp-document-loader

# Or try with npx directly
npx -y @anthropic/mcp-document-loader

"Timeout" errors

Increase the timeout for large files:

mcp-cli call document-loader-mcp/read_document '{"file_path": "/path/to/large.pdf", "file_type": "pdf", "timeout_seconds": 120}'

"File not found"

Always use absolute paths, not relative:

# Wrong
mcp-cli call document-loader-mcp/read_document '{"file_path": "file.pdf", "file_type": "pdf"}'

# Correct
mcp-cli call document-loader-mcp/read_document '{"file_path": "/Users/you/Documents/file.pdf", "file_type": "pdf"}'

Why Use This Instead of Built-in Read?

Feature	Built-in Read	document-loader-mcp
PDF extraction	Basic	Full text + structure
Word documents	No	Yes
Excel files	No	Yes
PowerPoint	No	Yes
Image analysis	Yes	Yes
Large files	Limited	Better handling

License

Apache 2.0 (AWS Labs)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Document Loader MCP

What It Does

Installation

Prerequisites

Step 1: Install the MCP Package

Step 2: Add to Claude Code Settings

Step 3: Verify Installation

Usage

Reading Documents

Reading Images

Parameters

read_document

read_image

Auto-Detection in CLAUDE.md

Troubleshooting

"MCP not found"

"Timeout" errors

"File not found"

Why Use This Instead of Built-in Read?

License

FilesExpand file tree

document-loader-mcp

Directory actions

More options

Directory actions

More options

Latest commit

History

document-loader-mcp

Folders and files

parent directory

README.md

Document Loader MCP

What It Does

Installation

Prerequisites

Step 1: Install the MCP Package

Step 2: Add to Claude Code Settings

Step 3: Verify Installation

Usage

Reading Documents

Reading Images

Parameters

read_document

read_image

Auto-Detection in CLAUDE.md

Troubleshooting

"MCP not found"

"Timeout" errors

"File not found"

Why Use This Instead of Built-in Read?

License