Feature: Added a reader using the multimodal capability of Gemini #18020

sumitaryal · 2025-03-05T19:23:09Z

Description

Add GeminiReader: AI-powered PDF extraction using Google's Gemini models

This PR introduces a new LlamaIndex reader integration that leverages Google's Gemini AI models for high-quality PDF extraction and intelligent chunking.

Motivation

Traditional PDF parsers often struggle with complex layouts, scanned documents, tables, and forms. By utilizing Gemini's vision capabilities, this reader provides significantly better extraction quality and semantic chunking for improved RAG performance.

Key Features

AI-powered OCR that handles complex layouts and scanned documents
Intelligent semantic chunking that preserves document structure
Special handling for tables, forms, and mathematical formulas
Parallel processing with configurable parallelism
Continuous mode for content spanning multiple pages
Built-in caching system to avoid redundant processing

Implementation

The reader is built as a BasePydanticReader extension with comprehensive configuration options. It creates semantically meaningful document chunks while preserving metadata about their source and position in the original document.

New Package?

Did I fill in the tool.llamahub section in the pyproject.toml and provide a detailed README.md for my new integration or package?

Yes
No

Version Bump?

Did I bump the version in the pyproject.toml file of the package I am updating? (Except for the llama-index-core package)

Yes
No

Type of Change

Please delete options that are not relevant.

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
This change requires a documentation update

How Has This Been Tested?

Your pull-request will likely not be merged unless it is covered by some form of impactful unit testing.

I added new unit tests to cover this change
I believe this change is already covered by existing unit tests

Suggested Checklist:

I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
I have added Google Colab support for the newly added notebooks.
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes
I ran make format; make lint to appease the lint gods

logan-markewich · 2025-03-06T19:33:11Z

llama-index-integrations/readers/llama-index-readers-gemini/llama_index/readers/gemini/base.py

+
+        all_documents = []
+
+        with ThreadPoolExecutor(max_workers=self.max_workers) as executor:


imo, it is 100000x times safer to be using async concurrency vs. threading. I would prefer something like asyncio.gather() behind a semaphore -- we actually have utility for this as well

from llama_index.core.async_utils import run_jobs jobs = [some_async_fn() for x in thing] results = await run_jobs(jobs, workers=self.max_workers)

llama-index-integrations/readers/llama-index-readers-gemini/llama_index/readers/gemini/base.py

llama-index-integrations/readers/llama-index-readers-gemini/pyproject.toml

llama-index-integrations/readers/llama-index-readers-gemini/tests/test_readers_gemini.py

…ryal/llama_index into feature/gemini-pdf-reader

sumitaryal added 2 commits March 6, 2025 00:57

Feature: Added a reader using the multimodal capability of Gemini

9855a36

Update: Updated README.md

9e1e8ff

sumitaryal marked this pull request as ready for review March 6, 2025 06:17

dosubot bot added the size:XXL This PR changes 1000+ lines, ignoring generated files. label Mar 6, 2025

Merge branch 'main' into feature/gemini-pdf-reader

93cccea

logan-markewich reviewed Mar 6, 2025

View reviewed changes

llama-index-integrations/readers/llama-index-readers-gemini/llama_index/readers/gemini/base.py Show resolved Hide resolved

logan-markewich reviewed Mar 6, 2025

View reviewed changes

llama-index-integrations/readers/llama-index-readers-gemini/llama_index/readers/gemini/base.py Outdated Show resolved Hide resolved

logan-markewich reviewed Mar 6, 2025

View reviewed changes

llama-index-integrations/readers/llama-index-readers-gemini/pyproject.toml Outdated Show resolved Hide resolved

logan-markewich reviewed Mar 6, 2025

View reviewed changes

llama-index-integrations/readers/llama-index-readers-gemini/tests/test_readers_gemini.py Outdated Show resolved Hide resolved

sumitaryal and others added 7 commits March 8, 2025 00:53

Updated README.md

e6e899e

Updated pyproject.toml file

c0b684e

Added more tests

86ab188

Update: Modularized Codebase and implemented aload_data method

8841261

Merge branch 'main' into feature/gemini-pdf-reader

af135b0

Merge branch 'feature/gemini-pdf-reader' of https://github.com/sumita…

7225295

…ryal/llama_index into feature/gemini-pdf-reader

ran make format; make lint

7e4b99f

sumitaryal requested a review from logan-markewich March 8, 2025 08:41

sumitaryal added 2 commits March 8, 2025 14:54

Merge branch 'main' into feature/gemini-pdf-reader

dfc85bd

Merge branch 'main' into feature/gemini-pdf-reader

4c1d582

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature: Added a reader using the multimodal capability of Gemini #18020

Feature: Added a reader using the multimodal capability of Gemini #18020

sumitaryal commented Mar 5, 2025 •

edited

Loading

logan-markewich Mar 6, 2025


		all_documents = []

		with ThreadPoolExecutor(max_workers=self.max_workers) as executor:

Feature: Added a reader using the multimodal capability of Gemini #18020

Are you sure you want to change the base?

Feature: Added a reader using the multimodal capability of Gemini #18020

Conversation

sumitaryal commented Mar 5, 2025 • edited Loading

Description

Add GeminiReader: AI-powered PDF extraction using Google's Gemini models

Motivation

Key Features

Implementation

New Package?

Version Bump?

Type of Change

How Has This Been Tested?

Suggested Checklist:

logan-markewich Mar 6, 2025

Choose a reason for hiding this comment

sumitaryal commented Mar 5, 2025 •

edited

Loading