Skip to content

(retriever) update graphic-elements stage#1488

Merged
jdye64 merged 23 commits intoNVIDIA:mainfrom
edknv:edwardk/retriever-graphic-elements
Mar 8, 2026
Merged

(retriever) update graphic-elements stage#1488
jdye64 merged 23 commits intoNVIDIA:mainfrom
edknv:edwardk/retriever-graphic-elements

Conversation

@edknv
Copy link
Collaborator

@edknv edknv commented Mar 4, 2026

Description

This PR integrates the graphic-elements model with OCR for semantically-structured chart text extraction.

  • New use_graphic_elements flag enables running the graphic-elements model on chart crops detected by page-elements, then combining those detections with OCR output via bounding box matching to produce text ordered by semantic role (title, axes, labels, legend) instead of plain reading-order concatenation.
  • Supports both local and remote graphic-elements models; graphic_elements_invoke_url sends chart crops to a NIM endpoint; auto-enables use_graphic_elements when set.

Before (OCR only)

Chart 1
This chart shows some gadgets, and some very fictitious costs.
Gadgets and their cost
$160.00
$140.00
$120.00
$100.00
Dollars
$80.00
$60.00
$40.00
$20.00
$-
Powerdrill
Bluetooth speaker
Minifridge
Premium desk fan
Hammer
Cost

After (graphic-elements + OCR)

This chart shows some gadgets, and some very fictitious costs.   Minifridge - Premium desk fan - Hammer - Powerdrill - Bluetooth speaker Dollars Chart 1 $160.00 - $100.00 - $120.00 - $80.00 - $40.00 - $60.00 - $140.00 - $20.00 - $- Cost    Gadgets and their cost

Usage Examples

Inprocess (local models)

  python -m nemo_retriever.examples.inprocess_pipeline \
    /work/data/jp20 \
    --query-csv /work/data/jp20_query_gt.csv \
    --use-graphic-elements \
    --num-gpus 2

Inprocess (remote graphic-elements endpoint, local everything else)

  python -m nemo_retriever.examples.inprocess_pipeline \
    /work/data/jp20 \
    --graphic-elements-invoke-url http://localhost:8000/v1/infer \
    --num-gpus 2

Batch (all local)

  python -m nemo_retriever.examples.batch_pipeline \
    /work/data/jp20 \
    --use-graphic-elements \
    --gpu-page-elements 0.5 \
    --gpu-ocr 1.0 \
    --gpu-embed 0.5 \
    --ocr-workers 3 \
    --page-elements-workers 3

Batch (remote graphic-elements endpoint, local page-elements + OCR)

  python -m nemo_retriever.examples.batch_pipeline \
    /work/data/jp20 \
    --graphic-elements-invoke-url http://localhost:8000/v1/infer \
    --gpu-page-elements 0.5 \
    --gpu-ocr 1.0 \
    --gpu-embed 0.5

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.
  • If adjusting docker-compose.yaml environment variables have you ensured those are mimicked in the Helm values.yaml file.

@edknv edknv force-pushed the edwardk/retriever-graphic-elements branch from b2bc067 to 0bd7329 Compare March 4, 2026 19:36
@edknv edknv force-pushed the edwardk/retriever-graphic-elements branch from 37beee8 to e46379f Compare March 4, 2026 19:38
@edknv edknv requested a review from jperez999 March 5, 2026 20:46
@edknv edknv marked this pull request as ready for review March 5, 2026 20:46
@edknv edknv requested a review from a team as a code owner March 5, 2026 20:47
@edknv edknv requested a review from ChrisJar March 7, 2026 00:07
@edknv edknv requested a review from jdye64 March 7, 2026 00:38
@jdye64 jdye64 merged commit 926651f into NVIDIA:main Mar 8, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants