Long `ExampleData` causes `extract` to hang on `_fuzzy_align_extraction`

Hi,

First of all, thanks for open sourcing this nice package. :-)

I am trying to use `langextract` to post-process interviews by tagging (sometimes long) quotes in a text. 
The problem that I run into, is that the text extraction takes exceedingly long to complete. 
For example, it takes 17 minutes (~1000 seconds) to analyze a text of 355 characters (59 words) using an examples list that contains a single `ExampleData` with:
- A `text` of length: ~ 2,100 words (~ 13,000 characters) with 10 `Extraction`s with `extraction_text` size of:
  - 6 characters
  - 7 characters
  - 7 characters
  - 115 characters
  - 496 characters
  - 207 characters
  - 84 characters
  - 139 characters
  - 36 characters
  - 334 characters

(Unfortunately, I can not share the actual contents of the text for privacy reasons.) 
Most of the time is spend _before_ generating any output. When I terminate the program, it's always stuck at `_fuzzy_align_extraction`. After it starts spitting out output like this:

> WARNING:absl:Prompt alignment: non-exact match:

the program quickly finishes. This is some corresponding output that is generated after the long silent period:

> LangExtract: model=gemini-2.5-flash, current=358 chars, processed=358 chars:  [00:09]
> ✓ Extraction processing complete
> INFO:absl:Finalizing annotation for document ID <censored>.
> INFO:absl:Document annotation completed.
> ✓ Extracted 3 entities (1 unique types)
>   • Time: 9.76s
>   • Speed: 37 chars/sec
>   • Chunks: 1


Any suggestions how to speed the `extract` function up?

Thanks in advance,

Hylke

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Long `ExampleData` causes `extract` to hang on `_fuzzy_align_extraction` #277

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Long ExampleData causes extract to hang on _fuzzy_align_extraction #277

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Long `ExampleData` causes `extract` to hang on `_fuzzy_align_extraction` #277