Skip to content

Conversation

@afoucret
Copy link
Contributor

Summary

This PR implements constant folding support for ESQL COMPLETION inference plans, allowing completion operations with constant prompts to be evaluated at optimization time rather than at query execution time.

Closes #136863

Technical Details

  • A CompletionFunction as a internal primitive only and is not exposed to users.
  • During the analysis phase in Analyzer::resolveInferencePlan where foldable Completion plans are converted to Eval nodes with CompletionFunction expressions
FROM books
| COMPLETION "Translate this text" WITH { "inference_id": "my-model" }

is internally rewritten into

FROM books
| EVAL completion=COMPLETION("Translate this text", "my-model")
  • The pre-optimizer uses the InferenceFunctionEvaluator which handles the actual execution of the inference using CompletionOperator

@elasticsearchmachine elasticsearchmachine added the Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch label Nov 14, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search-relevance (Team:Search Relevance)

@elasticsearchmachine
Copy link
Collaborator

Hi @afoucret, I've created a changelog YAML for you.

Copy link
Member

@carlosdelest carlosdelest left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM 👍

I was wondering if some rules could be taken out of the Analyzer - like transforming the command into an Eval - but I don't see a special benefit out of it, and I prefer to have all the related transformations in one place 👍

Folding is straightforward and incorporated into the InferencePlan.

Copy link
Contributor

@ioanatia ioanatia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added some non blocking comments that you can address before merging

// Test that a foldable Completion plan (with literal prompt) is transformed to Eval with CompletionFunction
LogicalPlan plan = analyze("""
FROM books METADATA _score
| COMPLETION "Translate this text in French" WITH { "inference_id" : "completion-inference-id" }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add another test, where we have something like:

FROM books METADATA _score
| EVAL prompt = "Translate this text in", language = "French"
| COMPLETION concat(prompt, language) WITH ....

so we not only test for literals which are always foldable, but also foldable expressions that are not literals.


/**
* COMPLETION function generates text completions from a prompt using an inference endpoint.
* This function is not registered in the function registry.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add more details here, that this is only used for optimizing the COMPLETION command when we deal with a foldable prompt with some example, that this should never be added to the function registry, since we don't have support for async functions etc.

@afoucret afoucret merged commit efbd414 into elastic:main Nov 17, 2025
34 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>enhancement :Search Relevance/ES|QL Search functionality in ES|QL Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch v9.3.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[ES|QL][Completion] Inference Constant folding optimization for COMPLETION

4 participants