Skip to content

Conversation

@bpintea
Copy link
Contributor

@bpintea bpintea commented Oct 23, 2025

This updates the way the @timestamp field is injected into the functions that require it implicitly: these functions no longer need to declare an attribute themselves, the function registry will do it for them.

Followingly, this can be traced from the source and eventually wired into the functions (so that renames no longer be problematic).

Closes #136772

@elasticsearchmachine
Copy link
Collaborator

Hi @bpintea, I've created a changelog YAML for you.

if (plan instanceof Rerank r) {
return resolveRerank(r, childrenOutput);
}
return switch (plan) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Member

@costin costin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for looking into this. Left a first round of comments.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The alternative to this interface is the TimestampAware interface with a timestamp() getter-like method. This would act as a marker interface which can be used by the FunctionRegistry to know if the timestamp needs to be injected or not.
To avoid compiler ambiguities, potentially add the tcab wrapper.

Comment on lines 74 to 76
public TBucket(Source source, Expression buckets, TimestampAttributeSupplier timestampSupplier) {
this(source, buckets, timestampSupplier.timestampAttribute());
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This constructor could be removed and instead the function would rely only on the TBucket constructor with 2 parameters where the users supplies the bucket while the timestamp by the ESQL infra.

Copy link
Member

@costin costin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a timestamp() method would be useful.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added. It's not absolutely needed now, but having it available for later. It fits well with some of the implementing methods' implicit timestamp() existing method.

if (arg.foldable() && ((arg instanceof EsqlScalarFunction) == false)) {
if (i < targetDataTypes.size()) {
targetDataType = targetDataTypes.get(i);
int targetDataTypesIdx = f instanceof TimestampAware ? i - 1 : i;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about using TimestampAware.timestamp() ?

Comment on lines 10 to 15
/**
* Marker interface to identify classes of functions that operate on the {code @timestamp} field of an index.
* Implementations of this interface need to expect the associated {@code Attribute} to be passed as the following argument after the
* {@code Source} one, which is always the first one.
*/
public interface TimestampAware {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reason for having the Timestamp expression right after Source and not the last Expression, that is:

Class(Source, Expression field, Expression timestamp)
Class(Source, Expression field, Expression timestamp, Configuration config)
Class(Source, Expression field, Expression timestamp, List)

In case of ambiguity, this can be sorted out inside the function registry.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO it makes the most sense to have it following source:

  • it's always there, just like source and always on same position for all TimestampAware functions;
  • it's consistent to report (verification issues) about its position, if needed;
  • it's clearer to apply positional transformations, like implicit casting (example).

But I've reverted this change, leaving the code inline with the other TS functions.

@elastic elastic deleted a comment from github-actions bot Oct 27, 2025
@elastic elastic deleted a comment from github-actions bot Oct 27, 2025
@elasticsearchmachine
Copy link
Collaborator

Hi @bpintea, I've updated the changelog YAML for you.

Copy link
Member

@costin costin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

import static org.hamcrest.Matchers.startsWith;

//@TestLogging(value = "org.elasticsearch.xpack.esql:TRACE,org.elasticsearch.compute:TRACE", reason = "debug")
@TestLogging(value = "org.elasticsearch.xpack.esql:TRACE,org.elasticsearch.compute:TRACE", reason = "debug")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leftover

@bpintea bpintea marked this pull request as ready for review October 27, 2025 20:52
@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Oct 27, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@bpintea bpintea added the auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) label Oct 27, 2025
@bpintea bpintea removed the auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) label Oct 27, 2025
defTS(FirstOverTime.class, FirstOverTime::new, "first_over_time"),
def(PercentileOverTime.class, bi(PercentileOverTime::new), "percentile_over_time"),
// dense vector function
def(TextEmbedding.class, bi(TextEmbedding::new), "text_embedding") } };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if First/Last functions should be updated as well?
Conceptually they need @timestamp although (if I remember correctly) it is possible to supply it or another date/time field explicitly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sort can be anything (as long as dt -> dt == DataType.LONG || dt == DataType.DATETIME || dt == DataType.DATE_NANOS), @timestamp included, but not restricted to it.
This change only applies to functions that only work with the @timestamp field and which assume it's there, i.e. it's an implicit dependency (which this change provides/injects).

@bpintea bpintea merged commit 7e75746 into elastic:main Oct 28, 2025
33 of 34 checks passed
@bpintea bpintea deleted the enh/timestamp_aware branch October 28, 2025 10:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/ES|QL AKA ESQL >enhancement Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v9.3.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ESQL: Introduce TimestampAware interface/contract

4 participants