Skip to content

Conversation

@fivetran-kwoodbeck
Copy link
Collaborator

Annotate type for snowflake REGR_SLOPE function.

Documentation:
https://docs.snowflake.com/en/sql-reference/functions/regr_slope

Copy link
Collaborator

@geooo109 geooo109 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you also check if this function exists in other sql dialects ? (from the docs online)

@fivetran-kwoodbeck
Copy link
Collaborator Author

Did you also check if this function exists in other sql dialects ? (from the docs online)

I assume that by "exists in other sql dialects", you are asking whether it's present elsewhere in SQLGlot - no, it's not. The function itself is supported by some other databases.

@geooo109
Copy link
Collaborator

geooo109 commented Nov 27, 2025

Did you also check if this function exists in other sql dialects ? (from the docs online)

I assume that by "exists in other sql dialects", you are asking whether it's present elsewhere in SQLGlot - no, it's not. The function itself is supported by some other databases.

Yeap, I was referring to other databases, not SQLGlot codebase. So, when we introduce a new function expression, the goal is to have a common representation (1 AST node) for all the dialects + add parsing tests (validate_identity or validate_all) for all the dialects that support the targeted function. For example, you can imagine that a function with the same name can exist in two different dialects and support different number of arguments, in this case our AST node must handle both of these cases.

I did a quick look up and found that spark, databricks, presto, trino, oracle, postgres, and duckdb (you can re-check it also by yourself I maybe missed something) support this function and for all of these your representation is sufficient. So, we can include a validate_identity test like self.validate_identity("REGR_SLOPE(y, x)") in each of their test files (test_{dialect_name}.py), or use the test_dialect.py file and do something like this PR (gather all the tests in one file): https://github.com/tobymao/sqlglot/pull/5576/files

exp.MonthsBetween,
exp.RegrAvgx,
exp.RegrAvgy,
exp.RegrSlope,
Copy link
Collaborator

@geooo109 geooo109 Nov 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function returns: If any of the input expressions is of type DECFLOAT, the returned type is DECFLOAT. Otherwise, the returned type is FLOAT.

Currently, SQLGlot doesn't support DECFLOAT , should we add a new type here or use a mapping? @VaggelisD

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add support for DECFLOAT, it's different than FLOAT and has variable scale so we probably can't make it a mapping.

Copy link
Collaborator

@geooo109 geooo109 Nov 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeap agree thanks.

@fivetran-kwoodbeck so let's add a new datatype called DECFLOAT, then we have to make a custom annotation function that first annotates the args of RegrSlope, and based on the types we will output the correct type.

@fivetran-kwoodbeck
Copy link
Collaborator Author

Did you also check if this function exists in other sql dialects ? (from the docs online)

I assume that by "exists in other sql dialects", you are asking whether it's present elsewhere in SQLGlot - no, it's not. The function itself is supported by some other databases.

Yeap, I was referring to other databases, not SQLGlot codebase. So, when we introduce a new function expression, the goal is to have a common representation (1 AST node) for all the dialects + add parsing tests (validate_identity or validate_all) for all the dialects that support the targeted function. For example, you can imagine that a function with the same name can exist in two different dialects and support different number of arguments, in this case our AST node must handle both of these cases.

I did a quick look up and found that spark, databricks, presto, trino, oracle, postgres, and duckdb (you can re-check it also by yourself I maybe missed something) support this function and for all of these your representation is sufficient. So, we can include a validate_identity test like self.validate_identity("REGR_SLOPE(y, x)") in each of their test files (test_{dialect_name}.py), or use the test_dialect.py file and do something like this PR (gather all the tests in one file): https://github.com/tobymao/sqlglot/pull/5576/files

Sounds good. I double checked and also found a few extra, they were all added to the test_dialect.py same as the example you shared.

@fivetran-kwoodbeck fivetran-kwoodbeck changed the title feat(optimizer)!: Annotate type for snowflake REGR_SLOPE function feat(snowflake)!: Annotate type for snowflake REGR_SLOPE function Nov 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants