Skip to content

Conversation

@fivetran-felixhuang
Copy link
Collaborator

exp.Substring,
exp.TimeSlice,
exp.TimestampTrunc,
exp.Median,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about this, did you test it ?

Docs state Returns a FLOAT or DECIMAL (fixed-point) number, depending upon the input..

SELECT SYSTEM$TYPEOF(MEDIAN(CAST(1 AS BIGINT)));
>NUMBER(38,3)[SB2]

SELECT SYSTEM$TYPEOF(CAST(1 AS BIGINT);
>NUMBER(38,0)[SB2]

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure what the right type is...the docs says it would return FLOAT or DECIMAL, but when I try on snowflake, I only see NUMBER types

Copy link
Collaborator

@VaggelisD VaggelisD Nov 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is because NUMBER is synonymous to DECIMAL, INT, BIGINT etc but not FLOAT:

snowflake> SELECT 
    SYSTEM$TYPEOF(MEDIAN(CAST(1 AS FLOAT))) as t1,
    SYSTEM$TYPEOF(MEDIAN(CAST(1 AS NUMBER))) as t2,
    SYSTEM$TYPEOF(MEDIAN(CAST(1 AS BIGINT))) as t3, 
    SYSTEM$TYPEOF(MEDIAN(CAST(1 AS NUMBER(10, 5)))) as t4;

T1 | T2 | T3 | T4
-- | -- | -- | --
FLOAT[DOUBLE] | NUMBER(38,3)[SB2] | NUMBER(38,3)[SB2] | NUMBER(13,8)[SB4]

Notice how MEDIAN(FLOAT) -> FLOAT while others return DECIMAL / NUMBER.

Also, from this example we can see that numbers do not preserve their precision.

We should only annotate it with FLOAT (aka DOUBLE) only if the input expr is also one, otherwise we'd annotate it as DECIMAL

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeap, @fivetran-felixhuang let's fix this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have this update

def _annotate_median(self: TypeAnnotator, expression: exp.Median) -> exp.Median:
    """Annotate MEDIAN function with correct return type.

    Based on Snowflake documentation: "Returns a FLOAT or DECIMAL (fixed-point) number, depending upon the input."

    MEDIAN returns:
    - FLOAT/DOUBLE inputs -> DOUBLE (preserve FLOAT type)
    - Other numeric types (INT, BIGINT) -> DECIMAL
    """
    # First annotate the argument to get its type
    expression = self._annotate_by_args(expression, "this")

    # Get the input type
    input_type = expression.this.type

    # If input is FLOAT/DOUBLE, return DOUBLE
    if input_type and input_type.is_type(exp.DataType.Type.DOUBLE, exp.DataType.Type.FLOAT):
        self._set_type(expression, exp.DataType.Type.DOUBLE)
    else:
        # For all other types (INT, BIGINT, NULL, etc.), return DOUBLE
        self._set_type(expression, exp.DataType.Type.DECIMAL)

    return expression

and I updated the tests

# dialect: snowflake
MEDIAN(tbl.bigint_col) OVER (PARTITION BY 1);
DECIMAL;

# dialect: snowflake
MEDIAN(CAST(100 AS DECIMAL(10,2)));
DECIMAL;

but I got these errors

======================================================================
FAIL: test_annotate_funcs (tests.test_optimizer.TestOptimizer.test_annotate_funcs) [670, MEDIAN(tbl.bigint_col) OVER (PARTITION BY 1)]
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/felix.huang/Fivetran/sqlglot/tests/test_optimizer.py", line 984, in test_annotate_funcs
    self.assertEqual(
AssertionError: 'DECIMAL' != 'DECIMAL(38, 0)'
- DECIMAL
+ DECIMAL(38, 0)


======================================================================
FAIL: test_annotate_funcs (tests.test_optimizer.TestOptimizer.test_annotate_funcs) [671, MEDIAN(CAST(100 AS DECIMAL(10,2)))]
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/felix.huang/Fivetran/sqlglot/tests/test_optimizer.py", line 984, in test_annotate_funcs
    self.assertEqual(
AssertionError: 'DECIMAL' != 'DECIMAL(38, 0)'
- DECIMAL
+ DECIMAL(38, 0)

@VaggelisD @geooo109 Do you know why this happens?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants