Skip to content

Conversation

@mariadb-LeonidFedorov
Copy link
Collaborator

@mariadb-LeonidFedorov mariadb-LeonidFedorov commented Nov 12, 2025

Fix for GROUP BY Ambiguous Column Reference Bug

Problem Description

When a GROUP BY clause contains a column name that is ambiguous (i.e., it matches both a table column and a SELECT list alias pointing to a different column), ColumnStore was throwing an error:

ERROR 1178 (42000): The storage engine for the table doesn't support MCS-2016: Non supported item 'cat_id' on the GROUP BY list.

The expected behavior is to handle the ambiguity gracefully, similar to how standard MariaDB would handle it (with a warning if appropriate).

Example Query

CREATE TABLE test2 (
  id INT UNSIGNED NOT NULL,
  GBP DOUBLE UNSIGNED NOT NULL,
  cat_id INT UNSIGNED NOT NULL,
  cat_id_level_0 INT UNSIGNED NOT NULL
) ENGINE=Columnstore;

-- This query has an ambiguous GROUP BY reference
-- 'cat_id' could refer to:
--   1. The table column 'cat_id'
--   2. The SELECT alias 'cat_id' (which is cat_id_level_0 AS cat_id)

SELECT cat_id_level_0 AS entity_id, SUM(GBP) AS spend, cat_id_level_0 AS cat_id
FROM test2
GROUP BY cat_id;

Root Cause

In the processGroupBy() function in ha_mcs_execplan.cpp, when processing a FIELD_ITEM in the GROUP BY clause:

  1. buildSimpleColumn() was called to create a column reference
  2. If this failed (returned NULL) due to ambiguity or other issues, the code immediately set nonSupportItem, triggering the error
  3. The code didn't check if there was a matching alias in the SELECT list that could resolve the reference

Solution

Modified processGroupBy() function in ha_mcs_execplan.cpp to handle ambiguous column references more gracefully:

  1. When buildSimpleColumn() succeeds, use the table column and check SELECT list by sameColumn() match
  2. If table column found in SELECT list, also check for alias match as a fallback
  3. When buildSimpleColumn() returns NULL or non-SimpleColumn, match by alias name in SELECT list

Copy link
Contributor

@mariadb-SergeyZefirov mariadb-SergeyZefirov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with changes as the solve exactly what need to be solved and thus can be merged.

I have a question about nested aggregate unwrapping, but I do not see it as super important. If we differ in behavior from the server for single aggregate, we also may differ for nested aggregates.

ReturnedColumn* inner_col = agg_check->aggParms()[0].get();
AggregateColumn* inner_agg = dynamic_cast<AggregateColumn*>(inner_col);

if (inner_agg)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can it be recursive? E.g., SUM(COUNT(group_by_col)).

@mariadb-AlekseiBukhalov mariadb-AlekseiBukhalov force-pushed the MCOL-6088-groupby branch 6 times, most recently from 65a6525 to eb82161 Compare November 19, 2025 17:07
…UP BY alias` works correctly even when `col` is wrapped in an implicit aggregate wrapper.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants