Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enhancement] support struct column late materialized #28222

Merged
merged 5 commits into from
Sep 5, 2023

Conversation

Seaven
Copy link
Contributor

@Seaven Seaven commented Jul 28, 2023

Fixes #issue

support late materialized on struct column

  • Remove append default value on ARRAY/MAP/STRUCT column
  • Add fill_subfield_column to support late materialized subfield on struct column
  • Only support prune nested struct type, like: /structA/structB/structC, can't prune the subfield of structB(/structA/array/structB), because the interface of array/map doesn't support now, array/map need flatten rows when fetch values by rowid, it's too complex

Doesn't support on map/array, because only few scenarios when use MAP_SIZE/ARRAY_LENGTH = xx as scan predicate

SSB scalar column struct column before PR struct column after PR
Q01 61 1098 526
Q02 17 224 57
Q03 50 359 269
Q04 1330 8471 6529
Q05 416 8154 6837
Q06 175 5841 4883
Q07 1327 8431 6527
Q08 1052 6965 6199
Q09 293 3738 3108
Q10 23 78 66
Q11 1707 10353 8623
Q12 165 2287 1889
Q13 100 1726 1475

What type of PR is this:

  • BugFix
  • Feature
  • Enhancement
  • Refactor
  • UT
  • Doc
  • Tool

Checklist:

  • I have added test cases for my bug fix or my new feature
  • This pr will affect users' behaviors
  • This pr needs user documentation (for new or modified features or behaviors)
    • I have added documentation for my new feature or new function

Bugfix cherry-pick branch check:

  • I have checked the version labels which the pr will be auto-backported to the target branch
    • 3.1
    • 3.0
    • 2.5
    • 2.4

void resize(size_t n) override { _size = n; }
void resize(size_t n) override {
if (_size == 0) {
_data->resize(1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why we need call resize when _size == 0

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need resize column to keep same rows when fill default value

be/src/common/config.h Outdated Show resolved Hide resolved
Comment on lines +143 to +148
void append_default(size_t count) override {
if (_size == 0) {
_data->append_default(1);
}
_size += count;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reject if data is not emtpy?

Copy link
Contributor Author

@Seaven Seaven Aug 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, don't need reject, only need modified size

Comment on lines +90 to +91
array_column->elements_column()->append_default(1);
array_column->elements_column() = ConstColumn::create(array_column->elements_column(), num_to_read);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

array_column must be emtpy before append_default()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, must be empty

Comment on lines +222 to +231

size_t size_to_read = 0;
for (size_t i = 0; i < size; ++i) {
RETURN_IF_ERROR(_array_size_iterator->seek_to_ordinal_and_calc_element_ordinal(rowids[i]));
size_t element_ordinal = _array_size_iterator->element_ordinal();
RETURN_IF_ERROR(_element_iterator->seek_to_ordinal(element_ordinal));
size_to_read += array_size.get_data()[i];
}

array_column->elements_column()->append_default(size_to_read);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in else {} ?

Copy link
Contributor Author

@Seaven Seaven Aug 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, for mock read data


StructColumn* struct_column = nullptr;
NullColumn* null_column = nullptr;
if (dst->is_nullable()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would dst be set const ? may be a subfield in struct{struct{},struct{stuct{}}}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the method only call once on empty chunk, so I think never be const

@Seaven Seaven force-pushed the dev2 branch 2 times, most recently from 2626df8 to a5740fe Compare August 15, 2023 07:35
fzhedu
fzhedu previously approved these changes Aug 15, 2023
Copy link
Contributor

@fzhedu fzhedu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@wanpengfei-git
Copy link
Collaborator

[FE PR Coverage Check]

😍 pass : 4 / 5 (80.00%)

file detail

path covered_line new_line coverage not_covered_line_detail
🔵 com/starrocks/sql/plan/PlanFragmentBuilder.java 2 3 66.67% [661]
🔵 com/starrocks/qe/SessionVariable.java 2 2 100.00% []

@sonarcloud
Copy link

sonarcloud bot commented Aug 24, 2023

SonarCloud Quality Gate failed.    Quality Gate failed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell E 19 Code Smells

0.0% 0.0% Coverage
0.0% 0.0% Duplication

warning The version of Java (11.0.20) you have used to run this analysis is deprecated and we will stop accepting it soon. Please update to at least Java 17.
Read more here

idea Catch issues before they fail your Quality Gate with our IDE extension sonarlint SonarLint

@wanpengfei-git
Copy link
Collaborator

[FE Incremental Coverage Report]

😍 pass : 4 / 5 (80.00%)

file detail

path covered_line new_line coverage not_covered_line_detail
🔵 com/starrocks/sql/plan/PlanFragmentBuilder.java 2 3 66.67% [663]
🔵 com/starrocks/qe/SessionVariable.java 2 2 100.00% []

@wanpengfei-git
Copy link
Collaborator

[BE Incremental Coverage Report]

😞 fail : 45 / 246 (18.29%)

file detail

path covered_line new_line coverage not_covered_line_detail
🔵 src/storage/rowset/fill_subfield_iterator.h 0 5 00.00% [32, 33, 35, 51, 53]
🔵 src/column/const_column.h 0 12 00.00% [81, 82, 83, 85, 136, 137, 138, 140, 143, 144, 145, 147]
🔵 src/storage/rowset/fill_subfield_iterator.cpp 0 12 00.00% [19, 20, 23, 24, 27, 28, 31, 32, 35, 36, 39, 40]
🔵 src/storage/rowset/column_iterator.h 0 4 00.00% [177, 179, 180, 183]
🔵 src/storage/rowset/array_column_iterator.cpp 0 24 00.00% [89, 90, 91, 93, 167, 168, 169, 171, 209, 210, 211, 212, 213, 214, 218, 219, 220, 223, 224, 225, 226, 227, 228, 231]
🔵 src/storage/rowset/struct_column_iterator.cpp 7 109 06.42% [160, 170, 171, 172, 174, 195, 196, 200, 206, 207, 208, 210, 243, 244, 245, 249, 250, 251, 252, 253, 254, 255, 258, 259, 260, 261, 262, 263, 265, 269, 270, 271, 275, 276, 277, 278, 279, 280, 281, 285, 286, 287, 288, 289, 291, 294, 297, 298, 299, 302, 303, 304, 305, 306, 307, 308, 311, 312, 313, 314, 316, 317, 319, 322, 323, 324, 327, 328, 329, 330, 331, 333, 334, 337, 338, 339, 340, 341, 343, 346, 349, 350, 352, 353, 354, 356, 360, 361, 362, 363, 364, 366, 367, 372, 374, 378, 379, 380, 381, 382, 384, 387]
🔵 src/storage/rowset/map_column_iterator.cpp 6 24 25.00% [114, 125, 199, 200, 201, 203, 210, 211, 212, 214, 270, 271, 272, 274, 279, 280, 281, 283]
🔵 src/column/nullable_column.cpp 4 9 44.44% [74, 95, 115, 116, 117]
🔵 src/storage/rowset/segment_iterator.cpp 28 47 59.57% [617, 618, 619, 621, 1343, 1344, 1345, 1346, 1347, 1348, 1349, 1350, 1351, 1407, 1408, 1409, 1580, 1581, 1583]

@@ -288,6 +288,7 @@ public class SessionVariable implements Serializable, Writable, Cloneable {
"enable_rewrite_groupingsets_to_union_all";

public static final String CBO_USE_DB_LOCK = "cbo_use_lock_db";
public static final String CBO_PREDICATE_SUBFIELD_PATH = "cbo_enable_predicate_subfield_path";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need cbo_ prefix?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just for easy search

@Seaven Seaven merged commit 658be9b into StarRocks:main Sep 5, 2023
26 of 29 checks passed
Jay-ju pushed a commit to Jay-ju/starrocks that referenced this pull request Sep 7, 2023
* [Enhancement] support struct column late materialized

---------

Signed-off-by: Seaven <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants