Skip to content

Feat/stmt2 new bind#35264

Open
sheyanjie-qq wants to merge 3 commits intomainfrom
feat/stmt2-new-bind
Open

Feat/stmt2 new bind#35264
sheyanjie-qq wants to merge 3 commits intomainfrom
feat/stmt2-new-bind

Conversation

@sheyanjie-qq
Copy link
Copy Markdown
Contributor

Description

Issue(s)

  • Close/close/Fix/fix/Resolve/resolve: Issue Link

Checklist

Please check the items in the checklist if applicable.

  • Is the user manual updated?
  • Are the test cases passed and automated?
  • Is there no significant decrease in test coverage?

sheyanjie-qq and others added 3 commits March 9, 2026 17:49
Add columnar binding support for stmt2 to improve bulk insert performance
with column-organized data.

Key changes:
- Add TAOS_STMT2_COLUMN_BIND and TAOS_STMT2_COLUMN_BINDV structures
- Implement taos_stmt2_bind_param_column to convert columnar data to row format
- Implement taos_stmt2_bind_param_column_a for async binding
- Add comprehensive test coverage with 6 test cases

Columnar binding advantages:
- Better cache locality for column-organized data sources
- SIMD optimization opportunities
- Suitable for big data batch inserts

Technical details:
- First column must be tbname for table identification
- Tag columns: num=0 (only first row valid)
- Data columns: num=numRows (all rows valid)
- Field info caching with (stmt, SQL string pointer) key to avoid
  multiple get_fields calls which have side effects

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings April 29, 2026 13:14
@sheyanjie-qq sheyanjie-qq added the keep branch Please add this label to PR to keep the branch after PR is merged or closed. label Apr 29, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces columnar binding support for the stmt2 interface, including new data structures, API functions, and internal implementation logic to convert columnar data to row format. It also adds a macOS compilation guide, updates the build configuration to include a new test executable, and fixes a bug in the decimal string parsing logic. I have no feedback to provide as there are no review comments to evaluate.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a new “columnar bind” API for stmt2 to allow binding data in column-oriented buffers (instead of per-row binds), with supporting prepare-time field caching and async binding support. It also includes a small fix to decimal exponent parsing and adds a macOS build guide.

Changes:

  • Add public client API/types for stmt2 columnar binding (TAOS_STMT2_COLUMN_BIND(V) + taos_stmt2_bind_param_column(_a)), plus implementation that converts columnar input into existing row-bind flow.
  • Cache prepared field metadata in stmtPrepare2() to avoid reconstructing placeholder/field info during binds; update async bind thread to support columnar mode.
  • Add a new client unit-test executable target for stmt2TestColumn, and add a new BUILD_MACOS.md document.

Reviewed changes

Copilot reviewed 7 out of 9 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
source/libs/decimal/src/decimal.c Updates exponent parsing for scientific-notation decimals.
include/client/taos.h Adds new public columnar bind structs and APIs for stmt2.
source/client/inc/clientStmt2.h Extends stmt2 internal structs with cached field metadata + async thread args union for row/columnar binds.
source/client/src/clientStmt2.c Frees cached field metadata on cleanup; populates field cache at prepare; updates async bind thread to call columnar bind.
source/client/src/clientMain.c Implements columnar bind API and refactors async bind entrypoint to share code between row/columnar.
source/client/test/CMakeLists.txt Builds stmt2TestColumn as a unit test executable.
BUILD_MACOS.md Adds macOS build instructions.
.gitignore Adds BUILD_MACOS.md to ignore list.
Comments suppressed due to low confidence (1)

source/client/src/clientMain.c:2935

  • stmtBindParamAImpl() leaks the allocated ThreadArgs *args on early-return paths (e.g., when asyncBindNum > 0) and also when taosStmt2AsyncBind() fails. Ensure args is freed on all failure/return paths after allocation.
  (void)taosThreadMutexLock(&(pStmt->asyncBindParam.mutex));
  if (atomic_load_8((int8_t *)&pStmt->asyncBindParam.asyncBindNum) > 0) {
    (void)taosThreadMutexUnlock(&(pStmt->asyncBindParam.mutex));
    tscError("async bind param is still working, please try again later");
    terrno = TSDB_CODE_TSC_STMT_API_ERROR;
    return terrno;
  }
  (void)atomic_add_fetch_8(&pStmt->asyncBindParam.asyncBindNum, 1);
  (void)taosThreadMutexUnlock(&(pStmt->asyncBindParam.mutex));

  int code_s = taosStmt2AsyncBind(stmtAsyncBindThreadFunc, (void *)args);
  if (code_s != TSDB_CODE_SUCCESS) {
    terrno = code_s;
    (void)taosThreadMutexLock(&(pStmt->asyncBindParam.mutex));
    (void)taosThreadCondSignal(&(pStmt->asyncBindParam.waitCond));
    (void)atomic_sub_fetch_8(&pStmt->asyncBindParam.asyncBindNum, 1);
    (void)taosThreadMutexUnlock(&(pStmt->asyncBindParam.mutex));
    tscError("async bind failed, code:%d , %s", code_s, tstrerror(code_s));
  }

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 2905 to +2910
ThreadArgs *args = (ThreadArgs *)taosMemoryMalloc(sizeof(ThreadArgs));
args->stmt = stmt;
args->bindv = bindv;
args->col_idx = col_idx;
args->fp = fp;
args->param = param;
args->is_columnar = is_columnar;
if (is_columnar) {
Copy link

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stmtBindParamAImpl() allocates args but never checks for allocation failure before dereferencing, which can crash on OOM. Please add a NULL check and return TSDB_CODE_OUT_OF_MEMORY (or terrno) early.

Copilot uses AI. Check for mistakes.
Comment on lines +3627 to +3636
// Check if variable length type
bool isVarLength = IS_VAR_DATA_TYPE(colBind->buffer_type) ||
colBind->buffer_type == TSDB_DATA_TYPE_DECIMAL ||
colBind->buffer_type == TSDB_DATA_TYPE_DECIMAL64;

// Set buffer, length, and is_null with offset
if (isVarLength) {
bind->buffer = (char*)colBind->buffer + colOffsets[fieldIdx];
bind->length = colBind->length + startRow;
bind->is_null = colBind->is_null ? (char*)colBind->is_null + startRow : NULL;
Copy link

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In processColumnsForTable(), for variable-length types you do pointer arithmetic on colBind->length and colBind->buffer without validating they are non-NULL. If a caller passes a var-length column with length==NULL (or buffer==NULL with nonzero lengths), this will crash before deeper validation runs. Add explicit checks for var-length columns (and tbname) and return an invalid-parameter error instead of dereferencing NULL.

Copilot uses AI. Check for mistakes.
Comment on lines +3776 to +3782
char *tbname = (char*)bindv->columns[tbnameColIdx].buffer + tbnameOffset;
int32_t tbname_len = bindv->columns[tbnameColIdx].length[startRow];

if (tbname == NULL || tbname_len == 0) {
STMT2_ELOG("Table name at startRow %d is NULL or empty", startRow);
terrno = TSDB_CODE_INVALID_PARA;
goto _return;
Copy link

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When grouping super-table rows by tbname, the code ignores the is_null bitmap for the tbname column. If is_null[startRow] is set, the current logic still reads buffer + offset and may mis-group or crash. Consider validating is_null for tbname rows (treating NULL as invalid) before using length/memcmp.

Copilot uses AI. Check for mistakes.
TARGET_LINK_LIBRARIES(
stmt2TestColumn PRIVATE
os util common transport parser catalog scheduler ${TAOS_NATIVE_LIB_STATIC} qcom executor function
)
Copy link

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stmt2TestColumn is added as a test executable, but it is not registered with add_test() in the Linux test block below, so it won’t run under ctest/CI even though it exercises the new columnar binding API. Please add an add_test(NAME stmt2TestColumn COMMAND stmt2TestColumn) entry (guarded consistently with the other tests).

Suggested change
)
)
if (NOT TD_WINDOWS)
add_test(NAME stmt2TestColumn COMMAND stmt2TestColumn)
endif()

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

internal keep branch Please add this label to PR to keep the branch after PR is merged or closed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants