-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] Support new syntax for Backup/Restore #52729
base: main
Are you sure you want to change the base?
Conversation
@@ -8246,6 +8356,7 @@ public List<String> getColumnNames(StarRocksParser.ColumnAliasesContext context) | |||
} | |||
|
|||
protected NodePosition createPos(ParserRuleContext context) { | |||
Preconditions.checkState(context != null); | |||
return createPos(context.start, context.stop); | |||
} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Who sings the song "Runnin' Down a Dream"?
@@ -136,7 +151,7 @@ public Void visitBackupStatement(BackupStmt backupStmt, ConnectContext context) | |||
|
|||
// analyze and get Function for stmt | |||
List<FunctionRef> fnRefs = backupStmt.getFnRefs(); | |||
if (!withOnClause) { | |||
if (!withOnClause || allFunction) /* without `On` or contains `ALL` */ { | |||
fnRefs.add(new FunctionRef(database.getFunctions())); | |||
} else { | |||
backupStmt.getFnRefs().stream().forEach(x -> x.analyzeForBackup(database)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The most risky bug in this code is:
Potential unintended backup inclusion due to condition logic.
You can modify the code like this:
// Adjust the conditional logic for backing up all tables, MVs, or views correctly
if (!withOnClause && (allTable || allMV || allView)) {
for (Table tbl : GlobalStateMgr.getCurrentState().getLocalMetastore().getTables(database.getId())) {
if (!Config.enable_backup_materialized_view && tbl.isMaterializedView()) {
LOG.info("Skip backup materialized view: {} because " +
"backup of materialized views is disabled", tbl.getName());
continue;
}
if (tbl.isTemporaryTable()) {
continue;
}
if ((tbl.isOlapTable() && !allTable) ||
(tbl.isOlapMaterializedView() && !allMV) ||
(tbl.isOlapView() && !allView)) {
continue;
}
TableName tableName = new TableName(dbName, tbl.getName());
TableRef tableRef = new TableRef(tableName, null, null);
tableRefs.add(tableRef);
}
}
This modification clarifies that all specific objects should only be added when the corresponding flags are set, correcting logical errors from the original inclusive checks.
.filter(x -> x.isOlapView()).map(x -> x.getName()).collect(Collectors.toSet())); | ||
} | ||
} | ||
|
||
// only retain restore tables | ||
jobInfo.retainTables(allTbls); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The most risky bug in this code is:
A null pointer dereference when dbName
is null during the creation or access of the database.
You can modify the code like this:
// check if db exist
String dbName = stmt.getDbName();
Database db = globalStateMgr.getLocalMetastore().getDb(dbName);
if (db == null) {
if (stmt instanceof RestoreStmt) {
// Ensure jobInfo is initialized before using it
if (jobInfo == null && dbName != null) {
ErrorReport.reportDdlException(ErrorCode.ERR_COMMON_ERROR, "JobInfo is not available for database: " + dbName);
}
if (dbName == null) {
// Check if jobInfo has the necessary data to avoid null pointer exception
if (jobInfo == null || jobInfo.dbName == null) {
ErrorReport.reportDdlException(ErrorCode.ERR_COMMON_ERROR,
"Cannot determine database name from job info during restore process");
}
// use dbName in snapshot if target dbName is null
dbName = jobInfo.dbName;
}
try {
globalStateMgr.getLocalMetastore().createDb(dbName, null);
db = globalStateMgr.getLocalMetastore().getDb(dbName);
} catch (Exception e) {
ErrorReport.reportDdlException(ErrorCode.ERR_COMMON_ERROR,
"Cannot create database: " + dbName + " in restore process");
}
} else {
ErrorReport.reportDdlException(ErrorCode.ERR_BAD_DB_ERROR, dbName);
}
}
Signed-off-by: srlch <[email protected]>
d0dc1d4
to
98144e8
Compare
Signed-off-by: srlch <[email protected]>
[FE Incremental Coverage Report]✅ pass : 60 / 69 (86.96%) file detail
|
: tableDesc | FUNCTION qualifiedName | ||
backupRestoreObjectDesc | ||
: backupRestoreTableDesc | ||
| (ALL (FUNCTION | FUNCTIONS) | (FUNCTION | FUNCTIONS) qualifiedName (AS identifier)?) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is ( )
needed here? This syntax can't restrict that there is only one ALL FUNCTIONS or multiple specified FUNCTIONS.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And, maybe I want to restore all Functions, and restore some of them with alias. Can it be supported.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
()
is no needed here, ()
is just a marker for the syntax definition in .g4
file. ALL FUNCTIONS
and FUNCTIONS
can be specified multiple time, check definition in backupStatement
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If use ALL FUNCTIONS
, alias can not be set in this case
@BaseMessage("Specify alias for backup object is forbidden in BACKUP stmt") | ||
String unsupportedSepcifyAliasInBackupStmt(); | ||
|
||
@BaseMessage("`ON` clause is forbidden if no Database explicitly completely in Restore stmt") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's the meaning of explicity completely
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
@@ -222,4 +222,13 @@ public interface ParserErrorMsg { | |||
String nullIdentifierCancelBackupRestore(); | |||
@BaseMessage("Value count in PIVOT {0} must match number of FOR columns {1}") | |||
String pivotValueArityMismatch(int a0, int a1); | |||
|
|||
@BaseMessage("Specify dbName after snapshot name is forbidden if the DbName is specified explicitly in BACKUP/RESTORE") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
after -> before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
this.withOnClause = !(this.tblRefs.isEmpty() && this.fnRefs.isEmpty()); | ||
this.originDbName = ""; | ||
|
||
this.withOnClause = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not directly set the default in variable declaration?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
public ParseNode visitBackupStatement(StarRocksParser.BackupStatementContext context) { | ||
QualifiedName qualifiedName = getQualifiedName(context.qualifiedName()); | ||
LabelName labelName = qualifiedNameToLabelName(qualifiedName); | ||
private ParseNode getObjectRef(StarRocksParser.QualifiedNameContext qualifiedNameContext, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'll be clearer to split it into 2 functions: getFunctionRef, getTableRef
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
FROM identifier | ||
(ON '(' restoreObjectDesc (',' restoreObjectDesc) * ')')? | ||
FROM repoName=identifier | ||
(DATABASE dbName=identifier (AS dbAlias=identifier)?)? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will it be simpler to extract a sub-statement like:
backupRestorePackageStmt
DATABASE dbName=identifier (AS dbAlias=identifier)? |
.....
Other pageage types will be supported/used later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will consider this change in next pr for external catalog. It seems that backupRestorePackageStmt
designed for DATABASE
and CATALOG
is better.
return new BackupStmt(labelName, repoName, tblRefs, fnRefs, properties, createPos(context)); | ||
AbstractBackupStmt stmt = null; | ||
if (backupContext != null) { | ||
stmt = (AbstractBackupStmt) (new BackupStmt(labelName, repoName, mixTblRefs, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(AbstractBackupStmt)
is not needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
String originDb = null; | ||
|
||
boolean withOnClause = false; | ||
boolean allTable = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think, it will be simpler to use vector to store allXxx and Refs, like:
- ArrayList allObjects = false * 4. with enum TABLE/MV/VIEW/FUNCTION.
- ArrayList<ArrayList<>> allRefs = {mixTblRefs, mixTblRefs, mixTblRefs, new List()};
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed NO.1, for NO.2 it seems that keep the original implementation is better, because we will treat TABLE
/MV
/VIEW
as TableRef
.
Signed-off-by: srlch <[email protected]>
b465aa8
to
90b01e5
Compare
Signed-off-by: srlch <[email protected]>
|
||
public class AbstractBackupStmt extends DdlStmt { | ||
public enum BackupRestoreAllMarker { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BackupObjectType
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
@@ -223,12 +223,12 @@ public interface ParserErrorMsg { | |||
@BaseMessage("Value count in PIVOT {0} must match number of FOR columns {1}") | |||
String pivotValueArityMismatch(int a0, int a1); | |||
|
|||
@BaseMessage("Specify dbName after snapshot name is forbidden if the DbName is specified explicitly in BACKUP/RESTORE") | |||
@BaseMessage("Specify dbName before snapshot name is forbidden if the DbName is specified explicitly in BACKUP/RESTORE") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Specifying
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
} | ||
TableRef tableRef = new TableRef(tableName, alias, partitionNames, position); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not directly return new TableRef(...);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
@@ -3424,58 +3423,58 @@ private ParseNode parseBackupRestoreStatement(ParserRuleContext context) { | |||
} | |||
|
|||
if (specifiedFunction) { | |||
if (allFunction) { | |||
if (allMarker.contains(AbstractBackupStmt.BackupRestoreAllMarker.FUNCTION)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
directly import xxx.AbstractBackupStmt.BackupObjectType;
Then usage is simpler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
[Java-Extensions Incremental Coverage Report]✅ pass : 0 / 0 (0%) |
[BE Incremental Coverage Report]✅ pass : 0 / 0 (0%) |
Signed-off-by: srlch <[email protected]>
Signed-off-by: srlch <[email protected]>
Signed-off-by: srlch <[email protected]>
d3c18eb
to
1c8fcb3
Compare
Quality Gate failedFailed conditions See analysis details on SonarQube Cloud Catch issues before they fail your Quality Gate with our IDE extension SonarQube for IDE |
Signed-off-by: srlch <[email protected]>
Why we need new syntax for Backup/Restore
New syntax design for Backup/Restore
The behavioral changes in Backup/Restore syntax
Expansion of
ON
clause:We introduce the key word
(TABLE(S)/VIEW(S)/MATERIALIZED VIEW(S)/FUNCTION(S))
to identify different type of Backup/Restore object and useALL
to represent all objects of a certain type which is much more clear the before.Allow to specify database explicitly and separated from snapshot name:
Backup:
User can specify database after DATABASE key word or before snapshot name as before.Restore:
DATABASE
keyword) withoutON
clause, it will create database with the same name in snapshot file and restore all data into it.DATABASE
keyword, no dbName can be specified before snapshot name.DATABASE
keyword must exactly match in snapshot file.DATABASE <db_name> [AS <db_alias>]
means that restore the database named by db_name in snapshot to the database named by db_alias in current cluster. IfAS
is missing, restore to the database with the same name.Fixes #52746
What type of PR is this:
Does this PR entail a change in behavior?
If yes, please specify the type of change:
Checklist:
Bugfix cherry-pick branch check: