Skip to content

Add identifier unicode support in Mysql, Postgres and Redshift #1933

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jul 14, 2025

Conversation

etgarperets
Copy link
Contributor

…o added a test for that

Copy link
Contributor

@iffyio iffyio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @etgarperets, could you also take a look at the ci failure (and also to merge in latest from main as there are currently some conflicts on the branch)

Comment on lines 15901 to 16140
let unicode_sql = r#"SELECT phoneǤЖשचᎯ⻩☯♜🦄⚛🀄ᚠ⌛🌀 tbl FROM customers"#;
let dialects_supporting_unicode = TestedDialects::new(vec![Box::new(MySqlDialect {}), Box::new(RedshiftSqlDialect {}), Box::new(PostgreSqlDialect {})]);
let _ = dialects_supporting_unicode.parse_sql_statements(unicode_sql).unwrap();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let unicode_sql = r#"SELECT phoneǤЖשचᎯ⻩☯♜🦄⚛🀄ᚠ⌛🌀 tbl FROM customers"#;
let dialects_supporting_unicode = TestedDialects::new(vec![Box::new(MySqlDialect {}), Box::new(RedshiftSqlDialect {}), Box::new(PostgreSqlDialect {})]);
let _ = dialects_supporting_unicode.parse_sql_statements(unicode_sql).unwrap();
let sql = r#"SELECT phoneǤЖשचᎯ⻩☯♜🦄⚛🀄ᚠ⌛🌀 tbl FROM customers"#;
let dialects = TestedDialects::new(vec![Box::new(MySqlDialect {}), Box::new(RedshiftSqlDialect {}), Box::new(PostgreSqlDialect {})]);
let _ = dialects.verified_stmt(unicode_sql);

@@ -15895,3 +15895,11 @@ fn parse_create_procedure_with_parameter_modes() {
_ => unreachable!(),
}
}

#[test]
fn test_unicode_support() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fn test_unicode_support() {
fn test_identifier_unicode_support() {

@@ -51,7 +51,7 @@ impl Dialect for MySqlDialect {
}

fn is_identifier_part(&self, ch: char) -> bool {
self.is_identifier_start(ch) || ch.is_ascii_digit()
self.is_identifier_start(ch) || ch.is_ascii_digit() || !ch.is_ascii()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a comment to these highlighting that they implement unicode character support?

@iffyio iffyio changed the title Added unquoted identifiers unicode support for mySql, postgreSqp, als… Add identifier unicode support in Mysql, Postgres and Redshift Jul 11, 2025
Copy link
Contributor

@iffyio iffyio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks @etgarperets!
cc @alamb

@iffyio iffyio merged commit c5e6ba5 into apache:main Jul 14, 2025
10 checks passed
@etgarperets etgarperets deleted the unicode_support branch July 14, 2025 12:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants