Description
If a sql text file is encoded as ANSI (as opposed to UTF-8 or similar) the newer Go version of sqlcmd will not correctly parse non-ASCII characters.
For example, if a file contains non-breaking spaces (character 160), which in T-SQL is generally treated identically to a normal space. In ANSI Windows-1252, this is encoded as a single-byte hex A0.
The Go version of sqlcmd appears to assume all files are UTF encoded, for it treats such a character as unknown and replaces it with unicode character 65533, which would be consistent with assuming UTF-8 encoded, for the single byte A0 is not valid UTF-8.
The attached file is a simple example txt file encoded using the Windows notepad as ANSI, containing "SELECT{Non-breaking-space}CURRENT_TIMESTAMP"
It can be run in sqlcmd with a command like:
sqlcmd -i testfile.txt
The original ODBC version of sqlcmd has no problem running the above file, returning the expected timestamp.
The GO version however fails:
"Could not find stored procedure 'SELECT�CURRENT_TIMESTAMP'."
The behavior of the GO sqlcmd should either match the ODBC behavior, or this should be documented as one of the "Breaking changes from sqlcmd (ODBC)" that ANSI-encoded text files are not supported.
Activity
shueybubbles commentedon Dec 21, 2023
thx for opening the issue. This is related to #111
ODBC SqlCmd treats non-Unicode/non-UTF8 files as "system code page encoded" and converts them to UTF16 on read using the Win32 API
MultiByteToWideChar
, at least on Windows. I am not sure what their Linux version does.There's not much support in the Go dev community for code pages and we encourage folks who develop cloud-first applications that run on Linux etc to use UTF8 or UTF16 encoded files instead of relying on ambient properties like the system code page.
I do want to support the code page conversions but we just haven't had the time to do the work yet. I will update the README appropriately.
shueybubbles commentedon Dec 21, 2023
this content is relevant for ODBC SqlCmd on Linux and may guide our implementation.
I don't know offhand what the Go method to detect "current locale" is.
https://learn.microsoft.com/en-us/sql/connect/odbc/linux-mac/programming-guidelines?view=sql-server-ver16#character-set-support