-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bizarre behavior of string.StartsWith #72770
Comments
Tagging subscribers to this area: @dotnet/area-system-globalization Issue DetailsDescriptionNow, I'm aware of https://docs.microsoft.com/en-us/dotnet/standard/base-types/string-comparison-net-5-plus Please see the screenshot ("Immediate window" in VS debugger) and comments below. Reproduction StepsSet locale to norwegian bokmål (NOB). Expected behaviorAt least Actual behaviorPlease see the screenshots. Totally crazy, I spent two hours diagnosing the issue. Regression?No response Known WorkaroundsExplicitly use ConfigurationWindows 11, .net 6.0.5, x64. Mixed locale: english as display language, several keyboard layouts installed (ENG and NOB). Other informationNo response
|
A simple repro in dotnetfiddle: https://dotnetfiddle.net/Y3jLQJ |
@zvrba thanks for your report. It is the Unicode collation behavior for the Norwegian culture that For CultureInfo ci = CultureInfo.GetCultureInfo("nb-NO");
Console.WriteLine(ci.CompareInfo.IsPrefix("aa", "å", CompareOptions.IgnoreNonSpace)); This should make everything work fine. Let me know if you have any more questions, I can help you with them. |
This issue has been marked |
Hi. Thanks for the reply. I do have a question: I want Also, I'm questioning the decision that Obviously, I'm not a unicode expert and really do not want to become one. The program I'm writing has to process Unicode strings but the processing should be neutral wrt user's OS locale. As another example, a person running the program under German locale should be able to "sanely" search (wrt [1] Now, how does a German enter French characters under German locale/culture into the search box? Copy-paste! EDIT: Another inconsistency. Look
No matter how hard I try, I cannot make sense of this. (Yes, I know, there is an explanation. But the rather involved explanation does not match the programmer's expectations about how these methods should behave wrt each other. When |
To treat the string like an array of chars, use overloads with "aa".StartsWith("a", StringComparison.Ordinal) To treat strings in a way that is reasonably logical for English, use "aa".StartsWith("a", StringComparison.InvariantCulture) |
Sadly, that's not really the case. Here's a bit explained by Jon Skeet about how Really, the best advice when it comes to .NET string manipulation is and has always been: Never rely on a method's default behavior; Always supply a comparison type at your callsite (even if the supplied comparison matches that method's default) just so that you're clear and consistent and not getting surprising behavior like this. You can enable the code analysis rules CA1305 and CA1304 to help you catch those callsites and improve your code quality. |
I would suggest that |
@skyoxZ please have a look at dotnet/designs#207 for more info. |
Description
Now, I'm aware of https://docs.microsoft.com/en-us/dotnet/standard/base-types/string-comparison-net-5-plus
Please see the screenshot ("Immediate window" in VS debugger) and comments below.
Reproduction Steps
Set locale to norwegian bokmål (NOB).
"aa".StartsWith("a")
returns false, which might be explainable with the breaking behavior I linked to above. However,"aa".StartsWith("å")
returns false as wellExpected behavior
At least
"aa".StartsWith("å")
should then return true as "å" is "linguistically the same" as "aa". Otherwise, you tell me. The observed behavior totally breaks the expectation of a "string being a sequence of characters". It almost makes me want to replace allstring
types withList<char>
.Actual behavior
Please see the screenshots. Totally crazy, I spent two hours diagnosing the issue.
Regression?
No response
Known Workarounds
Explicitly use
StringComparison.Ordinal
. Alternately, set the program's culture to invariant, like thisSystem.Globalization.CultureInfo.CurrentCulture = System.Globalization.CultureInfo.InvariantCulture;
Configuration
Windows 11, .net 6.0.5, x64. Mixed locale: english as display language, several keyboard layouts installed (ENG and NOB).
Other information
No response
The text was updated successfully, but these errors were encountered: