I originally posted:
The man page says strcasecmp_l() takes an explicit locale.
The implication is that strcasecmp() uses the current locale
(presumably as set by setlocale()).
to which Christian Weisgerber <
[email protected]> kindly replied:
Yes.
src/lib/libc/string/strcasecmp.c:
57 int
58 strcasecmp(const char *s1, const char *s2)
59 {
60 return strcasecmp_l(s1, s2, __get_locale());
61 }
:-)
After calling setlocale(LC_ALL, "uk_UA.UTF-8"), I'm seeing that
strcasecmp() is not, in fact, case-independently matching non-ASCII
UTF-8 strings: it's case sensitive (the ASCII equivalent in this
case being that "Abc" isn't matching "abc").
UTF-8 characters are multibyte. You need to convert the strings
to wide characters and use wcscasecmp().
As one would expect and perfectly reasonable, but something (I forget
what now) led me to think that if strcasecmp accepted UTF-8 locales,
maybe it *would* be willing to, just operating one byte at a time
instead of two.
Thanks for confirming that, Christian. Onward to upgrading this
code that should have been doing that already ...
-WBE
--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)