By and large I have no problems working with Tcl and unicode strings of Chinese characters. However, certain variant forms become garbled when saving. How should I configure my output stream to overcome this? The particular character is U+21E40.
https://ctext.org/dictionary.pl?if=en&char=%F0%A1%B9%80
Many thanks in advance for any comments.
W.
At least in my tcl 8.6.13
set x \u21E40
sets x to \u21E4 followed by a "0".
Your code saves the U+21E40 char on disk as the byte sequence
\360\241\271\200
(I have no clue whether this is correct utf-8 for \u21E40).
By and large I have no problems working with Tcl and unicode strings of Chinese characters. However, certain variant forms become garbled when saving. How should I configure my output stream to overcome this? The particular character is U+21E40.
https://ctext.org/dictionary.pl?if=en&char=%F0%A1%B9%80
Many thanks in advance for any comments.
W.
Am 05.04.2023 um 09:20 schrieb WJG:
By and large I have no problems working with Tcl and unicode strings of Chinese characters. However, certain variant forms become garbled when saving. How should I configure my output stream to overcome this? The particular character is U+21E40.
https://ctext.org/dictionary.pl?if=en&char=%F0%A1%B9%80
Many thanks in advance for any comments.
W.Dear WJG,
thanks for your message!
https://onlineunicodetools.com/convert-unicode-to-utf8
Translates
𡹀
to
f0 a1 b9 80
Ralf has given (in octal ;-) ) : \360\241\271\200
which is correct:
% set s \360\241\271\200
𡹀
(bin) 2 % scan $s %c%c%c%c
240 161 185 128
(bin) 3 % format "%X %x %x %x" {*}[scan $s %c%c%c%c]
F0 a1 b9 80
So, that should work for any other program.
Remark, that the internal representation in TCL 8.6.11 - TCL 8.7.99 is a
set of two surrogates for any non BMP character:
% set c 𡹀
𡹀
(bin) 5 % string length $c
2
This is a ultra-hack and may be fixed by:
1) using TCl 9.x
2) compiling TCL with a TCL_UNICHAR size larger than 3 (non-standard).
You may try Androwish and friends, to get real support for this now, as
it uses option 2 above.
And you may write a big posting everywhere:
- that you are confused
- and that you want 9.0 now
(sorry, a bit a half joke, but partly very true)
Take care,
Harald
On Wednesday, 5 April 2023 at 19:19:32 UTC+1, Harald Oehlmann wrote:the main dictionaries.
Am 05.04.2023 um 09:20 schrieb WJG:
By and large I have no problems working with Tcl and unicode strings of Chinese characters. However, certain variant forms become garbled when saving. How should I configure my output stream to overcome this? The particular character is U+21E40.Dear WJG,
https://ctext.org/dictionary.pl?if=en&char=%F0%A1%B9%80
Many thanks in advance for any comments.
W.
thanks for your message!
https://onlineunicodetools.com/convert-unicode-to-utf8
Translates
𡹀
to
f0 a1 b9 80
Ralf has given (in octal ;-) ) : \360\241\271\200
which is correct:
% set s \360\241\271\200
𡹀
(bin) 2 % scan $s %c%c%c%c
240 161 185 128
(bin) 3 % format "%X %x %x %x" {*}[scan $s %c%c%c%c]
F0 a1 b9 80
So, that should work for any other program.
Remark, that the internal representation in TCL 8.6.11 - TCL 8.7.99 is a
set of two surrogates for any non BMP character:
% set c 𡹀
𡹀
(bin) 5 % string length $c
2
This is a ultra-hack and may be fixed by:
1) using TCl 9.x
2) compiling TCL with a TCL_UNICHAR size larger than 3 (non-standard).
You may try Androwish and friends, to get real support for this now, as
it uses option 2 above.
And you may write a big posting everywhere:
- that you are confused
- and that you want 9.0 now
(sorry, a bit a half joke, but partly very true)
Take care,
Harald
Thanks for taking the time to answer my post. So, no active support for this range. I'm coding on Linux so I'll explore using the glib route. TBH, its not that much of a nightmare, the character I presented in basically a 'typo' that made its way into
When, if ever, is Tcl 9 to be released to the distros?
WJG
| Sysop: | Keyop |
|---|---|
| Location: | Huddersfield, West Yorkshire, UK |
| Users: | 715 |
| Nodes: | 16 (2 / 14) |
| Uptime: | 05:28:55 |
| Calls: | 12,100 |
| Calls today: | 8 |
| Files: | 15,003 |
| Messages: | 6,517,908 |
| Posted today: | 1 |