• Converting from utf-16le to ascii or utf-8

    From Simon Geard@21:1/5 to All on Wed Aug 30 19:37:59 2023
    So I have a data file containing ascii characters which is UTF-16LE
    encoded (output from Powershell). I'd like to do two things with this
    file from tcl on both Windows and Linux:

    1) detect that it is utf-16le
    2) convert it to ascii (or utf-8)

    Looking at the output from [encoding names] on Linux there is no
    utf-16le, does it have a different name? It looks to me as if I could
    just read the file two bytes at a time and drop the second byte but I
    was hoping I could use fconfigure and encoding.

    Thanks for any ideas.

    Simon

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rich@21:1/5 to Simon Geard on Wed Aug 30 19:33:19 2023
    Simon Geard <[email protected]> wrote:
    So I have a data file containing ascii characters which is UTF-16LE
    encoded (output from Powershell). I'd like to do two things with this
    file from tcl on both Windows and Linux:

    1) detect that it is utf-16le

    For detection, a UTF-16 encoded file is /supposed/ to begin with a Byte
    Order Mark (https://en.wikipedia.org/wiki/Byte_order_mark) -- so
    assuming it has one, this is how to detect it is UTF-16LE.

    2) convert it to ascii (or utf-8)

    It looks like there is no utf-16 'encoding' support yet. There is a
    Tip: https://core.tcl-lang.org/tips/doc/main/tip/547.md but it is
    marked as Tcl 8.7.

    Looking at the output from [encoding names] on Linux there is no
    utf-16le, does it have a different name? It looks to me as if I could
    just read the file two bytes at a time and drop the second byte but I
    was hoping I could use fconfigure and encoding.

    Since you have Linux, it appears that the iconv command handles
    converting from UTF-16 LE and BE -- so you might be able to convert on
    Linux and then use the converted file afterward.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Peter Dean@21:1/5 to Simon Geard on Thu Aug 31 15:03:44 2023
    On 31/8/23 04:37, Simon Geard wrote:
    So I have a data file containing ascii characters which is UTF-16LE
    encoded (output from Powershell). I'd like to do two things with this
    file from tcl on both Windows and Linux:

    1) detect that it is utf-16le
    2) convert it to ascii (or utf-8)

    Looking at the output from [encoding names] on Linux there is no
    utf-16le, does it have a different name? It looks to me as if I could
    just read the file two bytes at a time and drop the second byte but I
    was hoping I could use fconfigure and encoding.

    Thanks for any ideas.

    Simon

    https://wiki.tcl-lang.org/page/Unicode+file+reader

    but I just use notepad++ and save as utf-8

    Peter

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)