• Re: Standards (was Re: Simple string conversion from UCS2 to ISO8859-1)

    From David Brown@21:1/5 to Janis Papanagnou on Thu Feb 27 17:09:11 2025
    On 27/02/2025 08:57, Janis Papanagnou wrote:
    On 26.02.2025 20:50, Lawrence D'Oliveiro wrote:
    On Wed, 26 Feb 2025 07:38:06 +0100, Janis Papanagnou wrote:

    ... e.g. the *.doc format was often named "de facto standard", but
    there was a long period of time neither a public document of that
    "standard" nor was it a standard in the first place ...

    That is still the case.

    What do you mean? - That *.doc is still a de facto standard, or that
    it is still called so?


    .doc has not been the "de facto" standard for a very long time - .docx
    is, and has been for nearly 20 years.

    I've heard of the newer XML-based *.docx format that it is publicly documented and even an official formal standard. (If I'm misinformed
    about that feel free to correct that.)

    Again - you are two decades out of touch here! Yes, the OOXML formats
    are documented and are ISO standards. No one (that's not an
    exaggeration) has read them - they are absolute monsters, full of errors
    and inconsistencies, and exist solely because MS was at risk of losing
    their contracts with US Government and Federal offices that required the
    use of open and documented file formats. The level of bribery,
    corruption and abuse involved in getting these "standards" at ISO is a
    long, sad story that is way off-topic here. And even with that, MS'
    software does not generate standard OOXML formats normally. Much of the support in other software (such as LibreOffice) is based on reverse
    engineering - it is much less work than trying to read the "standard" documents.

    (To be clear - MS is much more of a "team player" than it was twenty
    years ago.)


    WRT the new XML-based formats all I can say is that I had a glimpse
    into docx samples and turned away in disgust.


    The OOXML formats are horrendous. But don't judge them from documents
    produced by MS software - MS has never been able to make XML, HTML or
    other -ML documents of any sane quality. For fun, take a .docx file
    that has seen a lot of action from various MS Office versions, then open
    it with LibreOffice and re-save it in .docx format. The files produced
    by LibreOffice are worlds apart in their efficiency and simplicity.
    (It's still XML, and still inefficient.) My record was taking a .xlsx spreadsheet file that had bloated to over 600 MB from Excel over many
    years, and reducing it to 20 KB by opening and saving it with
    LibreOffice. (I am not claiming that is typical!)


    If you are trying to suggest that ISO 29500 (Microsoft’s “OOXML”) is in
    any way a proper workable standard, then you haven’t read it.

    What are you making up here? - I've not spoken of either "ISO 29500"
    or “OOXML”. - I therefore also haven't said anything about anything "workable".

    OOXML is the format used for .docx, .xlsx, etc., and ISO 29500 is the
    ISO number of the standard.


    My post had been about what some folks call "[de facto] standard".


    That is .docx - approximately OOXML.

    Prior to that, MS Office had a brief muckaround with another XML format,
    and before that .doc was a binary format with no documentation and a
    format that changed with every version of the software. Other software supported it to some extent, by reverse engineering. Yes, at the time
    (prior to Office 2003), it was often referred to as the "de facto"
    standard, but in practice couldn't even work well between two different
    copies of MS Office if the versions didn't match or the computers had
    different fonts or printer settings. (Yes, your computer's printer
    setup affected document compatibility with MS Office at that time.)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Richard Heathfield@21:1/5 to David Brown on Thu Feb 27 16:20:46 2025
    On 27/02/2025 16:09, David Brown wrote:
    .doc has not been the "de facto" standard for a very long time

    It's been my de facto standard text file extension for over 40
    years. I was most amused when I learned that fly-by-night
    companies like Microsoft had started claiming that the extension
    designated a proprietary format. Not for me it didn't. At most,
    it has always designated printable EBCDIC (on a mainframe) or
    printable ASCII (on a smaller machine).

    --
    Richard Heathfield
    Email: rjh at cpax dot org dot uk
    "Usenet is a strange place" - dmr 29 July 1999
    Sig line 4 vacant - apply within

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From bart@21:1/5 to David Brown on Thu Feb 27 21:51:47 2025
    On 27/02/2025 16:09, David Brown wrote:
    On 27/02/2025 08:57, Janis Papanagnou wrote:
    On 26.02.2025 20:50, Lawrence D'Oliveiro wrote:
    On Wed, 26 Feb 2025 07:38:06 +0100, Janis Papanagnou wrote:

    ... e.g. the *.doc format was often named "de facto standard", but
    there was a long period of time neither a public document of that
    "standard" nor was it a standard in the first place ...

    That is still the case.

    What do you mean? - That *.doc is still a de facto standard, or that
    it is still called so?


    .doc has not been the "de facto" standard for a very long time - .docx
    is, and has been for nearly 20 years.

    I've heard of the newer XML-based *.docx format that it is publicly
    documented and even an official formal standard. (If I'm misinformed
    about that feel free to correct that.)

    Again - you are two decades out of touch here!  Yes, the OOXML formats
    are documented and are ISO standards.  No one (that's not an
    exaggeration) has read them - they are absolute monsters, full of errors
    and inconsistencies, and exist solely because MS was at risk of losing
    their contracts with US Government and Federal offices that required the
    use of open and documented file formats.  The level of bribery,
    corruption and abuse involved in getting these "standards" at ISO is a
    long, sad story that is way off-topic here.  And even with that, MS' software does not generate standard OOXML formats normally.  Much of the support in other software (such as LibreOffice) is based on reverse engineering - it is much less work than trying to read the "standard" documents.

    (To be clear - MS is much more of a "team player" than it was twenty
    years ago.)


    WRT the new XML-based formats all I can say is that I had a glimpse
    into docx samples and turned away in disgust.


    The OOXML formats are horrendous.  But don't judge them from documents produced by MS software - MS has never been able to make XML, HTML or
    other -ML documents of any sane quality.  For fun, take a .docx file
    that has seen a lot of action from various MS Office versions, then open
    it with LibreOffice and re-save it in .docx format.  The files produced
    by LibreOffice are worlds apart in their efficiency and simplicity.
    (It's still XML, and still inefficient.)  My record was taking a .xlsx spreadsheet file that had bloated to over 600 MB from Excel over many
    years, and reducing it to 20 KB by opening and saving it with
    LibreOffice.  (I am not claiming that is typical!)


    If you are trying to suggest that ISO 29500 (Microsoft’s “OOXML”) is in
    any way a proper workable standard, then you haven’t read it.

    What are you making up here? - I've not spoken of either "ISO 29500"
    or “OOXML”. - I therefore also haven't said anything about anything
    "workable".

    OOXML is the format used for .docx, .xlsx, etc., and ISO 29500 is the
    ISO number of the standard.


    My post had been about what some folks call "[de facto] standard".


    That is .docx - approximately OOXML.

    Prior to that, MS Office had a brief muckaround with another XML format,
    and before that .doc was a binary format with no documentation and a
    format that changed with every version of the software.  Other software supported it to some extent, by reverse engineering.  Yes, at the time (prior to Office 2003), it was often referred to as the "de facto"
    standard, but in practice couldn't even work well between two different copies of MS Office if the versions didn't match or the computers had different fonts or printer settings.  (Yes, your computer's printer
    setup affected document compatibility with MS Office at that time.)



    A shame you can't say the same for Libreoffice itself. I've had it on my machine for a while but it was normally used to print stuff originating elsewhere.

    When I tried to actually type stuff in directly, it had such a poor
    response time as to be unusable. That is, you typed a bunch of text, but nothing appeared on the screen for a second or so, then it comes all at
    once. All options that might slow it down had been disabled; still slow.

    I haven't used MS Office on the same machine; could it actually be
    faster? That sounds difficult to believe of an MS product, but it's hard
    to see how it could be any slower!

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to bart on Thu Feb 27 23:27:43 2025
    On Thu, 27 Feb 2025 21:51:47 +0000, bart wrote:

    When I tried to actually type stuff in directly, [LibreOffice] had such
    a poor response time as to be unusable. That is, you typed a bunch of
    text, but nothing appeared on the screen for a second or so, then it
    comes all at once.

    Windows problem?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Kuyper@21:1/5 to bart on Thu Feb 27 18:39:36 2025
    On Thu, 27 Feb 2025 21:51:47 +0000, bart wrote:

    When I tried to actually type stuff in directly, [LibreOffice] had such
    a poor response time as to be unusable. That is, you typed a bunch of
    text, but nothing appeared on the screen for a second or so, then it
    comes all at once.

    I use LibreOffice constantly, and cannot remember it ever having such
    behavior, except when my entire computer was showing similar problems regardless of which program I was using. On those occasions, something
    was using up too much memory.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to bart on Fri Feb 28 10:16:56 2025
    On 27/02/2025 22:51, bart wrote:


    A shame you can't say the same for Libreoffice itself. I've had it on my machine for a while but it was normally used to print stuff originating elsewhere.

    When I tried to actually type stuff in directly, it had such a poor
    response time as to be unusable. That is, you typed a bunch of text, but nothing appeared on the screen for a second or so, then it comes all at
    once. All options that might slow it down had been disabled; still slow.

    I haven't used MS Office on the same machine; could it actually be
    faster? That sounds difficult to believe of an MS product, but it's hard
    to see how it could be any slower!

    We are getting /really/ off-topic now (that's as much my fault as anyone else's), so I'll be brief.

    No, LibreOffice is not particularly slow on most systems - but it /is/ a
    big program. If your system is old and weak, and in particular if it
    has very little ram, then it will of course be slow. With a very old
    computer, an older version of LibreOffice could be more efficient - like
    most well-managed open source projects, old versions are easily available.

    MS Office of similar generation is of similar size, weight and
    inefficiency as LibreOffice. On many systems it appears to start
    faster, but that is just because it is often started automatically
    before any files are opened.

    While I prefer LaTeX for serious documentation (and markdown / pandoc
    for small stuff), I sometimes need a more standard word processor or spreadsheet. I've used LibreOffice from its ancestor Star Office - I
    haven't had MS Office on a computer since Word For Windows 2.0 on
    Windows 3.1.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to David Brown on Fri Feb 28 14:52:05 2025
    On 27.02.2025 17:09, David Brown wrote:
    On 27/02/2025 08:57, Janis Papanagnou wrote:
    On 26.02.2025 20:50, Lawrence D'Oliveiro wrote:
    On Wed, 26 Feb 2025 07:38:06 +0100, Janis Papanagnou wrote:

    ... e.g. the *.doc format was often named "de facto standard", but
    there was a long period of time neither a public document of that
    "standard" nor was it a standard in the first place ...

    That is still the case.

    What do you mean? - That *.doc is still a de facto standard, or that
    it is still called so?

    .doc has not been the "de facto" standard for a very long time - .docx
    is, and has been for nearly 20 years.

    Again,
    My post had been about what some folks call "[de facto] standard".

    It was not about specific formats whether they are valid now or have
    been valid decades ago; it's actually 4 decades that we were confronted
    with the MS phenomenon.


    I've heard of the newer XML-based *.docx format that it is publicly
    documented and even an official formal standard. (If I'm misinformed
    about that feel free to correct that.)

    Again - you are two decades out of touch here! [...]

    You missed the point. (See above.)


    (To be clear - MS is much more of a "team player" than it was twenty
    years ago.)

    Please discuss MS's role in IT with others, not with me. I'm fed up,
    not only with their inferior software and designs but also with MS
    evangelists mindlessly repeating their nonsensical ads.



    WRT the new XML-based formats all I can say is that I had a glimpse
    into docx samples and turned away in disgust.


    [...] For fun, take a .docx file [...]

    Thanks, but that's nothing I consider to be fun (for me).

    [...]

    Prior to that, MS Office had a brief muckaround with another XML format,
    and before that .doc was a binary format with no documentation and a
    format that changed with every version of the software. Other software supported it to some extent, by reverse engineering. Yes, at the time
    (prior to Office 2003), it was often referred to as the "de facto"
    standard, but in practice couldn't even work well between two different copies of MS Office if the versions didn't match or the computers had different fonts or printer settings. (Yes, your computer's printer
    setup affected document compatibility with MS Office at that time.)

    Amen.

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Lawrence D'Oliveiro on Fri Feb 28 14:55:03 2025
    On 28.02.2025 00:27, Lawrence D'Oliveiro wrote:
    On Thu, 27 Feb 2025 21:51:47 +0000, bart wrote:

    When I tried to actually type stuff in directly, [LibreOffice] had such
    a poor response time as to be unusable. That is, you typed a bunch of
    text, but nothing appeared on the screen for a second or so, then it
    comes all at once.

    Windows problem?

    That was my thought as well, when I read about his experienced system
    behavior.

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Janis Papanagnou on Fri Feb 28 17:01:42 2025
    On 28/02/2025 14:52, Janis Papanagnou wrote:
    On 27.02.2025 17:09, David Brown wrote:
    On 27/02/2025 08:57, Janis Papanagnou wrote:
    On 26.02.2025 20:50, Lawrence D'Oliveiro wrote:
    On Wed, 26 Feb 2025 07:38:06 +0100, Janis Papanagnou wrote:

    ... e.g. the *.doc format was often named "de facto standard", but
    there was a long period of time neither a public document of that
    "standard" nor was it a standard in the first place ...

    That is still the case.

    What do you mean? - That *.doc is still a de facto standard, or that
    it is still called so?

    .doc has not been the "de facto" standard for a very long time - .docx
    is, and has been for nearly 20 years.

    Again,
    My post had been about what some folks call "[de facto] standard".

    It was not about specific formats whether they are valid now or have
    been valid decades ago; it's actually 4 decades that we were confronted
    with the MS phenomenon.

    You said ".doc", and I took you at your word - I did not realise that
    you were being a lot more general than that, and talking about all MS
    Office / MS Word formats. The issue is complicated by the fact that
    current ".docx" formats are real standards (albeit very poor ones, and
    the default settings of Word don't follow them strictly), rather than
    the "de facto" standards of earlier formats.


    (To be clear - MS is much more of a "team player" than it was twenty
    years ago.)

    Please discuss MS's role in IT with others, not with me. I'm fed up,
    not only with their inferior software and designs but also with MS evangelists mindlessly repeating their nonsensical ads.


    I don't see any "MS evangelists" here. But it is a fact that MS these
    days is much more cooperative than they were decades ago, when they
    openly used criminal and mafia-like tactics. I am not a fan of the
    company or their products in general, but I believe that it is fair when discussing sins (intentional or accidental) of the past (of any company
    or person) to point out if they have improved since then. Being less
    evil than they used to be does not make them a good company - but it
    /is/ an improvement, and it is not reasonable to consider modern MS to
    be as bad a company as old MS.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)