• Nobody does that anymore... (Was: Simplify an AWK pipeline?)

    From Kenny McCormack@21:1/5 to [email protected] on Mon Aug 21 14:54:53 2023
    In article <ubm5g7$3u7rt$[email protected]>,
    Janis Papanagnou <[email protected]> wrote:
    ...
    2) You probably don't need to mess with SUBSEP. Your data seems
    to be OK with assuming no embedded spaces (i.e., so using space
    as the delimiter is OK) Note that SUBSEP is intended to be
    used as the delimiter for the implementation of old-fashioned
    pseudo-multi-dimensional arrays in AWK, but nobody uses that
    functionality anymore. Therefore, some AWK programmers have co-opted
    SUBSEP as a symbol provided by the language to represent a character
    that is more-or-less guaranteed to never occur in user data.

    Yes, SUBSEP is the default separation character for arrays and. Of
    course you can use other characters (that require less text). Why
    you think that "nobody uses that functionality anymore" is beyond
    me; I doubt you have any evidence for that, so I interpret it just
    as "I [Kenny] don't use it anymore.", which is fine by me.

    It may be a language barrier - I understand that English is not your first language - but in colloquial English, the phrase "nobody does X anymore"
    often means something close to "nobody should do X anymore" or "Only uncool people still do X". Obviously, *some* people still do. BTW, see also the famous Yogi Berra quip: (Of a certain restaurant) "Nobody goes there
    anymore; it's too crowded."

    Anyway, this is definitely true of old-fashioned AWK pseudo-multi-dimensional arrays. They never really worked well, and now that we have true MDAs,
    nobody should be using the old stuff.

    --
    The randomly chosen signature file that would have appeared here is more than 4 lines long. As such, it violates one or more Usenet RFCs. In order to remain in compliance with said RFCs, the actual sig can be found at the following URL:
    http://user.xmission.com/~gazelle/Sigs/GodDelusion

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Kenny McCormack on Mon Aug 21 18:10:10 2023
    On 21.08.2023 16:54, Kenny McCormack wrote:
    In article <ubm5g7$3u7rt$[email protected]>,
    Janis Papanagnou <[email protected]> wrote:
    ...
    2) You probably don't need to mess with SUBSEP. Your data seems
    to be OK with assuming no embedded spaces (i.e., so using space
    as the delimiter is OK) Note that SUBSEP is intended to be
    used as the delimiter for the implementation of old-fashioned
    pseudo-multi-dimensional arrays in AWK, but nobody uses that
    functionality anymore. Therefore, some AWK programmers have co-opted
    SUBSEP as a symbol provided by the language to represent a character
    that is more-or-less guaranteed to never occur in user data.

    Yes, SUBSEP is the default separation character for arrays and. Of
    course you can use other characters (that require less text). Why
    you think that "nobody uses that functionality anymore" is beyond
    me; I doubt you have any evidence for that, so I interpret it just
    as "I [Kenny] don't use it anymore.", which is fine by me.

    It may be a language barrier - I understand that English is not your first language - but in colloquial English, the phrase "nobody does X anymore" often means something close to "nobody should do X anymore" or "Only uncool people still do X". Obviously, *some* people still do. BTW, see also the famous Yogi Berra quip: (Of a certain restaurant) "Nobody goes there
    anymore; it's too crowded."

    Anyway, this is definitely true of old-fashioned AWK pseudo-multi-dimensional arrays. They never really worked well, and now that we have true MDAs, nobody should be using the old stuff.

    Okay, thanks for explaining. So I've interpreted it right
    (despite any probably existing language barrier). - And I
    disagree with you in the given thread context, still also
    generally.

    "True" multi-dimensional arrays are unnecessary here, and
    if you use separate keys where you need only one composite
    key is not only unnecessary it seems to complicate matters.
    (But you may provide code to prove me wrong if you like;
    how would multidimensional arrays help here?)

    In the past I used Gnu Awk's multi-dimensional arrays in
    contexts where it was necessary, and there it simplified
    *these* things. But usually when using awk I observed that
    "simple [associative] arrays" is what I need in 98% of my
    awk applications[*] - of course the situation where _you_
    (personally) use Awk arrays may be different (that would
    actually mean "I [Kenny] don't use it anymore.", what I
    interpreted upthread).[**]

    Since a[k] is a/the common use the question is, in which
    contexts is a[k1][k2] necessary and in which is a[k1,k2]
    sufficient? - My observation is that only where you need
    true multi-dimensional access a[k1][k2] is advantageous;
    but this appears not to be the common case. (BTW, [***].)

    I think it boils down to observe that the concrete given
    solution uses just one composed index and that there's no
    need for non-standard "true multi-dimensional arrays"
    because here there are no multi-dimensional arrays.[****]

    Thanks for reading.

    Janis

    [*] Reminds me of the reasons why in Pascal the supported
    only loops based on integral indices (and not FP); because
    there was evidence that this was used most of the times.
    (It doesn't mean that there aren't sensible applications
    beyond that.)

    [**] Of course you may also provide evidence and reasons
    for the given hypotheses "nobody should do X anymore" -
    Why? - and "Only uncool people still do X" - "uncool"? -
    for (X = "don't use simple awk arrays". - I think such
    statements make just no sense, yet if they are just fuzzy
    (non determined) or personal without evidence.

    [***] I deliberately ignored that the GNU Awk extension
    is also non-standard, since it's not necessary for our
    dispute.

    [****] You see that where 'k' is composed and only a[k]
    and c[k] used; simply and without disadvantage.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Kenny McCormack on Mon Aug 21 18:56:33 2023
    On 21.08.2023 16:54, Kenny McCormack wrote:

    Anyway, this is definitely true of old-fashioned AWK pseudo-multi-dimensional arrays. They never really worked well, and now that we have true MDAs, nobody should be using the old stuff.

    I think the misunderstandings in this subthread were...
    - we have no disagreement where "MDAs" are _necessary_ and used,
    - in this thread's solutions we had no application of "MDAs"
    (just a composed key), and "MDAs" also weren't necessary,
    - (thesis) basic associative arrays are predominantly used
    (mileages may probably vary depending on where awk is used),
    - "MDAs" support associative functionality thus hardly avoidable
    (is a[k] "old stuff" or is it an MDA with one dimension?)

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kenny McCormack@21:1/5 to [email protected] on Mon Aug 21 19:04:17 2023
    In article <uc0501$1vsmp$[email protected]>,
    Janis Papanagnou <[email protected]> wrote:
    On 21.08.2023 16:54, Kenny McCormack wrote:

    Anyway, this is definitely true of old-fashioned AWK pseudo-multi-dimensional
    arrays. They never really worked well, and now that we have true MDAs,
    nobody should be using the old stuff.

    I think the misunderstandings in this subthread were...
    - we have no disagreement where "MDAs" are _necessary_ and used,
    - in this thread's solutions we had no application of "MDAs"
    (just a composed key), and "MDAs" also weren't necessary,
    - (thesis) basic associative arrays are predominantly used
    (mileages may probably vary depending on where awk is used),
    - "MDAs" support associative functionality thus hardly avoidable
    (is a[k] "old stuff" or is it an MDA with one dimension?)

    I never said anything about any of that - That is, anything about whether
    or not MDAs were needed in the context of this thread (Clearly, they are
    not).

    My content was, as it usually is, entirely "meta". Thus, the following two comments:

    1) It sounded like you had misunderstood my comment about "nobody does
    that anymore", so I clarified what the colloquial meaning of that
    expression is. Note that I have hit a similar thing a while back in the
    shell group - where I stated that nobody uses backticks anymore,
    because we now have $(), which, as we all know, is better in just about
    every way (the only exception that I can think of is that if you are
    programming in csh or tcsh, then you have to use backticks - although
    this may sound facetious, I still do some tcsh stuff, so I have to keep
    this in mind).

    I got a lot of blowback from indignant people who wanted me to know
    that they still use backticks and they were personally insulted that I
    claimed that no one did that anymore. Clearly, those people did not
    understand the idiomatic meaning of the expression either.

    2) You had used SUBSEP in your script (reply to OP), but were
    (obviously) not using (any form of) MDAs, so I made some comments (not
    for your benefit, but for OP's) about your usage of SUBSEP (i.e., how
    it is usually only used when using pseudo-MDAs, but that some people
    have co-opted it for other uses).

    --
    The plural of "anecdote" is _not_ "data".

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)