On 25.05.23 16:30, Ed Morton wrote:
I'm certain I remember years ago reading a document that said
(paraphrasing) "an unparenthesized expression on the right side of input
or output redirection is undefined behavior" and I thought it was an
older version of the POSIX spec. I now can't find that (or similar)
statement in any of these:
SUSV2 - https://pubs.opengroup.org/onlinepubs/7990989775/xcu/awk.html
SUSV3 - https://pubs.opengroup.org/onlinepubs/009695399/utilities/awk.html
Current POSIX spec - https://pubs.opengroup.org/onlinepubs/9699919799/utilities/awk.html
or by googling.
What I do see in the current POSIX spec is a related statement just
about input redirection:
Historical practice has been that:
getline < "a" "b"
is parsed as:
( getline < "a" ) "b"
although many would argue that the intent was that the file ab should
be read. However:
getline < "x" + 1
parses as:
getline < ( "x" + 1 )
...
Since in most cases such constructs are not (or at least should not)
be used (because they have a natural ambiguity for which there is no conventional parsing), the meaning of these constructs has been made explicitly unspecified.
and:
The getline operator can form ambiguous constructs when there are
unparenthesized binary operators (including concatenate) to the right of
the '<' (up to the end of the expression containing the getline). The
result of evaluating such a construct is unspecified
but nothing about output redirection. I know gawk doesn't require parens around the expression for output redirection but other awks do (e.g. see https://stackoverflow.com/q/21093626/1745001) and it's not obvious to me
why `getline < "a" "b"` should be undefined behavior while `print > "a"
"b"` wouldn't be so intuitively if one of them is undefined then so
should the other be.
Does anyone else recall seeing a statement about output redirection to
an expression requiring parens and, if so, do you recall where it existed?
What I recall is that a few times there were discussions about that,
but there was (AFAIR) never a formal explanation.
My thoughts about your question above are as follows...
getline expressions might consider precedence rules, and since in
C-like languages (as opposed to e.g. Algol68) have the precedence
associated with the concrete symbol ('<', '>') as opposed to the
semantic context, so 'less than' would bind stronger than 'concat'.
In cases where (as quoted above) "conventional parsing" deviates
from that (whatever "conventional" or "non-conventional" will be)
it might be different.
Note also that I wrote "getline *expressions*" as opposed to, say,
"print *statement*"; getline is part of the expression (it has a
value) where print has an expression argument. There is (I think)
no expression that starts with '>' in awk, so 'print >' should be
a redirection indication, generally.
Depending on semantical context an expression
if (getline < "a" + i) ...
can make sense in both cases, try reading from "a" and adding a
constant to the return value, or reading from "a1", "a42", etc.
So I can see why one is undefined but not the other. And my coding
approach would be to make the intention visible by parenthesis.
Janis
Ed.
--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)