while read -r line ; do problem
$ bash --version
GNU bash, version 5.1.4(1)-release (x86_64-mageia-linux-gnu)
I have a bash script which reads a script file and updates variables. contents of some lines are modified without my script intervention.
Code snippet
1 while read -r line; do
2 _t=$line
3 set -- $(IFS='=' ; echo $_t)
4 _wd=$1
5 case "$_wd" in
6 _ira_worth) line=" _ira_worth=$_ira_worth # from $_cons_fn" ;;
7 <big if/case snip none of which modify _medicare line>
8 echo $line >> $_tmp_fn
9 done < $_taxes_paid_fn
If you look at the following results from set -vx
You'll notice the _medicare line * was converted to file names used in the script
read -r line
_t='_medicare="$(echo "scale=2; 144.60 * 12" | bc)"'
: IFS==
echo _medicare '"$(echo "scale' '2; 144.60 * 12" | bc)"'
set -- _medicare '"$(echo' '"scale' '2;' 144.60 202112.txt 2021_es_taxes_paid.txt aa cons_202112.txt uniform_rmd_wksht.pdf '12"' '|' 'bc)"'
'[' 13 -gt 1 ']'
_wd=_medicare
Here is the
echo $line >> $_tmp_fn
which did/has the * jumk/substitution
echo '_medicare="$(echo' '"scale=2;' 144.60 202112.txt 2021_es_taxes_paid.txt aa cons_202112.txt uniform_rmd_wksht.pdf '12"' '|' 'bc)"'
How can I prevent the * substitution and still be use the line modification
like line 6 in the example snippet??
Thanks in advance for any advice.
while read -r line ; do problem
$ bash --version
GNU bash, version 5.1.4(1)-release (x86_64-mageia-linux-gnu)
I have a bash script which reads a script file and updates variables. contents of some lines are modified without my script intervention.
Code snippet
1 while read -r line; do
2 _t=$line
3 set -- $(IFS='=' ; echo $_t)
4 _wd=$1
5 case "$_wd" in
6 _ira_worth) line=" _ira_worth=$_ira_worth # from $_cons_fn" ;;
7 <big if/case snip none of which modify _medicare line>
8 echo $line >> $_tmp_fn
9 done < $_taxes_paid_fn
If you look at the following results from set -vx
You'll notice the _medicare line * was converted to file names used in the script
read -r line
_t='_medicare="$(echo "scale=2; 144.60 * 12" | bc)"'
: IFS==
echo _medicare '"$(echo "scale' '2; 144.60 * 12" | bc)"'
set -- _medicare '"$(echo' '"scale' '2;' 144.60 202112.txt 2021_es_taxes_paid.txt aa cons_202112.txt uniform_rmd_wksht.pdf '12"' '|' 'bc)"'
'[' 13 -gt 1 ']'
_wd=_medicare
Here is the
echo $line >> $_tmp_fn
which did/has the * jumk/substitution
echo '_medicare="$(echo' '"scale=2;' 144.60 202112.txt 2021_es_taxes_paid.txt aa cons_202112.txt uniform_rmd_wksht.pdf '12"' '|' 'bc)"'
How can I prevent the * substitution and still be use the line modification
like line 6 in the example snippet??
Thanks in advance for any advice.
On 04.03.2022 23:44, Bit Twister wrote:
while read -r line ; do problem
$ bash --version
GNU bash, version 5.1.4(1)-release (x86_64-mageia-linux-gnu)
I have a bash script which reads a script file and updates variables.
contents of some lines are modified without my script intervention.
Code snippet
1 while read -r line; do
2 _t=$line
3 set -- $(IFS='=' ; echo $_t)
4 _wd=$1
5 case "$_wd" in
6 _ira_worth) line=" _ira_worth=$_ira_worth # from $_cons_fn" ;; >> 7 <big if/case snip none of which modify _medicare line>
8 echo $line >> $_tmp_fn
9 done < $_taxes_paid_fn
If you look at the following results from set -vx
You'll notice the _medicare line * was converted to file names used in the script
read -r line
_t='_medicare="$(echo "scale=2; 144.60 * 12" | bc)"'
: IFS==
echo _medicare '"$(echo "scale' '2; 144.60 * 12" | bc)"'
set -- _medicare '"$(echo' '"scale' '2;' 144.60 202112.txt 2021_es_taxes_paid.txt aa cons_202112.txt uniform_rmd_wksht.pdf '12"' '|' 'bc)"'
'[' 13 -gt 1 ']'
_wd=_medicare
Here is the
echo $line >> $_tmp_fn
which did/has the * jumk/substitution
echo '_medicare="$(echo' '"scale=2;' 144.60 202112.txt 2021_es_taxes_paid.txt aa cons_202112.txt uniform_rmd_wksht.pdf '12"' '|' 'bc)"'
How can I prevent the * substitution and still be use the line modification >> like line 6 in the example snippet??
If you quote your variables on expansion ("$var") the * as part of your variable value will not expand to file names.
Janis
1) Always quote your shell variables unless you have an explicit reason
not to, see https://mywiki.wooledge.org/Quotes.
2) If you're going to use a shell read loop then always use both `IFS=`
and `-r`:
while IFS= read -r line
unless you have an explicit reason not to, see https://mywiki.wooledge.org/BashFAQ/001.
3) Don't use a shell loop just to manipulate text as you seem to be
doing, see https://unix.stackexchange.com/questions/169716/why-is-using-a-shell-loop-to-process-text-considered-bad-practice.
On Fri, Mar 04 2022,Ed Morton wrote:
3) Don't use a shell loop just to manipulate text as you seem to be
doing, see
https://unix.stackexchange.com/questions/169716/why-is-using-a-shell-loop-to-process-text-considered-bad-practice.
what then, would be a better way to use the shell for line by line processing? The stackexchange answer clearly says, people are
mimicking C lang style and other issues, which I agree with. What
should a novice do then? Pretty sure, they wouldn't know about paste/join/cut/comm etc which sort of makes them do all this.
sivaram
If the data is actually just within a few hundreds (or even a few
thousands) of lines I also wouldn't care much using a shell. That
depends on the data, its transformation, and application, though.
On Fri, Mar 04 2022,Ed Morton wrote:
[snipped 37 lines]
1) Always quote your shell variables unless you have an explicit reason
not to, see https://mywiki.wooledge.org/Quotes.
2) If you're going to use a shell read loop then always use both `IFS=`
and `-r`:
while IFS= read -r line
unless you have an explicit reason not to, see
https://mywiki.wooledge.org/BashFAQ/001.
3) Don't use a shell loop just to manipulate text as you seem to be
doing, see
https://unix.stackexchange.com/questions/169716/why-is-using-a-shell-loop-to-process-text-considered-bad-practice.
[snipped 6 lines]
what then, would be a better way to use the shell for line by line processing? The stackexchange answer clearly says, people are
mimicking C lang style and other issues, which I agree with. What
should a novice do then? Pretty sure, they wouldn't know about paste/join/cut/comm etc which sort of makes them do all this.
People shouldn't be writing shell scripts unless they do know about the
most common mandatory POSIX tools though. Doing so would be like trying
to build a house when all you know how to use is a toolbelt and you've
never heard of a hammer/screwdriver/saw/drill etc that the toolbelt is designed to hold.
In general if you want to do small, simple operations then use tools
like sed, grep, cut, etc. but if you find yourself creating lengthy
and/or complicated pipelines of those or being tempted to write a shell
loop to process multi-line text then you should be using awk instead.
Again - the above is about manipulating text. If you find yourself
needing to manipulate (create/destroy) files or processes THEN a shell
loop may be appropriate (if xargs isn't a better solution).
On Sun, Mar 06 2022,Ed Morton wrote:
[snipped 26 lines]
People shouldn't be writing shell scripts unless they do know about the
most common mandatory POSIX tools though. Doing so would be like trying
to build a house when all you know how to use is a toolbelt and you've
never heard of a hammer/screwdriver/saw/drill etc that the toolbelt is
designed to hold.
On that standard, no one would ever get started on shell then, would
they?
people getting politely chewed out for not being posixy/portable in c.u.shell. And I'm not the only clown in this circus. And no, I
haven't seen/read one posix doc, though I have seen it being quoted
here.
In general if you want to do small, simple operations then use tools
like sed, grep, cut, etc. but if you find yourself creating lengthy
and/or complicated pipelines of those or being tempted to write a shell
loop to process multi-line text then you should be using awk instead.
As a low level sysadmin thrown in the deep end of a bog standard prod
support project decades ago, I have seen 5/10/15 yr scripts with the
above abused paradigm. I didn't touch or change it nor did the
retiring AT&T/Sprint/H3G/others chap who handed it over to me.
Unfortunately I used to use the template because it's been working for
so long. Talk about picking the one idea of the 1000s of shell script
which was bad. :-)
Again - the above is about manipulating text. If you find yourself
needing to manipulate (create/destroy) files or processes THEN a shell
loop may be appropriate (if xargs isn't a better solution).
I suspect that with no one telling what's the best or optimal way to
save tears down the road, it's just like the mess you described. It's
good thing that my mistakes are generally not earth altering....so
far.
sivaram
what then, would be a better way to use the shell for line by line >processing? The stackexchange answer clearly says, people are mimicking
C lang style and other issues, which I agree with. What should a novice
do then? Pretty sure, they wouldn't know about paste/join/cut/comm etc
which sort of makes them do all this.
In article <[email protected]>,
Sivaram Neelakantan <[email protected]> wrote:
...
what then, would be a better way to use the shell for line by line
processing? The stackexchange answer clearly says, people are mimicking
C lang style and other issues, which I agree with. What should a novice
do then? Pretty sure, they wouldn't know about paste/join/cut/comm etc
which sort of makes them do all this.
I usually use MAPFILE (in bash) for this. MAPFILE reads an entire file or process into an array. Then you can iterate the array. So, you end up
with:
mapfile -t < file
for i in "${MAPFILE[@]}"
do
...
done
Or, to do it with a process (the more common case):
mapfile -t < <(process)
for i in "${MAPFILE[@]}"
do
...
done
On 3/4/2022 4:44 PM, Bit Twister wrote:<snip>
while read -r line ; do problem
$ bash --version
GNU bash, version 5.1.4(1)-release (x86_64-mageia-linux-gnu)
I have a bash script which reads a script file and updates variables.
contents of some lines are modified without my script intervention.
Code snippet
1 while read -r line; do
2 _t=$line
3 set -- $(IFS='=' ; echo $_t)
4 _wd=$1
5 case "$_wd" in
6 _ira_worth) line=" _ira_worth=$_ira_worth # from $_cons_fn" ;; >> 7 <big if/case snip none of which modify _medicare line>
8 echo $line >> $_tmp_fn
9 done < $_taxes_paid_fn
3) Don't use a shell loop just to manipulate text as you seem to be
doing, see https://unix.stackexchange.com/questions/169716/why-is-using-a-shell-loop-to-process-text-considered-bad-practice.
I usually use MAPFILE (in bash) for this. MAPFILE reads an entire file or process into an array. Then you can iterate the array. So, you end up
with:
mapfile -t < file
for i in "${MAPFILE[@]}"
do
...
done
Or, to do it with a process (the more common case):
mapfile -t < <(process)
for i in "${MAPFILE[@]}"
do
...
done
Ed Morton <[email protected]> wrote:
On 3/4/2022 4:44 PM, Bit Twister wrote:<snip>
while read -r line ; do problem
$ bash --version
GNU bash, version 5.1.4(1)-release (x86_64-mageia-linux-gnu)
I have a bash script which reads a script file and updates variables.
contents of some lines are modified without my script intervention.
Code snippet
1 while read -r line; do
2 _t=$line
3 set -- $(IFS='=' ; echo $_t)
4 _wd=$1
5 case "$_wd" in
6 _ira_worth) line=" _ira_worth=$_ira_worth # from $_cons_fn" ;; >>> 7 <big if/case snip none of which modify _medicare line>
8 echo $line >> $_tmp_fn
9 done < $_taxes_paid_fn
3) Don't use a shell loop just to manipulate text as you seem to be
doing, see
https://unix.stackexchange.com/questions/169716/why-is-using-a-shell-loop-to-process-text-considered-bad-practice.
IMO, this is not great advice.
1) If raw throughput matters, you shouldn't be using shell text processing
in the first place; use sed or awk, at the very least, instead.
2) If the input is coming from a pipe, what the read loops buys you is concurrency and parallelism. If the process generating the input has high latency, the concurrency can help tremendously. If either side uses alot of CPU, the parallelism might help performance, overcoming the byte-by-byte issue.
Case example: last year I downloaded a company engineer-managed script that updated routing tables, created as a workaround for a poorly managed IPSec VPN configuration deployed on company laptos. When I ran the script it
seemed to hang, so I'd kill it and run it again. After a few minutes I decided to dive into the script to figure out what was happening. The fundamental problem was that they had one routine generating a list of addresses and another routine consuming the list in a loop. Crucially, the latter, second routine was using the Bash'ism to slurp the input into an array for processing using a for-loop rather than a while-read-loop. The address-generating routine was doing network I/O to download and preprocess the lists, which was taking considerable time. Meanwhile, the second loop
was completely idle waiting for the first to finish. The second loop also incurred some surprisingly high latency per address (IIRC, might have been every invocation of route(1) doing reverse DNS or some such). Long story short, the entire script took much longer to complete than if they had used
a simple while-read-loop, permitting both loops to run concurrently. Plus, the script would have provided immediate feedback that things were actually progressing.
You see something similar with the widespread adoption of map-filter-reduce functional patterns in languages like JavaScript. The current popularity seems to have been kicked off by admiration for Haskell-style algorithms, which once upon a time were popular blog fodder. But Haskell uses lazy list evaluation, unlike languages like JavaScript. The result is that the new preferred pattern results in a tremendous amount of memory usage and churn, as every transformation step requires constructing and populating a whole
new array. It makes for some horribly inefficient programs; inefficient in a quite opaque way, whereas with traditional patterns the unnecessary array duplications would be immediately obvious, particularly if reading the code with an eye toward improving performance. (Also aren't creating a bunch of closures, which can create barriers to JIT optimization.)
Some of the old patterns--e.g. shell pipes--are far more sophisticated than people give them credit for today. See, e.g., this 2014 paper by Doug McIlroy, inventor of the Unix pipe, describing the equivalency between coroutines, pipes, and lazy lists:
https://www.cs.dartmouth.edu/~doug/sieve/sieve.pdf
On 3/8/2022 8:46 PM, William Ahern wrote:
I'm not sure what you're saying below. It sounds like you're discussing
some bad software you came across that you improved by replacing a
couple of for loops with a while loop but obviously that doesn't mean it couldn't have been further improved by using, say, awk instead. Can you provide a concise sample shell script that clearly and simply just demonstrates what you're describing below and some way of generating
sample input to help us understand what you're describing and so we can
test it?
Ed.
2) If the input is coming from a pipe, what the read loops buys you is
concurrency and parallelism. If the process generating the input has high
latency, the concurrency can help tremendously. If either side uses
alot of
CPU, the parallelism might help performance, overcoming the byte-by-byte
issue.
Case example: last year I downloaded a company engineer-managed script
that
updated routing tables, created as a workaround for a poorly managed
IPSec
VPN configuration deployed on company laptos. When I ran the script it
seemed to hang, so I'd kill it and run it again. After a few minutes I
decided to dive into the script to figure out what was happening. The
fundamental problem was that they had one routine generating a list of
addresses and another routine consuming the list in a loop. Crucially,
the
latter, second routine was using the Bash'ism to slurp the input into an
array for processing using a for-loop rather than a while-read-loop. The
address-generating routine was doing network I/O to download and
preprocess
the lists, which was taking considerable time. Meanwhile, the second loop
was completely idle waiting for the first to finish. The second loop also
incurred some surprisingly high latency per address (IIRC, might have
been
every invocation of route(1) doing reverse DNS or some such). Long story
short, the entire script took much longer to complete than if they had
used
a simple while-read-loop, permitting both loops to run concurrently.
Plus,
the script would have provided immediate feedback that things were
actually
progressing.
You see something similar with the widespread adoption of
map-filter-reduce
functional patterns in languages like JavaScript. The current popularity
seems to have been kicked off by admiration for Haskell-style algorithms,
which once upon a time were popular blog fodder. But Haskell uses lazy
list
evaluation, unlike languages like JavaScript. The result is that the new
preferred pattern results in a tremendous amount of memory usage and
churn,
as every transformation step requires constructing and populating a whole
new array. It makes for some horribly inefficient programs;
inefficient in a
quite opaque way, whereas with traditional patterns the unnecessary array
duplications would be immediately obvious, particularly if reading the
code
with an eye toward improving performance. (Also aren't creating a
bunch of
closures, which can create barriers to JIT optimization.)
Some of the old patterns--e.g. shell pipes--are far more sophisticated
than
people give them credit for today. See, e.g., this 2014 paper by Doug
McIlroy, inventor of the Unix pipe, describing the equivalency between
coroutines, pipes, and lazy lists:
https://www.cs.dartmouth.edu/~doug/sieve/sieve.pdf
| Sysop: | Keyop |
|---|---|
| Location: | Huddersfield, West Yorkshire, UK |
| Users: | 714 |
| Nodes: | 16 (2 / 14) |
| Uptime: | 135:12:33 |
| Calls: | 12,087 |
| Files: | 14,997 |
| Messages: | 6,517,362 |