I want to replace all the non-printable/control characters with plain
space except keeping the `\n' as they are in the following string:
```
str:="""#g2 % point group to the space group of group
3 % generator
0 -1 0
0 0 -1
-1 0 0
3 /8 % generator
-30 58 -30
-33 55 -25
-25 55 -33
% order of the group unknown""";
```
Is there a convenient way to do this?
Regards,
Zhao
On 20.03.2023 16:00, [email protected] wrote:
I want to replace all the non-printable/control characters with plain space except keeping the `\n' as they are in the following string:I don't see any control characters in your data below.
```
str:="""#g2 % point group to the space group of group
3 % generator
0 -1 0
0 0 -1
-1 0 0
3 /8 % generator
-30 58 -30
-33 55 -25
-25 55 -33
% order of the group unknown""";
```
Is there a convenient way to do this?Of course. Use variable substitution with patterns containing
the respective character classes; for example
str="..." # any string
printf "%s" "${str//[^[:print:]$'\n']/ }"
to replace all occurrences of non-printable and also not '\n'.
Janis
Regards,
Zhao
On Tuesday, March 21, 2023 at 12:33:06 AM UTC+8, Janis Papanagnou
wrote:
On 20.03.2023 16:00, [email protected] wrote:
I want to replace all the non-printable/control characters withI don't see any control characters in your data below.
plain space except keeping the `\n' as they are in the following
string:
If I first store the string here into a file named as `strfile' and
then check it as follows, you will see them:
werner@X10DAi:~$ cat -A strfile #g2 % point group to the space group
of group$ 3^I% generator$ 0 -1 0$ 0 0 -1$ -1 0 0$ 3^I/8^I^I%
generator$ -30 58 -30$ -33 55 -25$ -25 55 -33$ % order of the group
unknown$
Of course. Use variable substitution with patterns containing the
``` str:="""#g2 % point group to the space group of group 3 %
generator 0 -1 0 0 0 -1 -1 0 0 3 /8 % generator -30 58 -30 -33 55
-25 -25 55 -33 % order of the group unknown"""; ```
Is there a convenient way to do this?
respective character classes; for example
str="..." # any string printf "%s" "${str//[^[:print:]$'\n']/ }"
to replace all occurrences of non-printable and also not '\n'.
But your description above is inconsistent with the answer given by
ChatGPT:
Overall, this code is a useful way to clean up strings and remove any non-printable characters or newline characters that may cause issues
in further processing or display. ```
Finally, I only want to replace all occurrences of non-printable with
one space and also keep '\n' as they are.
The following is the
desired result when applied on the file whose content is the string
discussed here:
werner@X10DAi:~$ sed -e 's/[^[:print:]]/ /g' strfile | cat -A #g2 %
point group to the space group of group$ 3 % generator$ 0 -1 0$ 0 0
-1$ -1 0 0$ 3 /8 % generator$ -30 58 -30$ -33 55 -25$ -25 55 -33$
% order of the group unknown$
But my concern here is that is the whole file is represented in a
string, I should do the above string operations on this string
instead of a file.
Janis
Regards, Zhao
On 21.03.2023 00:59, [email protected] wrote:
On Tuesday, March 21, 2023 at 12:33:06 AM UTC+8, Janis Papanagnou
wrote:
On 20.03.2023 16:00, [email protected] wrote:
I want to replace all the non-printable/control characters withI don't see any control characters in your data below.
plain space except keeping the `\n' as they are in the following
string:
If I first store the string here into a file named as `strfile' andNo. If _you_ do that *you* will see them. _I_ just see spaces tabs
then check it as follows, you will see them:
and newlines as the only control characters.
werner@X10DAi:~$ cat -A strfile #g2 % point group to the space group
of group$ 3^I% generator$ 0 -1 0$ 0 0 -1$ -1 0 0$ 3^I/8^I^I%
generator$ -30 58 -30$ -33 55 -25$ -25 55 -33$ % order of the group unknown$
Of course. Use variable substitution with patterns containing the
``` str:="""#g2 % point group to the space group of group 3 %
generator 0 -1 0 0 0 -1 -1 0 0 3 /8 % generator -30 58 -30 -33 55
-25 -25 55 -33 % order of the group unknown"""; ```
Is there a convenient way to do this?
respective character classes; for example
str="..." # any string printf "%s" "${str//[^[:print:]$'\n']/ }"
to replace all occurrences of non-printable and also not '\n'.
But your description above is inconsistent with the answer given by ChatGPT:I suggest to discuss that with ChatGPT then, if you think there's
more expertise, and if you prefer chatting with that tool instead
of just trying the suggestion on your data.
[ big snip of chat protocol spam ]
Overall, this code is a useful way to clean up strings and remove any non-printable characters or newline characters that may cause issuesAre you saying that or your chat tool?
in further processing or display. ```
The character set [^[:print:]$'\n'] specifies a pattern defined
by the negated ('^') sets comprising printables and newlines.
That is what you said you need. No?
Finally, I only want to replace all occurrences of non-printable withNewlines are not touched with the code I presented.
one space and also keep '\n' as they are.
You didn't say in your original post that you want multiple occurrences "compressed" to a single character replacement.
To transform _multiple_ consecutive control characters by a _single_ character adjust your regexp. Depending on what tool (what shell type,
sed, whatever) you want to use it's either
[^[:print:]$'\n']+
[^[:print:]$'\n'][^[:print:]$'\n']*
+([^[:print:]$'\n'])
The following is the
desired result when applied on the file whose content is the string discussed here:
werner@X10DAi:~$ sed -e 's/[^[:print:]]/ /g' strfile | cat -A #g2 %
point group to the space group of group$ 3 % generator$ 0 -1 0$ 0 0
-1$ -1 0 0$ 3 /8 % generator$ -30 58 -30$ -33 55 -25$ -25 55 -33$
% order of the group unknown$
But my concern here is that is the whole file is represented in aThat's what the shell's string substitution ${str//.../...} is for.
string, I should do the above string operations on this string
instead of a file.
In other words, just apply the solution, or go chatting with ChatGPT.
Janis
Regards, Zhao
On Tuesday, March 21, 2023 at 9:28:39 AM UTC+8, Janis Papanagnou wrote:
On 21.03.2023 00:59, [email protected] wrote:
On Tuesday, March 21, 2023 at 12:33:06 AM UTC+8, Janis Papanagnou wrote:
On 20.03.2023 16:00, [email protected] wrote:
I want to replace all the non-printable/control characters withI don't see any control characters in your data below.
plain space except keeping the `\n' as they are in the following
string:
If I first store the string here into a file named as `strfile' andNo. If _you_ do that *you* will see them. _I_ just see spaces tabs
then check it as follows, you will see them:
and newlines as the only control characters.
werner@X10DAi:~$ cat -A strfile #g2 % point group to the space group
of group$ 3^I% generator$ 0 -1 0$ 0 0 -1$ -1 0 0$ 3^I/8^I^I%
generator$ -30 58 -30$ -33 55 -25$ -25 55 -33$ % order of the group unknown$
Of course. Use variable substitution with patterns containing the
``` str:="""#g2 % point group to the space group of group 3 %
generator 0 -1 0 0 0 -1 -1 0 0 3 /8 % generator -30 58 -30 -33 55
-25 -25 55 -33 % order of the group unknown"""; ```
Is there a convenient way to do this?
respective character classes; for example
str="..." # any string printf "%s" "${str//[^[:print:]$'\n']/ }"
to replace all occurrences of non-printable and also not '\n'.
In fact, in the analysis of your regex, ChatGPT is indeed correct. However, its final summary is wrong:But your description above is inconsistent with the answer given by ChatGPT:I suggest to discuss that with ChatGPT then, if you think there's
more expertise, and if you prefer chatting with that tool instead
of just trying the suggestion on your data.
The replacement in this pattern substitution is a space character, denoted by the single space between the forward slashes. This means that any non-printable characters or newline characters in the `str` variable will be replaced with a space character.
"any non-printable characters or newline characters" should be "any non-printable characters other than newline characters".
[ big snip of chat protocol spam ]
Overall, this code is a useful way to clean up strings and remove any non-printable characters or newline characters that may cause issuesAre you saying that or your chat tool?
in further processing or display. ```
The character set [^[:print:]$'\n'] specifies a pattern defined
by the negated ('^') sets comprising printables and newlines.
That is what you said you need. No?Yes.
Finally, I only want to replace all occurrences of non-printable with one space and also keep '\n' as they are.Newlines are not touched with the code I presented.
You didn't say in your original post that you want multiple occurrences "compressed" to a single character replacement.
To transform _multiple_ consecutive control characters by a _single_ character adjust your regexp. Depending on what tool (what shell type, sed, whatever) you want to use it's eitherThey all work as follows, with `grep -E':
[^[:print:]$'\n']+
[^[:print:]$'\n'][^[:print:]$'\n']*
+([^[:print:]$'\n'])
werner@X10DAi:~$ grep -E '[^[:print:]$'\n']+' strfile | cat -A
3^I% generator$
3^I/8^I^I% generator$
werner@X10DAi:~$ grep -E '[^[:print:]$'\n'][^[:print:]$'\n']*' strfile | cat -A
3^I% generator$
3^I/8^I^I% generator$
werner@X10DAi:~$ grep -E '[^[:print:]$'\n']+' strfile | cat -A
3^I% generator$
3^I/8^I^I% generator$
The following is the
desired result when applied on the file whose content is the string discussed here:
werner@X10DAi:~$ sed -e 's/[^[:print:]]/ /g' strfile | cat -A #g2 % point group to the space group of group$ 3 % generator$ 0 -1 0$ 0 0
-1$ -1 0 0$ 3 /8 % generator$ -30 58 -30$ -33 55 -25$ -25 55 -33$
% order of the group unknown$
In my example, the more portable usage should be as follows:But my concern here is that is the whole file is represented in a string, I should do the above string operations on this stringThat's what the shell's string substitution ${str//.../...} is for.
instead of a file.
werner@X10DAi:~$ sed -Ee 's/[^[:print:]]/ /g' strfile | cat -A
#g2 % point group to the space group of group$
3 % generator$
0 -1 0$
0 0 -1$
-1 0 0$
3 /8 % generator$
-30 58 -30$
-33 55 -25$
-25 55 -33$
% order of the group unknown$
In other words, just apply the solution, or go chatting with ChatGPT.Agreed. But isn't it better to combine the advantages of both to a certain extent?
Zhao
Janis
Regards, Zhao
On Tuesday, March 21, 2023 at 9:28:39 AM UTC+8, Janis Papanagnou wrote:
On 21.03.2023 00:59, [email protected] wrote:
In other words, just apply the solution, or go chatting with ChatGPT.
Agreed. But isn't it better to combine the advantages of both to a certain extent?
Another question:
[[:^print:]] and [^[:print:]], can they both be used here?
[ please snip the 140 lines of previous context if all you have is a
simple question ]
On 21.03.2023 14:46, [email protected] wrote:
Another question:
[[:^print:]] and [^[:print:]], can they both be used here?If you mean whether they are interchangeable, then No.
(Where did you get _that idea_ from, from charGPT ?)
But this you could also have easily testes yourself.
This is very basic and you should inspect some contemporary source describing the Unix'y form of regular expressions and their syntax.
Janis
[...]
Just a few suggestions and things to ponder about. Feel free to ignore
them.
Janis
I want to replace all the non-printable/control characters with plain space except keeping the `\n' as they are in the following string:bdddddccd
```
str:="""#g2 % point group to the space group of group
3 % generator
0 -1 0
0 0 -1
-1 0 0
3 /8 % generator
-30 58 -30
-33 55 -25
-25 55 -33
% order of the group unknown""";
```
Is there a convenient way to do this?
Regards,
Zhao
On Monday, March 20, 2023 at 10:00:32 AM UTC-5, [email protected] wrote:
I want to replace all the non-printable/control characters with plain space except keeping the `\n' as they are in the following string:
```
str:="""#g2 % point group to the space group of group
3 % generator
0 -1 0
0 0 -1
-1 0 0
3 /8 % generator
-30 58 -30
-33 55 -25
-25 55 -33
% order of the group unknown""";
```
Is there a convenient way to do this?
Regards,bdddddccd
Zhao
| Sysop: | Keyop |
|---|---|
| Location: | Huddersfield, West Yorkshire, UK |
| Users: | 714 |
| Nodes: | 16 (2 / 14) |
| Uptime: | 138:19:48 |
| Calls: | 12,087 |
| Files: | 14,997 |
| Messages: | 6,517,398 |