In a couple recent versions of Python (including 3.8 and 3.10), the following code:The documentation for re.sub() and re.findall() has these notes:
import re
print(re.sub(".*", "replacement", "pattern"))
yields the output "replacementreplacement".
This behavior does not occur in 3.6.
Which behavior is the desired one? Perhaps relatedly, I noticed that even
in 3.6, the code
print(re.findall(".*","pattern"))
yields ['pattern',''] which is not what I was expecting.
Alexander Richert - NOAA Affiliate via Python-list schreef opFor what it's worth, there's some discussion about this in this Github
28/12/2022 om 19:42:
In a couple recent versions of Python (including 3.8 and 3.10), theThe documentation for re.sub() and re.findall() has these notes:
following code:
import re
print(re.sub(".*", "replacement", "pattern"))
yields the output "replacementreplacement".
This behavior does not occur in 3.6.
Which behavior is the desired one? Perhaps relatedly, I noticed that even
in 3.6, the code
print(re.findall(".*","pattern"))
yields ['pattern',''] which is not what I was expecting.
"Changed in version 3.7: Empty matches for the pattern are replaced
when adjacent to a previous non-empty match." and "Changed in version
3.7: Non-empty matches can now start just after a previous empty match." That's probably describes the behavior you're seeing. ".*" first
matches "pattern", which is a non-empty match; then it matches the
empty string at the end, which is an empty match but is replaced
because it is adjacent to a non-empty match.
Seems somewhat counter-intuitive to me, but AFAICS it's the intended behavior.
In a couple recent versions of Python (including 3.8 and 3.10), the following code:
import re
print(re.sub(".*", "replacement", "pattern"))
yields the output "replacementreplacement".
This behavior does not occur in 3.6.
Which behavior is the desired one? Perhaps relatedly, I noticed that even
in 3.6, the code
print(re.findall(".*","pattern"))
yields ['pattern',''] which is not what I was expecting.
On 2022-12-28 18:42, Alexander Richert - NOAA Affiliate via Python-list wrote:
In a couple recent versions of Python (including 3.8 and 3.10), theIt's not a bug, it's a change in behaviour to bring it more into line with other regex implementations in other languages.
following code:
import re
print(re.sub(".*", "replacement", "pattern"))
yields the output "replacementreplacement".
This behavior does not occur in 3.6.
Which behavior is the desired one? Perhaps relatedly, I noticed that even
in 3.6, the code
print(re.findall(".*","pattern"))
yields ['pattern',''] which is not what I was expecting.
On 2022-12-28 18:42, Alexander Richert - NOAA Affiliate via Python-list wrote:[...]
print(re.sub(".*", "replacement", "pattern"))
yields the output "replacementreplacement".
It's not a bug, it's a change in behaviour to bring it more into line with other regex implementations in other languages.
| Sysop: | Keyop |
|---|---|
| Location: | Huddersfield, West Yorkshire, UK |
| Users: | 716 |
| Nodes: | 16 (2 / 14) |
| Uptime: | 50:21:43 |
| Calls: | 12,115 |
| Calls today: | 6 |
| Files: | 15,010 |
| Messages: | 6,518,550 |
| Posted today: | 1 |