Note that I'm not talking about [incr x] in the above example, but if eg
a simple [string is integer $x] will make $x = 1234567 from 1_234_567,
this would be Not Good, since obviously [split $x "_"] would no longer
work then.
On 7/21/2023 12:00 PM, Ralf Fassel wrote:
Note that I'm not talking about [incr x] in the above example, but if eg
a simple [string is integer $x] will make $x = 1234567 from 1_234_567, this would be Not Good, since obviously [split $x "_"] would no longer work then.
I agree. IMO, if true, a completely unnecessary feature with limited
benefit and huge risk potential to break things as you pointed out.
Perhaps the "-strict" version will work as before and reject it?
On Friday, July 21, 2023 at 6:17:45 PM UTC+2, saitology9 wrote:
On 7/21/2023 12:00 PM, Ralf Fassel wrote:
I agree. IMO, if true, a completely unnecessary feature with limited
Note that I'm not talking about [incr x] in the above example, but if eg >>> a simple [string is integer $x] will make $x = 1234567 from 1_234_567,
this would be Not Good, since obviously [split $x "_"] would no longer
work then.
benefit and huge risk potential to break things as you pointed out.
Perhaps the "-strict" version will work as before and reject it?
This was discussed on site, NO, right now -strict mode does not catch this:
% package req Tcl
9.0a4
% string is integer -strict 1_234_567
1
This should be fixed. (Also, -strict mode for "string is" should become default in Tcl 9, anyways.)
Stefan
With respect to your first question, no the string representation will
not shimmer away. This is no different than "set x 0x10" and worrying
about the string shimmering to "16" (assuming no arithmetic operations
are applied of course).
With respect to the second, the "string is integer" identifies what
strings are accepted by *Tcl* as integer, not humans. Since 1_234 is
always acceptable the command will return 1 irrespective of -strict.
As to -strict being the default in Tcl 9, I think that ship has sailed.
Iirc, it would have too much impact on Tk.
/Ashok
In article <u9efpk$3brt2$[email protected]>,
Ashok <[email protected]> wrote:
With respect to your first question, no the string representation will
not shimmer away. This is no different than "set x 0x10" and worrying
about the string shimmering to "16" (assuming no arithmetic operations
are applied of course).
With respect to the second, the "string is integer" identifies what
strings are accepted by *Tcl* as integer, not humans. Since 1_234 is
always acceptable the command will return 1 irrespective of -strict.
As to -strict being the default in Tcl 9, I think that ship has sailed. >>Iirc, it would have too much impact on Tk.
/Ashok
I'm curious how this feature was motivated. Does any other language use
this notation?
Perhaps I'm missing something, but to me it looks like a solution in
search of a problem, and something likely to have unintended
consequences.
Or maybe I'm missing something.
Ted Nolan <tednolan> <[email protected]> wrote:
In article <u9efpk$3brt2$[email protected]>,
Ashok <[email protected]> wrote:
With respect to your first question, no the string representation will >>not shimmer away. This is no different than "set x 0x10" and worrying >>about the string shimmering to "16" (assuming no arithmetic operations >>are applied of course).
With respect to the second, the "string is integer" identifies what >>strings are accepted by *Tcl* as integer, not humans. Since 1_234 is >>always acceptable the command will return 1 irrespective of -strict.
As to -strict being the default in Tcl 9, I think that ship has sailed. >>Iirc, it would have too much impact on Tk.
/Ashok
I'm curious how this feature was motivated. Does any other language use this notation?At least one other language is Python:
https://peps.python.org/pep-0515/
Whether Python's usage motivated this change for Tcl I cannot say.
Perhaps I'm missing something, but to me it looks like a solution in search of a problem, and something likely to have unintended
consequences.
Or maybe I'm missing something.The intent appears to be to allow for a locale independent "thousands separator" for long integer constants.
Instead of 17283747283748234
One can write 17_283_747_283_748_234
Also, according to the Phython PEP, the following other languages have similar allowances for "thousand's separators":
Ada: single, only between digits [8]
C# (open proposal for 7.0): multiple, only between digits [6]
C++14: single, between digits (different separator chosen) [1]
D: multiple, anywhere, including trailing [2]
Java: multiple, only between digits [7]
Julia: single, only between digits (but not in float exponent parts) [9] Perl 5: multiple, basically anywhere, although docs say it’s
restricted to one underscore between digits [3]
Ruby: single, only between digits (although docs say “anywhere”) [10] Rust: multiple, anywhere, except for between exponent “e” and digits [4] Swift: multiple, between digits and trailing (although textual
description says only “between digits”) [5]
In article <u9efpk$3brt2$[email protected]>,
Ashok <[email protected]> wrote:
With respect to your first question, no the string representation will >>>not shimmer away. This is no different than "set x 0x10" and worrying >>>about the string shimmering to "16" (assuming no arithmetic operations >>>are applied of course).
With respect to the second, the "string is integer" identifies what >>>strings are accepted by *Tcl* as integer, not humans. Since 1_234 is >>>always acceptable the command will return 1 irrespective of -strict.
As to -strict being the default in Tcl 9, I think that ship has sailed. >>>Iirc, it would have too much impact on Tk.
/Ashok
I'm curious how this feature was motivated. Does any other language use
this notation?
At least one other language is Python:
https://peps.python.org/pep-0515/
Whether Python's usage motivated this change for Tcl I cannot say.
Perhaps I'm missing something, but to me it looks like a solution in
search of a problem, and something likely to have unintended
consequences.
Or maybe I'm missing something.
The intent appears to be to allow for a locale independent "thousands >separator" for long integer constants.
Instead of 17283747283748234
One can write 17_283_747_283_748_234
Also, according to the Phython PEP, the following other languages have >similar allowances for "thousand's separators":
Ada: single, only between digits [8]
C# (open proposal for 7.0): multiple, only between digits [6]
C++14: single, between digits (different separator chosen) [1]
D: multiple, anywhere, including trailing [2]
Java: multiple, only between digits [7]
Julia: single, only between digits (but not in float exponent parts) [9]
Perl 5: multiple, basically anywhere, although docs say it’s
restricted to one underscore between digits [3]
Ruby: single, only between digits (although docs say “anywhere”) [10]
Rust: multiple, anywhere, except for between exponent “e” and digits [4]
Swift: multiple, between digits and trailing (although textual
description says only “between digits”) [5]
On Friday, July 21, 2023 at 8:50:16 PM UTC+2, Rich wrote:
Ted Nolan <tednolan> <[email protected]> wrote:
In article <u9efpk$3brt2$[email protected]>,At least one other language is Python:
Ashok <[email protected]> wrote:
With respect to your first question, no the string representation will >>>> not shimmer away. This is no different than "set x 0x10" and worrying
about the string shimmering to "16" (assuming no arithmetic operations >>>> are applied of course).
With respect to the second, the "string is integer" identifies what
strings are accepted by *Tcl* as integer, not humans. Since 1_234 is
always acceptable the command will return 1 irrespective of -strict.
As to -strict being the default in Tcl 9, I think that ship has sailed. >>>> Iirc, it would have too much impact on Tk.
/Ashok
I'm curious how this feature was motivated. Does any other language use
this notation?
https://peps.python.org/pep-0515/
Whether Python's usage motivated this change for Tcl I cannot say.
Perhaps I'm missing something, but to me it looks like a solution inThe intent appears to be to allow for a locale independent "thousands
search of a problem, and something likely to have unintended
consequences.
Or maybe I'm missing something.
separator" for long integer constants.
Instead of 17283747283748234
One can write 17_283_747_283_748_234
Also, according to the Phython PEP, the following other languages have
similar allowances for "thousand's separators":
Ada: single, only between digits [8]
C# (open proposal for 7.0): multiple, only between digits [6]
C++14: single, between digits (different separator chosen) [1]
D: multiple, anywhere, including trailing [2]
Java: multiple, only between digits [7]
Julia: single, only between digits (but not in float exponent parts) [9]
Perl 5: multiple, basically anywhere, although docs say it’s
restricted to one underscore between digits [3]
Ruby: single, only between digits (although docs say “anywhere”) [10]
Rust: multiple, anywhere, except for between exponent “e” and digits [4] >> Swift: multiple, between digits and trailing (although textual
description says only “between digits”) [5]
But in those languages, this syntactic sugar is thrown away early during processing, in Tcl, it keeps sitting in the stringrep. This makes a difference.
Stefan
On 7/21/2023 11:55 AM, stefan wrote:
On Friday, July 21, 2023 at 8:50:16 PM UTC+2, Rich wrote:
Ted Nolan <tednolan> <[email protected]> wrote:
In article <u9efpk$3brt2$[email protected]>,At least one other language is Python:
Ashok <[email protected]> wrote:
With respect to your first question, no the string representation will >>>>> not shimmer away. This is no different than "set x 0x10" and worrying >>>>> about the string shimmering to "16" (assuming no arithmetic operations >>>>> are applied of course).
With respect to the second, the "string is integer" identifies what
strings are accepted by *Tcl* as integer, not humans. Since 1_234 is >>>>> always acceptable the command will return 1 irrespective of -strict. >>>>>
As to -strict being the default in Tcl 9, I think that ship has sailed. >>>>> Iirc, it would have too much impact on Tk.
/Ashok
I'm curious how this feature was motivated. Does any other language use >>>> this notation?
https://peps.python.org/pep-0515/
Whether Python's usage motivated this change for Tcl I cannot say.
Perhaps I'm missing something, but to me it looks like a solution inThe intent appears to be to allow for a locale independent "thousands
search of a problem, and something likely to have unintended
consequences.
Or maybe I'm missing something.
separator" for long integer constants.
Instead of 17283747283748234
One can write 17_283_747_283_748_234
Also, according to the Phython PEP, the following other languages have
similar allowances for "thousand's separators":
Ada: single, only between digits [8]
C# (open proposal for 7.0): multiple, only between digits [6]
C++14: single, between digits (different separator chosen) [1]
D: multiple, anywhere, including trailing [2]
Java: multiple, only between digits [7]
Julia: single, only between digits (but not in float exponent parts) [9] >>> Perl 5: multiple, basically anywhere, although docs say it’s
restricted to one underscore between digits [3]
Ruby: single, only between digits (although docs say “anywhere”) [10] >>> Rust: multiple, anywhere, except for between exponent “e” and digits [4]
Swift: multiple, between digits and trailing (although textual
description says only “between digits”) [5]
But in those languages, this syntactic sugar is thrown away early during processing, in Tcl, it keeps sitting in the stringrep. This makes a difference.
Stefan
The _ can be used freely between digits in any way desired to make large literal numbers more readable. In particular, it's quite useful with hex and binary. With 64 bit ints, it is even more helpful.
100_000_000
0xffff_ffff
0b1111_1111_1111_1110
0b1111_1111___1111_1111___1111_1111___1111_1111
Once it is shimmered into an integer, the string representation changes:
%info patch
8.7a6
% set mask 0b1111_0000_1010
0b1111_0000_1010
% expr $mask
3850
% puts $mask
0b1111_0000_1010
% tcl::unsupported::representation $mask
value is a exprcode with a refcount of 4, object pointer at 0xffffffffac327ae0, internal representation 0xffffffffae0a38b0:0x0, string representation "0b1111_0000_1010"
% incr mask
3851
% tcl::unsupported::representation $mask
value is a wideInt with a refcount of 2, object pointer at 0xffffffffac3272d0, internal representation 0xf0b:0x3b59bd8, string representation "3851"
Do you think that the above expr $mask should do a shimmer? I don't know, perhaps. But clearly it should with the incr.
Please write a ticket if you think this is wrong.
The _ can be used freely between digits in any way desired to make large literal numbers more readable. In particular, it's quite useful with hex
and binary. With 64 bit ints, it is even more helpful.
100_000_000
0xffff_ffff
0b1111_1111_1111_1110
0b1111_1111___1111_1111___1111_1111___1111_1111
I don't access to Tcl 9 to test. So, it looks like "_" can be placed anywhere as a non-enforcing separator. So earlier statements about this being a thousand separator do not apply, I guess.
On 7/21/2023 2:23 PM, saitology9 wrote:
I don't access to Tcl 9 to test. So, it looks like "_" can be placed anywhere as a non-enforcing separator. So earlier statements about this being a thousand separator do not apply, I guess.
That is correct. the _ can be anywhere and any number of times between numeric or hex digits, but cannot be between the radix e.g. 0x and the first digit nor after the last digit.
TIP 551 was approved for 8.7 and that's the version I tested, as I too don't have a tcl 9 to test. Somewhere, that escapes memory, there are single file 8.7 binaries for windows and linux. It might have been on github that I found one.
The motivation for the TIP began with some discussions here on clt 5-6 years ago. The list of languages that support this was lifted from a page in the perl universe. I first learned of its use in Ada.
Brian Griffin did the actual implementation.
On 7/21/2023 2:23 PM, saitology9 wrote:
I don't access to Tcl 9 to test. So, it looks like "_" can be placed anywhere as a non-enforcing separator. So earlier statements about this being a thousand separator do not apply, I guess.That is correct. the _ can be anywhere and any number of times between numeric or hex digits, but cannot be between the radix e.g. 0x and the first digit nor after the last digit.
TIP 551 was approved for 8.7 and that's the version I tested, as I too don't have a tcl 9 to test. Somewhere, that escapes memory, there are single file 8.7 binaries for windows and linux. It might have been on github that I found one.
The motivation for the TIP began with some discussions here on clt 5-6 years ago. The list of languages that support this was lifted from a page in the perl universe. I first learned of its use in Ada.
Brian Griffin did the actual implementation.
* et99 <[email protected]>
| Forgot to address the OP's example:
| % set a 123_456_789
| 123_456_789
| % string is integer $a
| 1
| % tcl::unsupported::representation $a
| value is a wideInt with a refcount of 4, object pointer at 0xfffffffff62a0190, internal representation 0x75bcd15:0x0, string representation "123_456_789"
| % split $a _
| 123 456 789
Thanks, that last part is what I was hoping for.
R'
| Forgot to address the OP's example:
| % set a 123_456_789
| 123_456_789
| % string is integer $a
| 1
| % tcl::unsupported::representation $a
| value is a wideInt with a refcount of 4, object pointer at 0xfffffffff62a0190,
internal representation 0x75bcd15:0x0, string representation "123_456_789"
| % split $a _
| 123 456 789
| ["_" as thousands separator in numbers]
| I'm curious how this feature was motivated. Does any other language use
| this notation?
C++ has something similar since c++14 with the
int i = 18'446'744;
notation, but here this is not a problem, since the variable is never >interpreted as a string.
R'
In article <[email protected]>, Ralf Fassel <[email protected]> wrote:
* [email protected] (Ted Nolan <tednolan>)
| ["_" as thousands separator in numbers]
| I'm curious how this feature was motivated. Does any other language use >> | this notation?
C++ has something similar since c++14 with the
int i = 18'446'744;
notation, but here this is not a problem, since the variable is never
interpreted as a string.
R'
Yikes! I really don't like that. Makes me glad I don't c++
Am 22.07.2023 um 18:24 schrieb Ted Nolan <tednolan>:
In article <[email protected]>, Ralf Fassel <[email protected]> wrote:
* [email protected] (Ted Nolan <tednolan>)
| ["_" as thousands separator in numbers]
| I'm curious how this feature was motivated. Does any other language use >> | this notation?
C++ has something similar since c++14 with the
int i = 18'446'744;
notation, but here this is not a problem, since the variable is never
interpreted as a string.
R'
Yikes! I really don't like that. Makes me glad I don't c++I want to express, that I am also very unhappy with the change.
Rolf have given details in his porting speech in Vienna.
I would love to have something like "string is integernounderscoreandnoprefix" to check, if a string is ok for external processing, so those are not accepted: "0xab", "0d12", "12_34".
I have a deja-vu about all the "scan $d %d" necessary due to the octal interpretation of tcl8.x which will not be necessary for 9, but other
new issues come...
Anyway, TCL 9 is great !
Harald
On Friday, July 21, 2023 at 6:17:45 PM UTC+2, saitology9 wrote:
On 7/21/2023 12:00 PM, Ralf Fassel wrote:
Note that I'm not talking about [incr x] in the above example, but if eg a simple [string is integer $x] will make $x = 1234567 from 1_234_567, this would be Not Good, since obviously [split $x "_"] would no longer work then.
I agree. IMO, if true, a completely unnecessary feature with limited benefit and huge risk potential to break things as you pointed out.
Perhaps the "-strict" version will work as before and reject it?This was discussed on site, NO, right now -strict mode does not catch this:
% package req Tcl
9.0a4
% string is integer -strict 1_234_567
1
This should be fixed. (Also, -strict mode for "string is" should become default in Tcl 9, anyways.)
| What about going the other way? Suppose I have a large integer number in
| TCL, and I want to print it out with commas (or any other character) so
| that it is more legible? I.e., :
| I have: 123456789
| I want: 123,456,789
https://wiki.tcl-lang.org/page/Delimiting+Numbers
has some examples for that, though maybe TCL 9 has something built-in?
In article <[email protected]>, Ralf Fassel <[email protected]> wrote:
* [email protected] (Kenny McCormack)
| What about going the other way? Suppose I have a large integer number in >> | TCL, and I want to print it out with commas (or any other character) so
| that it is more legible? I.e., :
| I have: 123456789
| I want: 123,456,789
https://wiki.tcl-lang.org/page/Delimiting+Numbers
Yuck. So, there's no builtin way to do it (yet). You have to do it in
code. The examples at the URL seem awfully complex.
For what it is worth, here's how I do it in AWK; it seems something similar ought to be do-able in Tcl:
The other direction is easier as it is under your control.
This is what I use and it should replace your awk script. You can use
use period instead of commas by supplying it as the second argument:
proc format_nums {num {sep ,}} {
while {[regsub -- {^([-+]?\d+)(\d\d\d)} $num "\\1$sep\\2" num]} {}
return $num
}
I would love to have something like "string is integernounderscoreandnoprefix" to check, if a string is ok for external processing, so those are not accepted: "0xab", "0d12", "12_34".
% package req Tcl
9.0a4
% string is integer -strict 1_234_567
1
This should be fixed.
stefan <[email protected]> wrote:
% package req Tcl
9.0a4
% string is integer -strict 1_234_567
1
This should be fixed.
yes! should be fixed! in the particular script. ;-)
Namely, not to use "string is integer", when a regexp check is desired.
| Sysop: | Keyop |
|---|---|
| Location: | Huddersfield, West Yorkshire, UK |
| Users: | 715 |
| Nodes: | 16 (2 / 14) |
| Uptime: | 21:00:44 |
| Calls: | 12,104 |
| Calls today: | 4 |
| Files: | 15,004 |
| Messages: | 6,518,104 |