• The difference between strtol() and strtoul() ?

    From Kenny McCormack@21:1/5 to All on Thu Jun 20 14:06:45 2024
    Interestingly, I note that strtoul() accepts strings that begin with a sign
    (+ or -). This is odd, since you'd (*) think that a sign (particularly, a minus) would be a syntax error in parsing for an unsigned value.

    Further, although the (Linux) man page is more than a bit murky on the
    subject, it seems that the result of parsing, say, "-1", with strtoul() is
    the largest unsigned value (usually, 2**N-1 or a lot of F's (in hex)).
    Whereas, I would expect it to be 1 (i.e., just take the absolute value).

    Comments? I find this all very counterintuitive.

    (*) Or should I say, "one would" ?

    P.S. Why isn't there a strtoi() or strtou() ? I know, of course, that
    there is atoi(), but that doesn't have the error checking capability that
    the strto* functions have.

    --
    If you think you have any objections to anything I've said above, please navigate to this URL:
    http://www.xmission.com/~gazelle/Truth
    This should clear up any misconceptions you may have.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lew Pitcher@21:1/5 to Kenny McCormack on Thu Jun 20 14:48:53 2024
    On Thu, 20 Jun 2024 14:06:45 +0000, Kenny McCormack wrote:

    Interestingly, I note that strtoul() accepts strings that begin with a sign (+ or -). This is odd, since you'd (*) think that a sign (particularly, a minus) would be a syntax error in parsing for an unsigned value.

    IIUC, the ISO C standard does not make a distinction between strings that
    make sense for an unsigned long vs strings that make sense for a signed long. The standard says (with regards to the strtol, strtoll, strtoul, and strtoull functions):
    "... the expected form of the subject sequence is a sequence of letters
    and digits representing an integer with the radix specified by base,
    optionally preceded by a plus or minus sign ... . If the value of base
    is 16, the characters 0x or 0X may optionally precede the sequence of
    letters and digits, following the sign if present."
    so, it appears that the ISO C standard permits the input string to specify
    a sign, even if the resulting conversion does not.


    Further, although the (Linux) man page is more than a bit murky on the subject, it seems that the result of parsing, say, "-1", with strtoul() is the largest unsigned value (usually, 2**N-1 or a lot of F's (in hex)). Whereas, I would expect it to be 1 (i.e., just take the absolute value).

    Why would you expect that? Again, the ISO standard says:
    "If the subject sequence has the expected form ... it is used as the base
    for conversion, ascribing to each letter its value ... . If the subject
    sequence begins with a minus sign, the value resulting from the conversion
    is negated (in the return type)."
    and
    "If the correct value is outside the range of representable values, LONG_MIN,
    LONG_MAX, LLONG_MIN, LLONG_MAX, ULONG_MAX, or ULLONG_MAX is returned
    (according to the return type and sign of the value, if any) ... ."


    Comments? I find this all very counterintuitive.

    I can't comment on /your/ internalization of the standards and expected behaviour. But, the standard makes sense (in an eccentric sort of way)
    to me, in that the defining distinction of the various strto*l() functions
    is not the format of the input, but the format of the output of the function.

    (*) Or should I say, "one would" ?

    P.S. Why isn't there a strtoi() or strtou() ? I know, of course, that
    there is atoi(), but that doesn't have the error checking capability that
    the strto* functions have.




    --
    Lew Pitcher
    "In Skills We Trust"

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to Kenny McCormack on Thu Jun 20 14:46:52 2024
    [email protected] (Kenny McCormack) writes:
    Interestingly, I note that strtoul() accepts strings that begin with a sign >(+ or -). This is odd, since you'd (*) think that a sign (particularly, a >minus) would be a syntax error in parsing for an unsigned value.

    The strtoul/strtoull function semantics match the C language semantics.

    $ cat /tmp/a.c
    #include <stdio.h>
    int main(int argc, const char **argv)
    {
    unsigned long v = -1ul;

    printf("0x%lx\n", v);
    return 0;
    }
    $ cc -Wall -Werror -o /tmp/a /tmp/a.c
    $ /tmp/a
    0xffffffffffffffff
    $

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lew Pitcher@21:1/5 to Kenny McCormack on Thu Jun 20 15:26:51 2024
    On Thu, 20 Jun 2024 14:06:45 +0000, Kenny McCormack wrote:

    [snip]

    P.S. Why isn't there a strtoi() or strtou() ? I know, of course, that
    there is atoi(), but that doesn't have the error checking capability that
    the strto* functions have.

    I don't know, but I'd /guess/ that, because the strto*l() functions return
    a value that can easily be range-checked and (possibly) truncated to fit in
    an int, the ISO committee didn't see any reason add another set of specialized functions.

    --
    Lew Pitcher
    "In Skills We Trust"

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kaz Kylheku@21:1/5 to Kenny McCormack on Thu Jun 20 22:55:01 2024
    On 2024-06-20, Kenny McCormack <[email protected]> wrote:
    Interestingly, I note that strtoul() accepts strings that begin with a sign (+ or -). This is odd, since you'd (*) think that a sign (particularly, a minus) would be a syntax error in parsing for an unsigned value.

    unsigned int x = -42; // implementation defined result: UINT_MAX - 41

    These functions seem to be geared toward the C language (perhaps writing compilers or tooling for C). Note that these functions recognize
    a leading zero for octal when base is specified as zero, and also
    recognize the 0x prefix when base is 0 or 16.

    So it is unsurprising that the unsigned functions would accept
    negative values and do the modulo reduction.

    Further, although the (Linux) man page is more than a bit murky on the subject, it seems that the result of parsing, say, "-1", with strtoul() is the largest unsigned value (usually, 2**N-1 or a lot of F's (in hex)). Whereas, I would expect it to be 1 (i.e., just take the absolute value).

    Comments? I find this all very counterintuitive.

    (*) Or should I say, "one would" ?

    P.S. Why isn't there a strtoi() or strtou() ? I know, of course, that
    there is atoi(), but that doesn't have the error checking capability that
    the strto* functions have.

    I suspect, because, at the time strtol was introduced, long was the
    widest integer type.

    When designing an integer parsing function, why would you not just
    have one function, working with the widest type?

    Unfortunately, though, strtoll later had to be added.

    If strtol didn't exist today, making it necessary to invent it or
    something like it, that function should use the intmax_t type.
    Then there wouldn't be any need to add new variants going forward.

    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @[email protected]

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kenny McCormack@21:1/5 to [email protected] on Thu Jun 20 23:35:37 2024
    In article <[email protected]>,
    Kaz Kylheku <[email protected]> wrote:
    ...
    If strtol didn't exist today, making it necessary to invent it or
    something like it, that function should use the intmax_t type.
    Then there wouldn't be any need to add new variants going forward.

    There actually is.

    STRTOIMAX(3) Linux Programmer's Manual STRTOIMAX(3)



    NAME
    strtoimax, strtoumax - convert string to integer

    SYNOPSIS
    #include <inttypes.h>

    intmax_t strtoimax(const char *nptr, char **endptr, int base);
    uintmax_t strtoumax(const char *nptr, char **endptr, int base);

    DESCRIPTION
    These functions are just like strtol(3) and strtoul(3), except that
    they return a value of type intmax_t and uintmax_t, respectively.

    --
    Conservatives want smaller government for the same reason criminals want fewer cops.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kenny McCormack@21:1/5 to Kenny McCormack on Fri Jun 21 13:58:01 2024
    In article <v51d1l$2fklr$[email protected]>,
    Kenny McCormack <[email protected]> wrote:
    Interestingly, I note that strtoul() accepts strings that begin with a sign >(+ or -). This is odd, since you'd (*) think that a sign (particularly, a >minus) would be a syntax error in parsing for an unsigned value.

    There have been some useful responses on this thread, which is Good. Of course, there have also been the usual crappola-type responses, but one must learn to take the good with the bad.

    Anyway, I think the takeaway is that while it is what it is, an argument
    can certainly be made that it would have been better for the unsigned
    versions of these function to not accept signed input. If I were designing
    it, I would have had strtoul("-1") be a syntax error (not a C language
    syntax error - but a meta-language syntax error) - or, if not that, then
    have it return 1, not 2**N-1. But that's just me.

    I appreciate the responses indicating that it was probably done the way it
    was for actually both of these reasons:
    1) Because it makes it more useful for C compiler writers - who were
    seen as the primary audience.
    2) Because it means that the two functions are literally the same code.
    Both calculate the same bit pattern - the difference is only in the
    caller's interpretation of the result.

    P.S. Why isn't there a strtoi() or strtou() ? I know, of course, that
    there is atoi(), but that doesn't have the error checking capability that
    the strto* functions have.

    Yeah, now I get it. You really only need strtoimax() and strtoumax().

    A result of any smaller type can be obtained by calling one of these
    functions and storing the result in an object of the smaller type.

    --
    The randomly chosen signature file that would have appeared here is more than 4 lines long. As such, it violates one or more Usenet RFCs. In order to remain in compliance with said RFCs, the actual sig can be found at the following URL:
    http://user.xmission.com/~gazelle/Sigs/GodDelusion

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael S@21:1/5 to Michael S on Fri Jun 21 18:53:14 2024
    On Fri, 21 Jun 2024 18:28:39 +0300
    Michael S <[email protected]> wrote:

    On Fri, 21 Jun 2024 13:58:01 -0000 (UTC)
    [email protected] (Kenny McCormack) wrote:

    Yeah, now I get it. You really only need strtoimax() and
    strtoumax().

    Which are? uunfortunately, not part of C standard.

    A result of any smaller type can be obtained by calling one of these functions and storing the result in an object of the smaller type.


    Or check for range and handle out of range values as appropriate by situation.



    BTW, I don't know what The Standard says about out-of-range inputs, but
    at least https://en.cppreference.com/w/c/string/byte/strtol does not
    say anything certain. especially about what stored in *str_end.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael S@21:1/5 to Kenny McCormack on Fri Jun 21 19:00:08 2024
    On Fri, 21 Jun 2024 13:58:01 -0000 (UTC)
    [email protected] (Kenny McCormack) wrote:

    In article <v51d1l$2fklr$[email protected]>,
    2) Because it means that the two functions are literally the same
    code. Both calculate the same bit pattern - the difference is only in
    the caller's interpretation of the result.


    I implementation that I just tested strtoll and strtull are not the
    same. They deliver different answers when input is out of range.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael S@21:1/5 to Kenny McCormack on Fri Jun 21 18:28:39 2024
    On Fri, 21 Jun 2024 13:58:01 -0000 (UTC)
    [email protected] (Kenny McCormack) wrote:

    Yeah, now I get it. You really only need strtoimax() and strtoumax().


    Which are? uunfortunately, not part of C standard.

    A result of any smaller type can be obtained by calling one of these functions and storing the result in an object of the smaller type.


    Or check for range and handle out of range values as appropriate by
    situation.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to Michael S on Fri Jun 21 16:14:58 2024
    Michael S <[email protected]> writes:
    On Fri, 21 Jun 2024 18:28:39 +0300
    Michael S <[email protected]> wrote:

    On Fri, 21 Jun 2024 13:58:01 -0000 (UTC)
    [email protected] (Kenny McCormack) wrote:

    Yeah, now I get it. You really only need strtoimax() and
    strtoumax().

    Which are? uunfortunately, not part of C standard.

    A result of any smaller type can be obtained by calling one of these
    functions and storing the result in an object of the smaller type.


    Or check for range and handle out of range values as appropriate by
    situation.



    BTW, I don't know what The Standard says about out-of-range inputs, but
    at least https://en.cppreference.com/w/c/string/byte/strtol does not
    say anything certain. especially about what stored in *str_end.

    SuS defines ERANGE as the errno returned if the converted value is out of range.

    https://pubs.opengroup.org/onlinepubs/9699919799/functions/strtoull.html

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to Scott Lurndal on Fri Jun 21 16:54:33 2024
    [email protected] (Scott Lurndal) writes:
    Michael S <[email protected]> writes:
    On Fri, 21 Jun 2024 18:28:39 +0300
    Michael S <[email protected]> wrote:

    On Fri, 21 Jun 2024 13:58:01 -0000 (UTC)
    [email protected] (Kenny McCormack) wrote:

    Yeah, now I get it. You really only need strtoimax() and
    strtoumax().

    Which are? uunfortunately, not part of C standard.

    A result of any smaller type can be obtained by calling one of these
    functions and storing the result in an object of the smaller type.


    Or check for range and handle out of range values as appropriate by
    situation.



    BTW, I don't know what The Standard says about out-of-range inputs, but
    at least https://en.cppreference.com/w/c/string/byte/strtol does not
    say anything certain. especially about what stored in *str_end.

    SuS defines ERANGE as the errno returned if the converted value is out of range.

    https://pubs.opengroup.org/onlinepubs/9699919799/functions/strtoull.html

    It should be quite clear what is stored at endptr in all cases from the
    POSIX description.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ben Bacarisse@21:1/5 to Michael S on Fri Jun 21 18:15:07 2024
    Michael S <[email protected]> writes:

    On Fri, 21 Jun 2024 18:28:39 +0300
    Michael S <[email protected]> wrote:

    On Fri, 21 Jun 2024 13:58:01 -0000 (UTC)
    [email protected] (Kenny McCormack) wrote:

    Yeah, now I get it. You really only need strtoimax() and
    strtoumax().

    Which are? uunfortunately, not part of C standard.

    A result of any smaller type can be obtained by calling one of these
    functions and storing the result in an object of the smaller type.


    Or check for range and handle out of range values as appropriate by
    situation.

    BTW, I don't know what The Standard says about out-of-range inputs, but
    at least https://en.cppreference.com/w/c/string/byte/strtol does not
    say anything certain. especially about what stored in *str_end.

    It says what value should be returned. That's something certain!

    As for what gets put into *str_end that page could be clearer. The
    standard says that a pointer just past the last of the digits is stored, provided the input has the right form (spaces, sign, prefix, digits).
    The cppreference page says a pointer just past "the last numeric
    character interpreted" which begs the question of what "interpreted"
    means when the result is possibly out of range. Maybe saying "scanned"
    rather than interpreted would be better. The end pointer always points
    just past any syntactically valid characters, even when the result is
    out of range.

    --
    Ben.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ben Bacarisse@21:1/5 to Michael S on Fri Jun 21 18:02:28 2024
    Michael S <[email protected]> writes:

    On Fri, 21 Jun 2024 13:58:01 -0000 (UTC)
    [email protected] (Kenny McCormack) wrote:

    Yeah, now I get it. You really only need strtoimax() and strtoumax().

    Which are? uunfortunately, not part of C standard.

    Not sure if that '?' is just a typo. Anyway, yes they are both part of
    the C standard.

    --
    Ben.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kenny McCormack@21:1/5 to [email protected] on Fri Jun 21 18:43:29 2024
    In article <v54hc0$39bpi$[email protected]>,
    James Kuyper <[email protected]> wrote:
    On 6/21/24 11:53, Michael S wrote:
    On Fri, 21 Jun 2024 18:28:39 +0300
    Michael S <[email protected]> wrote:

    On Fri, 21 Jun 2024 13:58:01 -0000 (UTC)
    [email protected] (Kenny McCormack) wrote:

    Yeah, now I get it. You really only need strtoimax() and
    strtoumax().

    Which are? uunfortunately, not part of C standard.

    They have been part of the C standard since C99.

    To some people, "Standard C" means C89.

    Everything after that is, like POSIX, just fluffy nonsense.

    --
    12% of Americans think that Joan of Arc was Noah's wife.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Kuyper@21:1/5 to Michael S on Fri Jun 21 14:38:56 2024
    On 6/21/24 11:53, Michael S wrote:
    On Fri, 21 Jun 2024 18:28:39 +0300
    Michael S <[email protected]> wrote:

    On Fri, 21 Jun 2024 13:58:01 -0000 (UTC)
    [email protected] (Kenny McCormack) wrote:

    Yeah, now I get it. You really only need strtoimax() and
    strtoumax().

    Which are? uunfortunately, not part of C standard.

    They have been part of the C standard since C99.

    BTW, I don't know what The Standard says about out-of-range inputs, but
    at least https://en.cppreference.com/w/c/string/byte/strtol does not
    say anything certain. especially about what stored in *str_end.

    "The strtoimax and strtoumax functions are equivalent to the strtol,
    strtoll, strtoul, and strtoull functions, except that the initial
    portion of the string is converted to intmax_t and uintmax_t
    representation, respectively." (7.8.2.3p2)

    You need to go to the descriptions of those other functions to get the
    detailed specifications.

    "If the correct value is outside the range of representable values,
    LONG_MIN, LONG_MAX, LLONG_MIN, LLONG_MAX, ULONG_MAX, or ULLONG_MAX is
    returned (according to the return type and sign of the value, if any),
    and the value of the macro ERANGE is stored in errno."

    As I understand it, that means that if the input string represents a
    value outside of the range of representable values, then strtoimax()
    should return INTMAX_MIN or INTMAX_MAX, depending upon the sign, and strtouimax() should return UINTMAX_MAX. Both of them should store the
    value of ERANGE in errno, to distinguish these results from what you
    would get if the string happened to represent those values.


    The C standard uses end_ptr rather than str_end in it's description of
    these functions.

    "... First, they decompose the input string into three parts: an
    initial, possibly empty, sequence of white-space characters, a subject
    sequence resembling an integer represented in some radix determined by
    the value of base, and a final string of one or more unrecognized
    characters, including the terminating null character of the input
    string. ..." (7.21.4.7p2).

    That defines what the "final string" is.

    "If the subject sequence has the expected form, ... A pointer to the
    final string is stored in the object pointed to by endptr, provided that
    endptr is not a null pointer." (7.24.1.7p5).

    "If the subject sequence is empty or does not have the expected form ...
    the value of nptr is stored in the object pointed to by endptr, provided
    that endptr is not a null pointer." (7.21.4.7p7)

    That seems very precise and unambiguous to me, aside from what "the
    expected form" is, which is described elsewhere.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Keith Thompson on Sat Jun 22 06:43:45 2024
    On Fri, 21 Jun 2024 10:38:51 -0700, Keith Thompson wrote:

    strto[u]l[l] are declared in <stdlib.h> strtoimax and strtoumax are
    declared in <inttypes.h>, which can make them easy to miss.

    The first thing I do is check the man pages <https://manpages.debian.org/3/strtoimax.3.en.html>:

    STANDARDS

    POSIX.1-2001, POSIX.1-2008, C99.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Scott Lurndal on Sat Jun 22 06:44:38 2024
    On Fri, 21 Jun 2024 16:54:33 GMT, Scott Lurndal wrote:

    It should be quite clear what is stored at endptr in all cases from the
    POSIX description.

    You really need to be checking the C spec, just in case.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to Lawrence D'Oliveiro on Sat Jun 22 15:16:24 2024
    Lawrence D'Oliveiro <[email protected]d> writes:
    On Fri, 21 Jun 2024 16:54:33 GMT, Scott Lurndal wrote:

    It should be quite clear what is stored at endptr in all cases from the
    POSIX description.

    You really need to be checking the C spec, just in case.

    No, I don't. The posix document clearly states that the text
    is from ISO C (and clearly marks any extensions).

    You really need to control the need to reply to every post.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael S@21:1/5 to Keith Thompson on Sat Jun 22 21:04:38 2024
    On Fri, 21 Jun 2024 10:38:51 -0700
    Keith Thompson <[email protected]> wrote:

    Ben Bacarisse <[email protected]> writes:
    Michael S <[email protected]> writes:
    [...]
    Which are? uunfortunately, not part of C standard.

    Not sure if that '?' is just a typo. Anyway, yes they are both
    part of the C standard.

    strto[u]l[l] are declared in <stdlib.h> strtoimax and strtoumax are
    declared in <inttypes.h>, which can make them easy to miss.


    May be, that is the reason. But frankly, I expected that
    cppreference.com will do better. As a minimum, strtoimax should have
    ben listed in "See also" section on this page: https://en.cppreference.com/w/c/string/byte/strtol

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael S@21:1/5 to James Kuyper on Sat Jun 22 21:18:35 2024
    On Fri, 21 Jun 2024 14:38:56 -0400
    James Kuyper <[email protected]> wrote:

    On 6/21/24 11:53, Michael S wrote:
    On Fri, 21 Jun 2024 18:28:39 +0300
    Michael S <[email protected]> wrote:

    On Fri, 21 Jun 2024 13:58:01 -0000 (UTC)
    [email protected] (Kenny McCormack) wrote:

    Yeah, now I get it. You really only need strtoimax() and
    strtoumax().

    Which are? uunfortunately, not part of C standard.

    They have been part of the C standard since C99.

    BTW, I don't know what The Standard says about out-of-range inputs,
    but at least https://en.cppreference.com/w/c/string/byte/strtol
    does not say anything certain. especially about what stored in
    *str_end.

    "The strtoimax and strtoumax functions are equivalent to the strtol,
    strtoll, strtoul, and strtoull functions, except that the initial
    portion of the string is converted to intmax_t and uintmax_t
    representation, respectively." (7.8.2.3p2)

    You need to go to the descriptions of those other functions to get the detailed specifications.

    "If the correct value is outside the range of representable values,
    LONG_MIN, LONG_MAX, LLONG_MIN, LLONG_MAX, ULONG_MAX, or ULLONG_MAX is returned (according to the return type and sign of the value, if any),
    and the value of the macro ERANGE is stored in errno."

    As I understand it, that means that if the input string represents a
    value outside of the range of representable values, then strtoimax()
    should return INTMAX_MIN or INTMAX_MAX, depending upon the sign, and strtouimax() should return UINTMAX_MAX. Both of them should store the
    value of ERANGE in errno, to distinguish these results from what you
    would get if the string happened to represent those values.


    That what is done by my implementation, but I can not understand how it
    follows from the text, esp. for a case of out of range negative input
    for strtou**() functions.
    That creates rather non-intuitive discontinuity.
    strtoull("-18446744073709551615") => 1
    strtoull("-18446744073709551616") => 18446744073709551615


    The C standard uses end_ptr rather than str_end in it's description of
    these functions.

    "... First, they decompose the input string into three parts: an
    initial, possibly empty, sequence of white-space characters, a subject sequence resembling an integer represented in some radix determined by
    the value of base, and a final string of one or more unrecognized
    characters, including the terminating null character of the input
    string. ..." (7.21.4.7p2).

    That defines what the "final string" is.

    "If the subject sequence has the expected form, ... A pointer to the
    final string is stored in the object pointed to by endptr, provided
    that endptr is not a null pointer." (7.24.1.7p5).

    "If the subject sequence is empty or does not have the expected form
    ... the value of nptr is stored in the object pointed to by endptr,
    provided that endptr is not a null pointer." (7.21.4.7p7)

    That seems very precise and unambiguous to me, aside from what "the
    expected form" is, which is described elsewhere.

    Yes, this part of description is good and unambiguous.
    I wonder why cppreference.com had chosen to use less clear wording "The functions set the pointer pointed to by str_end to point to the
    character past the last numeric character interpreted."

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kaz Kylheku@21:1/5 to Kenny McCormack on Sat Jun 22 22:07:58 2024
    On 2024-06-21, Kenny McCormack <[email protected]> wrote:
    In article <v51d1l$2fklr$[email protected]>,
    Kenny McCormack <[email protected]> wrote:
    Interestingly, I note that strtoul() accepts strings that begin with a sign >>(+ or -). This is odd, since you'd (*) think that a sign (particularly, a >>minus) would be a syntax error in parsing for an unsigned value.

    There have been some useful responses on this thread, which is Good. Of course, there have also been the usual crappola-type responses, but one must learn to take the good with the bad.

    Anyway, I think the takeaway is that while it is what it is, an argument
    can certainly be made that it would have been better for the unsigned versions of these function to not accept signed input. If I were designing it, I would have had strtoul("-1") be a syntax error (not a C language
    syntax error - but a meta-language syntax error) - or, if not that, then
    have it return 1, not 2**N-1. But that's just me.

    An alternative would be for the current minus handling behavior to apply
    when the base is specified as zero, which is where the other hacks are
    like leading 0 for octal and 0x for hexadecimal (that one also
    recognized in base 16).

    I appreciate the responses indicating that it was probably done the way it was for actually both of these reasons:
    1) Because it makes it more useful for C compiler writers - who were
    seen as the primary audience.
    2) Because it means that the two functions are literally the same code.
    Both calculate the same bit pattern - the difference is only in the
    caller's interpretation of the result.

    3) The behavior is also useful for IT people who understand two's
    complement computer arithmetic:

    voipserver --debug-mask=-1 # more convenient than --debug-mask=0xFFFFFFFF

    It's why the 0x prefix is supported when base is 0, and also octal.

    It supports not only compiler writing but system utilities.

    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @[email protected]

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Scott Lurndal on Sat Jun 22 23:21:43 2024
    On Sat, 22 Jun 2024 15:16:24 GMT, Scott Lurndal wrote:

    Lawrence D'Oliveiro <[email protected]d> writes:
    On Fri, 21 Jun 2024 16:54:33 GMT, Scott Lurndal wrote:

    It should be quite clear what is stored at endptr in all cases from the
    POSIX description.

    You really need to be checking the C spec, just in case.

    No, I don't.

    It is the authoritative reference.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Kuyper@21:1/5 to Scott Lurndal on Sat Jun 22 20:10:32 2024
    On 6/22/24 11:16, Scott Lurndal wrote:
    Lawrence D'Oliveiro <[email protected]d> writes:
    ...
    You really need to be checking the C spec, just in case.

    No, I don't. The posix document clearly states that the text
    is from ISO C (and clearly marks any extensions).

    It also clearly states:
    "The functionality described on this reference page is aligned with the
    ISO C standard. Any conflict between the requirements described here and
    the ISO C standard is unintentional. This volume of POSIX.1-2017 defers
    to the ISO C standard."

    This tells you two important things: they believe that there's a small
    but significant chance of their description being unintentionally in
    conflict with the C standard. And, if that is the case, POSIX defers to C. You're better off reading the original than the thing that is supposed
    to be a faithful copy, but might not be.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Michael S on Sat Jun 22 23:22:28 2024
    On Sat, 22 Jun 2024 21:04:38 +0300, Michael S wrote:

    But frankly, I expected that cppreference.com will do better.

    This is why we have authoritative references.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael S@21:1/5 to Kenny McCormack on Sun Jun 23 11:47:56 2024
    On Fri, 21 Jun 2024 18:43:29 -0000 (UTC)
    [email protected] (Kenny McCormack) wrote:

    In article <v54hc0$39bpi$[email protected]>,
    James Kuyper <[email protected]> wrote:
    On 6/21/24 11:53, Michael S wrote:
    On Fri, 21 Jun 2024 18:28:39 +0300
    Michael S <[email protected]> wrote:

    On Fri, 21 Jun 2024 13:58:01 -0000 (UTC)
    [email protected] (Kenny McCormack) wrote:

    Yeah, now I get it. You really only need strtoimax() and
    strtoumax().

    Which are? uunfortunately, not part of C standard.

    They have been part of the C standard since C99.

    To some people, "Standard C" means C89.


    That is not my case.
    I was sincerely mistaken.


    Everything after that is, like POSIX, just fluffy nonsense.


    I don't think that POSIX is fluffy nonsense. I do know, however, that
    POSIX is irrelevant for overwhelming majority of C programming that I
    do at work.
    Newer C standards are significantly more relevant, esp. language
    features.
    For library features, C Standard is relevant in a sense that if
    particular standard function exists in the library that I use, then it
    is very likely that it matches semantics of the standard.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael S@21:1/5 to Ben Bacarisse on Sun Jun 23 12:19:52 2024
    On Fri, 21 Jun 2024 18:15:07 +0100
    Ben Bacarisse <[email protected]> wrote:

    Michael S <[email protected]> writes:

    On Fri, 21 Jun 2024 18:28:39 +0300
    Michael S <[email protected]> wrote:

    On Fri, 21 Jun 2024 13:58:01 -0000 (UTC)
    [email protected] (Kenny McCormack) wrote:

    Yeah, now I get it. You really only need strtoimax() and
    strtoumax().

    Which are? uunfortunately, not part of C standard.

    A result of any smaller type can be obtained by calling one of
    these functions and storing the result in an object of the
    smaller type.

    Or check for range and handle out of range values as appropriate by
    situation.

    BTW, I don't know what The Standard says about out-of-range inputs,
    but at least https://en.cppreference.com/w/c/string/byte/strtol
    does not say anything certain. especially about what stored in
    *str_end.

    It says what value should be returned. That's something certain!


    In case of strtol, yes.
    In case of strtoul it also says what value should be returned, but
    plain reading of cppreference.com text (at least *my* plain reading)
    does not match observed behaviour. The text on cppreference.com
    resembles Standard text, but does not match it.
    Also, at least to me, Standard text itself appear very far from clear
    and way too open to interpretations.
    My own interpretation would be that for any negative input strtoul()
    should return ULONG_MAX and set errno to ERANGE. None of the actual implementation that I tested behaves in this manner.
    It seems, the problem is of what is considered "range of representable
    values" for unsigned type is by itself open to interpretations.

    IMHO, even if in some part of the standard there exists text that
    clearly states that "range of representable values for unsigned long = [-ULONG_MAX:ULONG_MAX]" it is worth repeating that in the section that
    defines strtol, because it is at all non-intuitive.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ben Bacarisse@21:1/5 to Michael S on Sun Jun 23 12:38:51 2024
    Michael S <[email protected]> writes:

    On Fri, 21 Jun 2024 18:15:07 +0100
    Ben Bacarisse <[email protected]> wrote:

    Michael S <[email protected]> writes:

    On Fri, 21 Jun 2024 18:28:39 +0300
    Michael S <[email protected]> wrote:

    On Fri, 21 Jun 2024 13:58:01 -0000 (UTC)
    [email protected] (Kenny McCormack) wrote:

    Yeah, now I get it. You really only need strtoimax() and
    strtoumax().

    Which are? uunfortunately, not part of C standard.

    A result of any smaller type can be obtained by calling one of
    these functions and storing the result in an object of the
    smaller type.

    Or check for range and handle out of range values as appropriate by
    situation.

    BTW, I don't know what The Standard says about out-of-range inputs,
    but at least https://en.cppreference.com/w/c/string/byte/strtol
    does not say anything certain. especially about what stored in
    *str_end.

    It says what value should be returned. That's something certain!


    In case of strtol, yes.
    In case of strtoul it also says what value should be returned, but
    plain reading of cppreference.com text (at least *my* plain reading)
    does not match observed behaviour. The text on cppreference.com
    resembles Standard text, but does not match it.

    Ah. What's the discrepancy you see?

    Also, at least to me, Standard text itself appear very far from clear
    and way too open to interpretations.
    My own interpretation would be that for any negative input strtoul()
    should return ULONG_MAX and set errno to ERANGE. None of the actual implementation that I tested behaves in this manner.

    I don't get that from the text. There is, after all, no "negative
    input". There is a "subject sequence" which, if it starts with a minus
    sign, causes the "value resulting from the conversion is negated (in the
    return type)" which seems clear enough.

    It seems, the problem is of what is considered "range of representable
    values" for unsigned type is by itself open to interpretations.

    IMHO, even if in some part of the standard there exists text that
    clearly states that "range of representable values for unsigned long = [-ULONG_MAX:ULONG_MAX]" it is worth repeating that in the section that defines strtol, because it is at all non-intuitive.

    I don't get what you are saying here. The range of values is [0:ULONG_MAX].

    --
    Ben.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael S@21:1/5 to Ben Bacarisse on Sun Jun 23 15:32:19 2024
    On Sun, 23 Jun 2024 12:38:51 +0100
    Ben Bacarisse <[email protected]> wrote:

    Michael S <[email protected]> writes:

    On Fri, 21 Jun 2024 18:15:07 +0100
    Ben Bacarisse <[email protected]> wrote:

    Michael S <[email protected]> writes:

    On Fri, 21 Jun 2024 18:28:39 +0300
    Michael S <[email protected]> wrote:

    On Fri, 21 Jun 2024 13:58:01 -0000 (UTC)
    [email protected] (Kenny McCormack) wrote:

    Yeah, now I get it. You really only need strtoimax() and
    strtoumax().

    Which are? uunfortunately, not part of C standard.

    A result of any smaller type can be obtained by calling one of
    these functions and storing the result in an object of the
    smaller type.

    Or check for range and handle out of range values as
    appropriate by situation.

    BTW, I don't know what The Standard says about out-of-range
    inputs, but at least
    https://en.cppreference.com/w/c/string/byte/strtol does not say
    anything certain. especially about what stored in *str_end.

    It says what value should be returned. That's something certain!


    In case of strtol, yes.
    In case of strtoul it also says what value should be returned, but
    plain reading of cppreference.com text (at least *my* plain reading)
    does not match observed behaviour. The text on cppreference.com
    resembles Standard text, but does not match it.

    Ah. What's the discrepancy you see?


    IMHO, the Standard texts allows for more interpretations (and misinterpretations) than cppreference.com text


    Also, at least to me, Standard text itself appear very far from
    clear and way too open to interpretations.
    My own interpretation would be that for any negative input strtoul()
    should return ULONG_MAX and set errno to ERANGE. None of the actual implementation that I tested behaves in this manner.

    I don't get that from the text. There is, after all, no "negative
    input". There is a "subject sequence" which, if it starts with a
    minus sign, causes the "value resulting from the conversion is
    negated (in the return type)" which seems clear enough.


    I find it less than clear.
    The most non-clear part is that for strtouxx() as long as "subject
    sequence" is in range, it is first converted and then negated. However
    when "subject sequence" is out of range it is converted, then clipped
    and then *not* negated.
    I don't feel confused in the similar way by none-u variants of strtoxx()

    It seems, the problem is of what is considered "range of
    representable values" for unsigned type is by itself open to interpretations.

    IMHO, even if in some part of the standard there exists text that
    clearly states that "range of representable values for unsigned
    long = [-ULONG_MAX:ULONG_MAX]" it is worth repeating that in the
    section that defines strtol, because it is at all non-intuitive.

    I don't get what you are saying here. The range of values is
    [0:ULONG_MAX].


    That as long as you see sign as something detached from the rest of the
    number. I tend to see them as parts of the whole. May be, that's my
    mistake.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ben Bacarisse@21:1/5 to Michael S on Sun Jun 23 16:30:13 2024
    Michael S <[email protected]> writes:

    On Sun, 23 Jun 2024 12:38:51 +0100
    Ben Bacarisse <[email protected]> wrote:

    Michael S <[email protected]> writes:

    On Fri, 21 Jun 2024 18:15:07 +0100
    Ben Bacarisse <[email protected]> wrote:

    Michael S <[email protected]> writes:

    On Fri, 21 Jun 2024 18:28:39 +0300
    Michael S <[email protected]> wrote:

    On Fri, 21 Jun 2024 13:58:01 -0000 (UTC)
    [email protected] (Kenny McCormack) wrote:

    Yeah, now I get it. You really only need strtoimax() and
    strtoumax().

    Which are? uunfortunately, not part of C standard.

    A result of any smaller type can be obtained by calling one of
    these functions and storing the result in an object of the
    smaller type.

    Or check for range and handle out of range values as
    appropriate by situation.

    BTW, I don't know what The Standard says about out-of-range
    inputs, but at least
    https://en.cppreference.com/w/c/string/byte/strtol does not say
    anything certain. especially about what stored in *str_end.

    It says what value should be returned. That's something certain!


    In case of strtol, yes.
    In case of strtoul it also says what value should be returned, but
    plain reading of cppreference.com text (at least *my* plain reading)
    does not match observed behaviour. The text on cppreference.com
    resembles Standard text, but does not match it.

    Ah. What's the discrepancy you see?

    IMHO, the Standard texts allows for more interpretations (and misinterpretations) than cppreference.com text

    I was hoping for an example. As I've used these functions for decades,
    I find it hard to see where the alternative interpretations might lie.

    Also, at least to me, Standard text itself appear very far from
    clear and way too open to interpretations.
    My own interpretation would be that for any negative input strtoul()
    should return ULONG_MAX and set errno to ERANGE. None of the actual
    implementation that I tested behaves in this manner.

    I don't get that from the text. There is, after all, no "negative
    input". There is a "subject sequence" which, if it starts with a
    minus sign, causes the "value resulting from the conversion is
    negated (in the return type)" which seems clear enough.

    I find it less than clear.
    The most non-clear part is that for strtouxx() as long as "subject
    sequence" is in range,

    I think it helps to be precise here: the subject sequence has to be of
    the right form, not in the right range.

    it is first converted and then negated. However
    when "subject sequence" is out of range it is converted, then clipped
    and then *not* negated.

    If the conversion (before negation) is out of range the result will be ULONG_MAX and errno will be set to ERANGE. Calling this "clipping" is
    possibly confusing. For what it's worth, I'm just describing what
    happens. I am not saying it is crystal clear.

    I think there /is/ something problematic with the wording about the
    negation. It happens "in the return type" but how can
    9223372036854775808 be negated in the type long long int? OK, the
    negated value can be /represented/ in the type long long int but that's
    not quite the same thing. On the othee hand, for the unsigned return
    types, the negation "in the return type" is what produces ULONG_MAX for
    "-1" when the negated value, -1, can't be /represented/ in the return
    type. It's a case where, over the years, I've just got used to what's happening.

    --
    Ben.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael S@21:1/5 to Ben Bacarisse on Sun Jun 23 18:47:10 2024
    On Sun, 23 Jun 2024 16:30:13 +0100
    Ben Bacarisse <[email protected]> wrote:

    Michael S <[email protected]> writes:


    As I've used these functions for
    decades, I find it hard to see where the alternative interpretations
    might lie.


    I also use them for decades, but until last Thursday never payed
    attention to what happens when they fed with OOR inputs.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Richard Kettlewell@21:1/5 to Kenny McCormack on Sun Jun 23 17:39:37 2024
    [email protected] (Kenny McCormack) writes
    Interestingly, I note that strtoul() accepts strings that begin with a
    sign (+ or -). This is odd, since you'd (*) think that a sign
    (particularly, a minus) would be a syntax error in parsing for an
    unsigned value.

    Further, although the (Linux) man page is more than a bit murky on the subject, it seems that the result of parsing, say, "-1", with
    strtoul() is the largest unsigned value (usually, 2**N-1 or a lot of
    F's (in hex)). Whereas, I would expect it to be 1 (i.e., just take
    the absolute value).

    Comments? I find this all very counterintuitive.

    I can think of contexts where the string -1 would be read as meaning 1
    (e.g. GF(2^n)) but I don’t think most people would think they were a
    sensible analogy for stroul behavior. Its behavior seems consistent with
    the normal meaning of unary minus (i.e. additive inverse) and of course
    with C’s treatment of unsigned integer types.

    --
    https://www.greenend.org.uk/rjk/

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tim Rentsch@21:1/5 to Ben Bacarisse on Sun Jun 23 10:58:30 2024
    Ben Bacarisse <[email protected]> writes:

    [range questions for strtol(), etc]

    I think there /is/ something problematic with the wording about the
    negation. It happens "in the return type" but how can
    9223372036854775808 be negated in the type long long int? OK, the
    negated value can be /represented/ in the type long long int but that's
    not quite the same thing. On the othee hand, for the unsigned return
    types, the negation "in the return type" is what produces ULONG_MAX for
    "-1" when the negated value, -1, can't be /represented/ in the return
    type. It's a case where, over the years, I've just got used to what's happening.

    I understand what these functions do, but their specification in the
    C standard is a little off. To my way of thinking the impact is
    minimal, but the specified behavior is either unequivocally wrong or
    there are some cases that give rise to undefined behavior.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to Tim Rentsch on Sun Jun 23 21:19:51 2024
    Tim Rentsch <[email protected]> writes:
    Ben Bacarisse <[email protected]> writes:

    [range questions for strtol(), etc]

    I think there /is/ something problematic with the wording about the
    negation. It happens "in the return type" but how can
    9223372036854775808 be negated in the type long long int? OK, the
    negated value can be /represented/ in the type long long int but that's
    not quite the same thing. On the othee hand, for the unsigned return
    types, the negation "in the return type" is what produces ULONG_MAX for
    "-1" when the negated value, -1, can't be /represented/ in the return
    type. It's a case where, over the years, I've just got used to what's
    happening.

    I understand what these functions do, but their specification in the
    C standard is a little off. To my way of thinking the impact is
    minimal, but the specified behavior is either unequivocally wrong or
    there are some cases that give rise to undefined behavior.

    I think you're both overthinking it.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ben Bacarisse@21:1/5 to Keith Thompson on Mon Jun 24 00:49:13 2024
    Keith Thompson <[email protected]> writes:

    Tim Rentsch <[email protected]> writes:
    Ben Bacarisse <[email protected]> writes:
    [range questions for strtol(), etc]

    I think there /is/ something problematic with the wording about the
    negation. It happens "in the return type" but how can
    9223372036854775808 be negated in the type long long int? OK, the
    negated value can be /represented/ in the type long long int but that's
    not quite the same thing. On the othee hand, for the unsigned return
    types, the negation "in the return type" is what produces ULONG_MAX for
    "-1" when the negated value, -1, can't be /represented/ in the return
    type. It's a case where, over the years, I've just got used to what's
    happening.

    I understand what these functions do, but their specification in the
    C standard is a little off. To my way of thinking the impact is
    minimal, but the specified behavior is either unequivocally wrong or
    there are some cases that give rise to undefined behavior.

    Can you give an example where the specified behavior causes undefined behavior?

    I don't want to pre-empt Tim's answer, but the wording that bothers me
    is

    "If the subject sequence begins with a minus sign, the value resulting
    from the conversion is negated (in the return type)."

    For strtoll("-9223372036854775808", 0, 0) the value resulting from the conversion is 9223372036854775808 which can not even be represented in
    the return type, so how can it be negated "in the return type"?

    --
    Ben.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Ben Bacarisse on Mon Jun 24 00:48:12 2024
    On Sun, 23 Jun 2024 16:30:13 +0100, Ben Bacarisse wrote:

    I think there /is/ something problematic with the wording about the
    negation. It happens "in the return type" but how can
    9223372036854775808 be negated in the type long long int? OK, the
    negated value can be /represented/ in the type long long int but that's
    not quite the same thing. On the othee hand, for the unsigned return
    types, the negation "in the return type" is what produces ULONG_MAX for
    "-1" when the negated value, -1, can't be /represented/ in the return
    type. It's a case where, over the years, I've just got used to what's happening.

    In the C23 spec, section 7.24.1.7, “The strtol, strtoll, strtoul, and strtoull functions”, paragraph 5 begins:

    If the subject sequence has the expected form and the value of
    base is zero, the sequence of characters starting with the first
    digit is interpreted as an integer constant according to the rules
    of 6.4.4.2.

    Note this is excluding any sign. So if the non-negated value cannot be represented in the desired type, then there is no valid value to apply
    negation to, so according to paragraph 8, zero is returned.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kaz Kylheku@21:1/5 to Ben Bacarisse on Mon Jun 24 02:29:19 2024
    On 2024-06-23, Ben Bacarisse <[email protected]> wrote:
    I don't want to pre-empt Tim's answer, but the wording that bothers me
    is

    "If the subject sequence begins with a minus sign, the value
    resulting from the conversion is negated (in the return type)."

    For strtoll("-9223372036854775808", 0, 0) the value resulting from the conversion is 9223372036854775808 which can not even be represented in
    the return type, so how can it be negated "in the return type"?

    We have to trust that the specification wants the functions to perform
    error checking, rather than precipitate into undefined behavior or implementation-defined results.

    If the negation, which is a positive value, cannot be represented in the
    type, that implies it is out of range. The required behavior for a
    positive out-of-range value is to return LLONG_MAX and set errno to
    ERANGE.

    The "in the return type" wording sounds like it may be written that way
    to cover the unsigned case, strtoull.

    I see in the N3220 draft that the signed and unsigned functions are
    lumped together and the wording is now:

    "If the subject sequence begins with a minus sign, the resulting value
    is the negative of the converted value; for functions whose return type
    is an unsigned integer type this action is performed in the return
    type."


    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @[email protected]

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kaz Kylheku@21:1/5 to Kaz Kylheku on Mon Jun 24 02:31:11 2024
    On 2024-06-24, Kaz Kylheku <[email protected]> wrote:
    If the negation, which is a positive value, cannot be represented in the type, that implies it is out of range. The required behavior for a
    positive out-of-range value is to return LLONG_MAX and set errno to
    ERANGE.

    Errr, what am I saying! The negation, which is a negative value,
    cannot be represented in the type, so the required behavior is to
    return LLONG_MIN and set errno to negative.

    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @[email protected]

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tim Rentsch@21:1/5 to Scott Lurndal on Sun Jun 23 22:28:37 2024
    [email protected] (Scott Lurndal) writes:

    Tim Rentsch <[email protected]> writes:

    Ben Bacarisse <[email protected]> writes:

    [range questions for strtol(), etc]

    I think there /is/ something problematic with the wording about the
    negation. It happens "in the return type" but how can
    9223372036854775808 be negated in the type long long int? OK, the
    negated value can be /represented/ in the type long long int but that's
    not quite the same thing. On the othee hand, for the unsigned return
    types, the negation "in the return type" is what produces ULONG_MAX for
    "-1" when the negated value, -1, can't be /represented/ in the return
    type. It's a case where, over the years, I've just got used to what's
    happening.

    I understand what these functions do, but their specification in the
    C standard is a little off. To my way of thinking the impact is
    minimal, but the specified behavior is either unequivocally wrong or
    there are some cases that give rise to undefined behavior.

    I think you're both overthinking it.

    You aren't saying anything. Do you have something to
    say that actually has positive information content?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tim Rentsch@21:1/5 to Keith Thompson on Sun Jun 23 22:30:35 2024
    Keith Thompson <[email protected]> writes:

    Tim Rentsch <[email protected]> writes:

    Ben Bacarisse <[email protected]> writes:
    [range questions for strtol(), etc]

    I think there /is/ something problematic with the wording about the
    negation. It happens "in the return type" but how can
    9223372036854775808 be negated in the type long long int? OK, the
    negated value can be /represented/ in the type long long int but that's
    not quite the same thing. On the othee hand, for the unsigned return
    types, the negation "in the return type" is what produces ULONG_MAX for
    "-1" when the negated value, -1, can't be /represented/ in the return
    type. It's a case where, over the years, I've just got used to what's
    happening.

    I understand what these functions do, but their specification in the
    C standard is a little off. To my way of thinking the impact is
    minimal, but the specified behavior is either unequivocally wrong or
    there are some cases that give rise to undefined behavior.

    Can you give an example where the specified behavior causes undefined behavior?

    Ben gave a good answer. (My thanks to Ben for both the
    content and the style of his answer.)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kaz Kylheku@21:1/5 to Keith Thompson on Mon Jun 24 06:05:33 2024
    On 2024-06-24, Keith Thompson <[email protected]> wrote:
    Kaz Kylheku <[email protected]> writes:
    On 2024-06-24, Kaz Kylheku <[email protected]> wrote:
    If the negation, which is a positive value, cannot be represented in the >>> type, that implies it is out of range. The required behavior for a
    positive out-of-range value is to return LLONG_MAX and set errno to
    ERANGE.

    Errr, what am I saying! The negation, which is a negative value,
    cannot be represented in the type, so the required behavior is to
    return LLONG_MIN and set errno to negative.

    You mean "and set errno to ERANGE".

    Once you screw up and start correcting yourself, there is no end
    to the long tail of erors.

    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @[email protected]

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael S@21:1/5 to Keith Thompson on Mon Jun 24 13:19:41 2024
    On Sun, 23 Jun 2024 20:11:09 -0700
    Keith Thompson <[email protected]> wrote:


    There's still some ambiguity for strtoull("-9999999999999999999",
    NULL, 10) (that's well outside the range of a 64-bit integer). For
    that to work as expected, we have to assume that the determination
    that "the correct value is outside the range of representable values"
    happens *before* the negation "is performed in the return type".
    It's not clear that this problem is worth fixing (doing so would
    likely make that section longer and perhaps more confusing).


    There is nothing wrong with longer sections.
    Personally I would prefer for each strtoxxx() function to have
    its own description fully independent of all others. It would make
    each of them easier to follow.
    DRY is a good principle for programming, not necessarily for writing
    Standards.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)