• Whaddaya think?

    From DFS@21:1/5 to All on Sat Jun 15 15:36:22 2024
    I want to read numbers in from a file, say:

    47 185 99 74 202 118 78 203 264 207 19 17 34 167 148 54 297 271 118 245
    294 188 140 134 251 188 236 160 48 189 228 94 74 27 168 275 144 245 178
    108 152 197 125 185 63 272 239 60 242 56 4 235 244 144 69 195 32 4 54 79
    193 282 173 267 8 40 241 152 285 119 259 136 15 83 21 78 55 259 137 297
    15 141 232 259 285 300 153 16 4 207 95 197 188 267 164 195 7 104 47 291


    This code:
    1 opens the file
    2 fscanf thru the file to count the number of data points
    3 allocate memory
    4 rewind and fscanf again to add the data to the int array


    Any issues with this method?

    Any 'better' way?

    Thanks


    ----------------------------------------------------------
    #include <stdio.h>
    #include <stdlib.h>

    int main(int argc, char *argv[]) {

    int N=0, i=0, j=0;
    int *nums;

    FILE* datafile = fopen(argv[1], "r");
    while(fscanf(datafile, "%d", &j) != EOF){
    N++;
    }

    nums = calloc(N, sizeof(int));
    rewind(datafile);
    while(fscanf(datafile, "%d", &j) != EOF){
    nums[i++] = j;
    }
    fclose (datafile);
    printf("\n");

    for(i=0;i<N;i++) {
    printf("%d. %d\n", i+1, nums[i]);
    }
    printf("\n");
    free(nums);
    return(0);

    }
    ----------------------------------------------------------

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ben Bacarisse@21:1/5 to DFS on Sat Jun 15 23:03:10 2024
    DFS <[email protected]> writes:

    I want to read numbers in from a file, say:

    47 185 99 74 202 118 78 203 264 207 19 17 34 167 148 54 297 271 118 245 294 188 140 134 251 188 236 160 48 189 228 94 74 27 168 275 144 245 178 108 152 197 125 185 63 272 239 60 242 56 4 235 244 144 69 195 32 4 54 79 193 282
    173 267 8 40 241 152 285 119 259 136 15 83 21 78 55 259 137 297 15 141 232 259 285 300 153 16 4 207 95 197 188 267 164 195 7 104 47 291


    This code:
    1 opens the file
    2 fscanf thru the file to count the number of data points
    3 allocate memory
    4 rewind and fscanf again to add the data to the int array

    Any issues with this method?

    There are two issues: (1) you end up with a program that can't be
    "piped" to (because the input can't be rewound), and (2) the file might
    change between counting and reading. How much either matters will
    depend on the context. I like piping to programs so (1) would bother
    me.

    Any 'better' way?

    I'd allocate the array on the fly. It's one of those things that, once
    you've done it, becomes a stock bit of coding. In fact, you can write a
    simple dynamic array module, and use it again and again.
    ----------------------------------------------------------
    #include <stdio.h>
    #include <stdlib.h>

    int main(int argc, char *argv[]) {

    int N=0, i=0, j=0;
    int *nums;

    FILE* datafile = fopen(argv[1], "r");
    while(fscanf(datafile, "%d", &j) != EOF){

    It's always better to loop while fscanf succeeds rather than trying to
    handle all the errors. You might not care about case where this loop
    fails, but it's just better to get into the right habit:

    while (fscanf(datafile, "%d", &j) == 1) ...

    nums = calloc(N, sizeof(int));

    The cost is low, but there's no need to use calloc here as you are going
    to assign exactly N locations.

    rewind(datafile);
    while(fscanf(datafile, "%d", &j) != EOF){
    nums[i++] = j;
    }

    As above, though I'd read into &nums[i] directly.

    fclose (datafile);
    printf("\n");

    for(i=0;i<N;i++) {
    printf("%d. %d\n", i+1, nums[i]);
    }
    printf("\n");
    free(nums);
    return(0);

    Because I have acquired the habit, I'd also check for errors,
    particularly on argc, fopen and malloc.


    }
    ----------------------------------------------------------

    --
    Ben.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael S@21:1/5 to DFS on Sun Jun 16 01:56:49 2024
    On Sat, 15 Jun 2024 15:36:22 -0400
    DFS <[email protected]> wrote:

    I want to read numbers in from a file, say:

    47 185 99 74 202 118 78 203 264 207 19 17 34 167 148 54 297 271 118
    245 294 188 140 134 251 188 236 160 48 189 228 94 74 27 168 275 144
    245 178 108 152 197 125 185 63 272 239 60 242 56 4 235 244 144 69 195
    32 4 54 79 193 282 173 267 8 40 241 152 285 119 259 136 15 83 21 78
    55 259 137 297 15 141 232 259 285 300 153 16 4 207 95 197 188 267 164
    195 7 104 47 291


    This code:
    1 opens the file
    2 fscanf thru the file to count the number of data points
    3 allocate memory
    4 rewind and fscanf again to add the data to the int array


    Any issues with this method?

    Any 'better' way?

    Thanks


    ----------------------------------------------------------
    #include <stdio.h>
    #include <stdlib.h>

    int main(int argc, char *argv[]) {

    int N=0, i=0, j=0;
    int *nums;

    FILE* datafile = fopen(argv[1], "r");
    while(fscanf(datafile, "%d", &j) != EOF){
    N++;
    }

    nums = calloc(N, sizeof(int));
    rewind(datafile);
    while(fscanf(datafile, "%d", &j) != EOF){
    nums[i++] = j;
    }
    fclose (datafile);
    printf("\n");

    for(i=0;i<N;i++) {
    printf("%d. %d\n", i+1, nums[i]);
    }
    printf("\n");
    free(nums);
    return(0);

    }
    ----------------------------------------------------------




    If you want to preserve you sanity, never use fscanf().

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From bart@21:1/5 to Ben Bacarisse on Sun Jun 16 00:22:34 2024
    On 15/06/2024 23:03, Ben Bacarisse wrote:
    DFS <[email protected]> writes:

    I want to read numbers in from a file, say:

    47 185 99 74 202 118 78 203 264 207 19 17 34 167 148 54 297 271 118 245 294 >> 188 140 134 251 188 236 160 48 189 228 94 74 27 168 275 144 245 178 108 152 >> 197 125 185 63 272 239 60 242 56 4 235 244 144 69 195 32 4 54 79 193 282
    173 267 8 40 241 152 285 119 259 136 15 83 21 78 55 259 137 297 15 141 232 >> 259 285 300 153 16 4 207 95 197 188 267 164 195 7 104 47 291


    This code:
    1 opens the file
    2 fscanf thru the file to count the number of data points
    3 allocate memory
    4 rewind and fscanf again to add the data to the int array

    Any issues with this method?

    There are two issues: (1) you end up with a program that can't be
    "piped" to (because the input can't be rewound), and (2) the file might change between counting and reading.

    It might change even while you're reading it once.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Lawrence D'Oliveiro on Sun Jun 16 05:41:12 2024
    On 16.06.2024 05:26, Lawrence D'Oliveiro wrote:
    On Sun, 16 Jun 2024 01:56:49 +0300, Michael S wrote:

    If you want to preserve you sanity, never use fscanf().

    Quoth the man page <https://manpages.debian.org/3/scanf.3.en.html>:

    It is very difficult to use these functions correctly, and it is
    preferable to read entire lines with fgets(3) or getline(3) and
    parse them later with sscanf(3) or more specialized functions such
    as strtol(3).

    This would be also my first impulse, but you'd have to know
    _in advance_ how long the data stream would be; the function
    requires an existing buffer. So you'd anyway need a stepwise
    input. On the plus side there's maybe a better performance
    to read large buffer junks and compose them on demand? But
    a problem is the potential cut of the string of a number; it
    requires additional clumsy handling. So it might anyway be
    better (i.e. much more convenient) to use fscanf() ?

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Michael S on Sun Jun 16 03:26:30 2024
    On Sun, 16 Jun 2024 01:56:49 +0300, Michael S wrote:

    If you want to preserve you sanity, never use fscanf().

    Quoth the man page <https://manpages.debian.org/3/scanf.3.en.html>:

    It is very difficult to use these functions correctly, and it is
    preferable to read entire lines with fgets(3) or getline(3) and
    parse them later with sscanf(3) or more specialized functions such
    as strtol(3).

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Keith Thompson on Sun Jun 16 06:44:27 2024
    On 16.06.2024 06:17, Keith Thompson wrote:

    For the original problem, where the input consists of digits and
    whitespace, you could read a character at a time and accumulate the
    value of each number. (You probably want to handle leading signs as
    well, which isn't difficult.)

    Yes. Been there, done that. Sometimes it's good enough to go back
    to the roots if higher-level functions are imperfect or quirky.

    That is admittedly reinventing the
    wheel, but the existing wheels aren't entirely round. You still
    have to dynamically allocate the array of ints, assuming you need
    to store all of them rather than processing each value as it's read.

    A subclass of tasks can certainly process data on the fly but for
    the general solution there should be a convenient way to handle it.

    I still prefer higher-level languages that take the burden from me.

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Janis Papanagnou on Sun Jun 16 06:51:33 2024
    On 16.06.2024 05:41, Janis Papanagnou wrote:
    On 16.06.2024 05:26, Lawrence D'Oliveiro wrote:
    On Sun, 16 Jun 2024 01:56:49 +0300, Michael S wrote:

    If you want to preserve you sanity, never use fscanf().

    Quoth the man page <https://manpages.debian.org/3/scanf.3.en.html>:

    It is very difficult to use these functions correctly, and it is
    preferable to read entire lines with fgets(3) or getline(3) and
    parse them later with sscanf(3) or more specialized functions such
    as strtol(3).

    This would be also my first impulse, but you'd have to know
    _in advance_ how long the data stream would be; the function
    requires an existing buffer. So you'd anyway need a stepwise
    input. [...]

    Would it be sensible to have a malloc()'ed buffer used for the first
    fgets() and then subsequent fgets() work on the realloc()'ed part? I
    suppose the previously set data in the malloc area would be retained
    so that there's no re-composition of cut numbers necessary?

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Keith Thompson on Sun Jun 16 04:41:34 2024
    On Sat, 15 Jun 2024 21:17:57 -0700, Keith Thompson wrote:

    .. but it's defined by POSIX, not by ISO C.

    Dang. There’s another reset of the days-since-last-mention-of-POSIX-on- this-list counter.

    Has it ever actually reached 1?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Keith Thompson on Sun Jun 16 07:41:38 2024
    On 16.06.2024 07:21, Keith Thompson wrote:
    Janis Papanagnou <[email protected]> writes:
    On 16.06.2024 05:41, Janis Papanagnou wrote:
    On 16.06.2024 05:26, Lawrence D'Oliveiro wrote:
    On Sun, 16 Jun 2024 01:56:49 +0300, Michael S wrote:

    If you want to preserve you sanity, never use fscanf().

    Quoth the man page <https://manpages.debian.org/3/scanf.3.en.html>:

    It is very difficult to use these functions correctly, and it is
    preferable to read entire lines with fgets(3) or getline(3) and
    parse them later with sscanf(3) or more specialized functions such >>>> as strtol(3).

    This would be also my first impulse, but you'd have to know
    _in advance_ how long the data stream would be; the function
    requires an existing buffer. So you'd anyway need a stepwise
    input. [...]

    Would it be sensible to have a malloc()'ed buffer used for the first
    fgets() and then subsequent fgets() work on the realloc()'ed part? I
    suppose the previously set data in the malloc area would be retained
    so that there's no re-composition of cut numbers necessary?

    Sure. "The contents of the new object shall be the same as that of the
    old object prior to deallocation, up to the lesser of the new and old
    sizes."

    Keep in mind that you can't call realloc() on a non-null pointer that
    wasn't allocated by an allocation function.

    Thanks. - I've just tried it with this ad hoc test code

    #include <stdlib.h>
    #include <stdio.h>

    void main (int argc, char * argv[])
    {
    int chunk = 10;
    int bufsize = chunk+1;
    char * buf = malloc(bufsize);
    char * anchor = buf;
    while (fgets(buf, chunk+1, stdin) != NULL)
    if (realloc(anchor, bufsize += chunk) != NULL)
    buf += chunk;
    puts(anchor);
    }

    I wonder whether it can be simplified by making malloc() obsolete
    and using realloc() in a redesigned loop.

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael S@21:1/5 to Janis Papanagnou on Sun Jun 16 10:44:25 2024
    On Sun, 16 Jun 2024 05:41:12 +0200
    Janis Papanagnou <[email protected]> wrote:

    On 16.06.2024 05:26, Lawrence D'Oliveiro wrote:
    On Sun, 16 Jun 2024 01:56:49 +0300, Michael S wrote:

    If you want to preserve you sanity, never use fscanf().

    Quoth the man page <https://manpages.debian.org/3/scanf.3.en.html>:

    It is very difficult to use these functions correctly, and it is
    preferable to read entire lines with fgets(3) or getline(3) and
    parse them later with sscanf(3) or more specialized functions
    such as strtol(3).

    This would be also my first impulse, but you'd have to know
    _in advance_ how long the data stream would be; the function
    requires an existing buffer.

    Define formats with sensible maximal line length (512 sounds about
    right) and refuse any input that has longer lines.

    So you'd anyway need a stepwise
    input. On the plus side there's maybe a better performance
    to read large buffer junks and compose them on demand? But
    a problem is the potential cut of the string of a number; it
    requires additional clumsy handling. So it might anyway be
    better (i.e. much more convenient) to use fscanf() ?


    No, the behaviour of fsacnf() is too non-intuitive.

    Janis


    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael S@21:1/5 to Janis Papanagnou on Sun Jun 16 11:11:34 2024
    On Sun, 16 Jun 2024 07:41:38 +0200
    Janis Papanagnou <[email protected]> wrote:

    On 16.06.2024 07:21, Keith Thompson wrote:
    Janis Papanagnou <[email protected]> writes:
    On 16.06.2024 05:41, Janis Papanagnou wrote:
    On 16.06.2024 05:26, Lawrence D'Oliveiro wrote:
    On Sun, 16 Jun 2024 01:56:49 +0300, Michael S wrote:

    If you want to preserve you sanity, never use fscanf().

    Quoth the man page
    <https://manpages.debian.org/3/scanf.3.en.html>:

    It is very difficult to use these functions correctly, and
    it is preferable to read entire lines with fgets(3) or
    getline(3) and parse them later with sscanf(3) or more
    specialized functions such as strtol(3).

    This would be also my first impulse, but you'd have to know
    _in advance_ how long the data stream would be; the function
    requires an existing buffer. So you'd anyway need a stepwise
    input. [...]

    Would it be sensible to have a malloc()'ed buffer used for the
    first fgets() and then subsequent fgets() work on the realloc()'ed
    part? I suppose the previously set data in the malloc area would
    be retained so that there's no re-composition of cut numbers
    necessary?

    Sure. "The contents of the new object shall be the same as that of
    the old object prior to deallocation, up to the lesser of the new
    and old sizes."

    Keep in mind that you can't call realloc() on a non-null pointer
    that wasn't allocated by an allocation function.

    Thanks. - I've just tried it with this ad hoc test code

    #include <stdlib.h>
    #include <stdio.h>

    void main (int argc, char * argv[])
    {
    int chunk = 10;
    int bufsize = chunk+1;
    char * buf = malloc(bufsize);
    char * anchor = buf;
    while (fgets(buf, chunk+1, stdin) != NULL)
    if (realloc(anchor, bufsize += chunk) != NULL)
    buf += chunk;
    puts(anchor);
    }


    Not sure what this code is supposed to do.
    However it looks unlikely that it does what you meant for it to do.
    I recommend to read the [f*****g] manual. https://cplusplus.com/reference/cstdio/fgets/ https://cplusplus.com/reference/cstdlib/realloc/

    I wonder whether it can be simplified by making malloc() obsolete
    and using realloc() in a redesigned loop.

    Janis


    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Michael S on Sun Jun 16 11:07:22 2024
    On 16.06.2024 10:11, Michael S wrote:

    Not sure what this code is supposed to do.

    Not sure what you're comment is supposed to tell me.

    However it looks unlikely that it does what you meant for it to do.
    I recommend to read the [f*****g] manual. https://cplusplus.com/reference/cstdio/fgets/ https://cplusplus.com/reference/cstdlib/realloc/

    I don't need the Web to access man pages. Thanks.

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Michael S on Sun Jun 16 11:13:35 2024
    On 16.06.2024 09:44, Michael S wrote:
    On Sun, 16 Jun 2024 05:41:12 +0200
    Janis Papanagnou <[email protected]> wrote:

    On 16.06.2024 05:26, Lawrence D'Oliveiro wrote:
    On Sun, 16 Jun 2024 01:56:49 +0300, Michael S wrote:

    If you want to preserve you sanity, never use fscanf().

    Quoth the man page <https://manpages.debian.org/3/scanf.3.en.html>:

    It is very difficult to use these functions correctly, and it is
    preferable to read entire lines with fgets(3) or getline(3) and
    parse them later with sscanf(3) or more specialized functions
    such as strtol(3).

    This would be also my first impulse, but you'd have to know
    _in advance_ how long the data stream would be; the function
    requires an existing buffer.

    Define formats with sensible maximal line length (512 sounds about
    right) and refuse any input that has longer lines.

    You're not serious, are you? - Or wasn't it clear that it was
    about reading lines (of arbitrary lengths) in one go?


    So you'd anyway need a stepwise
    input. On the plus side there's maybe a better performance
    to read large buffer junks and compose them on demand? But
    a problem is the potential cut of the string of a number; it
    requires additional clumsy handling. So it might anyway be
    better (i.e. much more convenient) to use fscanf() ?


    No, the behaviour of fsacnf() is too non-intuitive.

    Maybe fsacnf() is non-intuitive.

    Myself I've never problems with fscanf(), though. - To each
    his own. :-)

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ben Bacarisse@21:1/5 to bart on Sun Jun 16 10:30:43 2024
    bart <[email protected]> writes:

    On 15/06/2024 23:03, Ben Bacarisse wrote:
    DFS <[email protected]> writes:

    I want to read numbers in from a file, say:

    47 185 99 74 202 118 78 203 264 207 19 17 34 167 148 54 297 271 118 245 294 >>> 188 140 134 251 188 236 160 48 189 228 94 74 27 168 275 144 245 178 108 152 >>> 197 125 185 63 272 239 60 242 56 4 235 244 144 69 195 32 4 54 79 193 282 >>> 173 267 8 40 241 152 285 119 259 136 15 83 21 78 55 259 137 297 15 141 232 >>> 259 285 300 153 16 4 207 95 197 188 267 164 195 7 104 47 291


    This code:
    1 opens the file
    2 fscanf thru the file to count the number of data points
    3 allocate memory
    4 rewind and fscanf again to add the data to the int array

    Any issues with this method?
    There are two issues: (1) you end up with a program that can't be
    "piped" to (because the input can't be rewound), and (2) the file might
    change between counting and reading.

    It might change even while you're reading it once.

    Your program will see the data it sees -- in that sense the file does
    not change. When there are two (or more) phases to the input, your
    program has to handle some new error conditions that are logically
    avoided by just reading what's available (even if, to some outside
    observer, it's "changing").

    --
    Ben.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael S@21:1/5 to Janis Papanagnou on Sun Jun 16 12:38:50 2024
    On Sun, 16 Jun 2024 11:07:22 +0200
    Janis Papanagnou <[email protected]> wrote:

    On 16.06.2024 10:11, Michael S wrote:

    Not sure what this code is supposed to do.

    Not sure what you're comment is supposed to tell me.


    I hoped that after you would read the manual you will know.
    But it obviously didn't work out.
    So, I'd tell a little more:
    1. It does not read one line of arbitrary length
    2. There is more than one mistake
    3. All mistakes seems to be caused by deep misconceptions about fgets()
    and realloc().

    However it looks unlikely that it does what you meant for it to do.
    I recommend to read the [f*****g] manual. https://cplusplus.com/reference/cstdio/fgets/ https://cplusplus.com/reference/cstdlib/realloc/

    I don't need the Web to access man pages. Thanks.

    Janis


    May be not Web in general, but this particular site is more pleasant to
    read than typical *nix man.
    Besides, I don't know how it looks on your man page, but if it is
    similar to one in link below then apart from relevant info it contains
    few blah-blah paragraphs. https://www.man7.org/linux/man-pages/man3/fgets.3p.html

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Michael S on Sun Jun 16 12:03:21 2024
    On 16.06.2024 11:38, Michael S wrote:
    On Sun, 16 Jun 2024 11:07:22 +0200
    Janis Papanagnou <[email protected]> wrote:

    On 16.06.2024 10:11, Michael S wrote:

    Not sure what this code is supposed to do.

    Not sure what you're comment is supposed to tell me.


    I hoped that after you would read the manual you will know.

    The point is that I inspected these manuals before I wrote the
    test code, and after some initial mistakes it ran as expected.

    But it obviously didn't work out.
    So, I'd tell a little more:
    1. It does not read one line of arbitrary length

    No, it does not do that. I wrote the test program as one way
    to circumvent the problem reading data of arbitrary length.
    The relevant text of my previous post was:

    | Would it be sensible to have a malloc()'ed buffer used for the first
    | fgets() and then subsequent fgets() work on the realloc()'ed part? I
    | suppose the previously set data in the malloc area would be retained
    | so that there's no re-composition of cut numbers necessary?

    It's intention was to read chunks of data to construct the
    buffer subsequently, starting from a small buffer instance and
    let it grow as more data (from the single external data line)
    are read. The intention was to avoid having to wastefully
    specify a too large buffer from the beginning, yet not being
    sure it suffices.

    (And just to make sure; "arbitrary" length is of course meant
    to be in the range of available memory, neither exa-byte, nor
    "unlimited" was meant.)

    2. There is more than one mistake

    I'm sure that's possible with an "ad hoc test code". Only your
    comment is meaningless without pointing me to any issue you see.

    3. All mistakes seems to be caused by deep misconceptions about fgets()
    and realloc().

    Again; which ones?

    Janis

    [...]

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Keith Thompson on Sun Jun 16 11:29:05 2024
    On 16.06.2024 07:49, Keith Thompson wrote:
    Janis Papanagnou <[email protected]> writes:

    void main (int argc, char * argv[])

    *Ahem* -- int main.

    Never sure about whether it was/is correct to 'void'-declare
    the return value and/or the [unused] main() arguments. (I'm
    still from the early C time when types were even omitted as
    function return specification (presuming an implicit int or
    no return), as in the K&R book. During the past decades I
    tended to declare my intention by writing f(void) instead
    of f() and void f() where no results are delivered. K&R at
    least seems to say that 'void' can only be declared for the
    return type of functions that do not return anything.

    As long as my C compiler doesn't mind 'int main (void)' or
    'void main (int, char **)' I don't care much for test code.
    I'm sure this stance of mine might be considered offensive
    in a 'C' NG. - Apologies! :-)

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael S@21:1/5 to Janis Papanagnou on Sun Jun 16 14:31:43 2024
    On Sun, 16 Jun 2024 12:03:21 +0200
    Janis Papanagnou <[email protected]> wrote:

    On 16.06.2024 11:38, Michael S wrote:


    Again; which ones?

    Janis

    [...]



    The main misconceptions are about what is returned by fgets() and by
    realloc().
    fgets() returns its first argument in all cases except EOF or FS read
    error. That includes the case when the buffer is too short to
    accommodate full input line.

    With realloc() there are two issues:
    1. It can sometimes return its first argument, but does not have to. In scenario like yours it will return a different pointer quite soon.
    2. When realloc() returns NULL, it does not de-allocate its first
    argument.

    The second case, of course, is not important in practice, because in
    practice you're very unlikely to see realloc() returning NULL, and if nevertheless it did happen, you program is unlikely to survive and give meaningful result anyway. Still, here on c.l.c we like to pretend that
    we can meaningfully handle allocation failures.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From DFS@21:1/5 to Janis Papanagnou on Sun Jun 16 11:09:01 2024
    On 6/16/2024 12:44 AM, Janis Papanagnou wrote:
    On 16.06.2024 06:17, Keith Thompson wrote:

    For the original problem, where the input consists of digits and
    whitespace, you could read a character at a time and accumulate the
    value of each number. (You probably want to handle leading signs as
    well, which isn't difficult.)

    Yes. Been there, done that. Sometimes it's good enough to go back
    to the roots if higher-level functions are imperfect or quirky.

    That is admittedly reinventing the
    wheel, but the existing wheels aren't entirely round. You still
    have to dynamically allocate the array of ints, assuming you need
    to store all of them rather than processing each value as it's read.

    A subclass of tasks can certainly process data on the fly but for
    the general solution there should be a convenient way to handle it.

    I still prefer higher-level languages that take the burden from me.

    nums = []
    with open('data.txt','r') as f:
    for nbr in f.read().split():
    nums.append(int(nbr))
    print(*sorted(nums))

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Malcolm McLean on Sun Jun 16 17:13:29 2024
    On 16.06.2024 17:04, Malcolm McLean wrote:

    And is a mapping of every input to the empty set a "function" or not? I
    think it is but mathematicians might weigh in on that.

    I'm well aware that there's a semantical distinction of "procedures"
    and "functions". And there's differences how languages consider these.
    In "C" I've learned that basically "everything is a function", even
    if they don't return anything, of if they do but the result ignored.
    In other languages these two types are called 'procedure' and '<type> procedure'. It might indeed be a source for religious disputes as so
    many things in IT/CS and elsewhere. I think it's not worth the time.

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From DFS@21:1/5 to Michael S on Sun Jun 16 11:03:15 2024
    On 6/15/2024 6:56 PM, Michael S wrote:
    On Sat, 15 Jun 2024 15:36:22 -0400

    If you want to preserve you sanity, never use fscanf().


    ha!

    You misspelled C.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Michael S on Sun Jun 16 17:37:34 2024
    On 16.06.2024 13:31, Michael S wrote:
    On Sun, 16 Jun 2024 12:03:21 +0200
    Janis Papanagnou <[email protected]> wrote:

    On 16.06.2024 11:38, Michael S wrote:


    Again; which ones?

    Janis

    [...]



    The main misconceptions are about what is returned by fgets() and by realloc().
    fgets() returns its first argument in all cases except EOF or FS read
    error. That includes the case when the buffer is too short to
    accommodate full input line.

    fgets() return s on success, and NULL on error or when end
    of file occurs while no characters have been read.

    I am interested in the success case, thus "fgets() != NULL".
    The buffer size is controlled by the second function parameter.


    With realloc() there are two issues:
    1. It can sometimes return its first argument, but does not have to. In scenario like yours it will return a different pointer quite soon.
    2. When realloc() returns NULL, it does not de-allocate its first
    argument.

    The second case, of course, is not important in practice, because in
    practice you're very unlikely to see realloc() returning NULL, and if nevertheless it did happen, you program is unlikely to survive and give meaningful result anyway. Still, here on c.l.c we like to pretend that
    we can meaningfully handle allocation failures.

    You may provide "correct" code (if you think mine is wrong). Or just
    inspect how it behaves; here's the output after two printf's added:

    $ printf "123 456 789 101112 77 88 99 101 999" | realloc
    123 456 78<+++
    123 456 78<===
    9 101112 7<+++
    123 456 789 101112 7<===
    7 88 99 10<+++
    123 456 789 101112 77 88 99 10<===
    1 999<+++
    123 456 789 101112 77 88 99 101 999<===
    123 456 789 101112 77 88 99 101 999

    The +++data+++ is the chunk read, and the ===data=== is the overall
    buffer content, and the final line again the result (as before, with
    the added newline as documented for puts()).

    Even though my code is just an "ad hoc test code" to demonstrate the
    procedure I outlined - and as such test code certainly lacking quite
    some error handling and much more - it does exactly what I intended it
    to do, and what I've implemented according to what I read in the man
    pages. I cannot see any "misconception", it does what was _intended_.

    There's indeed one point that I _deliberately_ ignored for the test
    code; actually the point you mentioned as "not important in practice".

    Again: You may provide "correct" code (if you think mine is "wrong"),
    or "better" code, usable for production instead of a test code.

    But your tone and statements were (as observed so often) inadequate;
    I quote from your post that started the subthread:
    "However it looks unlikely that it does what you meant for it to do."

    It does exactly what I meant to do (as you can see in the logs above).

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lew Pitcher@21:1/5 to DFS on Sun Jun 16 15:52:16 2024
    On Sat, 15 Jun 2024 15:36:22 -0400, DFS wrote:

    I want to read numbers in from a file, say:

    47 185 99 74 202 118 78 203 264 207 19 17 34 167 148 54 297 271 118 245
    294 188 140 134 251 188 236 160 48 189 228 94 74 27 168 275 144 245 178
    108 152 197 125 185 63 272 239 60 242 56 4 235 244 144 69 195 32 4 54 79
    193 282 173 267 8 40 241 152 285 119 259 136 15 83 21 78 55 259 137 297
    15 141 232 259 285 300 153 16 4 207 95 197 188 267 164 195 7 104 47 291


    This code:
    1 opens the file
    2 fscanf thru the file to count the number of data points
    3 allocate memory
    4 rewind and fscanf again to add the data to the int array


    Any issues with this method?

    Others have (and will continue to) address this question.

    Any 'better' way?

    Not so much "better", as "other".

    ISTM that you waste an opportunity (and expose a common 'blind-spot')
    in your first loop:

    Here,
    while(fscanf(datafile, "%d", &j) != EOF){
    N++;
    }
    you discard a lot of work (done for you by fscanf() to determine the
    value of each input number) just to be able to count the number of
    numbers in your input. What if there were a way to put this (to you)
    byproduct of fscanf() to use, and avoid using fscanf() entirely in
    the second pass?

    You /could/ create a temporary, binary, file, and write the fscanf()'ed
    values to it as part of the first loop. Once the first loop completes,
    you rewind this temporary file, and load your integer array by reading
    the (now converted to native integer format) values from that file.

    Still two passes, but using fscanf() in only one of those passes.

    (BTW, the 'blind-spot' I mentioned is that we often forget that
    we /can/ use temporary files to store intermediary results. Sometimes
    we can manipulate a temporary file easier than we can manipulate
    malloc()ed (or other) storage. )
    --
    Lew Pitcher
    "In Skills We Trust"

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to DFS on Sun Jun 16 17:56:01 2024
    On 16/06/2024 17:09, DFS wrote:
    On 6/16/2024 12:44 AM, Janis Papanagnou wrote:
    On 16.06.2024 06:17, Keith Thompson wrote:

    For the original problem, where the input consists of digits and
    whitespace, you could read a character at a time and accumulate the
    value of each number.  (You probably want to handle leading signs as
    well, which isn't difficult.)

    Yes. Been there, done that. Sometimes it's good enough to go back
    to the roots if higher-level functions are imperfect or quirky.

    That is admittedly reinventing the
    wheel, but the existing wheels aren't entirely round.  You still
    have to dynamically allocate the array of ints, assuming you need
    to store all of them rather than processing each value as it's read.

    A subclass of tasks can certainly process data on the fly but for
    the general solution there should be a convenient way to handle it.

    I still prefer higher-level languages that take the burden from me.

    nums = []
    with open('data.txt','r') as f:
        for nbr in f.read().split():
            nums.append(int(nbr))
        print(*sorted(nums))


    nums = sorted(map(int, open('data.txt', 'r').read().split()))

    But you'll learn more doing it with C :-) And it's nice to see someone starting on-topic threads here.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From DFS@21:1/5 to Ben Bacarisse on Sun Jun 16 11:52:30 2024
    On 6/15/2024 6:03 PM, Ben Bacarisse wrote:
    DFS <[email protected]> writes:

    I want to read numbers in from a file, say:

    47 185 99 74 202 118 78 203 264 207 19 17 34 167 148 54 297 271 118 245 294 >> 188 140 134 251 188 236 160 48 189 228 94 74 27 168 275 144 245 178 108 152 >> 197 125 185 63 272 239 60 242 56 4 235 244 144 69 195 32 4 54 79 193 282
    173 267 8 40 241 152 285 119 259 136 15 83 21 78 55 259 137 297 15 141 232 >> 259 285 300 153 16 4 207 95 197 188 267 164 195 7 104 47 291


    This code:
    1 opens the file
    2 fscanf thru the file to count the number of data points
    3 allocate memory
    4 rewind and fscanf again to add the data to the int array

    Any issues with this method?

    There are two issues: (1) you end up with a program that can't be
    "piped" to (because the input can't be rewound), and (2) the file might change between counting and reading. How much either matters will
    depend on the context. I like piping to programs so (1) would bother
    me.

    Any 'better' way?

    I'd allocate the array on the fly. It's one of those things that, once you've done it, becomes a stock bit of coding. In fact, you can write a simple dynamic array module, and use it again and again.
    ----------------------------------------------------------
    #include <stdio.h>
    #include <stdlib.h>

    int main(int argc, char *argv[]) {

    int N=0, i=0, j=0;
    int *nums;

    FILE* datafile = fopen(argv[1], "r");
    while(fscanf(datafile, "%d", &j) != EOF){

    It's always better to loop while fscanf succeeds rather than trying to
    handle all the errors. You might not care about case where this loop
    fails, but it's just better to get into the right habit:

    while (fscanf(datafile, "%d", &j) == 1) ...

    nums = calloc(N, sizeof(int));

    The cost is low, but there's no need to use calloc here as you are going
    to assign exactly N locations.

    rewind(datafile);
    while(fscanf(datafile, "%d", &j) != EOF){
    nums[i++] = j;
    }

    As above, though I'd read into &nums[i] directly.

    fclose (datafile);
    printf("\n");

    for(i=0;i<N;i++) {
    printf("%d. %d\n", i+1, nums[i]);
    }
    printf("\n");
    free(nums);
    return(0);

    Because I have acquired the habit, I'd also check for errors,
    particularly on argc, fopen and malloc.


    }
    ----------------------------------------------------------

    Thanks for the tips.

    I'm not into error checking on my personal code. But I am into brief
    and efficient.

    New effort
    * dropped 2 variables
    * allocate 'on the fly'
    * one fscanf thru the file
    * 4 less lines of code

    ----------------------------------------------------------
    #include <stdio.h>
    #include <stdlib.h>

    int main(int argc, char *argv[]) {

    int N=0;
    int *nums = malloc(2 * sizeof(int));

    FILE* datafile = fopen(argv[1], "r");
    while(fscanf(datafile, "%d", &nums[N++]) == 1){
    nums = realloc(nums, (N+1) * sizeof(int));
    }
    fclose (datafile);

    N--;
    for(int i=0;i<N;i++) {
    printf("%d.%d ", i+1, nums[i]);
    }
    free(nums);

    printf("\n");
    return 0;

    }
    ----------------------------------------------------------




    original 19 lines not incl close brackets ----------------------------------------------------------
    #include <stdio.h>
    #include <stdlib.h>

    int main(int argc, char *argv[]) {

        int N=0, i=0, j=0;
        int *nums;

        FILE* datafile = fopen(argv[1], "r");
        while(fscanf(datafile, "%d", &j) != EOF){
    N++;
        }

        nums = calloc(N, sizeof(int));
        rewind(datafile);
        while(fscanf(datafile, "%d", &j) != EOF){
    nums[i++] = j;
        }
        fclose (datafile);
        printf("\n");

        for(i=0;i<N;i++) {
    printf("%d. %d\n", i+1, nums[i]);
        }
        printf("\n");
        free(nums);
        return(0);

    }
    ----------------------------------------------------------

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From DFS@21:1/5 to Keith Thompson on Sun Jun 16 12:20:07 2024
    On 6/15/2024 6:22 PM, Keith Thompson wrote:
    DFS <[email protected]> writes:
    I want to read numbers in from a file, say:

    47 185 99 74 202 118 78 203 264 207 19 17 34 167 148 54 297 271 118
    245 294 188 140 134 251 188 236 160 48 189 228 94 74 27 168 275 144
    245 178 108 152 197 125 185 63 272 239 60 242 56 4 235 244 144 69 195
    32 4 54 79 193 282 173 267 8 40 241 152 285 119 259 136 15 83 21 78 55
    259 137 297 15 141 232 259 285 300 153 16 4 207 95 197 188 267 164 195
    7 104 47 291


    This code:
    1 opens the file
    2 fscanf thru the file to count the number of data points
    3 allocate memory
    4 rewind and fscanf again to add the data to the int array


    Any issues with this method?

    Any 'better' way?

    Thanks

    In a quick test, your code compiles without errors and runs correctly
    with your input. I do get a warning about argc being unused, which you should address.

    -Wall doesn't warn about that, but -Wall -Wextra does.

    In the bigger program of which this is a part, argc IS used.


    ----------------------------------------------------------
    #include <stdio.h>
    #include <stdlib.h>

    int main(int argc, char *argv[]) {

    int N=0, i=0, j=0;

    The usual convention is to use all-caps for macro names. Calling your variable N is not a real problem, but could be slightly confusing.

    N is the number of integers in the input. i is an index. j is a value
    read from the file. That's not at all clear from the names.

    I suggest using longer and more descriptive names in lower case.
    "N" could be "count". "i" is fine for an index, but "j" could be
    "value".


    N is used in statistics, and this is a stats program.



    Consider using size_t rather than int for the count and index. That's
    mostly a style point; it's not going to make any practical difference
    unless you have at least INT_MAX elements.

    int *nums;

    FILE* datafile = fopen(argv[1], "r");

    Undefined behavior if no argument was provided, i.e., argc < 1.

    while(fscanf(datafile, "%d", &j) != EOF){

    Numeric input with the *scanf functions has undefined behavior if the
    scanned value is outside the range of the target type. For example, if
    the input contains "99999999999999999999999999999999999999999999999999", arbitrary bad things could happen. (Most likely it will just store some incorrect value in j, with no indication that there was an error.)

    strtol is trickier to use, but you can detect errors.

    fscanf returns EOF on reaching the end of the file or on a read error,
    and that's the only condition you check. It returns the number of items scanned. If the input doesn't contain a string that can be interpreted
    as an integer, fscanf will return 0, and you'll be stuck in an infinite
    loop. `while (fscanf(...) == 1)` is more robust, but it doesn't
    distinguish between a read error and bad data. It's up to you how and whether to distinguish among different kinds of errors.

    Your sample input consists of decimal integers with no sign. Decide
    whether you want to hande "-123" or "+123". (fscanf will do so; so will strtol.)

    A change I might make down the road is to process positive floats. For
    now it's just positive ints.


    N++;
    }

    nums = calloc(N, sizeof(int));

    Consider using `sizeof *nums` rather than `sizeof(int)`. That way you
    don't have to change the type in two places if the element type changes.

    You'll be updating all the elements of the nums array, so there's not
    much point in zeroing it. If you use malloc:

    nums = malloc(N * sizeof *nums);

    Whether you use calloc() or malloc(), you should check the return
    value. If it returns a null pointer, it means the allocation failed. Aborting the program is probably a good way to handle it.

    I usually don't do error checking on my personal code.


    (There are complications on Linux-based systems which I won't get into
    here. Google "OOM killer" and "overcommit" for details.)

    rewind(datafile);

    This can fail if the input file is not seekable. For example, on a Linux-based system you could do something like:
    ./your_program /dev/stdin < file
    Perhaps that's an acceptable restriction, but be aware of it.

    while(fscanf(datafile, "%d", &j) != EOF){

    Again, UB for out of range values.

    It's not guaranteed that you'll get the same data the second time you
    read the file; some other process could modify it. This might not be
    worth worrying about.

    I updated the code to do one fscanf() thru the file.

    I looked for an easy way to lock it while reading, but as I understand
    flock() it only places an 'advisory lock' on the file, and other
    processes are still free to modify it.


    nums[i++] = j;
    }
    fclose (datafile);
    printf("\n");

    You haven't produced any output yet; why print a blank line? (Of course
    you can if you want to.)

    for(i=0;i<N;i++) {
    printf("%d. %d\n", i+1, nums[i]);
    }
    printf("\n");
    free(nums);
    return(0);

    A minor style point: a return statement doesn't require parentheses.
    IMHO using parentheses make it look too much like a function call. I'd
    write `return 0;`, or more likely I'd just omit it, since falling off
    the end of main does an implicit `return 0;` (starting in C99).

    Can't omit it. It's required by my brain.


    }

    A method that doesn't require rescanning the input file is to initially allocate some reasonable amount of memory, then use realloc() to
    expand the array as needed. Doubling the array size is probably
    reasonable. It will consume more memory than a single allocation.

    Done in a way, as you'll see below.


    Thanks for the thorough analysis and good tips.


    Updated
    * dropped 2 variable declarations
    * allocate 'on the fly'
    * one fscanf thru the file
    * 4 less lines of code (not incl brackets)

    ----------------------------------------------------------
    #include <stdio.h>
    #include <stdlib.h>

    int main(int argc, char *argv[]) {

        int N=0;
        int *nums = malloc(2 * sizeof(int));

        FILE* datafile = fopen(argv[1], "r");
        while(fscanf(datafile, "%d", &nums[N++]) == 1){
    nums = realloc(nums, (N+1) * sizeof(int));
        }
        fclose (datafile);

        N--;
        for(int i=0;i<N;i++) {
    printf("%d.%d ", i+1, nums[i]);
        }
        free(nums);

        printf("\n");
        return 0;

    }
    ----------------------------------------------------------

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From bart@21:1/5 to David Brown on Sun Jun 16 18:14:55 2024
    On 16/06/2024 16:56, David Brown wrote:
    On 16/06/2024 17:09, DFS wrote:
    On 6/16/2024 12:44 AM, Janis Papanagnou wrote:
    On 16.06.2024 06:17, Keith Thompson wrote:

    For the original problem, where the input consists of digits and
    whitespace, you could read a character at a time and accumulate the
    value of each number.  (You probably want to handle leading signs as
    well, which isn't difficult.)

    Yes. Been there, done that. Sometimes it's good enough to go back
    to the roots if higher-level functions are imperfect or quirky.

    That is admittedly reinventing the
    wheel, but the existing wheels aren't entirely round.  You still
    have to dynamically allocate the array of ints, assuming you need
    to store all of them rather than processing each value as it's read.

    A subclass of tasks can certainly process data on the fly but for
    the general solution there should be a convenient way to handle it.

    I still prefer higher-level languages that take the burden from me.

    nums = []
    with open('data.txt','r') as f:
         for nbr in f.read().split():
             nums.append(int(nbr))
         print(*sorted(nums))


    nums = sorted(map(int, open('data.txt', 'r').read().split()))

    OK, a bit of a challenge for my scripting language. I managed this first:

    nums := sort(mapv(toval, splitstring(readstrfile("data.txt"))))

    It needed a change to 'splitstring' to allow a default separator
    consisting of white space of any length. And a one-line helper function
    'toval' since the usual candidates, special built-ins, were not valid
    for 'mapv'.

    It also works like this:

    nums := readstrfile("data.txt") -> splitstring -> mapv(toval) -> sort

    But only by chance since the 'piped' argument is the last one of multi-parameter functions, rather than the first.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael S@21:1/5 to Janis Papanagnou on Mon Jun 17 00:45:33 2024
    On Sun, 16 Jun 2024 17:37:34 +0200
    Janis Papanagnou <[email protected]> wrote:

    On 16.06.2024 13:31, Michael S wrote:
    On Sun, 16 Jun 2024 12:03:21 +0200
    Janis Papanagnou <[email protected]> wrote:

    On 16.06.2024 11:38, Michael S wrote:


    Again; which ones?

    Janis

    [...]



    The main misconceptions are about what is returned by fgets() and by realloc().
    fgets() returns its first argument in all cases except EOF or FS
    read error. That includes the case when the buffer is too short to accommodate full input line.

    fgets() return s on success, and NULL on error or when end
    of file occurs while no characters have been read.

    I am interested in the success case, thus "fgets() != NULL".
    The buffer size is controlled by the second function parameter.


    With realloc() there are two issues:
    1. It can sometimes return its first argument, but does not have
    to. In scenario like yours it will return a different pointer quite
    soon. 2. When realloc() returns NULL, it does not de-allocate its
    first argument.

    The second case, of course, is not important in practice, because in practice you're very unlikely to see realloc() returning NULL, and
    if nevertheless it did happen, you program is unlikely to survive
    and give meaningful result anyway. Still, here on c.l.c we like to
    pretend that we can meaningfully handle allocation failures.

    You may provide "correct" code (if you think mine is wrong). Or just
    inspect how it behaves; here's the output after two printf's added:

    $ printf "123 456 789 101112 77 88 99 101 999" | realloc
    123 456 78<+++
    123 456 78<===
    9 101112 7<+++
    123 456 789 101112 7<===
    7 88 99 10<+++
    123 456 789 101112 77 88 99 10<===
    1 999<+++
    123 456 789 101112 77 88 99 101 999<===
    123 456 789 101112 77 88 99 101 999

    The +++data+++ is the chunk read, and the ===data=== is the overall
    buffer content, and the final line again the result (as before, with
    the added newline as documented for puts()).

    Even though my code is just an "ad hoc test code" to demonstrate the procedure I outlined - and as such test code certainly lacking quite
    some error handling and much more - it does exactly what I intended it
    to do, and what I've implemented according to what I read in the man
    pages. I cannot see any "misconception", it does what was _intended_.

    There's indeed one point that I _deliberately_ ignored for the test
    code; actually the point you mentioned as "not important in practice".

    Again: You may provide "correct" code (if you think mine is "wrong"),
    or "better" code, usable for production instead of a test code.

    But your tone and statements were (as observed so often) inadequate;
    I quote from your post that started the subthread:
    "However it looks unlikely that it does what you meant for it to do."


    Did you consider that, may be, your understanding of C library is
    inadequate?


    It does exactly what I meant to do (as you can see in the logs above).

    Janis


    The only thing I see in your logs is that your testing skills are on par
    with your coding skills.

    I am not quite sure what your code is supposed to do.
    However, my impression was that we wanted to read text file line by
    line and to process lines separately. You code certainly does not do
    it. As I said above it contains several mistakes, including one that is particularly serious and hard to diagnose - a use of de-allocated
    buffer.

    Below is an example of how to do it correctly.
    It's not the only possible method, but any correct method would have
    similar complexity.


    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>

    int main(void)
    {
    const int sz_incr = 10;
    size_t bufsz = 32;
    char* buffer = malloc(bufsz);
    if (!buffer) {
    perror("malloc()");
    return 1;
    }

    size_t rdi = 0;
    int err = 0;
    for (size_t line_i = 1; ;) {
    buffer[bufsz-1] = 1; // set guard
    if (!fgets(&buffer[rdi], bufsz-rdi, stdin))
    break; // eof or error

    if (buffer[bufsz-1] != 0 || buffer[bufsz-2] == '\n') {
    // Full line - we can process it here
    // As an example, let's print our line as hex
    printf("%10zu:", line_i);
    size_t len = (char*)memchr(&buffer[rdi], '\n',
    bufsz-rdi-1)+1-buffer; for (size_t i = 0; i < len; ++i)
    printf(" %02x", (unsigned char)buffer[i]);
    printf("\n");
    rdi = 0;
    ++line_i;
    } else {
    // line is longer then bufsz-1
    rdi = bufsz-1;
    bufsz += sz_incr;
    char* tmp = realloc(buffer, bufsz);
    if (!tmp) {
    perror("realloc()");
    err = 1;
    break;
    }
    buffer = tmp;
    }
    }

    free(buffer);
    if (ferror(stdin)) {
    perror("fgets(stdin)");
    err = 2;
    }
    return err;
    }

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ben Bacarisse@21:1/5 to DFS on Mon Jun 17 00:17:18 2024
    DFS <[email protected]> writes:

    #include <stdio.h>
    #include <stdlib.h>

    int main(int argc, char *argv[]) {

    int N=0;
    int *nums = malloc(2 * sizeof(int));

    FILE* datafile = fopen(argv[1], "r");
    while(fscanf(datafile, "%d", &nums[N++]) == 1){
    nums = realloc(nums, (N+1) * sizeof(int));
    }
    fclose (datafile);

    N--;

    This N-- is a bit "tricksy". Better to increment in the realloc (or the
    while body) so it only happens when an int has been read.

    --
    Ben.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Kuyper@21:1/5 to DFS on Sun Jun 16 22:41:26 2024
    On 6/16/24 12:20, DFS wrote:
    On 6/15/2024 6:22 PM, Keith Thompson wrote:
    DFS <[email protected]> writes:
    ...
    return(0);

    A minor style point: a return statement doesn't require parentheses.
    IMHO using parentheses make it look too much like a function call. I'd
    write `return 0;`, or more likely I'd just omit it, since falling off
    the end of main does an implicit `return 0;` (starting in C99).

    Can't omit it. It's required by my brain.

    The parentheses you're putting in are completely unrelated to the use of parentheses in _Generic(), function calls, compound literals,
    sizeof(type name), alignof(), _BitInt(), _Atomic(), typeof(),
    typeof_unqual(), alignas(), function declarators, static_assert(), if(), switch(for(), while(), do ... while(), function-like macro definitions
    and invocations or cast expressions. In all of those cases, the
    parentheses are part of the grammar.

    The parentheses that you put in return(0) serve only for grouping
    purpose. They are semantically equivalent to the parentheses in "i =
    (0);"; they are just as legal, and just as pointless.

    If your brain doesn't immediately understand why what I said above is
    true, I recommend retraining it.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Kuyper@21:1/5 to DFS on Mon Jun 17 00:30:01 2024
    On 6/16/24 12:20, DFS wrote:
    On 6/15/2024 6:22 PM, Keith Thompson wrote:
    DFS <[email protected]> writes:
    ...
    return(0);

    A minor style point: a return statement doesn't require parentheses.
    IMHO using parentheses make it look too much like a function call. I'd
    write `return 0;`, or more likely I'd just omit it, since falling off
    the end of main does an implicit `return 0;` (starting in C99).

    Can't omit it. It's required by my brain.

    What behavior does your brain expect of the following code?:

    return(a+b)*2;

    If you have any trouble with interpreting that code, you need to retrain
    your brain.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Keith Thompson on Mon Jun 17 07:41:01 2024
    On 16.06.2024 22:32, Keith Thompson wrote:
    Janis Papanagnou <[email protected]> writes:
    [...] K&R at
    least seems to say that 'void' can only be declared for the
    return type of functions that do not return anything.
    [...]

    No version of C has ever permitted "void main" except when an
    implementation documents and permits it. [...]

    I cannot comment on main() being handled differently than
    other C functions. I was just quoting my old copy of K&R.

    I don't understand what you mean with "no version of C has
    ever permitted", given that my C compiler doesn't complain.

    WRT return value to the environment I'd expect any random
    or arbitrary value being returned in case that non had been
    explicitly specified to be returned.

    If I want a defined exit status (which is what I usually
    want) I specify 'int main (...)' and provide an explicit
    return statement (or exit() call).

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Keith Thompson on Mon Jun 17 07:48:39 2024
    On 17.06.2024 02:06, Keith Thompson wrote:
    Janis Papanagnou <[email protected]> writes:
    [...]
    #include <stdlib.h>
    #include <stdio.h>

    void main (int argc, char * argv[])
    {
    int chunk = 10;
    int bufsize = chunk+1;
    char * buf = malloc(bufsize);
    char * anchor = buf;
    while (fgets(buf, chunk+1, stdin) != NULL)
    if (realloc(anchor, bufsize += chunk) != NULL)
    buf += chunk;
    puts(anchor);
    }

    realloc() can return the pointer you pass to it if there's enough room
    in the existing location. (Or it can relocate the buffer even if there
    is enough room.)

    But if realloc() moves the buffer (copying the existing data to it), it returns a pointer to the new location and invalidates the old one. You discard the new pointer, only comparing it to NULL.

    Perhaps you assumed that realloc() always expands the buffer in place.
    It doesn't.

    No, I didn't assume that. I just missed that 'anchor' will get lost.
    Thanks!


    If the above program worked for you, I suspect that either realloc()
    never relocated the buffer, or you continued using the original buffer
    (and beyond) after realloc() invalidated it. [...]

    Yes, that was certainly the case. (I did no thorough test with large
    data sets, just a simple ad hoc test.)


    Elsethread I suggested to merge the malloc() with the realloc() call.
    The resulting code would be simpler (and might address that problem).

    int chunk = 10;
    int bufsize = 1;
    char * anchor = NULL;
    while ((anchor = realloc (anchor, bufsize += chunk)) != NULL &&
    fgets (anchor+bufsize-chunk-1, chunk+1, stdin) != NULL)
    ;
    puts (anchor);


    Do you see the exposed problem (or any other issues) here, too?

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tim Rentsch@21:1/5 to James Kuyper on Sun Jun 16 22:45:28 2024
    James Kuyper <[email protected]> writes:

    On 6/16/24 12:20, DFS wrote:

    On 6/15/2024 6:22 PM, Keith Thompson wrote:

    DFS <[email protected]> writes:

    ...

    return(0);

    A minor style point: a return statement doesn't require parentheses.
    IMHO using parentheses make it look too much like a function call. I'd
    write `return 0;`, or more likely I'd just omit it, since falling off
    the end of main does an implicit `return 0;` (starting in C99).

    Can't omit it. It's required by my brain.

    The parentheses you're putting in are completely unrelated to the use of parentheses in _Generic(), function calls, compound literals,
    sizeof(type name), alignof(), _BitInt(), _Atomic(), typeof(), typeof_unqual(), alignas(), function declarators, static_assert(), if(), switch(for(), while(), do ... while(), function-like macro definitions
    and invocations or cast expressions. In all of those cases, the
    parentheses are part of the grammar. [...]

    I'm pretty sure the "it" in "Can't omit it" was meant to refer
    to having the return statement, not to the parentheses.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Tim Rentsch on Mon Jun 17 07:52:25 2024
    On 17.06.2024 07:40, Tim Rentsch wrote:
    Keith Thompson <[email protected]> writes:

    The worst consequence of undefined behavior is having your code
    appear to "work".

    Personally I think causing a missle launch that started a
    worldwide thermonuclear war would be a worse consequence.
    YMMV.

    I think I wouldn't code a missile control system in "C". ;-)

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tim Rentsch@21:1/5 to Keith Thompson on Sun Jun 16 22:40:48 2024
    Keith Thompson <[email protected]> writes:

    The worst consequence of undefined behavior is having your code
    appear to "work".

    Personally I think causing a missle launch that started a
    worldwide thermonuclear war would be a worse consequence.
    YMMV.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Kuyper@21:1/5 to Janis Papanagnou on Mon Jun 17 02:21:22 2024
    On 6/17/24 01:41, Janis Papanagnou wrote:
    On 16.06.2024 22:32, Keith Thompson wrote:
    Janis Papanagnou <[email protected]> writes:
    [...] K&R at
    least seems to say that 'void' can only be declared for the
    return type of functions that do not return anything.
    [...]

    No version of C has ever permitted "void main" except when an
    implementation documents and permits it. [...]

    I cannot comment on main() being handled differently than
    other C functions. I was just quoting my old copy of K&R.

    It is handled differently. Your own functions can be declared in a wide
    variety of ways, so long as the declaration that is relevant to function designator in a function call is compatible with the definition of the
    function that it designates.
    C standard library functions can only be declared in ways compatible
    with the specifications in the C standard.
    main(), on the other hand, is unique, in that you have two incompatible
    choices of how to define it, and an implementation can designate
    additional choices. You can define main() in any way compatible with one
    of the options supported by your implementation; but portable code
    should define it only in one of the two ways specified by the C standard.
    K&R is long obsolete; up-to-date drafts of the standard that are almost identical to the latest version of the standard are free and easily
    available.

    I don't understand what you mean with "no version of C has
    ever permitted", given that my C compiler doesn't complain.

    He wrote "No version of C has ever permitted "void main" except when an implementation documents and permits it." Note that he is talking about versions of the standard, not versions of any particular implementation
    of C. If your C compiler "documents and permits" "void main", then it
    certainly shouldn't complain about it. However, since the C standard
    does not mandate support for void main, you've no guarantee of
    portability of code that uses void main to other implementations of C.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Kuyper@21:1/5 to Tim Rentsch on Mon Jun 17 02:38:16 2024
    On 17.06.2024 07:40, Tim Rentsch wrote:
    Keith Thompson <[email protected]> writes:

    The worst consequence of undefined behavior is having your code
    appear to "work".

    Personally I think causing a missle launch that started a
    worldwide thermonuclear war would be a worse consequence.
    YMMV.

    The standard doesn't say anything to prohibit such a consequence, but in
    real life such an outcome is possible only if your program is executing
    in an environment that allows it to send out launch messages to real
    missiles. In such a context, a program that was intended to launch a
    missile strike, and seemed to do so, but actually failed to do so, would arguably be worse. If the enemy knows that you are running such
    defective software, that enemy might not be deterred from attacking.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to James Kuyper on Mon Jun 17 09:22:58 2024
    On 17.06.2024 08:21, James Kuyper wrote:
    On 6/17/24 01:41, Janis Papanagnou wrote:
    On 16.06.2024 22:32, Keith Thompson wrote:
    Janis Papanagnou <[email protected]> writes:
    [...] K&R at
    least seems to say that 'void' can only be declared for the
    return type of functions that do not return anything.
    [...]

    No version of C has ever permitted "void main" except when an
    implementation documents and permits it. [...]

    I cannot comment on main() being handled differently than
    other C functions. I was just quoting my old copy of K&R.

    It is handled differently. Your own functions can be declared in a wide variety of ways, so long as the declaration that is relevant to function designator in a function call is compatible with the definition of the function that it designates.
    C standard library functions can only be declared in ways compatible
    with the specifications in the C standard.
    main(), on the other hand, is unique, in that you have two incompatible choices of how to define it, and an implementation can designate
    additional choices. You can define main() in any way compatible with one
    of the options supported by your implementation; but portable code
    should define it only in one of the two ways specified by the C standard.
    K&R is long obsolete; up-to-date drafts of the standard that are almost identical to the latest version of the standard are free and easily available.

    I don't understand what you mean with "no version of C has
    ever permitted", given that my C compiler doesn't complain.

    He wrote "No version of C has ever permitted "void main" except when an implementation documents and permits it." Note that he is talking about versions of the standard, not versions of any particular implementation
    of C. If your C compiler "documents and permits" "void main", then it certainly shouldn't complain about it. However, since the C standard
    does not mandate support for void main, you've no guarantee of
    portability of code that uses void main to other implementations of C.


    Re portability: Of course there's other requirements for portable
    and generally for professional code. I wrote a lot more professional
    code in C++ than in C but the same requirements hold. Defining the
    version of the standard, the supported platforms, activation of high
    warning levels - we wanted our code free of warnings! -, and whatnot.

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Keith Thompson on Mon Jun 17 09:35:07 2024
    On 17.06.2024 08:29, Keith Thompson wrote:
    Janis Papanagnou <[email protected]> writes:
    [...]

    Elsethread I suggested to merge the malloc() with the realloc() call.
    The resulting code would be simpler (and might address that problem).

    int chunk = 10;
    int bufsize = 1;
    char * anchor = NULL;
    while ((anchor = realloc (anchor, bufsize += chunk)) != NULL &&
    fgets (anchor+bufsize-chunk-1, chunk+1, stdin) != NULL)
    ;
    puts (anchor);


    Do you see the exposed problem (or any other issues) here, too?

    If stdin is empty, you never store anything in the buffer and
    puts(anchor) has undefined behavior because there might be a terminating '\0'. If the first realloc() fails, anchor is a null pointer and again puts(anchor) has undefined behavior.

    If nothing goes wrong, puts() adds an extra newline to the output.

    Yes, the purpose of puts() was to show whether the function that I
    wanted to check works properly on a long line of data.


    That's all that jumped out at me looking at the code, but did you test
    it with multi-line input? When I tried it it printed only the first
    line of input (followed by that extra newline).

    I'm still not entirely sure what the code is supposed to do.

    I just wanted the realloc-append logic codified and verified. (As
    a possible building block to create a function for the purpose of
    the original task outlined by the OP.)

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Keith Thompson on Mon Jun 17 09:16:00 2024
    On 17.06.2024 08:20, Keith Thompson wrote:
    Janis Papanagnou <[email protected]> writes:
    On 16.06.2024 22:32, Keith Thompson wrote:
    Janis Papanagnou <[email protected]> writes:
    [...] K&R at
    least seems to say that 'void' can only be declared for the
    return type of functions that do not return anything.
    [...]

    No version of C has ever permitted "void main" except when an
    implementation documents and permits it. [...]

    I cannot comment on main() being handled differently than
    other C functions. I was just quoting my old copy of K&R.

    First or second edition?

    It's a translation of a "(c) 1977 Prentice Hall" original, with
    no further edition mentioned, so it's probably the 1st edition?


    But main() *is* handled differently than other functions,

    Be assured, I don't object! It was just not mentioned that it's
    a special case.

    and
    that's important to understand. It's effectively called by the
    environment, which means that your definition has to cooperate
    with what the environment expects.

    I'm not sure whether my K&R copy addresses that at all. A quick
    view and I see only one instance where "main()" is mentioned at
    the beginning: main() { printf("hello, world\n"); }
    No types here, and no environment aspects mentioned.

    Also mind that (other) languages don't need to interact with the
    environment. Just recently I noticed that the Algol 68 Genie
    does not pass the value of the outmost block to the environment.
    Environment questions, interaction with the OS, may not be part
    of the language. (I'm not saying anything about the C standard
    [that I don't know]. Just a comment in principal.)

    What's slightly weird about
    it is that it can be defined in (at least) two different ways,
    with or without argc and argv.

    Frankly, I have no idea about the details of evolution of the
    C language. The old K&R source I have I had considered a pain;
    it provoked more questions than giving answers. And since then
    C changed a lot. That's why I stay mostly conservative with C
    and if in doubt check things undogmatic just with my compiler.

    [...]

    If I want a defined exit status (which is what I usually
    want) I specify 'int main (...)' and provide an explicit
    return statement (or exit() call).

    Why would you ever not want a defined exit status, given that it's
    easier to have one than not to have one?

    Aren't we agreeing here? (The only difference is that you are
    formulating in a negated form where I positively said the same.)

    (Since C99 an explicit
    return or exit() is optional.) I can't think of any reason *at all*
    to use "void main" in C with a hosted implementation. Can you?

    Well, to indicate that there's no status information or that
    it's irrelevant. E.g. as was the case in the test fragment I
    posted.

    In programs I typically write there's a lot things that can
    possibly go wrong - and that the program cannot fix -, mostly
    (but not exclusively) externalities. So it's typical that I
    interrogate return status of functions, map them to a defined
    set of return codes, create own codes for data inconsistencies
    etc. And this status is relevant and of course returned to the
    environment (often accompanied by some textual information on
    stderr).

    (If you don't care about the exit status, you can just write
    "int main" and not bother with a return statement or exit() call.
    The exit status will be 0, but that's not a problem if you don't
    care about it.)

    Whatever current C standards - and I'm not sure what ancient
    'cc' is on my system and to what standard it complies - say,
    if I specify an 'int' return type I also want a 'return' (or
    exit()) - consider it as "code hygienics" - even if it's not
    necessary according to more recent standards.

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kaz Kylheku@21:1/5 to James Kuyper on Mon Jun 17 07:39:36 2024
    On 2024-06-17, James Kuyper <[email protected]> wrote:
    On 6/16/24 12:20, DFS wrote:
    On 6/15/2024 6:22 PM, Keith Thompson wrote:
    DFS <[email protected]> writes:
    ...
    return(0);

    A minor style point: a return statement doesn't require parentheses.
    IMHO using parentheses make it look too much like a function call. I'd
    write `return 0;`, or more likely I'd just omit it, since falling off
    the end of main does an implicit `return 0;` (starting in C99).

    Can't omit it. It's required by my brain.

    I think DFS might mean that they find themselves unable to omit the
    unnecessary return 0 statement entirely.

    I also hate it; I feel that the implicit return 0 in main is a
    misfeature that was added due to caving in to bad programmers.

    Making int main(void) { } correct is like legalizing weed.
    Potheads are still potheads. Since I'm not one, I write a
    return statement in main.

    The parentheses you're putting in are completely unrelated to the use of parentheses in _Generic(), function calls, compound literals,
    sizeof(type name), alignof(), _BitInt(), _Atomic(), typeof(), typeof_unqual(), alignas(), function declarators, static_assert(), if(), switch(for(), while(), do ... while(), function-like macro definitions
    and invocations or cast expressions. In all of those cases, the
    parentheses are part of the grammar.

    Speaking of while, the do/while construct does not require parentheses
    in order to disambiguate anything, since it has a mandatory semicolon.
    Yet, it still has them.

    There would be no issue with this grammar:

    iteration_statement := 'do' statement 'while' expression ';'

    the fragment "'while' expression ';'" is exactly like
    "'return' expression ';'".

    Obviously, the parentheses are there for consistency with the
    top-testing while loop.

    It seems that in some people's eyes, the same consistency should extend
    to the return statement.

    More widespread than that is a practice of always using parentheses
    around the argument of sizeof, even if it's an expression and not
    a type.

    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @[email protected]

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to bart on Mon Jun 17 14:44:47 2024
    On 16/06/2024 19:14, bart wrote:
    On 16/06/2024 16:56, David Brown wrote:
    On 16/06/2024 17:09, DFS wrote:
    On 6/16/2024 12:44 AM, Janis Papanagnou wrote:
    On 16.06.2024 06:17, Keith Thompson wrote:

    For the original problem, where the input consists of digits and
    whitespace, you could read a character at a time and accumulate the
    value of each number.  (You probably want to handle leading signs as >>>>> well, which isn't difficult.)

    Yes. Been there, done that. Sometimes it's good enough to go back
    to the roots if higher-level functions are imperfect or quirky.

    That is admittedly reinventing the
    wheel, but the existing wheels aren't entirely round.  You still
    have to dynamically allocate the array of ints, assuming you need
    to store all of them rather than processing each value as it's read.

    A subclass of tasks can certainly process data on the fly but for
    the general solution there should be a convenient way to handle it.

    I still prefer higher-level languages that take the burden from me.

    nums = []
    with open('data.txt','r') as f:
         for nbr in f.read().split():
             nums.append(int(nbr))
         print(*sorted(nums))


    nums = sorted(map(int, open('data.txt', 'r').read().split()))

    OK, a bit of a challenge for my scripting language. I managed this first:

      nums := sort(mapv(toval, splitstring(readstrfile("data.txt"))))

    It needed a change to 'splitstring' to allow a default separator
    consisting of white space of any length. And a one-line helper function 'toval' since the usual candidates, special built-ins, were not valid
    for 'mapv'.


    That's nice, but irrelevant - the OP can use the Python version if he
    decides writing the C version is not fun any more, but your language is
    useless to everyone but you.

    It also works like this:

      nums := readstrfile("data.txt") -> splitstring -> mapv(toval) -> sort

    But only by chance since the 'piped' argument is the last one of multi-parameter functions, rather than the first.


    A piping syntax is, IMHO, also a nice feature (though again the OP will
    have no direct use of your language).

    Some people might like to do this all with shell pipes:

    cat data.txt | xargs -n 1 | sort -n | xargs

    That kind of thing can quickly get more awkward as the details change,
    such as if the data is separated by commas - by the time you have
    figured out the "awk" or "sed" commands needed, you'd be much faster
    with Python.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From DFS@21:1/5 to James Kuyper on Mon Jun 17 08:50:06 2024
    On 6/16/2024 10:41 PM, James Kuyper wrote:
    On 6/16/24 12:20, DFS wrote:
    On 6/15/2024 6:22 PM, Keith Thompson wrote:
    DFS <[email protected]> writes:
    ...
    return(0);

    A minor style point: a return statement doesn't require parentheses.
    IMHO using parentheses make it look too much like a function call. I'd
    write `return 0;`, or more likely I'd just omit it, since falling off
    the end of main does an implicit `return 0;` (starting in C99).

    Can't omit it. It's required by my brain.

    The parentheses you're putting in are completely unrelated to the use of parentheses in _Generic(), function calls, compound literals,
    sizeof(type name), alignof(), _BitInt(), _Atomic(), typeof(), typeof_unqual(), alignas(), function declarators, static_assert(), if(), switch(for(), while(), do ... while(), function-like macro definitions
    and invocations or cast expressions. In all of those cases, the
    parentheses are part of the grammar.

    The parentheses that you put in return(0) serve only for grouping
    purpose. They are semantically equivalent to the parentheses in "i =
    (0);"; they are just as legal, and just as pointless.

    If your brain doesn't immediately understand why what I said above is
    true, I recommend retraining it.


    I meant omit a return altogether.

    But looking around, I rarely see return(0). Don't know why it became a
    thing for me.

    Moving forward, return 0 it is.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From DFS@21:1/5 to Ben Bacarisse on Mon Jun 17 08:49:51 2024
    On 6/16/2024 7:17 PM, Ben Bacarisse wrote:
    DFS <[email protected]> writes:

    #include <stdio.h>
    #include <stdlib.h>

    int main(int argc, char *argv[]) {

    int N=0;
    int *nums = malloc(2 * sizeof(int));

    FILE* datafile = fopen(argv[1], "r");
    while(fscanf(datafile, "%d", &nums[N++]) == 1){
    nums = realloc(nums, (N+1) * sizeof(int));
    }
    fclose (datafile);

    N--;

    This N-- is a bit "tricksy". Better to increment in the realloc (or the while body) so it only happens when an int has been read.

    I don't like it either, but I've already spent about 3 full days on the
    whole stats program, and life's too short. N has to be the number of
    data points, because it's used throughout the rest of the program.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Kuyper@21:1/5 to Janis Papanagnou on Mon Jun 17 09:38:54 2024
    On 6/17/24 03:16, Janis Papanagnou wrote:
    On 17.06.2024 08:20, Keith Thompson wrote:
    Janis Papanagnou <[email protected]> writes:
    ...
    and
    that's important to understand. It's effectively called by the
    environment, which means that your definition has to cooperate
    with what the environment expects.

    I'm not sure whether my K&R copy addresses that at all. A quick
    view and I see only one instance where "main()" is mentioned at
    the beginning: main() { printf("hello, world\n"); }
    No types here, and no environment aspects mentioned.

    K&R C did not have function prototypes. main() declared with no
    arguments indicates that main takes an unknown number of arguments, of unspecified type - as such it's compatible with taking either two
    arguments, or none. For backwards compatibility, you're still allowed to declare functions K&R style, but it's been more than 3 decades since it
    was a good idea to do so.

    ...
    If I want a defined exit status (which is what I usually
    want) I specify 'int main (...)' and provide an explicit
    return statement (or exit() call).

    Why would you ever not want a defined exit status, given that it's
    easier to have one than not to have one?

    Aren't we agreeing here? (The only difference is that you are
    formulating in a negated form where I positively said the same.)

    You implied, by saying "If I want a defined exit status", that there are occasions where you don't want a defined exit status - and he's
    questioning that. Things that are undefined are seldom useful. If the
    exit status is undefined, it might be a failure status. In many
    contexts, that would cause no problems, but there's also places where it
    would.

    ...
    Well, to indicate that there's no status information or that
    it's irrelevant. E.g. as was the case in the test fragment I
    posted.

    That's the problem - your "indication that there's no status
    information" doesn't achieve the desired effect. Instead, it results in
    an unspecified status being returned to the system. If might be a
    successful status, or an unsuccessful status. On the systems I use,
    scripts that execute programs will often abort if the program returns an unsuccessful status code. If there's nothing that needs to be brought to
    the system's attention, use "return 0;", not "void main()".

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From DFS@21:1/5 to Janis Papanagnou on Mon Jun 17 09:45:36 2024
    On 6/17/2024 1:52 AM, Janis Papanagnou wrote:
    On 17.06.2024 07:40, Tim Rentsch wrote:
    Keith Thompson <[email protected]> writes:

    The worst consequence of undefined behavior is having your code
    appear to "work".

    Personally I think causing a missle launch that started a
    worldwide thermonuclear war would be a worse consequence.
    YMMV.

    I think I wouldn't code a missile control system in "C". ;-)

    Janis

    Per "Google AI Overview": "In 1987, the Department of Defense mandated
    that Ada be the standard programming language for Defense computer
    resources used in military command and control systems."

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From DFS@21:1/5 to David Brown on Mon Jun 17 09:52:06 2024
    On 6/16/2024 11:56 AM, David Brown wrote:
    On 16/06/2024 17:09, DFS wrote:


    nums = []
    with open('data.txt','r') as f:
         for nbr in f.read().split():
             nums.append(int(nbr))
         print(*sorted(nums))


    nums = sorted(map(int, open('data.txt', 'r').read().split()))


    showoff!

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From DFS@21:1/5 to Kaz Kylheku on Mon Jun 17 09:50:14 2024
    On 6/17/2024 3:39 AM, Kaz Kylheku wrote:

    I think DFS might mean that they find themselves

    he finds himself


    unable to omit the unnecessary return 0 statement entirely.

    yes

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ben Bacarisse@21:1/5 to DFS on Mon Jun 17 15:41:17 2024
    DFS <[email protected]> writes:

    On 6/16/2024 10:41 PM, James Kuyper wrote:
    On 6/16/24 12:20, DFS wrote:
    On 6/15/2024 6:22 PM, Keith Thompson wrote:
    DFS <[email protected]> writes:
    ...
    return(0);

    A minor style point: a return statement doesn't require parentheses.
    IMHO using parentheses make it look too much like a function call. I'd >>>> write `return 0;`, or more likely I'd just omit it, since falling off
    the end of main does an implicit `return 0;` (starting in C99).

    Can't omit it. It's required by my brain.
    The parentheses you're putting in are completely unrelated to the use of
    parentheses in _Generic(), function calls, compound literals,
    sizeof(type name), alignof(), _BitInt(), _Atomic(), typeof(),
    typeof_unqual(), alignas(), function declarators, static_assert(), if(),
    switch(for(), while(), do ... while(), function-like macro definitions
    and invocations or cast expressions. In all of those cases, the
    parentheses are part of the grammar.
    The parentheses that you put in return(0) serve only for grouping
    purpose. They are semantically equivalent to the parentheses in "i =
    (0);"; they are just as legal, and just as pointless.
    If your brain doesn't immediately understand why what I said above is
    true, I recommend retraining it.

    I meant omit a return altogether.

    But looking around, I rarely see return(0). Don't know why it became a
    thing for me.

    Moving forward, return 0 it is.

    By the way, you might have retained return (exp); from old C. C
    originally required the parentheses, but they got dropped quite early
    on. The syntax in K&R (1st edition) does not require them, but almost
    all the code example in the book still have them!

    I took a while to drop them as I came to C from B where they were always required so I'd got the habit.

    --
    Ben.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Richard Harnden@21:1/5 to DFS on Mon Jun 17 16:23:54 2024
    On 17/06/2024 14:50, DFS wrote:
    On 6/17/2024 3:39 AM, Kaz Kylheku wrote:

    I think DFS might mean that they find themselves

    he finds himself


    unable to omit the unnecessary return 0 statement entirely.

    yes



    If a function is defined to return an int, then you should return one.

    Anything else is just lazy/sloppy. Just because main allows it as a
    special case doesn't mean it's a good idea.

    I mean: it's really not much extra to type.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Richard Harnden on Mon Jun 17 18:46:59 2024
    On 17/06/2024 17:23, Richard Harnden wrote:
    On 17/06/2024 14:50, DFS wrote:
    On 6/17/2024 3:39 AM, Kaz Kylheku wrote:

    I think DFS might mean that they find themselves

    he finds himself


    unable to omit the unnecessary return 0 statement entirely.

    yes



    If a function is defined to return an int, then you should return one.

    Anything else is just lazy/sloppy.  Just because main allows it as a
    special case doesn't mean it's a good idea.

    I mean: it's really not much extra to type.

    There's nothing wrong with ending your "main" with "return 0;". What
    Keith said was that it is unnecessary, that using parenthesis in the
    form "return(0);" looks like like a function call and is considered poor
    style by many people, and that it is useful to know that when "main"
    exists without an explicit returned value, it does so as though it had
    exited with "return 0;". (And in another branch, he said the return
    type for "main" on hosted C systems should be "int".)

    These are all true statements.

    If you prefer to end "main" with "return 0;", that's absolutely fine -
    but it is /not/ lazy or sloppy to omit it.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From DFS@21:1/5 to Chris M. Thomasson on Mon Jun 17 17:07:41 2024
    On 6/17/2024 4:16 PM, Chris M. Thomasson wrote:
    On 6/17/2024 6:45 AM, DFS wrote:
    On 6/17/2024 1:52 AM, Janis Papanagnou wrote:
    On 17.06.2024 07:40, Tim Rentsch wrote:
    Keith Thompson <[email protected]> writes:

    The worst consequence of undefined behavior is having your code
    appear to "work".

    Personally I think causing a missle launch that started a
    worldwide thermonuclear war would be a worse consequence.
    YMMV.

    I think I wouldn't code a missile control system in "C". ;-)

    Janis

    Per "Google AI Overview": "In 1987, the Department of Defense mandated
    that Ada be the standard programming language for Defense computer
    resources used in military command and control systems."



    Check this out:

    JOINT STRIKE FIGHTER
    AIR VEHICLE
    C++ CODING STANDARDS

    https://www.stroustrup.com/JSF-AV-rules.pdf

    ;^)


    Scary.

    I want to add a new AV Rule:

    * The Joint Strike Fighter Air Vehicle C++ Coding Standards document
    will not leave out Rules 161 and 172.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to DFS on Mon Jun 17 22:48:21 2024
    DFS <[email protected]> writes:
    On 6/17/2024 4:16 PM, Chris M. Thomasson wrote:
    On 6/17/2024 6:45 AM, DFS wrote:
    On 6/17/2024 1:52 AM, Janis Papanagnou wrote:
    On 17.06.2024 07:40, Tim Rentsch wrote:
    Keith Thompson <[email protected]> writes:

    The worst consequence of undefined behavior is having your code
    appear to "work".

    Personally I think causing a missle launch that started a
    worldwide thermonuclear war would be a worse consequence.
    YMMV.

    I think I wouldn't code a missile control system in "C". ;-)

    Janis

    Per "Google AI Overview": "In 1987, the Department of Defense mandated
    that Ada be the standard programming language for Defense computer
    resources used in military command and control systems."



    Check this out:

    JOINT STRIKE FIGHTER
    AIR VEHICLE
    C++ CODING STANDARDS

    https://www.stroustrup.com/JSF-AV-rules.pdf

    ;^)


    Scary.

    It's useful to note that these rules were published two decades ago.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to DFS on Tue Jun 18 06:57:06 2024
    On 17.06.2024 15:45, DFS wrote:
    On 6/17/2024 1:52 AM, Janis Papanagnou wrote:

    I think I wouldn't code a missile control system in "C". ;-)

    Per "Google AI Overview": "In 1987, the Department of Defense mandated
    that Ada be the standard programming language for Defense computer
    resources used in military command and control systems."

    This is actually what I'd have expected, that they might prefer Ada
    (as in the aviation or space flight areas).

    There's jokes existing[*] that illustrate the dilemma with safety in engineering. Using specific (unsafe) languages as well as reducing
    funds for QA measures or externalizing processes and components (or
    many other possible sources for increased unreliability); there's
    quite some tragic examples, sadly.

    Janis

    [*] e.g. https://www.reddit.com/r/Jokes/comments/6a1czd/a_group_of_engineering_professors_were_invited_to/

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to James Kuyper on Tue Jun 18 07:09:34 2024
    On 17.06.2024 15:38, James Kuyper wrote:
    On 6/17/24 03:16, Janis Papanagnou wrote:
    On 17.06.2024 08:20, Keith Thompson wrote:
    Janis Papanagnou <[email protected]> writes:
    If I want a defined exit status (which is what I usually
    want) I specify 'int main (...)' and provide an explicit
    return statement (or exit() call).

    Why would you ever not want a defined exit status, given that it's
    easier to have one than not to have one?

    Aren't we agreeing here? (The only difference is that you are
    formulating in a negated form where I positively said the same.)

    You implied, by saying "If I want a defined exit status", that there are occasions where you don't want a defined exit status

    ...where I don't _need_ one. Yes.
    E.g. in code like main() { printf("hello, world\n"); }

    - and he's
    questioning that. Things that are undefined are seldom useful.

    I disagree. If things are undefined it _may_ just not matter.
    If things matter they should not (ideally never) be undefined.

    If the
    exit status is undefined, it might be a failure status. In many
    contexts, that would cause no problems, but there's also places where it would.

    Exactly.


    ...
    Well, to indicate that there's no status information or that
    it's irrelevant. E.g. as was the case in the test fragment I
    posted.

    That's the problem - your "indication that there's no status
    information" doesn't achieve the desired effect. Instead, it results in
    an unspecified status being returned to the system. If might be a
    successful status, or an unsuccessful status. On the systems I use,
    scripts that execute programs will often abort if the program returns an unsuccessful status code. If there's nothing that needs to be brought to
    the system's attention, use "return 0;", not "void main()".

    In cases where the return status is a substantial part of the
    external specification, yes. Return status is not self purpose!
    YMMV.

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Ben Bacarisse on Tue Jun 18 08:12:40 2024
    On 17.06.2024 16:41, Ben Bacarisse wrote:
    DFS <[email protected]> writes:

    Moving forward, return 0 it is.

    By the way, you might have retained return (exp); from old C. C
    originally required the parentheses, but they got dropped quite early
    on. The syntax in K&R (1st edition) does not require them, but almost
    all the code example in the book still have them!

    This is an interesting observation! (That I can confirm.)

    That's probably why originally I also used parenthesis.
    I saw the examples but didn't inspect the syntax appendix.

    But how did the early compiler behave?
    Did they follow the code samples' syntax or the formal syntax?

    I took a while to drop them as I came to C from B where they were always required so I'd got the habit.

    I dropped them as soon as I noticed that it's possible.

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Janis Papanagnou@21:1/5 to Keith Thompson on Tue Jun 18 08:02:13 2024
    On 18.06.2024 01:15, Keith Thompson wrote:
    Janis Papanagnou <[email protected]> writes:


    You've mentioned several things you have no idea about.

    (I have mainly no idea about the detailed differences of
    the "C" standard evolution.) I explicitly stated that to
    make clear about that, since what's valid in "std X" (or
    in legacy systems) may be void in "std Y".

    Are you interested in learning?

    I think you are unnecessarily provocative, but I'm honestly
    answering your question...

    It depends. - Generally, I'm constantly learning of course
    (there's no "off"-switch).

    Concerning current "C"? - Clearly not in detailed changes.
    (I'm aware that the density of folks here that has every C
    standard version present and probably even considers the
    latest version as a bible (sort of) is high. That's fine.
    But I'm not a [religious (sort of)] follower.)

    "C" was never (thus still isn't) my "language of choice".
    I programmed already in a couple other (better) languages
    when I stumbled across C (in the 1980's. It has never been
    a paragon of a "good" programming language to me. (YMMV)

    In practice I switched _very early_ (as soon as it was
    available) to C++, mainly because of the OO concepts that
    I already knew from and used with Simula. The unreliable
    "C" base of that language was still a nuisance. (It's not
    surprising that "C" has later evolved in important parts.
    But that ship has sailed [for me; maybe also more widely].)

    I'm still (academically) interested in several questions
    concerning the C programming language. That's one reason
    why I'm raising or discussing topics here. It's private
    interest. (It should be a clear indication of learning.)

    [...]

    (I skip the part of your post that I just answered in a
    reply to James.)

    [...]

    Whatever current C standards - and I'm not sure what ancient
    'cc' is on my system and to what standard it complies - say,

    Perhaps you should find out what your ancient "cc" does. What OS
    are you on? Does "cc --version", "cc -V", or "man cc" give you
    any meaningful information?

    I'm on an (old) Unix system. The version of my GNU 'cc' is
    usually not important for the things I'm doing. I've just
    once (in the past year) used a '-std=...' switch to have
    some specific behavior guaranteed or a feature available.
    (I cannot use features of newer standard, year 2000+, but
    that's unimportant for the things I'm using that language.)

    Professionally I used C only in the late 1980's for a short
    period of time.

    Frankly, professionally I do other things than programming
    [in C or else] these days.

    Given my age, and in the light of what I outlined above,
    don't expect to convince me with "expert details" of C,
    specifically newer C standards. (I'm sure they are very
    important for younger folks that [want/have to] use C in
    their professional context.)

    Quite some folks here seem to be of similar age than me;
    I'm astonished there's so much eagerness concerning the
    "C" language. :-)

    [...]

    Janis

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tim Rentsch@21:1/5 to Janis Papanagnou on Tue Jun 18 00:25:19 2024
    Janis Papanagnou <[email protected]> writes:

    On 17.06.2024 07:40, Tim Rentsch wrote:

    Keith Thompson <[email protected]> writes:

    The worst consequence of undefined behavior is having your code
    appear to "work".

    Personally I think causing a missle launch that started a
    worldwide thermonuclear war would be a worse consequence.
    YMMV.

    I think I wouldn't code a missile control system in "C". ;-)

    It's extremely unlikely that I will ever be working on a missile
    control system, either in C or in any other language. But that
    doesn't change either the truth or the point of my comment.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tim Rentsch@21:1/5 to Kaz Kylheku on Tue Jun 18 00:19:28 2024
    Kaz Kylheku <[email protected]> writes:

    Speaking of while, the do/while construct does not require parentheses
    in order to disambiguate anything, since it has a mandatory semicolon.
    Yet, it still has them.

    It has them to allow an extension for a "loop-and-a-half" control
    structure:

    do statement while ( expression ) statement

    and so for example

    do c = getchar(); while( c != EOF ) n++;

    to count characters on standard input.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Kuyper@21:1/5 to Janis Papanagnou on Tue Jun 18 03:25:59 2024
    On 6/18/24 01:09, Janis Papanagnou wrote:
    On 17.06.2024 15:38, James Kuyper wrote:
    ...
    You implied, by saying "If I want a defined exit status", that there are
    occasions where you don't want a defined exit status

    ...where I don't _need_ one. Yes.
    E.g. in code like main() { printf("hello, world\n"); }

    - and he's
    questioning that. Things that are undefined are seldom useful.

    I disagree. If things are undefined it _may_ just not matter.
    If things matter they should not (ideally never) be undefined.

    If the
    exit status is undefined, it might be a failure status. In many
    contexts, that would cause no problems, but there's also places where it
    would.

    Exactly.

    I was merely directly addressing your comment by pointing out that the
    status returned was unspecified. While technically correct that's like
    saying that a nuclear bomb could be used to light a match. It's
    unspecified because, as Keith pointed out, the behavior of the entire
    program is undefined.
    Is there anything that a program could be written to do on your computer
    that you would not like it to do? The C standard imposes no restrictions
    on the behavior of a program created by translating code that has
    undefined behavior, so a fully conforming implementation may legally
    translate such code into a program with that behavior. If the
    implementation you're using documents what it does with "void main()",
    you can rely upon that documentation - but do you know for a fact that
    it does document it?
    When you deliberately write a program that you know to have undefined
    behavior, what you are, in effect, telling the implementation to do is
    create an executable that has any behavior it wants to give you. If
    that's OK with you, there's no need to write a new program. Any existing program already has behavior that your such code could legally be
    translated to, so you might as well just execute any arbitrary program
    that's already been translated, and save yourself some time.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Lew Pitcher on Tue Jun 18 08:06:40 2024
    On Sun, 16 Jun 2024 15:52:16 -0000 (UTC), Lew Pitcher wrote:

    (BTW, the 'blind-spot' I mentioned is that we often forget that we /can/
    use temporary files to store intermediary results. Sometimes we can manipulate a temporary file easier than we can manipulate malloc()ed (or other) storage. )

    Or we could use an “I/O stream”, basically a temporary file in RAM.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Keith Thompson on Tue Jun 18 15:00:16 2024
    On 18/06/2024 00:44, Keith Thompson wrote:
    DFS <[email protected]> writes:
    On 6/17/2024 1:52 AM, Janis Papanagnou wrote:
    On 17.06.2024 07:40, Tim Rentsch wrote:
    Keith Thompson <[email protected]> writes:

    The worst consequence of undefined behavior is having your code
    appear to "work".

    Personally I think causing a missle launch that started a
    worldwide thermonuclear war would be a worse consequence.
    YMMV.
    I think I wouldn't code a missile control system in "C". ;-)
    Janis

    Per "Google AI Overview": "In 1987, the Department of Defense mandated
    that Ada be the standard programming language for Defense computer
    resources used in military command and control systems."

    Please don't post AI-based misinformation.

    The DOD Ada mandate was introduced in 1991, and effectively dropped in 1997.


    And of course the USA DoD (I assume, when DFS failed to mention the
    country, he meant the USA) is only for one country. There are a great
    many other countries making missile control software around the world,
    and I know without doubt that Ada is not mandated in all of them.

    The open-source RTEMS "Real-Time Executive for Missile Systems" RTOS is
    written in a mix of C and Ada, and supports at least C, Ada and C++ for
    user code.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tim Rentsch@21:1/5 to James Kuyper on Tue Jun 18 17:01:07 2024
    James Kuyper <[email protected]> writes:

    On 17.06.2024 07:40, Tim Rentsch wrote:

    Keith Thompson <[email protected]> writes:

    The worst consequence of undefined behavior is having your code
    appear to "work".

    Personally I think causing a missle launch that started a
    worldwide thermonuclear war would be a worse consequence.
    YMMV.

    The standard doesn't say anything to prohibit such a consequence, but in
    real life such an outcome is possible only if your program is executing
    in an environment that allows it to send out launch messages to real missiles. In such a context, a program that was intended to launch a
    missile strike, and seemed to do so, but actually failed to do so, would arguably be worse. If the enemy knows that you are running such
    defective software, that enemy might not be deterred from attacking.

    None of that has any relevance to my comment.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tim Rentsch@21:1/5 to Keith Thompson on Tue Jun 18 17:24:44 2024
    Keith Thompson <[email protected]> writes:

    Tim Rentsch <[email protected]> writes:

    Kaz Kylheku <[email protected]> writes:

    Speaking of while, the do/while construct does not require parentheses
    in order to disambiguate anything, since it has a mandatory semicolon.
    Yet, it still has them.

    It has them to allow an extension for a "loop-and-a-half" control
    structure:

    do statement while ( expression ) statement

    and so for example

    do c = getchar(); while( c != EOF ) n++;

    to count characters on standard input.

    Oh? Do you have any evidence that that was the intent? [...]

    I think you're reading something into my remark that it
    didn't say.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kaz Kylheku@21:1/5 to Keith Thompson on Wed Jun 19 01:58:57 2024
    On 2024-06-19, Keith Thompson <[email protected]> wrote:
    Tim Rentsch <[email protected]> writes:
    Keith Thompson <[email protected]> writes:

    Tim Rentsch <[email protected]> writes:

    Kaz Kylheku <[email protected]> writes:

    Speaking of while, the do/while construct does not require parentheses >>>>> in order to disambiguate anything, since it has a mandatory semicolon. >>>>> Yet, it still has them.

    It has them to allow an extension for a "loop-and-a-half" control
    structure:

    do statement while ( expression ) statement

    and so for example

    do c = getchar(); while( c != EOF ) n++;

    to count characters on standard input.

    Oh? Do you have any evidence that that was the intent? [...]

    I think you're reading something into my remark that it
    didn't say.

    Or at least that you didn't mean.

    FWIW, it would seem that the phrase pattern:

    do statement while expression ;

    may be compatible with the proposed extension in a way
    manageable via LALR(1) parsing.

    I don't see difficulties in recursive descent, either.

    The near minimal Yacc grammar pasted below produces no conflicts,
    and is only slightly contorted. We treat the ')' token as the
    lowest prededence operator, and ';' as highest, which eliminates
    conflicts in way that we want.

    I can explain why; another way is to remove the %nonassoc declarations,
    use "yacc -v", and study the confict details y.output file.

    It's not clear whether the grammar can be nicely factored into the form
    used in the standard, which makes no use of precedence or associativity.
    (But would that be a requirement for leaving room for an extension.)

    %{

    %}

    %nonassoc ')'
    %token DO WHILE NUM
    %left '+'
    %nonassoc ';'

    %%

    while_statement : DO statement WHILE expr ';'
    | DO statement WHILE '(' expr ')' statement

    statement : ';'
    | expr ';'
    | '{' expr '}'
    | '{' '}'
    ;

    expr : '(' expr ')'
    | expr '+' expr
    | '+' expr
    | NUM
    ;

    %%

    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @[email protected]

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Keith Thompson on Wed Jun 19 09:50:45 2024
    On 19/06/2024 04:07, Keith Thompson wrote:
    Keith Thompson <[email protected]> writes:
    [...]
    That's fine. A return statement or exit() call is unnecessary
    in main() due to a special-case rule that was added in 1999 for
    compatibility with C++. I don't particularly like that rule myself.
    I choose to omit the return statement in small programs, but if
    you want to add the "return 0;", I have absolutely no objection.
    (I used to do that myself.) It even makes your code more portable
    to old compilers that support C90. (tcc claims to support C99,
    but it has a bug in this area.)

    A minor point: The latest unreleased version of tcc appears to fix this
    bug. In tcc 0.9.27, falling off the end of main (defined as "int main(void)") returns some random status. In the latest version, it
    returns 0, based on a quick experiment and a cursory examination of the generated object code. (tcc doesn't have an option to generate an
    assembly listing; I used "tcc -c" followed by "objdump -d".)


    Godbolt has support for tcc, which might be convenient if you want to
    look at its output.

    <https://godbolt.org/z/5hK7PbGbj>

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kaz Kylheku@21:1/5 to Richard Harnden on Sat Jun 22 22:14:04 2024
    On 2024-06-17, Richard Harnden <[email protected]d> wrote:
    On 17/06/2024 14:50, DFS wrote:
    On 6/17/2024 3:39 AM, Kaz Kylheku wrote:

    I think DFS might mean that they find themselves

    he finds himself


    unable to omit the unnecessary return 0 statement entirely.

    yes



    If a function is defined to return an int, then you should return one.

    Anything else is just lazy/sloppy. Just because main allows it as a
    special case doesn't mean it's a good idea.

    I mean: it's really not much extra to type.

    The misfeature of missing return being success was, I believe, not
    intended to make programs shorter. It was intendeda to correct the
    random termination statuses of countless numbers of programs in a single stroke.

    Deliberately relying on this in a new program is like relying ona a
    diaper. If you're of intermediate age, you don't do this.

    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @[email protected]

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Phil Carmody@21:1/5 to Kaz Kylheku on Sun Jun 23 11:20:16 2024
    Kaz Kylheku <[email protected]> writes:
    On 2024-06-17, Richard Harnden <[email protected]d> wrote:
    If a function is defined to return an int, then you should return one.

    Anything else is just lazy/sloppy. Just because main allows it as a
    special case doesn't mean it's a good idea.

    I mean: it's really not much extra to type.

    The misfeature of missing return being success was, I believe, not
    intended to make programs shorter. It was intendeda to correct the
    random termination statuses of countless numbers of programs in a single stroke.

    Deliberately relying on this in a new program is like relying ona a
    diaper. If you're of intermediate age, you don't do this.

    Astronauts do this quite frequently. Some pilots too. And divers. And
    crane operators. It's a well-established solution to a known problem.

    However, I'd still put the explicit return in for a reason of
    literal portability: were I to want to lift that code out into
    a separate function called by main(), I'd want it to behave the
    same.

    Phil
    --
    We are no longer hunters and nomads. No longer awed and frightened, as we have gained some understanding of the world in which we live. As such, we can cast aside childish remnants from the dawn of our civilization.
    -- NotSanguine on SoylentNews, after Eugen Weber in /The Western Tradition/

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lew Pitcher@21:1/5 to Lew Pitcher on Tue Jun 25 17:37:24 2024
    On Sun, 16 Jun 2024 15:52:16 +0000, Lew Pitcher wrote:

    On Sat, 15 Jun 2024 15:36:22 -0400, DFS wrote:

    I want to read numbers in from a file, say:

    47 185 99 74 202 118 78 203 264 207 19 17 34 167 148 54 297 271 118 245
    294 188 140 134 251 188 236 160 48 189 228 94 74 27 168 275 144 245 178
    108 152 197 125 185 63 272 239 60 242 56 4 235 244 144 69 195 32 4 54 79
    193 282 173 267 8 40 241 152 285 119 259 136 15 83 21 78 55 259 137 297
    15 141 232 259 285 300 153 16 4 207 95 197 188 267 164 195 7 104 47 291


    This code:
    1 opens the file
    2 fscanf thru the file to count the number of data points
    3 allocate memory
    4 rewind and fscanf again to add the data to the int array


    [snip]
    You /could/ create a temporary, binary, file, and write the fscanf()'ed values to it as part of the first loop. Once the first loop completes,
    you rewind this temporary file, and load your integer array by reading
    the (now converted to native integer format) values from that file.

    Still two passes, but using fscanf() in only one of those passes.
    [snip]

    For what it's worth, here's an example of what I suggest:

    /*
    The following code provides two examples of the approach I suggested.

    Example 1: while counting input numbers, write temp file with int values
    malloc() a buffer big enough for that count of int values
    fread() the temp file into the malloc()'ed buffer
    Note: conformant to ISO Standard C.

    Example 2: while counting input numbers, write temp file with int values
    mmap() the temp file, starting at the beginning, and sized to
    include all the int values in the file.
    Note: conformant to POSIX C extensions to ISO Standard C.

    Note: compile with -DUSE_MMAP to obtain mmap() variant, otherwise
    this will compile the malloc()/fread() variant
    */

    #include <stdio.h>
    #include <stdlib.h>

    #ifdef USE_MMAP
    #include <sys/mman.h>
    #define BANNER "Example of array loading using mmap()"
    #define FREEALLOC(x)
    #else
    #define BANNER "Example of array loading using malloc() and fread()"
    #define FREEALLOC(x) free((x))
    #endif

    static int *LoadIntArray(FILE *fp, size_t *Count);

    int main(void)
    {
    int status = EXIT_FAILURE, *array;
    size_t count;

    puts(BANNER);

    if ((array = LoadIntArray(stdin,&count)))
    {
    printf("%zu elements loaded\n",count);
    for (size_t index = 0; index < count; ++index)
    printf("array[%3zu] == %d\n",index,array[index]);

    FREEALLOC(array); /* if necessary, free() the malloc()'ed array */
    status = EXIT_SUCCESS;
    }
    return status;
    }

    static int *LoadIntArray(FILE *fp,size_t *Count)
    {
    FILE *tmp;
    int *array = NULL;
    size_t count = 0;

    if ((tmp = tmpfile()))
    {
    int buffer;

    for (count = 0; fscanf(fp,"%d",&buffer) == 1; ++count)
    fwrite(&buffer,sizeof buffer, 1,tmp);

    if (count)
    {
    #ifdef USE_MMAP
    /*
    ** USE mmap() to map temp_file data into process memory
    */
    array = mmap(NULL,
    count * sizeof *array,
    PROT_READ,MAP_PRIVATE,
    fileno(tmp),
    0);
    if (array == MAP_FAILED)
    {
    array = NULL;
    fprintf(stderr,"FAIL: Cannot mmap %zu element array\n",count);
    }
    #else
    /*
    ** USE malloc() to reserve a big enough heap-space buffer,
    ** then fread() the temp_file data into that buffer
    */
    if ((array = malloc(count * sizeof *array)))
    {
    rewind(tmp);
    if (fread(array,sizeof *array,count,tmp) != count)
    {
    free(array);
    array = NULL;
    fprintf(stderr,"FAIL: Cannot load %zu element array\n",count);
    }
    }
    else fprintf(stderr,"FAIL: Cant malloc() %zu element array\n",count); #endif
    }
    fclose(tmp);

    }
    else fprintf(stderr,"FAIL: Cannot allocate temporary work file\n");

    *Count = count; /* byproduct value that caller might find useful */
    return array; /* either NULL (on failure) or pointer to array */
    }



    --
    Lew Pitcher
    "In Skills We Trust"

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From DFS@21:1/5 to Lew Pitcher on Tue Jun 25 14:09:23 2024
    On 6/25/2024 1:37 PM, Lew Pitcher wrote:
    On Sun, 16 Jun 2024 15:52:16 +0000, Lew Pitcher wrote:

    On Sat, 15 Jun 2024 15:36:22 -0400, DFS wrote:

    I want to read numbers in from a file, say:

    47 185 99 74 202 118 78 203 264 207 19 17 34 167 148 54 297 271 118 245
    294 188 140 134 251 188 236 160 48 189 228 94 74 27 168 275 144 245 178
    108 152 197 125 185 63 272 239 60 242 56 4 235 244 144 69 195 32 4 54 79 >>> 193 282 173 267 8 40 241 152 285 119 259 136 15 83 21 78 55 259 137 297
    15 141 232 259 285 300 153 16 4 207 95 197 188 267 164 195 7 104 47 291


    This code:
    1 opens the file
    2 fscanf thru the file to count the number of data points
    3 allocate memory
    4 rewind and fscanf again to add the data to the int array


    [snip]
    You /could/ create a temporary, binary, file, and write the fscanf()'ed
    values to it as part of the first loop. Once the first loop completes,
    you rewind this temporary file, and load your integer array by reading
    the (now converted to native integer format) values from that file.

    Still two passes, but using fscanf() in only one of those passes.
    [snip]

    For what it's worth, here's an example of what I suggest:

    /*
    The following code provides two examples of the approach I suggested.

    Example 1: while counting input numbers, write temp file with int values
    malloc() a buffer big enough for that count of int values
    fread() the temp file into the malloc()'ed buffer
    Note: conformant to ISO Standard C.

    Example 2: while counting input numbers, write temp file with int values
    mmap() the temp file, starting at the beginning, and sized to
    include all the int values in the file.
    Note: conformant to POSIX C extensions to ISO Standard C.

    Note: compile with -DUSE_MMAP to obtain mmap() variant, otherwise
    this will compile the malloc()/fread() variant
    */

    #include <stdio.h>
    #include <stdlib.h>

    #ifdef USE_MMAP
    #include <sys/mman.h>
    #define BANNER "Example of array loading using mmap()"
    #define FREEALLOC(x)
    #else
    #define BANNER "Example of array loading using malloc() and fread()"
    #define FREEALLOC(x) free((x))
    #endif

    static int *LoadIntArray(FILE *fp, size_t *Count);

    int main(void)
    {
    int status = EXIT_FAILURE, *array;
    size_t count;

    puts(BANNER);

    if ((array = LoadIntArray(stdin,&count)))
    {
    printf("%zu elements loaded\n",count);
    for (size_t index = 0; index < count; ++index)
    printf("array[%3zu] == %d\n",index,array[index]);

    FREEALLOC(array); /* if necessary, free() the malloc()'ed array */
    status = EXIT_SUCCESS;
    }
    return status;
    }

    static int *LoadIntArray(FILE *fp,size_t *Count)
    {
    FILE *tmp;
    int *array = NULL;
    size_t count = 0;

    if ((tmp = tmpfile()))
    {
    int buffer;

    for (count = 0; fscanf(fp,"%d",&buffer) == 1; ++count)
    fwrite(&buffer,sizeof buffer, 1,tmp);

    if (count)
    {
    #ifdef USE_MMAP
    /*
    ** USE mmap() to map temp_file data into process memory
    */
    array = mmap(NULL,
    count * sizeof *array,
    PROT_READ,MAP_PRIVATE,
    fileno(tmp),
    0);
    if (array == MAP_FAILED)
    {
    array = NULL;
    fprintf(stderr,"FAIL: Cannot mmap %zu element array\n",count);
    }
    #else
    /*
    ** USE malloc() to reserve a big enough heap-space buffer,
    ** then fread() the temp_file data into that buffer
    */
    if ((array = malloc(count * sizeof *array)))
    {
    rewind(tmp);
    if (fread(array,sizeof *array,count,tmp) != count)
    {
    free(array);
    array = NULL;
    fprintf(stderr,"FAIL: Cannot load %zu element array\n",count);
    }
    }
    else fprintf(stderr,"FAIL: Cant malloc() %zu element array\n",count); #endif
    }
    fclose(tmp);

    }
    else fprintf(stderr,"FAIL: Cannot allocate temporary work file\n");

    *Count = count; /* byproduct value that caller might find useful */
    return array; /* either NULL (on failure) or pointer to array */
    }


    $ gcc -Wall LewPitcher_readnums.c -o lprn
    $ (no compile errors)
    $ ./lprn nums.txt
    Example of array loading using malloc() and fread()

    (then it just hung)


    $ gcc -Wall LewPitcher_readnums.c -o lprn -DUSE_MMAP
    $ (no compile errors)
    $ ./lprn nums.txt
    Example of array loading using mmap()

    (then it just hung)


    Am I supposed to hardcode the filename in there somewhere?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lew Pitcher@21:1/5 to DFS on Tue Jun 25 18:11:10 2024
    On Tue, 25 Jun 2024 14:09:23 -0400, DFS wrote:

    On 6/25/2024 1:37 PM, Lew Pitcher wrote:
    [snip]

    $ gcc -Wall LewPitcher_readnums.c -o lprn
    $ (no compile errors)
    $ ./lprn nums.txt
    Example of array loading using malloc() and fread()

    The program (both versions) take input from stdin.
    Try
    ./lprn <nums.txt


    --
    Lew Pitcher
    "In Skills We Trust"

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)