I want to read numbers in from a file, say:
47 185 99 74 202 118 78 203 264 207 19 17 34 167 148 54 297 271 118 245 294 188 140 134 251 188 236 160 48 189 228 94 74 27 168 275 144 245 178 108 152 197 125 185 63 272 239 60 242 56 4 235 244 144 69 195 32 4 54 79 193 282
173 267 8 40 241 152 285 119 259 136 15 83 21 78 55 259 137 297 15 141 232 259 285 300 153 16 4 207 95 197 188 267 164 195 7 104 47 291
This code:
1 opens the file
2 fscanf thru the file to count the number of data points
3 allocate memory
4 rewind and fscanf again to add the data to the int array
Any issues with this method?
Any 'better' way?
----------------------------------------------------------
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[]) {
int N=0, i=0, j=0;
int *nums;
FILE* datafile = fopen(argv[1], "r");
while(fscanf(datafile, "%d", &j) != EOF){
nums = calloc(N, sizeof(int));
rewind(datafile);
while(fscanf(datafile, "%d", &j) != EOF){
nums[i++] = j;
}
fclose (datafile);
printf("\n");
for(i=0;i<N;i++) {
printf("%d. %d\n", i+1, nums[i]);
}
printf("\n");
free(nums);
return(0);
}
----------------------------------------------------------
I want to read numbers in from a file, say:
47 185 99 74 202 118 78 203 264 207 19 17 34 167 148 54 297 271 118
245 294 188 140 134 251 188 236 160 48 189 228 94 74 27 168 275 144
245 178 108 152 197 125 185 63 272 239 60 242 56 4 235 244 144 69 195
32 4 54 79 193 282 173 267 8 40 241 152 285 119 259 136 15 83 21 78
55 259 137 297 15 141 232 259 285 300 153 16 4 207 95 197 188 267 164
195 7 104 47 291
This code:
1 opens the file
2 fscanf thru the file to count the number of data points
3 allocate memory
4 rewind and fscanf again to add the data to the int array
Any issues with this method?
Any 'better' way?
Thanks
----------------------------------------------------------
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[]) {
int N=0, i=0, j=0;
int *nums;
FILE* datafile = fopen(argv[1], "r");
while(fscanf(datafile, "%d", &j) != EOF){
N++;
}
nums = calloc(N, sizeof(int));
rewind(datafile);
while(fscanf(datafile, "%d", &j) != EOF){
nums[i++] = j;
}
fclose (datafile);
printf("\n");
for(i=0;i<N;i++) {
printf("%d. %d\n", i+1, nums[i]);
}
printf("\n");
free(nums);
return(0);
}
----------------------------------------------------------
DFS <[email protected]> writes:
I want to read numbers in from a file, say:
47 185 99 74 202 118 78 203 264 207 19 17 34 167 148 54 297 271 118 245 294 >> 188 140 134 251 188 236 160 48 189 228 94 74 27 168 275 144 245 178 108 152 >> 197 125 185 63 272 239 60 242 56 4 235 244 144 69 195 32 4 54 79 193 282
173 267 8 40 241 152 285 119 259 136 15 83 21 78 55 259 137 297 15 141 232 >> 259 285 300 153 16 4 207 95 197 188 267 164 195 7 104 47 291
This code:
1 opens the file
2 fscanf thru the file to count the number of data points
3 allocate memory
4 rewind and fscanf again to add the data to the int array
Any issues with this method?
There are two issues: (1) you end up with a program that can't be
"piped" to (because the input can't be rewound), and (2) the file might change between counting and reading.
On Sun, 16 Jun 2024 01:56:49 +0300, Michael S wrote:
If you want to preserve you sanity, never use fscanf().
Quoth the man page <https://manpages.debian.org/3/scanf.3.en.html>:
It is very difficult to use these functions correctly, and it is
preferable to read entire lines with fgets(3) or getline(3) and
parse them later with sscanf(3) or more specialized functions such
as strtol(3).
If you want to preserve you sanity, never use fscanf().
For the original problem, where the input consists of digits and
whitespace, you could read a character at a time and accumulate the
value of each number. (You probably want to handle leading signs as
well, which isn't difficult.)
That is admittedly reinventing the
wheel, but the existing wheels aren't entirely round. You still
have to dynamically allocate the array of ints, assuming you need
to store all of them rather than processing each value as it's read.
On 16.06.2024 05:26, Lawrence D'Oliveiro wrote:
On Sun, 16 Jun 2024 01:56:49 +0300, Michael S wrote:
If you want to preserve you sanity, never use fscanf().
Quoth the man page <https://manpages.debian.org/3/scanf.3.en.html>:
It is very difficult to use these functions correctly, and it is
preferable to read entire lines with fgets(3) or getline(3) and
parse them later with sscanf(3) or more specialized functions such
as strtol(3).
This would be also my first impulse, but you'd have to know
_in advance_ how long the data stream would be; the function
requires an existing buffer. So you'd anyway need a stepwise
input. [...]
.. but it's defined by POSIX, not by ISO C.
Janis Papanagnou <[email protected]> writes:
On 16.06.2024 05:41, Janis Papanagnou wrote:
On 16.06.2024 05:26, Lawrence D'Oliveiro wrote:
On Sun, 16 Jun 2024 01:56:49 +0300, Michael S wrote:
If you want to preserve you sanity, never use fscanf().
Quoth the man page <https://manpages.debian.org/3/scanf.3.en.html>:
It is very difficult to use these functions correctly, and it is
preferable to read entire lines with fgets(3) or getline(3) and
parse them later with sscanf(3) or more specialized functions such >>>> as strtol(3).
This would be also my first impulse, but you'd have to know
_in advance_ how long the data stream would be; the function
requires an existing buffer. So you'd anyway need a stepwise
input. [...]
Would it be sensible to have a malloc()'ed buffer used for the first
fgets() and then subsequent fgets() work on the realloc()'ed part? I
suppose the previously set data in the malloc area would be retained
so that there's no re-composition of cut numbers necessary?
Sure. "The contents of the new object shall be the same as that of the
old object prior to deallocation, up to the lesser of the new and old
sizes."
Keep in mind that you can't call realloc() on a non-null pointer that
wasn't allocated by an allocation function.
On 16.06.2024 05:26, Lawrence D'Oliveiro wrote:
On Sun, 16 Jun 2024 01:56:49 +0300, Michael S wrote:
If you want to preserve you sanity, never use fscanf().
Quoth the man page <https://manpages.debian.org/3/scanf.3.en.html>:
It is very difficult to use these functions correctly, and it is
preferable to read entire lines with fgets(3) or getline(3) and
parse them later with sscanf(3) or more specialized functions
such as strtol(3).
This would be also my first impulse, but you'd have to know
_in advance_ how long the data stream would be; the function
requires an existing buffer.
So you'd anyway need a stepwise
input. On the plus side there's maybe a better performance
to read large buffer junks and compose them on demand? But
a problem is the potential cut of the string of a number; it
requires additional clumsy handling. So it might anyway be
better (i.e. much more convenient) to use fscanf() ?
Janis
On 16.06.2024 07:21, Keith Thompson wrote:
Janis Papanagnou <[email protected]> writes:
On 16.06.2024 05:41, Janis Papanagnou wrote:
On 16.06.2024 05:26, Lawrence D'Oliveiro wrote:
On Sun, 16 Jun 2024 01:56:49 +0300, Michael S wrote:
If you want to preserve you sanity, never use fscanf().
Quoth the man page
<https://manpages.debian.org/3/scanf.3.en.html>:
It is very difficult to use these functions correctly, and
it is preferable to read entire lines with fgets(3) or
getline(3) and parse them later with sscanf(3) or more
specialized functions such as strtol(3).
This would be also my first impulse, but you'd have to know
_in advance_ how long the data stream would be; the function
requires an existing buffer. So you'd anyway need a stepwise
input. [...]
Would it be sensible to have a malloc()'ed buffer used for the
first fgets() and then subsequent fgets() work on the realloc()'ed
part? I suppose the previously set data in the malloc area would
be retained so that there's no re-composition of cut numbers
necessary?
Sure. "The contents of the new object shall be the same as that of
the old object prior to deallocation, up to the lesser of the new
and old sizes."
Keep in mind that you can't call realloc() on a non-null pointer
that wasn't allocated by an allocation function.
Thanks. - I've just tried it with this ad hoc test code
#include <stdlib.h>
#include <stdio.h>
void main (int argc, char * argv[])
{
int chunk = 10;
int bufsize = chunk+1;
char * buf = malloc(bufsize);
char * anchor = buf;
while (fgets(buf, chunk+1, stdin) != NULL)
if (realloc(anchor, bufsize += chunk) != NULL)
buf += chunk;
puts(anchor);
}
I wonder whether it can be simplified by making malloc() obsolete
and using realloc() in a redesigned loop.
Janis
Not sure what this code is supposed to do.
However it looks unlikely that it does what you meant for it to do.
I recommend to read the [f*****g] manual. https://cplusplus.com/reference/cstdio/fgets/ https://cplusplus.com/reference/cstdlib/realloc/
On Sun, 16 Jun 2024 05:41:12 +0200
Janis Papanagnou <[email protected]> wrote:
On 16.06.2024 05:26, Lawrence D'Oliveiro wrote:
On Sun, 16 Jun 2024 01:56:49 +0300, Michael S wrote:
If you want to preserve you sanity, never use fscanf().
Quoth the man page <https://manpages.debian.org/3/scanf.3.en.html>:
It is very difficult to use these functions correctly, and it is
preferable to read entire lines with fgets(3) or getline(3) and
parse them later with sscanf(3) or more specialized functions
such as strtol(3).
This would be also my first impulse, but you'd have to know
_in advance_ how long the data stream would be; the function
requires an existing buffer.
Define formats with sensible maximal line length (512 sounds about
right) and refuse any input that has longer lines.
So you'd anyway need a stepwise
input. On the plus side there's maybe a better performance
to read large buffer junks and compose them on demand? But
a problem is the potential cut of the string of a number; it
requires additional clumsy handling. So it might anyway be
better (i.e. much more convenient) to use fscanf() ?
No, the behaviour of fsacnf() is too non-intuitive.
On 15/06/2024 23:03, Ben Bacarisse wrote:
DFS <[email protected]> writes:
I want to read numbers in from a file, say:There are two issues: (1) you end up with a program that can't be
47 185 99 74 202 118 78 203 264 207 19 17 34 167 148 54 297 271 118 245 294 >>> 188 140 134 251 188 236 160 48 189 228 94 74 27 168 275 144 245 178 108 152 >>> 197 125 185 63 272 239 60 242 56 4 235 244 144 69 195 32 4 54 79 193 282 >>> 173 267 8 40 241 152 285 119 259 136 15 83 21 78 55 259 137 297 15 141 232 >>> 259 285 300 153 16 4 207 95 197 188 267 164 195 7 104 47 291
This code:
1 opens the file
2 fscanf thru the file to count the number of data points
3 allocate memory
4 rewind and fscanf again to add the data to the int array
Any issues with this method?
"piped" to (because the input can't be rewound), and (2) the file might
change between counting and reading.
It might change even while you're reading it once.
On 16.06.2024 10:11, Michael S wrote:
Not sure what this code is supposed to do.
Not sure what you're comment is supposed to tell me.
However it looks unlikely that it does what you meant for it to do.
I recommend to read the [f*****g] manual. https://cplusplus.com/reference/cstdio/fgets/ https://cplusplus.com/reference/cstdlib/realloc/
I don't need the Web to access man pages. Thanks.
Janis
On Sun, 16 Jun 2024 11:07:22 +0200
Janis Papanagnou <[email protected]> wrote:
On 16.06.2024 10:11, Michael S wrote:
Not sure what this code is supposed to do.
Not sure what you're comment is supposed to tell me.
I hoped that after you would read the manual you will know.
But it obviously didn't work out.
So, I'd tell a little more:
1. It does not read one line of arbitrary length
2. There is more than one mistake
3. All mistakes seems to be caused by deep misconceptions about fgets()
and realloc().
[...]
Janis Papanagnou <[email protected]> writes:
void main (int argc, char * argv[])
*Ahem* -- int main.
On 16.06.2024 11:38, Michael S wrote:
Again; which ones?
Janis
[...]
On 16.06.2024 06:17, Keith Thompson wrote:
For the original problem, where the input consists of digits and
whitespace, you could read a character at a time and accumulate the
value of each number. (You probably want to handle leading signs as
well, which isn't difficult.)
Yes. Been there, done that. Sometimes it's good enough to go back
to the roots if higher-level functions are imperfect or quirky.
That is admittedly reinventing the
wheel, but the existing wheels aren't entirely round. You still
have to dynamically allocate the array of ints, assuming you need
to store all of them rather than processing each value as it's read.
A subclass of tasks can certainly process data on the fly but for
the general solution there should be a convenient way to handle it.
I still prefer higher-level languages that take the burden from me.
And is a mapping of every input to the empty set a "function" or not? I
think it is but mathematicians might weigh in on that.
On Sat, 15 Jun 2024 15:36:22 -0400
If you want to preserve you sanity, never use fscanf().
On Sun, 16 Jun 2024 12:03:21 +0200
Janis Papanagnou <[email protected]> wrote:
On 16.06.2024 11:38, Michael S wrote:
Again; which ones?
Janis
[...]
The main misconceptions are about what is returned by fgets() and by realloc().
fgets() returns its first argument in all cases except EOF or FS read
error. That includes the case when the buffer is too short to
accommodate full input line.
With realloc() there are two issues:
1. It can sometimes return its first argument, but does not have to. In scenario like yours it will return a different pointer quite soon.
2. When realloc() returns NULL, it does not de-allocate its first
argument.
The second case, of course, is not important in practice, because in
practice you're very unlikely to see realloc() returning NULL, and if nevertheless it did happen, you program is unlikely to survive and give meaningful result anyway. Still, here on c.l.c we like to pretend that
we can meaningfully handle allocation failures.
123 456 78<+++123 456 789 101112 77 88 99 101 999
123 456 78<===
9 101112 7<+++
123 456 789 101112 7<===
7 88 99 10<+++
123 456 789 101112 77 88 99 10<===
1 999<+++
123 456 789 101112 77 88 99 101 999<===
I want to read numbers in from a file, say:
47 185 99 74 202 118 78 203 264 207 19 17 34 167 148 54 297 271 118 245
294 188 140 134 251 188 236 160 48 189 228 94 74 27 168 275 144 245 178
108 152 197 125 185 63 272 239 60 242 56 4 235 244 144 69 195 32 4 54 79
193 282 173 267 8 40 241 152 285 119 259 136 15 83 21 78 55 259 137 297
15 141 232 259 285 300 153 16 4 207 95 197 188 267 164 195 7 104 47 291
This code:
1 opens the file
2 fscanf thru the file to count the number of data points
3 allocate memory
4 rewind and fscanf again to add the data to the int array
Any issues with this method?
Any 'better' way?
while(fscanf(datafile, "%d", &j) != EOF){you discard a lot of work (done for you by fscanf() to determine the
N++;
}
On 6/16/2024 12:44 AM, Janis Papanagnou wrote:
On 16.06.2024 06:17, Keith Thompson wrote:
For the original problem, where the input consists of digits and
whitespace, you could read a character at a time and accumulate the
value of each number. (You probably want to handle leading signs as
well, which isn't difficult.)
Yes. Been there, done that. Sometimes it's good enough to go back
to the roots if higher-level functions are imperfect or quirky.
That is admittedly reinventing the
wheel, but the existing wheels aren't entirely round. You still
have to dynamically allocate the array of ints, assuming you need
to store all of them rather than processing each value as it's read.
A subclass of tasks can certainly process data on the fly but for
the general solution there should be a convenient way to handle it.
I still prefer higher-level languages that take the burden from me.
nums = []
with open('data.txt','r') as f:
for nbr in f.read().split():
nums.append(int(nbr))
print(*sorted(nums))
DFS <[email protected]> writes:
I want to read numbers in from a file, say:
47 185 99 74 202 118 78 203 264 207 19 17 34 167 148 54 297 271 118 245 294 >> 188 140 134 251 188 236 160 48 189 228 94 74 27 168 275 144 245 178 108 152 >> 197 125 185 63 272 239 60 242 56 4 235 244 144 69 195 32 4 54 79 193 282
173 267 8 40 241 152 285 119 259 136 15 83 21 78 55 259 137 297 15 141 232 >> 259 285 300 153 16 4 207 95 197 188 267 164 195 7 104 47 291
This code:
1 opens the file
2 fscanf thru the file to count the number of data points
3 allocate memory
4 rewind and fscanf again to add the data to the int array
Any issues with this method?
There are two issues: (1) you end up with a program that can't be
"piped" to (because the input can't be rewound), and (2) the file might change between counting and reading. How much either matters will
depend on the context. I like piping to programs so (1) would bother
me.
Any 'better' way?
I'd allocate the array on the fly. It's one of those things that, once you've done it, becomes a stock bit of coding. In fact, you can write a simple dynamic array module, and use it again and again.
----------------------------------------------------------
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[]) {
int N=0, i=0, j=0;
int *nums;
FILE* datafile = fopen(argv[1], "r");
while(fscanf(datafile, "%d", &j) != EOF){
It's always better to loop while fscanf succeeds rather than trying to
handle all the errors. You might not care about case where this loop
fails, but it's just better to get into the right habit:
while (fscanf(datafile, "%d", &j) == 1) ...
nums = calloc(N, sizeof(int));
The cost is low, but there's no need to use calloc here as you are going
to assign exactly N locations.
rewind(datafile);
while(fscanf(datafile, "%d", &j) != EOF){
nums[i++] = j;
}
As above, though I'd read into &nums[i] directly.
fclose (datafile);
printf("\n");
for(i=0;i<N;i++) {
printf("%d. %d\n", i+1, nums[i]);
}
printf("\n");
free(nums);
return(0);
Because I have acquired the habit, I'd also check for errors,
particularly on argc, fopen and malloc.
}
----------------------------------------------------------
DFS <[email protected]> writes:
I want to read numbers in from a file, say:
47 185 99 74 202 118 78 203 264 207 19 17 34 167 148 54 297 271 118
245 294 188 140 134 251 188 236 160 48 189 228 94 74 27 168 275 144
245 178 108 152 197 125 185 63 272 239 60 242 56 4 235 244 144 69 195
32 4 54 79 193 282 173 267 8 40 241 152 285 119 259 136 15 83 21 78 55
259 137 297 15 141 232 259 285 300 153 16 4 207 95 197 188 267 164 195
7 104 47 291
This code:
1 opens the file
2 fscanf thru the file to count the number of data points
3 allocate memory
4 rewind and fscanf again to add the data to the int array
Any issues with this method?
Any 'better' way?
Thanks
In a quick test, your code compiles without errors and runs correctly
with your input. I do get a warning about argc being unused, which you should address.
----------------------------------------------------------
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[]) {
int N=0, i=0, j=0;
The usual convention is to use all-caps for macro names. Calling your variable N is not a real problem, but could be slightly confusing.
N is the number of integers in the input. i is an index. j is a value
read from the file. That's not at all clear from the names.
I suggest using longer and more descriptive names in lower case.
"N" could be "count". "i" is fine for an index, but "j" could be
"value".
Consider using size_t rather than int for the count and index. That's
mostly a style point; it's not going to make any practical difference
unless you have at least INT_MAX elements.
int *nums;
FILE* datafile = fopen(argv[1], "r");
Undefined behavior if no argument was provided, i.e., argc < 1.
while(fscanf(datafile, "%d", &j) != EOF){
Numeric input with the *scanf functions has undefined behavior if the
scanned value is outside the range of the target type. For example, if
the input contains "99999999999999999999999999999999999999999999999999", arbitrary bad things could happen. (Most likely it will just store some incorrect value in j, with no indication that there was an error.)
strtol is trickier to use, but you can detect errors.
fscanf returns EOF on reaching the end of the file or on a read error,
and that's the only condition you check. It returns the number of items scanned. If the input doesn't contain a string that can be interpreted
as an integer, fscanf will return 0, and you'll be stuck in an infinite
loop. `while (fscanf(...) == 1)` is more robust, but it doesn't
distinguish between a read error and bad data. It's up to you how and whether to distinguish among different kinds of errors.
Your sample input consists of decimal integers with no sign. Decide
whether you want to hande "-123" or "+123". (fscanf will do so; so will strtol.)
N++;
}
nums = calloc(N, sizeof(int));
Consider using `sizeof *nums` rather than `sizeof(int)`. That way you
don't have to change the type in two places if the element type changes.
You'll be updating all the elements of the nums array, so there's not
much point in zeroing it. If you use malloc:
nums = malloc(N * sizeof *nums);
Whether you use calloc() or malloc(), you should check the return
value. If it returns a null pointer, it means the allocation failed. Aborting the program is probably a good way to handle it.
(There are complications on Linux-based systems which I won't get into
here. Google "OOM killer" and "overcommit" for details.)
rewind(datafile);
This can fail if the input file is not seekable. For example, on a Linux-based system you could do something like:
./your_program /dev/stdin < file
Perhaps that's an acceptable restriction, but be aware of it.
while(fscanf(datafile, "%d", &j) != EOF){
Again, UB for out of range values.
It's not guaranteed that you'll get the same data the second time you
read the file; some other process could modify it. This might not be
worth worrying about.
nums[i++] = j;
}
fclose (datafile);
printf("\n");
You haven't produced any output yet; why print a blank line? (Of course
you can if you want to.)
for(i=0;i<N;i++) {
printf("%d. %d\n", i+1, nums[i]);
}
printf("\n");
free(nums);
return(0);
A minor style point: a return statement doesn't require parentheses.
IMHO using parentheses make it look too much like a function call. I'd
write `return 0;`, or more likely I'd just omit it, since falling off
the end of main does an implicit `return 0;` (starting in C99).
}
A method that doesn't require rescanning the input file is to initially allocate some reasonable amount of memory, then use realloc() to
expand the array as needed. Doubling the array size is probably
reasonable. It will consume more memory than a single allocation.
On 16/06/2024 17:09, DFS wrote:
On 6/16/2024 12:44 AM, Janis Papanagnou wrote:
On 16.06.2024 06:17, Keith Thompson wrote:
For the original problem, where the input consists of digits and
whitespace, you could read a character at a time and accumulate the
value of each number. (You probably want to handle leading signs as
well, which isn't difficult.)
Yes. Been there, done that. Sometimes it's good enough to go back
to the roots if higher-level functions are imperfect or quirky.
That is admittedly reinventing the
wheel, but the existing wheels aren't entirely round. You still
have to dynamically allocate the array of ints, assuming you need
to store all of them rather than processing each value as it's read.
A subclass of tasks can certainly process data on the fly but for
the general solution there should be a convenient way to handle it.
I still prefer higher-level languages that take the burden from me.
nums = []
with open('data.txt','r') as f:
for nbr in f.read().split():
nums.append(int(nbr))
print(*sorted(nums))
nums = sorted(map(int, open('data.txt', 'r').read().split()))
On 16.06.2024 13:31, Michael S wrote:
On Sun, 16 Jun 2024 12:03:21 +0200
Janis Papanagnou <[email protected]> wrote:
On 16.06.2024 11:38, Michael S wrote:
Again; which ones?
Janis
[...]
The main misconceptions are about what is returned by fgets() and by realloc().
fgets() returns its first argument in all cases except EOF or FS
read error. That includes the case when the buffer is too short to accommodate full input line.
fgets() return s on success, and NULL on error or when end
of file occurs while no characters have been read.
I am interested in the success case, thus "fgets() != NULL".
The buffer size is controlled by the second function parameter.
With realloc() there are two issues:
1. It can sometimes return its first argument, but does not have
to. In scenario like yours it will return a different pointer quite
soon. 2. When realloc() returns NULL, it does not de-allocate its
first argument.
The second case, of course, is not important in practice, because in practice you're very unlikely to see realloc() returning NULL, and
if nevertheless it did happen, you program is unlikely to survive
and give meaningful result anyway. Still, here on c.l.c we like to
pretend that we can meaningfully handle allocation failures.
You may provide "correct" code (if you think mine is wrong). Or just
inspect how it behaves; here's the output after two printf's added:
$ printf "123 456 789 101112 77 88 99 101 999" | realloc
123 456 78<+++123 456 789 101112 77 88 99 101 999
123 456 78<===
9 101112 7<+++
123 456 789 101112 7<===
7 88 99 10<+++
123 456 789 101112 77 88 99 10<===
1 999<+++
123 456 789 101112 77 88 99 101 999<===
The +++data+++ is the chunk read, and the ===data=== is the overall
buffer content, and the final line again the result (as before, with
the added newline as documented for puts()).
Even though my code is just an "ad hoc test code" to demonstrate the procedure I outlined - and as such test code certainly lacking quite
some error handling and much more - it does exactly what I intended it
to do, and what I've implemented according to what I read in the man
pages. I cannot see any "misconception", it does what was _intended_.
There's indeed one point that I _deliberately_ ignored for the test
code; actually the point you mentioned as "not important in practice".
Again: You may provide "correct" code (if you think mine is "wrong"),
or "better" code, usable for production instead of a test code.
But your tone and statements were (as observed so often) inadequate;
I quote from your post that started the subthread:
"However it looks unlikely that it does what you meant for it to do."
It does exactly what I meant to do (as you can see in the logs above).
Janis
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[]) {
int N=0;
int *nums = malloc(2 * sizeof(int));
FILE* datafile = fopen(argv[1], "r");
while(fscanf(datafile, "%d", &nums[N++]) == 1){
nums = realloc(nums, (N+1) * sizeof(int));
}
fclose (datafile);
N--;
On 6/15/2024 6:22 PM, Keith Thompson wrote:...
DFS <[email protected]> writes:
return(0);
A minor style point: a return statement doesn't require parentheses.
IMHO using parentheses make it look too much like a function call. I'd
write `return 0;`, or more likely I'd just omit it, since falling off
the end of main does an implicit `return 0;` (starting in C99).
Can't omit it. It's required by my brain.
On 6/15/2024 6:22 PM, Keith Thompson wrote:...
DFS <[email protected]> writes:
return(0);
A minor style point: a return statement doesn't require parentheses.
IMHO using parentheses make it look too much like a function call. I'd
write `return 0;`, or more likely I'd just omit it, since falling off
the end of main does an implicit `return 0;` (starting in C99).
Can't omit it. It's required by my brain.
Janis Papanagnou <[email protected]> writes:
[...] K&R at
least seems to say that 'void' can only be declared for the
return type of functions that do not return anything.
[...]
No version of C has ever permitted "void main" except when an
implementation documents and permits it. [...]
Janis Papanagnou <[email protected]> writes:
[...]
#include <stdlib.h>
#include <stdio.h>
void main (int argc, char * argv[])
{
int chunk = 10;
int bufsize = chunk+1;
char * buf = malloc(bufsize);
char * anchor = buf;
while (fgets(buf, chunk+1, stdin) != NULL)
if (realloc(anchor, bufsize += chunk) != NULL)
buf += chunk;
puts(anchor);
}
realloc() can return the pointer you pass to it if there's enough room
in the existing location. (Or it can relocate the buffer even if there
is enough room.)
But if realloc() moves the buffer (copying the existing data to it), it returns a pointer to the new location and invalidates the old one. You discard the new pointer, only comparing it to NULL.
Perhaps you assumed that realloc() always expands the buffer in place.
It doesn't.
If the above program worked for you, I suspect that either realloc()
never relocated the buffer, or you continued using the original buffer
(and beyond) after realloc() invalidated it. [...]
On 6/16/24 12:20, DFS wrote:
On 6/15/2024 6:22 PM, Keith Thompson wrote:
DFS <[email protected]> writes:
...
return(0);
A minor style point: a return statement doesn't require parentheses.
IMHO using parentheses make it look too much like a function call. I'd
write `return 0;`, or more likely I'd just omit it, since falling off
the end of main does an implicit `return 0;` (starting in C99).
Can't omit it. It's required by my brain.
The parentheses you're putting in are completely unrelated to the use of parentheses in _Generic(), function calls, compound literals,
sizeof(type name), alignof(), _BitInt(), _Atomic(), typeof(), typeof_unqual(), alignas(), function declarators, static_assert(), if(), switch(for(), while(), do ... while(), function-like macro definitions
and invocations or cast expressions. In all of those cases, the
parentheses are part of the grammar. [...]
Keith Thompson <[email protected]> writes:
The worst consequence of undefined behavior is having your code
appear to "work".
Personally I think causing a missle launch that started a
worldwide thermonuclear war would be a worse consequence.
YMMV.
The worst consequence of undefined behavior is having your code
appear to "work".
On 16.06.2024 22:32, Keith Thompson wrote:
Janis Papanagnou <[email protected]> writes:
[...] K&R at
least seems to say that 'void' can only be declared for the
return type of functions that do not return anything.
[...]
No version of C has ever permitted "void main" except when an
implementation documents and permits it. [...]
I cannot comment on main() being handled differently than
other C functions. I was just quoting my old copy of K&R.
I don't understand what you mean with "no version of C has
ever permitted", given that my C compiler doesn't complain.
Keith Thompson <[email protected]> writes:
The worst consequence of undefined behavior is having your code
appear to "work".
Personally I think causing a missle launch that started a
worldwide thermonuclear war would be a worse consequence.
YMMV.
On 6/17/24 01:41, Janis Papanagnou wrote:
On 16.06.2024 22:32, Keith Thompson wrote:
Janis Papanagnou <[email protected]> writes:
[...] K&R at
least seems to say that 'void' can only be declared for the
return type of functions that do not return anything.
[...]
No version of C has ever permitted "void main" except when an
implementation documents and permits it. [...]
I cannot comment on main() being handled differently than
other C functions. I was just quoting my old copy of K&R.
It is handled differently. Your own functions can be declared in a wide variety of ways, so long as the declaration that is relevant to function designator in a function call is compatible with the definition of the function that it designates.
C standard library functions can only be declared in ways compatible
with the specifications in the C standard.
main(), on the other hand, is unique, in that you have two incompatible choices of how to define it, and an implementation can designate
additional choices. You can define main() in any way compatible with one
of the options supported by your implementation; but portable code
should define it only in one of the two ways specified by the C standard.
K&R is long obsolete; up-to-date drafts of the standard that are almost identical to the latest version of the standard are free and easily available.
I don't understand what you mean with "no version of C has
ever permitted", given that my C compiler doesn't complain.
He wrote "No version of C has ever permitted "void main" except when an implementation documents and permits it." Note that he is talking about versions of the standard, not versions of any particular implementation
of C. If your C compiler "documents and permits" "void main", then it certainly shouldn't complain about it. However, since the C standard
does not mandate support for void main, you've no guarantee of
portability of code that uses void main to other implementations of C.
Janis Papanagnou <[email protected]> writes:
[...]
Elsethread I suggested to merge the malloc() with the realloc() call.
The resulting code would be simpler (and might address that problem).
int chunk = 10;
int bufsize = 1;
char * anchor = NULL;
while ((anchor = realloc (anchor, bufsize += chunk)) != NULL &&
fgets (anchor+bufsize-chunk-1, chunk+1, stdin) != NULL)
;
puts (anchor);
Do you see the exposed problem (or any other issues) here, too?
If stdin is empty, you never store anything in the buffer and
puts(anchor) has undefined behavior because there might be a terminating '\0'. If the first realloc() fails, anchor is a null pointer and again puts(anchor) has undefined behavior.
If nothing goes wrong, puts() adds an extra newline to the output.
That's all that jumped out at me looking at the code, but did you test
it with multi-line input? When I tried it it printed only the first
line of input (followed by that extra newline).
I'm still not entirely sure what the code is supposed to do.
Janis Papanagnou <[email protected]> writes:
On 16.06.2024 22:32, Keith Thompson wrote:
Janis Papanagnou <[email protected]> writes:
[...] K&R at
least seems to say that 'void' can only be declared for the
return type of functions that do not return anything.
[...]
No version of C has ever permitted "void main" except when an
implementation documents and permits it. [...]
I cannot comment on main() being handled differently than
other C functions. I was just quoting my old copy of K&R.
First or second edition?
But main() *is* handled differently than other functions,
and
that's important to understand. It's effectively called by the
environment, which means that your definition has to cooperate
with what the environment expects.
What's slightly weird about
it is that it can be defined in (at least) two different ways,
with or without argc and argv.
[...]
If I want a defined exit status (which is what I usually
want) I specify 'int main (...)' and provide an explicit
return statement (or exit() call).
Why would you ever not want a defined exit status, given that it's
easier to have one than not to have one?
(Since C99 an explicit
return or exit() is optional.) I can't think of any reason *at all*
to use "void main" in C with a hosted implementation. Can you?
(If you don't care about the exit status, you can just write
"int main" and not bother with a return statement or exit() call.
The exit status will be 0, but that's not a problem if you don't
care about it.)
On 6/16/24 12:20, DFS wrote:
On 6/15/2024 6:22 PM, Keith Thompson wrote:...
DFS <[email protected]> writes:
return(0);
A minor style point: a return statement doesn't require parentheses.
IMHO using parentheses make it look too much like a function call. I'd
write `return 0;`, or more likely I'd just omit it, since falling off
the end of main does an implicit `return 0;` (starting in C99).
Can't omit it. It's required by my brain.
The parentheses you're putting in are completely unrelated to the use of parentheses in _Generic(), function calls, compound literals,
sizeof(type name), alignof(), _BitInt(), _Atomic(), typeof(), typeof_unqual(), alignas(), function declarators, static_assert(), if(), switch(for(), while(), do ... while(), function-like macro definitions
and invocations or cast expressions. In all of those cases, the
parentheses are part of the grammar.
On 16/06/2024 16:56, David Brown wrote:
On 16/06/2024 17:09, DFS wrote:
On 6/16/2024 12:44 AM, Janis Papanagnou wrote:
On 16.06.2024 06:17, Keith Thompson wrote:
For the original problem, where the input consists of digits and
whitespace, you could read a character at a time and accumulate the
value of each number. (You probably want to handle leading signs as >>>>> well, which isn't difficult.)
Yes. Been there, done that. Sometimes it's good enough to go back
to the roots if higher-level functions are imperfect or quirky.
That is admittedly reinventing the
wheel, but the existing wheels aren't entirely round. You still
have to dynamically allocate the array of ints, assuming you need
to store all of them rather than processing each value as it's read.
A subclass of tasks can certainly process data on the fly but for
the general solution there should be a convenient way to handle it.
I still prefer higher-level languages that take the burden from me.
nums = []
with open('data.txt','r') as f:
for nbr in f.read().split():
nums.append(int(nbr))
print(*sorted(nums))
nums = sorted(map(int, open('data.txt', 'r').read().split()))
OK, a bit of a challenge for my scripting language. I managed this first:
nums := sort(mapv(toval, splitstring(readstrfile("data.txt"))))
It needed a change to 'splitstring' to allow a default separator
consisting of white space of any length. And a one-line helper function 'toval' since the usual candidates, special built-ins, were not valid
for 'mapv'.
It also works like this:
nums := readstrfile("data.txt") -> splitstring -> mapv(toval) -> sort
But only by chance since the 'piped' argument is the last one of multi-parameter functions, rather than the first.
On 6/16/24 12:20, DFS wrote:
On 6/15/2024 6:22 PM, Keith Thompson wrote:...
DFS <[email protected]> writes:
return(0);
A minor style point: a return statement doesn't require parentheses.
IMHO using parentheses make it look too much like a function call. I'd
write `return 0;`, or more likely I'd just omit it, since falling off
the end of main does an implicit `return 0;` (starting in C99).
Can't omit it. It's required by my brain.
The parentheses you're putting in are completely unrelated to the use of parentheses in _Generic(), function calls, compound literals,
sizeof(type name), alignof(), _BitInt(), _Atomic(), typeof(), typeof_unqual(), alignas(), function declarators, static_assert(), if(), switch(for(), while(), do ... while(), function-like macro definitions
and invocations or cast expressions. In all of those cases, the
parentheses are part of the grammar.
The parentheses that you put in return(0) serve only for grouping
purpose. They are semantically equivalent to the parentheses in "i =
(0);"; they are just as legal, and just as pointless.
If your brain doesn't immediately understand why what I said above is
true, I recommend retraining it.
DFS <[email protected]> writes:
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[]) {
int N=0;
int *nums = malloc(2 * sizeof(int));
FILE* datafile = fopen(argv[1], "r");
while(fscanf(datafile, "%d", &nums[N++]) == 1){
nums = realloc(nums, (N+1) * sizeof(int));
}
fclose (datafile);
N--;
This N-- is a bit "tricksy". Better to increment in the realloc (or the while body) so it only happens when an int has been read.
On 17.06.2024 08:20, Keith Thompson wrote:...
Janis Papanagnou <[email protected]> writes:
and
that's important to understand. It's effectively called by the
environment, which means that your definition has to cooperate
with what the environment expects.
I'm not sure whether my K&R copy addresses that at all. A quick
view and I see only one instance where "main()" is mentioned at
the beginning: main() { printf("hello, world\n"); }
No types here, and no environment aspects mentioned.
If I want a defined exit status (which is what I usually
want) I specify 'int main (...)' and provide an explicit
return statement (or exit() call).
Why would you ever not want a defined exit status, given that it's
easier to have one than not to have one?
Aren't we agreeing here? (The only difference is that you are
formulating in a negated form where I positively said the same.)
Well, to indicate that there's no status information or that
it's irrelevant. E.g. as was the case in the test fragment I
posted.
On 17.06.2024 07:40, Tim Rentsch wrote:
Keith Thompson <[email protected]> writes:
The worst consequence of undefined behavior is having your code
appear to "work".
Personally I think causing a missle launch that started a
worldwide thermonuclear war would be a worse consequence.
YMMV.
I think I wouldn't code a missile control system in "C". ;-)
Janis
On 16/06/2024 17:09, DFS wrote:
nums = []
with open('data.txt','r') as f:
for nbr in f.read().split():
nums.append(int(nbr))
print(*sorted(nums))
nums = sorted(map(int, open('data.txt', 'r').read().split()))
I think DFS might mean that they find themselves
unable to omit the unnecessary return 0 statement entirely.
On 6/16/2024 10:41 PM, James Kuyper wrote:
On 6/16/24 12:20, DFS wrote:
On 6/15/2024 6:22 PM, Keith Thompson wrote:...
DFS <[email protected]> writes:
The parentheses you're putting in are completely unrelated to the use ofreturn(0);
A minor style point: a return statement doesn't require parentheses.
IMHO using parentheses make it look too much like a function call. I'd >>>> write `return 0;`, or more likely I'd just omit it, since falling off
the end of main does an implicit `return 0;` (starting in C99).
Can't omit it. It's required by my brain.
parentheses in _Generic(), function calls, compound literals,
sizeof(type name), alignof(), _BitInt(), _Atomic(), typeof(),
typeof_unqual(), alignas(), function declarators, static_assert(), if(),
switch(for(), while(), do ... while(), function-like macro definitions
and invocations or cast expressions. In all of those cases, the
parentheses are part of the grammar.
The parentheses that you put in return(0) serve only for grouping
purpose. They are semantically equivalent to the parentheses in "i =
(0);"; they are just as legal, and just as pointless.
If your brain doesn't immediately understand why what I said above is
true, I recommend retraining it.
I meant omit a return altogether.
But looking around, I rarely see return(0). Don't know why it became a
thing for me.
Moving forward, return 0 it is.
On 6/17/2024 3:39 AM, Kaz Kylheku wrote:
I think DFS might mean that they find themselves
he finds himself
unable to omit the unnecessary return 0 statement entirely.
yes
On 17/06/2024 14:50, DFS wrote:
On 6/17/2024 3:39 AM, Kaz Kylheku wrote:
I think DFS might mean that they find themselves
he finds himself
unable to omit the unnecessary return 0 statement entirely.
yes
If a function is defined to return an int, then you should return one.
Anything else is just lazy/sloppy. Just because main allows it as a
special case doesn't mean it's a good idea.
I mean: it's really not much extra to type.
On 6/17/2024 6:45 AM, DFS wrote:
On 6/17/2024 1:52 AM, Janis Papanagnou wrote:
On 17.06.2024 07:40, Tim Rentsch wrote:
Keith Thompson <[email protected]> writes:
The worst consequence of undefined behavior is having your code
appear to "work".
Personally I think causing a missle launch that started a
worldwide thermonuclear war would be a worse consequence.
YMMV.
I think I wouldn't code a missile control system in "C". ;-)
Janis
Per "Google AI Overview": "In 1987, the Department of Defense mandated
that Ada be the standard programming language for Defense computer
resources used in military command and control systems."
Check this out:
JOINT STRIKE FIGHTER
AIR VEHICLE
C++ CODING STANDARDS
https://www.stroustrup.com/JSF-AV-rules.pdf
;^)
On 6/17/2024 4:16 PM, Chris M. Thomasson wrote:
On 6/17/2024 6:45 AM, DFS wrote:
On 6/17/2024 1:52 AM, Janis Papanagnou wrote:
On 17.06.2024 07:40, Tim Rentsch wrote:
Keith Thompson <[email protected]> writes:
The worst consequence of undefined behavior is having your code
appear to "work".
Personally I think causing a missle launch that started a
worldwide thermonuclear war would be a worse consequence.
YMMV.
I think I wouldn't code a missile control system in "C". ;-)
Janis
Per "Google AI Overview": "In 1987, the Department of Defense mandated
that Ada be the standard programming language for Defense computer
resources used in military command and control systems."
Check this out:
JOINT STRIKE FIGHTER
AIR VEHICLE
C++ CODING STANDARDS
https://www.stroustrup.com/JSF-AV-rules.pdf
;^)
Scary.
On 6/17/2024 1:52 AM, Janis Papanagnou wrote:
I think I wouldn't code a missile control system in "C". ;-)
Per "Google AI Overview": "In 1987, the Department of Defense mandated
that Ada be the standard programming language for Defense computer
resources used in military command and control systems."
On 6/17/24 03:16, Janis Papanagnou wrote:
On 17.06.2024 08:20, Keith Thompson wrote:
Janis Papanagnou <[email protected]> writes:
If I want a defined exit status (which is what I usually
want) I specify 'int main (...)' and provide an explicit
return statement (or exit() call).
Why would you ever not want a defined exit status, given that it's
easier to have one than not to have one?
Aren't we agreeing here? (The only difference is that you are
formulating in a negated form where I positively said the same.)
You implied, by saying "If I want a defined exit status", that there are occasions where you don't want a defined exit status
- and he's
questioning that. Things that are undefined are seldom useful.
If the
exit status is undefined, it might be a failure status. In many
contexts, that would cause no problems, but there's also places where it would.
...
Well, to indicate that there's no status information or that
it's irrelevant. E.g. as was the case in the test fragment I
posted.
That's the problem - your "indication that there's no status
information" doesn't achieve the desired effect. Instead, it results in
an unspecified status being returned to the system. If might be a
successful status, or an unsuccessful status. On the systems I use,
scripts that execute programs will often abort if the program returns an unsuccessful status code. If there's nothing that needs to be brought to
the system's attention, use "return 0;", not "void main()".
DFS <[email protected]> writes:
Moving forward, return 0 it is.
By the way, you might have retained return (exp); from old C. C
originally required the parentheses, but they got dropped quite early
on. The syntax in K&R (1st edition) does not require them, but almost
all the code example in the book still have them!
I took a while to drop them as I came to C from B where they were always required so I'd got the habit.
Janis Papanagnou <[email protected]> writes:
You've mentioned several things you have no idea about.
Are you interested in learning?
[...]
[...]
Whatever current C standards - and I'm not sure what ancient
'cc' is on my system and to what standard it complies - say,
Perhaps you should find out what your ancient "cc" does. What OS
are you on? Does "cc --version", "cc -V", or "man cc" give you
any meaningful information?
[...]
On 17.06.2024 07:40, Tim Rentsch wrote:
Keith Thompson <[email protected]> writes:
The worst consequence of undefined behavior is having your code
appear to "work".
Personally I think causing a missle launch that started a
worldwide thermonuclear war would be a worse consequence.
YMMV.
I think I wouldn't code a missile control system in "C". ;-)
Speaking of while, the do/while construct does not require parentheses
in order to disambiguate anything, since it has a mandatory semicolon.
Yet, it still has them.
On 17.06.2024 15:38, James Kuyper wrote:...
You implied, by saying "If I want a defined exit status", that there are
occasions where you don't want a defined exit status
...where I don't _need_ one. Yes.
E.g. in code like main() { printf("hello, world\n"); }
- and he's
questioning that. Things that are undefined are seldom useful.
I disagree. If things are undefined it _may_ just not matter.
If things matter they should not (ideally never) be undefined.
If the
exit status is undefined, it might be a failure status. In many
contexts, that would cause no problems, but there's also places where it
would.
Exactly.
(BTW, the 'blind-spot' I mentioned is that we often forget that we /can/
use temporary files to store intermediary results. Sometimes we can manipulate a temporary file easier than we can manipulate malloc()ed (or other) storage. )
DFS <[email protected]> writes:
On 6/17/2024 1:52 AM, Janis Papanagnou wrote:
On 17.06.2024 07:40, Tim Rentsch wrote:
Keith Thompson <[email protected]> writes:I think I wouldn't code a missile control system in "C". ;-)
The worst consequence of undefined behavior is having your code
appear to "work".
Personally I think causing a missle launch that started a
worldwide thermonuclear war would be a worse consequence.
YMMV.
Janis
Per "Google AI Overview": "In 1987, the Department of Defense mandated
that Ada be the standard programming language for Defense computer
resources used in military command and control systems."
Please don't post AI-based misinformation.
The DOD Ada mandate was introduced in 1991, and effectively dropped in 1997.
On 17.06.2024 07:40, Tim Rentsch wrote:
Keith Thompson <[email protected]> writes:
The worst consequence of undefined behavior is having your code
appear to "work".
Personally I think causing a missle launch that started a
worldwide thermonuclear war would be a worse consequence.
YMMV.
The standard doesn't say anything to prohibit such a consequence, but in
real life such an outcome is possible only if your program is executing
in an environment that allows it to send out launch messages to real missiles. In such a context, a program that was intended to launch a
missile strike, and seemed to do so, but actually failed to do so, would arguably be worse. If the enemy knows that you are running such
defective software, that enemy might not be deterred from attacking.
Tim Rentsch <[email protected]> writes:
Kaz Kylheku <[email protected]> writes:
Speaking of while, the do/while construct does not require parentheses
in order to disambiguate anything, since it has a mandatory semicolon.
Yet, it still has them.
It has them to allow an extension for a "loop-and-a-half" control
structure:
do statement while ( expression ) statement
and so for example
do c = getchar(); while( c != EOF ) n++;
to count characters on standard input.
Oh? Do you have any evidence that that was the intent? [...]
Tim Rentsch <[email protected]> writes:
Keith Thompson <[email protected]> writes:
Tim Rentsch <[email protected]> writes:
Kaz Kylheku <[email protected]> writes:
Speaking of while, the do/while construct does not require parentheses >>>>> in order to disambiguate anything, since it has a mandatory semicolon. >>>>> Yet, it still has them.
It has them to allow an extension for a "loop-and-a-half" control
structure:
do statement while ( expression ) statement
and so for example
do c = getchar(); while( c != EOF ) n++;
to count characters on standard input.
Oh? Do you have any evidence that that was the intent? [...]
I think you're reading something into my remark that it
didn't say.
Or at least that you didn't mean.
Keith Thompson <[email protected]> writes:
[...]
That's fine. A return statement or exit() call is unnecessary
in main() due to a special-case rule that was added in 1999 for
compatibility with C++. I don't particularly like that rule myself.
I choose to omit the return statement in small programs, but if
you want to add the "return 0;", I have absolutely no objection.
(I used to do that myself.) It even makes your code more portable
to old compilers that support C90. (tcc claims to support C99,
but it has a bug in this area.)
A minor point: The latest unreleased version of tcc appears to fix this
bug. In tcc 0.9.27, falling off the end of main (defined as "int main(void)") returns some random status. In the latest version, it
returns 0, based on a quick experiment and a cursory examination of the generated object code. (tcc doesn't have an option to generate an
assembly listing; I used "tcc -c" followed by "objdump -d".)
On 17/06/2024 14:50, DFS wrote:
On 6/17/2024 3:39 AM, Kaz Kylheku wrote:
I think DFS might mean that they find themselves
he finds himself
unable to omit the unnecessary return 0 statement entirely.
yes
If a function is defined to return an int, then you should return one.
Anything else is just lazy/sloppy. Just because main allows it as a
special case doesn't mean it's a good idea.
I mean: it's really not much extra to type.
On 2024-06-17, Richard Harnden <[email protected]d> wrote:
If a function is defined to return an int, then you should return one.
Anything else is just lazy/sloppy. Just because main allows it as a
special case doesn't mean it's a good idea.
I mean: it's really not much extra to type.
The misfeature of missing return being success was, I believe, not
intended to make programs shorter. It was intendeda to correct the
random termination statuses of countless numbers of programs in a single stroke.
Deliberately relying on this in a new program is like relying ona a
diaper. If you're of intermediate age, you don't do this.
On Sat, 15 Jun 2024 15:36:22 -0400, DFS wrote:[snip]
I want to read numbers in from a file, say:
47 185 99 74 202 118 78 203 264 207 19 17 34 167 148 54 297 271 118 245
294 188 140 134 251 188 236 160 48 189 228 94 74 27 168 275 144 245 178
108 152 197 125 185 63 272 239 60 242 56 4 235 244 144 69 195 32 4 54 79
193 282 173 267 8 40 241 152 285 119 259 136 15 83 21 78 55 259 137 297
15 141 232 259 285 300 153 16 4 207 95 197 188 267 164 195 7 104 47 291
This code:
1 opens the file
2 fscanf thru the file to count the number of data points
3 allocate memory
4 rewind and fscanf again to add the data to the int array
You /could/ create a temporary, binary, file, and write the fscanf()'ed values to it as part of the first loop. Once the first loop completes,[snip]
you rewind this temporary file, and load your integer array by reading
the (now converted to native integer format) values from that file.
Still two passes, but using fscanf() in only one of those passes.
On Sun, 16 Jun 2024 15:52:16 +0000, Lew Pitcher wrote:
On Sat, 15 Jun 2024 15:36:22 -0400, DFS wrote:[snip]
I want to read numbers in from a file, say:
47 185 99 74 202 118 78 203 264 207 19 17 34 167 148 54 297 271 118 245
294 188 140 134 251 188 236 160 48 189 228 94 74 27 168 275 144 245 178
108 152 197 125 185 63 272 239 60 242 56 4 235 244 144 69 195 32 4 54 79 >>> 193 282 173 267 8 40 241 152 285 119 259 136 15 83 21 78 55 259 137 297
15 141 232 259 285 300 153 16 4 207 95 197 188 267 164 195 7 104 47 291
This code:
1 opens the file
2 fscanf thru the file to count the number of data points
3 allocate memory
4 rewind and fscanf again to add the data to the int array
You /could/ create a temporary, binary, file, and write the fscanf()'ed[snip]
values to it as part of the first loop. Once the first loop completes,
you rewind this temporary file, and load your integer array by reading
the (now converted to native integer format) values from that file.
Still two passes, but using fscanf() in only one of those passes.
For what it's worth, here's an example of what I suggest:
/*
The following code provides two examples of the approach I suggested.
Example 1: while counting input numbers, write temp file with int values
malloc() a buffer big enough for that count of int values
fread() the temp file into the malloc()'ed buffer
Note: conformant to ISO Standard C.
Example 2: while counting input numbers, write temp file with int values
mmap() the temp file, starting at the beginning, and sized to
include all the int values in the file.
Note: conformant to POSIX C extensions to ISO Standard C.
Note: compile with -DUSE_MMAP to obtain mmap() variant, otherwise
this will compile the malloc()/fread() variant
*/
#include <stdio.h>
#include <stdlib.h>
#ifdef USE_MMAP
#include <sys/mman.h>
#define BANNER "Example of array loading using mmap()"
#define FREEALLOC(x)
#else
#define BANNER "Example of array loading using malloc() and fread()"
#define FREEALLOC(x) free((x))
#endif
static int *LoadIntArray(FILE *fp, size_t *Count);
int main(void)
{
int status = EXIT_FAILURE, *array;
size_t count;
puts(BANNER);
if ((array = LoadIntArray(stdin,&count)))
{
printf("%zu elements loaded\n",count);
for (size_t index = 0; index < count; ++index)
printf("array[%3zu] == %d\n",index,array[index]);
FREEALLOC(array); /* if necessary, free() the malloc()'ed array */
status = EXIT_SUCCESS;
}
return status;
}
static int *LoadIntArray(FILE *fp,size_t *Count)
{
FILE *tmp;
int *array = NULL;
size_t count = 0;
if ((tmp = tmpfile()))
{
int buffer;
for (count = 0; fscanf(fp,"%d",&buffer) == 1; ++count)
fwrite(&buffer,sizeof buffer, 1,tmp);
if (count)
{
#ifdef USE_MMAP
/*
** USE mmap() to map temp_file data into process memory
*/
array = mmap(NULL,
count * sizeof *array,
PROT_READ,MAP_PRIVATE,
fileno(tmp),
0);
if (array == MAP_FAILED)
{
array = NULL;
fprintf(stderr,"FAIL: Cannot mmap %zu element array\n",count);
}
#else
/*
** USE malloc() to reserve a big enough heap-space buffer,
** then fread() the temp_file data into that buffer
*/
if ((array = malloc(count * sizeof *array)))
{
rewind(tmp);
if (fread(array,sizeof *array,count,tmp) != count)
{
free(array);
array = NULL;
fprintf(stderr,"FAIL: Cannot load %zu element array\n",count);
}
}
else fprintf(stderr,"FAIL: Cant malloc() %zu element array\n",count); #endif
}
fclose(tmp);
}
else fprintf(stderr,"FAIL: Cannot allocate temporary work file\n");
*Count = count; /* byproduct value that caller might find useful */
return array; /* either NULL (on failure) or pointer to array */
}
On 6/25/2024 1:37 PM, Lew Pitcher wrote:[snip]
$ gcc -Wall LewPitcher_readnums.c -o lprn
$ (no compile errors)
$ ./lprn nums.txt
Example of array loading using malloc() and fread()
| Sysop: | Keyop |
|---|---|
| Location: | Huddersfield, West Yorkshire, UK |
| Users: | 714 |
| Nodes: | 16 (2 / 14) |
| Uptime: | 141:10:07 |
| Calls: | 12,087 |
| Files: | 14,998 |
| Messages: | 6,517,434 |