How shall we write a better build system?
Current systems use scripts with ad hoc, weird syntax. They don't work consistently across platforms, they break and are hard to fix, and are complicated to set up. You even get get one scripting system as a front
end to another, a sort of admission of failure.
But can we do better?
A scripting language provides loops, conditional branches, and
subroutine calls. And we've already got a perfecty good language for describing that. It is C itself.
So here's the proposal. We implement a C interpreter.
On 04/02/2024 15:39, Malcolm McLean wrote:
How shall we write a better build system?
Current systems use scripts with ad hoc, weird syntax. They don't work
consistently across platforms, they break and are hard to fix, and are
complicated to set up. You even get get one scripting system as a front
end to another, a sort of admission of failure.
But can we do better?
A scripting language provides loops, conditional branches, and
subroutine calls. And we've already got a perfecty good language for
describing that. It is C itself.
So here's the proposal. We implement a C interpreter.
bart <[email protected]> writes:
On 04/02/2024 15:39, Malcolm McLean wrote:
How shall we write a better build system?
Current systems use scripts with ad hoc, weird syntax. They don't work
consistently across platforms, they break and are hard to fix, and are
complicated to set up. You even get get one scripting system as a front
end to another, a sort of admission of failure.
But can we do better?
A scripting language provides loops, conditional branches, and
subroutine calls. And we've already got a perfecty good language for
describing that. It is C itself.
So here's the proposal. We implement a C interpreter.
This was tried. It's called the c-shell.
How shall we write a better build system?
Current systems use scripts with ad hoc, weird syntax.
bart <[email protected]> writes:
On 04/02/2024 15:39, Malcolm McLean wrote:
How shall we write a better build system?
Current systems use scripts with ad hoc, weird syntax. They don't work
consistently across platforms, they break and are hard to fix, and are
complicated to set up. You even get get one scripting system as a front
end to another, a sort of admission of failure.
But can we do better?
A scripting language provides loops, conditional branches, and
subroutine calls. And we've already got a perfecty good language for
describing that. It is C itself.
So here's the proposal. We implement a C interpreter.
This was tried. It's called the c-shell.
[...]
csh isn't a C interpreter by any stretch of the imagination!
On 04/02/2024 23:32, Ben Bacarisse wrote:
[email protected] (Scott Lurndal) writes:
bart <[email protected]> writes:
On 04/02/2024 15:39, Malcolm McLean wrote:
How shall we write a better build system?
Current systems use scripts with ad hoc, weird syntax. They don't work >>>>> consistently across platforms, they break and are hard to fix, and are >>>>> complicated to set up. You even get get one scripting system as a front >>>>> end to another, a sort of admission of failure.
But can we do better?
A scripting language provides loops, conditional branches, and
subroutine calls. And we've already got a perfecty good language for >>>>> describing that. It is C itself.
So here's the proposal. We implement a C interpreter.
This was tried. It's called the c-shell.
csh isn't a C interpreter by any stretch of the imagination!
No. But the concept of a C interpreter isn't new.
There's one called pico C
which is open source and available to be incorporated.
And a scripting
language is an obvious application for an intepreter.
On 05/02/2024 14:42, Ben Bacarisse wrote:
Malcolm McLean <[email protected]> writes:It's less of a problem now. The interpreter can just spin up a virtual machine and then create a memory space. That was less of a practical
On 04/02/2024 23:32, Ben Bacarisse wrote:
[email protected] (Scott Lurndal) writes:
bart <[email protected]> writes:
On 04/02/2024 15:39, Malcolm McLean wrote:
How shall we write a better build system?
Current systems use scripts with ad hoc, weird syntax. They don't >>>>>>> work
consistently across platforms, they break and are hard to fix,
and are
complicated to set up. You even get get one scripting system as a >>>>>>> front
end to another, a sort of admission of failure.
But can we do better?
A scripting language provides loops, conditional branches, and
subroutine calls. And we've already got a perfecty good language for >>>>>>> describing that. It is C itself.
So here's the proposal. We implement a C interpreter.
This was tried. It's called the c-shell.
csh isn't a C interpreter by any stretch of the imagination!
No. But the concept of a C interpreter isn't new.
Indeed. But C is not a good fit for interpretation, so there are often
compromises.
approach when machines were more limited.
Bart writes compilers. So I assumed, wrongly as it turned out, that itThere's one called pico C
Yes, and there's tcc's -run option.
which is open source and available to be incorporated.
Likewise tcc. So what's wrong (in your opinion) with these two that
caused you to suggest implementing another C interpreter?
would be very simple and attractive to him to modify one to be an interpreter.
But I'm still unclear as to why you need a C interpreter, rather than
just using whatever C compiler is provided.
On 05/02/2024 16:00, bart wrote:
And if you provide source code for the interpreter, theb building theThe way if would work is that there would be a program called
interpreter might be as much of a job as building the main app!
"fixedmake", or whatever, which was distributed as an executable exactly
like make. The someone writes program source, packages it up in a zip
file, and instead of a makefile, he puts a "fixedmake" file in the
source directory. However the fixedmake file is formally a C program.
Maybe we could have the convention that it is called buildme.c.
The user types fixedmake buildme.c. fixedmake is effectively a C
intepreter. buildme.c executes, calls C compilers and other things as externals, and the program is built.
Now because buildme.c is a C program, we could use other approaches. You could go gcc buildme.c -lfixedmake.lib, prodcue an a.out, and run a.out.
But the snag is that it's meant to be a build system. And launching gcc
is in and of itself a build. So it's not really automated any more.
Or we could say that since fixedmake is really just a C intepreter, we
don't have to be ones to write that. Just use tcc. But the problem is
that whilst buildme.c will uusally be fairly short and simple, to keep
it short and simple there will need to a rich library with facilites for getting lisrs of files, launching compilers, and so on. So if we use
tcc, we've got to package this library as well as buildme.c and the
actual sources in the distribution.
However tcc is open source, so we might be able to modify tcc.
Before the system has any sort of traction, you can't assume that the fixedmake executable will be available, however. So people are going to
have to provide it with the distribution. And if you provide as binary,
why not just provide the program as binary? And if you provide as source
and build with make, why not build the program with make? But I don't
see any way round that for any new build system.
On 05/02/2024 14:42, Ben Bacarisse wrote:
Malcolm McLean <[email protected]> writes:
The term "scripting language" is so vague as to be almost useless, butI'd dispute "very poorly". Of course you can devise a language which is better suited to building programs specifically. CMake takes exactly
in this context the suggestion seems to be to use C as a language to
invoke the commands required to compile and link a C program. But C is
very poorly suited to this task. One would end up writing a library of
functions to do string manipulation and (unless 'system' was deemed
sufficient) program execution. C itself would bring very little to the
party and the resulting scripts would be hard to read.
that approach. So what happens? It's effectively another programming
language to learn. Someone else wrote elaborate CMakes scripts which I
use at work. Sometimes thing go wrong. And then I'm messing about with a language I hardly ever use and does things in ways I am unfamilar with, trying to troubleshoot. It would be easier if the CMake scripts were in
C. It might be bit less convenient. But it's just looping and branching
and calling subroutines at the end of the day. As you know full well,
that's all computers can do.
The library would have facilities for working with lists of strings, of course. But C can do that perfectly well, and I don't see it as big
problem.
On 05/02/2024 16:17, Malcolm McLean wrote:
The library would have facilities for working with lists of strings,
of course. But C can do that perfectly well, and I don't see it as big
problem.
Have you ever tried /any/ other programming language? Apart from
assembly and Forth, I have not seen a language that is more cumbersome
for string handling than C. Yes, you /can/ do string handling in C, but
no one would /choose/ C for that.
I have a directory "src". I want to find all the .c files in "src". I want to make a list of all these files, and a list of matching object
files to make in the "build/obj/src" directory. For each file, I want
to call "gcc -c src/file.c -o build/obj/src/file.o". I want to do so in parallel, up to 8 commands at a time. (Ignore any possible runtime
errors.)
With Python, you could have (untested) :
import glob
import os
import multiprocessing
import subprocess
srcdir = "src"
objdir = "build/obj"
srcfiles = glob.glob(srcdir + "/*.c")
objfiles_src = [fn[:-2] + ".o" for fn in srcfiles]
objfiles = [objdir + "/" + fn for fn in srcfiles]
os.mkdirs(objdir + "/" + srcdir, exist_ok = True)
with multiprocessing.Pool(5) as pool :
pool.map(lambda src, obj : subprocess.call("gcc -c " +
src + " -o " + obj, shell = True),
zip(srcfiles, objfiles))
Obviously much of this could be put in a reusable library, so that end
users don't have to know about process pools. (And there are many other ways to structure such code.)
Have you ever tried /any/ other programming language? Apart from
assembly and Forth, I have not seen a language that is more
cumbersome for string handling than C.
(I tried your Python, but os.mkdir(dir) didn't work, even after fixing
the name typo and removing the named argument.
Bypassing that, there
were all sorts of errors to do with pickle.py and 'unbounded' methods.
strcat(cmdstr, "-o ");
strcat(cmdstr, exefilename);
strcat(cmdstr, " ");
for (i=0; i<nfiles; ++i) {
strcat(cmdstr, files[i]);
strcat(cmdstr, " ");
}
...
if (system(cmdstr)!=0)
It would be easier if the CMake scripts were in
C. It might be bit less convenient.
On 05/02/2024 23:34, Lawrence D'Oliveiro wrote:
On Mon, 5 Feb 2024 15:17:15 +0000, Malcolm McLean wrote:
It would be easier if the CMake scripts were in C. It might be bit
less convenient.
Aren’t “easier” and “less convenient” kind of ... opposites?
Less convenient for the whizzy suoer skilled build system programmeer to write. Easier for the humble C programmer who has it fall over on him to
fix.
But whilst the CMake scripts are quite short, they aren't trivial ...
But whilst I do write my own CMkae scripts, I don't write them often
enough to really know the language ...
On 2/5/2024 6:27 PM, Lawrence D'Oliveiro wrote:
But the C code will be anything up to an order of magnitude larger than
the build rules written in the domain-specific language. So more work
required overall.
Putting the build rules in source code, is a bad idea, TM?
On 05/02/2024 19:47, David Brown wrote:
On 05/02/2024 16:17, Malcolm McLean wrote:The newgroup is comp.lang.c. So if we propose using a generally accepted
On 05/02/2024 14:42, Ben Bacarisse wrote:
Malcolm McLean <[email protected]> writes:
The term "scripting language" is so vague as to be almost useless, but >>>> in this context the suggestion seems to be to use C as a language toI'd dispute "very poorly". Of course you can devise a language which
invoke the commands required to compile and link a C program. But C is >>>> very poorly suited to this task. One would end up writing a library of >>>> functions to do string manipulation and (unless 'system' was deemed
sufficient) program execution. C itself would bring very little to the >>>> party and the resulting scripts would be hard to read.
is better suited to building programs specifically. CMake takes
exactly that approach. So what happens? It's effectively another
programming language to learn. Someone else wrote elaborate CMakes
scripts which I use at work. Sometimes thing go wrong. And then I'm
messing about with a language I hardly ever use and does things in
ways I am unfamilar with, trying to troubleshoot. It would be easier
if the CMake scripts were in C. It might be bit less convenient. But
it's just looping and branching and calling subroutines at the end of
the day. As you know full well, that's all computers can do.
The library would have facilities for working with lists of strings,
of course. But C can do that perfectly well, and I don't see it as
big problem.
Have you ever tried /any/ other programming language? Apart from
assembly and Forth, I have not seen a language that is more cumbersome
for string handling than C. Yes, you /can/ do string handling in C,
but no one would /choose/ C for that.
It's not unreasonable to suggest that you'd rather base your build
system on an existing mainstream language than a domain-specific
language (though DSL's have the big advantage of having a syntax and
semantics tuned to the task in question). But why C?
and widely understood programming language as our build scripting
language, the choice of language has to be C.
Here's a quick challenge for you.
I have a directory "src". I want to find all the .c files in "src".
I want to make a list of all these files, and a list of matching
object files to make in the "build/obj/src" directory. For each file,
I want to call "gcc -c src/file.c -o build/obj/src/file.o". I want to
do so in parallel, up to 8 commands at a time. (Ignore any possible
runtime errors.)
With make, that would be (this is untested) :
default : all
.PHONY all
srcdir = "src"
objdir = "build/obj"
srcfiles = $(wildcard $(srcdir)/*.c)
objfiles_src = $(srcfiles:.c=.o)
objfiles = $(addprefix $(objdir)/, \
$(patsubst ../%,%,$(objfiles_src)))
all : objfiles
$(objdir)/src :
mkdir -p $@
$(objdir)/%.o : %.c | $(objdir)/src
gcc -c $< -o $@
Run with "make -j 8".
With Python, you could have (untested) :
import glob
import os
import multiprocessing
import subprocess
srcdir = "src"
objdir = "build/obj"
srcfiles = glob.glob(srcdir + "/*.c")
objfiles_src = [fn[:-2] + ".o" for fn in srcfiles]
objfiles = [objdir + "/" + fn for fn in srcfiles]
os.mkdirs(objdir + "/" + srcdir, exist_ok = True)
with multiprocessing.Pool(5) as pool :
pool.map(lambda src, obj : subprocess.call("gcc -c " +
src + " -o " + obj, shell = True),
zip(srcfiles, objfiles))
Obviously much of this could be put in a reusable library, so that end
users don't have to know about process pools. (And there are many
other ways to structure such code.)
Your task is to duplicate this in C.
#include "builduils.h"
int main(void)
{
STRINGLIST *sources;
STRINGLIST *objfiles;
int i;
sources = glob("src/*.c");
objfiles = stringlist();
for (i = 0; i < stringlist_Nstrings(sources); i++)
{
char objname[1024];
snprintf(objname, 1024, "%s/%s", objdir, replaceextension(stringlist_get(sources, i), ".o"));
stringlist_add(objfiles, objname);
}
char outputdir[1024];
snprinf("%s/%s", 1024, objdir, srcdir);
mkdir(outputdir);
callparallel("gcc -c %s %s", srcfiles, objfiles);
}
OK now I've cheated a little bit by inventing buildutils library
functions ad hoc.
But I haven't done so outrageously. And we're leaking
memory. But likely we have many gigabytres, so who are about maybe 2K
for a few strings. And yes, maybe we could reoplace snprintf() with
something that creates an arbitrary length output.
The Python is a bit shorter and more concise. But there's not that much
in it.
On Mon, 5 Feb 2024 21:28:20 +0000, bart wrote:
(I tried your Python, but os.mkdir(dir) didn't work, even after fixing
the name typo and removing the named argument.
“mkdirs” should have been “makedirs” <https://docs.python.org/3/library/os.html#os.makedirs>.
On 05/02/2024 19:47, David Brown wrote:
On 05/02/2024 16:17, Malcolm McLean wrote:
The library would have facilities for working with lists of strings,
of course. But C can do that perfectly well, and I don't see it as
big problem.
Have you ever tried /any/ other programming language? Apart from
assembly and Forth, I have not seen a language that is more cumbersome
for string handling than C. Yes, you /can/ do string handling in C,
but no one would /choose/ C for that.
See the build.c demo elsewhere in the thread.
I have a directory "src". I want to find all the .c files in "src".
I want to make a list of all these files, and a list of matching
object files to make in the "build/obj/src" directory. For each file,
I want to call "gcc -c src/file.c -o build/obj/src/file.o". I want to
do so in parallel, up to 8 commands at a time. (Ignore any possible
runtime errors.)
With Python, you could have (untested) :
import glob
import os
import multiprocessing
import subprocess
srcdir = "src"
objdir = "build/obj"
srcfiles = glob.glob(srcdir + "/*.c")
objfiles_src = [fn[:-2] + ".o" for fn in srcfiles]
objfiles = [objdir + "/" + fn for fn in srcfiles]
os.mkdirs(objdir + "/" + srcdir, exist_ok = True)
with multiprocessing.Pool(5) as pool :
pool.map(lambda src, obj : subprocess.call("gcc -c " +
src + " -o " + obj, shell = True),
zip(srcfiles, objfiles))
Obviously much of this could be put in a reusable library, so that end
users don't have to know about process pools. (And there are many
other ways to structure such code.)
I had a go at expressing the Python version in my scripting language:
srcdir := "src/"
objdir := "build/obj/"+srcdir
srcfiles := dirlist(srcdir + "*.c")
objfiles_src := mapvs(changeext, srcfiles, "o")
objfiles := mapsv(+, objdir, objfiles_src)
createdir(objdir)
for i, sfile in srcfiles do
execcmd(sfprint("gcc -c # -o #", srcdir+sfile, objfiles[i]))
od
A couple of issues: 'createdir' can only create one directory at a time;
for a chain of them like a/b/c, I'd need to split it up and do it one by
one. (To test this, I created it manually.)
Also, I don't have any features for parallel executions. However
'execcmd' starts a process but then doesn't wait for it to complete. If
I instead use 'system' (or my 'execwait'), then compiling all Lua .c
files in ./src takes 6.5 seconds instead of 2.5 seconds.
But this isn't the point of posting this. My interpreter can be
expressed as a single C file. That allows you to use this kind of
scripting, without adding any extra dependencies. You still only need a
C compiler.
(I tried your Python, but os.mkdir(dir) didn't work, even after fixing
the name typo and removing the named argument. Bypassing that, there
were all sorts of errors to do with pickle.py and 'unbounded' methods.
I believe your glob.glob routine returns filenames with path prepended
(my dirlist doesn't).
I suspect that's why you chose a dest path as
build/obj/src rather than build/obj.
Anyway, I wasn't able to compare
peformance.)
BTW this scripting is still hard work. For building you want to express
the requires as data, not code. If you can do that (see my build.c
example), then C may be adequate.
| Sysop: | Keyop |
|---|---|
| Location: | Huddersfield, West Yorkshire, UK |
| Users: | 715 |
| Nodes: | 16 (2 / 14) |
| Uptime: | 157:25:30 |
| Calls: | 12,093 |
| Calls today: | 1 |
| Files: | 15,000 |
| Messages: | 6,517,751 |