• Re: Build system proposal

    From bart@21:1/5 to Malcolm McLean on Sun Feb 4 15:58:52 2024
    On 04/02/2024 15:39, Malcolm McLean wrote:
    How shall we write a better build system?

    Current systems use scripts with ad hoc, weird syntax. They don't work consistently across platforms, they break and are hard to fix, and are complicated to set up. You even get get one scripting system as a front
    end to another, a sort of admission of failure.

    But can we do better?

    A scripting language provides loops, conditional branches, and
    subroutine calls. And we've already got a perfecty good language for describing that. It is C itself.

    So here's the proposal. We implement a C interpreter.

    Do we need an interpreter? We need to assume there is a C compiler
    otherwise we won't get very far with the build!

    If the concern is the time it takes to compile the script, it should be
    a small proportion of that needed to build the app. I assume the script
    is only built once.

    C for scripting builds has been done before. I think Thiago Adams did so.

    The Seed7 build process had its own version of an auto-config script
    which was written in C, not Bash (but building the app still uses
    makefiles, nearly 20 of them for assorted compilers).

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to bart on Sun Feb 4 16:58:22 2024
    bart <[email protected]> writes:
    On 04/02/2024 15:39, Malcolm McLean wrote:
    How shall we write a better build system?

    Current systems use scripts with ad hoc, weird syntax. They don't work
    consistently across platforms, they break and are hard to fix, and are
    complicated to set up. You even get get one scripting system as a front
    end to another, a sort of admission of failure.

    But can we do better?

    A scripting language provides loops, conditional branches, and
    subroutine calls. And we've already got a perfecty good language for
    describing that. It is C itself.

    So here's the proposal. We implement a C interpreter.

    This was tried. It's called the c-shell.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From bart@21:1/5 to Scott Lurndal on Sun Feb 4 17:15:16 2024
    On 04/02/2024 16:58, Scott Lurndal wrote:
    bart <[email protected]> writes:
    On 04/02/2024 15:39, Malcolm McLean wrote:
    How shall we write a better build system?

    Current systems use scripts with ad hoc, weird syntax. They don't work
    consistently across platforms, they break and are hard to fix, and are
    complicated to set up. You even get get one scripting system as a front
    end to another, a sort of admission of failure.

    But can we do better?

    A scripting language provides loops, conditional branches, and
    subroutine calls. And we've already got a perfecty good language for
    describing that. It is C itself.

    So here's the proposal. We implement a C interpreter.

    This was tried. It's called the c-shell.

    This is not a shell program. And it's not proposing using a language
    that vaguely looks like C, but /is/ C. I don't think the proposal is to
    ever type it live via a REPL, but always from a file.

    Although as I understand it, what is proposed could be done right now:
    anyone can provide a small C program to orchestrate the building of the
    main program.

    So perhaps it will be more of a library.

    But it will take some care that it doesn't end up doing exactly what
    makefiles do (with lots of assumptions about C compilers and the same
    failure points), but written as a sequence of C function calls.

    I think the main objection will be that it is a two step process. This
    may have been the motivation to use an interpreter, but that is too huge
    of a task, and too big a dependency, to solve a problem that could be
    achieved with shebang lines.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to Malcolm McLean on Sun Feb 4 16:32:52 2024
    Malcolm McLean <[email protected]> writes:
    How shall we write a better build system?

    Current systems use scripts with ad hoc, weird syntax.

    Fact not in evidence. Your opinion thereof does not make it a fact.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ben Bacarisse@21:1/5 to Scott Lurndal on Sun Feb 4 23:32:01 2024
    [email protected] (Scott Lurndal) writes:

    bart <[email protected]> writes:
    On 04/02/2024 15:39, Malcolm McLean wrote:
    How shall we write a better build system?

    Current systems use scripts with ad hoc, weird syntax. They don't work
    consistently across platforms, they break and are hard to fix, and are
    complicated to set up. You even get get one scripting system as a front
    end to another, a sort of admission of failure.

    But can we do better?

    A scripting language provides loops, conditional branches, and
    subroutine calls. And we've already got a perfecty good language for
    describing that. It is C itself.

    So here's the proposal. We implement a C interpreter.

    This was tried. It's called the c-shell.

    csh isn't a C interpreter by any stretch of the imagination!

    --
    Ben.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tim Rentsch@21:1/5 to Ben Bacarisse on Sun Feb 4 23:24:56 2024
    Ben Bacarisse <[email protected]> writes:

    [...]

    csh isn't a C interpreter by any stretch of the imagination!

    IT ISN'T??? No wonder I've been having trouble writing csh
    scripts...

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ben Bacarisse@21:1/5 to Malcolm McLean on Mon Feb 5 14:42:47 2024
    Malcolm McLean <[email protected]> writes:

    On 04/02/2024 23:32, Ben Bacarisse wrote:
    [email protected] (Scott Lurndal) writes:

    bart <[email protected]> writes:
    On 04/02/2024 15:39, Malcolm McLean wrote:
    How shall we write a better build system?

    Current systems use scripts with ad hoc, weird syntax. They don't work >>>>> consistently across platforms, they break and are hard to fix, and are >>>>> complicated to set up. You even get get one scripting system as a front >>>>> end to another, a sort of admission of failure.

    But can we do better?

    A scripting language provides loops, conditional branches, and
    subroutine calls. And we've already got a perfecty good language for >>>>> describing that. It is C itself.

    So here's the proposal. We implement a C interpreter.

    This was tried. It's called the c-shell.

    csh isn't a C interpreter by any stretch of the imagination!

    No. But the concept of a C interpreter isn't new.

    Indeed. But C is not a good fit for interpretation, so there are often compromises.

    There's one called pico C

    Yes, and there's tcc's -run option.

    which is open source and available to be incorporated.

    Likewise tcc. So what's wrong (in your opinion) with these two that
    caused you to suggest implementing another C interpreter?

    And a scripting
    language is an obvious application for an intepreter.

    The term "scripting language" is so vague as to be almost useless, but
    in this context the suggestion seems to be to use C as a language to
    invoke the commands required to compile and link a C program. But C is
    very poorly suited to this task. One would end up writing a library of functions to do string manipulation and (unless 'system' was deemed
    sufficient) program execution. C itself would bring very little to the
    party and the resulting scripts would be hard to read.

    --
    Ben.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From bart@21:1/5 to Malcolm McLean on Mon Feb 5 16:00:43 2024
    On 05/02/2024 15:17, Malcolm McLean wrote:
    On 05/02/2024 14:42, Ben Bacarisse wrote:
    Malcolm McLean <[email protected]> writes:

    On 04/02/2024 23:32, Ben Bacarisse wrote:
    [email protected] (Scott Lurndal) writes:

    bart <[email protected]> writes:
    On 04/02/2024 15:39, Malcolm McLean wrote:
    How shall we write a better build system?

    Current systems use scripts with ad hoc, weird syntax. They don't >>>>>>> work
    consistently across platforms, they break and are hard to fix,
    and are
    complicated to set up. You even get get one scripting system as a >>>>>>> front
    end to another, a sort of admission of failure.

    But can we do better?

    A scripting language provides loops, conditional branches, and
    subroutine calls. And we've already got a perfecty good language for >>>>>>> describing that. It is C itself.

    So here's the proposal. We implement a C interpreter.

    This was tried.  It's called the c-shell.

    csh isn't a C interpreter by any stretch of the imagination!

    No. But the concept of a C interpreter isn't new.

    Indeed.  But C is not a good fit for interpretation, so there are often
    compromises.

    It's less of a problem now. The interpreter can just spin up a virtual machine and then create a memory space. That was less of a practical
    approach when machines were more limited.
    There's one called pico C

    Yes, and there's tcc's -run option.

    which is open source and available to be incorporated.

    Likewise tcc.  So what's wrong (in your opinion) with these two that
    caused you to suggest implementing another C interpreter?

    Bart writes compilers. So I assumed, wrongly as it turned out, that it
    would be very simple and attractive to him to modify one to be an interpreter.

    It's actually quite hard to write an interpreter for a statically typed language, which has to call FFI routines written in native code, which
    include call-back functions that will call back into your program; they
    will expect a native code function!

    It's extra hard when the language is C, because then it is impossible to
    draw a line between routines in language X which is being interpreted,
    and language Y which is native code on the other side of the FFI.

    Since X and Y are both C.

    However, I do have an interpreter for a statically typed IL. To
    interpret C code, I need to change the C compiler backend to generate
    that IL. (And also tweak that IL because it uses a 64-bit execution
    core, but C requires a mixed 32/64-bit core.)

    So it's still quite a bit of work.

    But I'm still unclear as to why you need a C interpreter, rather than
    just using whatever C compiler is provided.

    If you provide an interpreter, what form would it take; a binary? If it
    can be binary, then why not a binary of the app?

    And if you provide source code for the interpreter, theb building the interpreter might be as much of a job as building the main app!

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael S@21:1/5 to bart on Mon Feb 5 19:14:52 2024
    On Mon, 5 Feb 2024 16:00:43 +0000
    bart <[email protected]> wrote:

    But I'm still unclear as to why you need a C interpreter, rather than
    just using whatever C compiler is provided.


    One reason could be that when build system's "meta" code (as opposed to
    bulk of project's code) has a bug, especially out of bound array
    access, I'd rather want it to fail gracefully, with meaningful error indication, something that typical compiled C environment unable to
    provide.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From bart@21:1/5 to Malcolm McLean on Mon Feb 5 17:39:25 2024
    On 05/02/2024 16:51, Malcolm McLean wrote:
    On 05/02/2024 16:00, bart wrote:

    And if you provide source code for the interpreter, theb building the
    interpreter might be as much of a job as building the main app!


    The way if would work is that there would be a program called
    "fixedmake", or whatever, which was distributed as an executable exactly
    like make. The someone writes program source, packages it up in a zip
    file, and instead of a makefile, he puts a "fixedmake" file in the
    source directory. However the fixedmake file is formally a C program.
    Maybe we could have the convention that it is called buildme.c.

    The user types fixedmake buildme.c. fixedmake is effectively a C
    intepreter. buildme.c executes, calls C compilers and other things as externals, and the program is built.

    Now because buildme.c is a C program, we could use other approaches. You could go gcc buildme.c -lfixedmake.lib, prodcue an a.out, and run a.out.
    But the snag is that it's meant to be a build system. And launching gcc
    is in and of itself a build. So it's not really automated any more.

    Or we could say that since fixedmake is really just a C intepreter, we
    don't have to be ones to write that. Just use tcc. But the problem is
    that whilst buildme.c will uusally be fairly short and simple, to keep
    it short and simple there will need to a rich library with facilites for getting lisrs of files, launching compilers, and so on. So if we use
    tcc, we've got to package this library as well as buildme.c and the
    actual sources in the distribution.

    However tcc is open source, so we might be able to modify tcc.

    Before the system has any sort of traction, you can't assume that the fixedmake executable will be available, however. So people are going to
    have to provide it with the distribution. And if you provide as binary,
    why not just provide the program as binary? And if you provide as source
    and build with make, why not build the program with make? But I don't
    see any way round that for any new build system.


    One problem as I see is that the build system doesn't know which C
    compiler is available, or which one the client wishes to use.

    The client will know that, so can simply supply that information.
    However it is needed in two places:

    (1) The compiler used to build build.c (as it's called in the demo
    below).

    (2) The compiler used by build.exe to build the app.

    (1) is easy, the client is just told to use their prefered compiler here:

    tcc build.c

    But the next step is harder: build.exe won't know what was typed in step
    (1). Even if they do 'tcc -run build.c', 'tcc' does not appear as
    'args[0]', it will be 'build.c' (not even build.exe).


    Here is a mock-up anyway of a build system:

    1) The demo app used has three C files cipher.c, hmac.c,sha2.c

    2) There is a program called build.c

    3) The data describing the project is in a file called here
    filelist.txt, although it contains C source code.

    Files (2) and (3) are shown below. It runs like this, using tcc's run
    option:

    c:\c>tcc -run build.c
    Compiler = tcc
    Invoking compiler:tcc -o cipher.exe cipher.c hmac.c sha2.c
    Finished building: cipher.exe

    build.c defaults to using tcc. For use with another compiler, the
    process looks like this:

    c:\c>gcc build.c -o build

    c:\c>build gcc
    Compiler = gcc
    Invoking compiler:gcc -o cipher.exe cipher.c hmac.c sha2.c
    Finished building: cipher.exe

    It's a litle untidier. However you can always provide a makefile to shut
    people up. The difference here is:

    * The makefile only contains the handful of lines needed to implement
    the above

    * The process can be easily be done manually if needed

    * The critical build info, the list of files, is inside the file
    filelist.txt. While it is C, it can still be presented in a readable
    manner. It can even be incorporated into the user's own programs.

    The info in filelist.txt is simplistic. It also needs the locations of
    any special headers, the libraries to use and and so on. This is all
    info added to the final invocation.

    Ideally build.c, knowing the name of the compiler, can select the right
    forms of any options needed.



    --------------------------------------------------------------
    filelist.txt
    --------------------------------------------------------------
    //#define exeext = ".exe"

    char* files[] = {
    "cipher.c",
    "hmac.c",
    "sha2.c"
    };

    int nfiles = sizeof(files)/sizeof(files[0]);

    char* projectname = "cipher";
    char* exefilename = "cipher.exe"; --------------------------------------------------------------

    --------------------------------------------------------------
    build.c
    --------------------------------------------------------------
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>

    #include "filelist.txt"

    char* compiler;

    void error(char* mess, char* param){
    printf("Build error: %s %s\n",mess, param);
    exit(1);
    }

    void compileproject(void) {
    int n, length, i;
    char* cmdstr;

    length=0;
    for (i=0; i<nfiles; ++i) length+=strlen(files[i])+1;
    cmdstr=malloc(length+100);

    strcpy(cmdstr, compiler);
    strcat(cmdstr, " ");

    strcat(cmdstr, "-o ");
    strcat(cmdstr, exefilename);
    strcat(cmdstr, " ");

    for (i=0; i<nfiles; ++i) {
    strcat(cmdstr, files[i]);
    strcat(cmdstr, " ");
    }

    printf("Invoking compiler:");
    puts(cmdstr);
    if (system(cmdstr)!=0)
    error("Error building:", exefilename);
    else
    printf("Finished building: %s\n", exefilename);
    }

    int main(int n, char** args) {

    if (n>=2)
    compiler=strdup(args[1]);
    else
    compiler="tcc";

    printf("Compiler = %s\n", compiler);

    compileproject();
    }
    --------------------------------------------------------------

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Malcolm McLean on Mon Feb 5 20:47:53 2024
    On 05/02/2024 16:17, Malcolm McLean wrote:
    On 05/02/2024 14:42, Ben Bacarisse wrote:
    Malcolm McLean <[email protected]> writes:


    The term "scripting language" is so vague as to be almost useless, but
    in this context the suggestion seems to be to use C as a language to
    invoke the commands required to compile and link a C program.  But C is
    very poorly suited to this task.  One would end up writing a library of
    functions to do string manipulation and (unless 'system' was deemed
    sufficient) program execution.  C itself would bring very little to the
    party and the resulting scripts would be hard to read.

    I'd dispute "very poorly". Of course you can devise a language which is better suited to building programs specifically. CMake takes exactly
    that approach. So what happens? It's effectively another programming
    language to learn. Someone else wrote elaborate CMakes scripts which I
    use at work. Sometimes thing go wrong. And then I'm messing about with a language I hardly ever use and does things in ways I am unfamilar with, trying to troubleshoot. It would be easier if the CMake scripts were in
    C. It might be bit less convenient. But it's just looping and branching
    and calling subroutines at the end of the day. As you know full well,
    that's all computers can do.

    The library would have facilities for working with lists of strings, of course. But C can do that perfectly well, and I don't see it as big
    problem.


    Have you ever tried /any/ other programming language? Apart from
    assembly and Forth, I have not seen a language that is more cumbersome
    for string handling than C. Yes, you /can/ do string handling in C, but
    no one would /choose/ C for that.

    It's not unreasonable to suggest that you'd rather base your build
    system on an existing mainstream language than a domain-specific
    language (though DSL's have the big advantage of having a syntax and
    semantics tuned to the task in question). But why C?

    Here's a quick challenge for you.

    I have a directory "src". I want to find all the .c files in "src". I
    want to make a list of all these files, and a list of matching object
    files to make in the "build/obj/src" directory. For each file, I want
    to call "gcc -c src/file.c -o build/obj/src/file.o". I want to do so in parallel, up to 8 commands at a time. (Ignore any possible runtime errors.)


    With make, that would be (this is untested) :

    default : all
    .PHONY all
    srcdir = "src"
    objdir = "build/obj"

    srcfiles = $(wildcard $(srcdir)/*.c)
    objfiles_src = $(srcfiles:.c=.o)
    objfiles = $(addprefix $(objdir)/, \
    $(patsubst ../%,%,$(objfiles_src)))

    all : objfiles

    $(objdir)/src :
    mkdir -p $@

    $(objdir)/%.o : %.c | $(objdir)/src
    gcc -c $< -o $@

    Run with "make -j 8".


    With Python, you could have (untested) :

    import glob
    import os
    import multiprocessing
    import subprocess

    srcdir = "src"
    objdir = "build/obj"

    srcfiles = glob.glob(srcdir + "/*.c")
    objfiles_src = [fn[:-2] + ".o" for fn in srcfiles]
    objfiles = [objdir + "/" + fn for fn in srcfiles]

    os.mkdirs(objdir + "/" + srcdir, exist_ok = True)

    with multiprocessing.Pool(5) as pool :
    pool.map(lambda src, obj : subprocess.call("gcc -c " +
    src + " -o " + obj, shell = True),
    zip(srcfiles, objfiles))

    Obviously much of this could be put in a reusable library, so that end
    users don't have to know about process pools. (And there are many other
    ways to structure such code.)


    Your task is to duplicate this in C.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From bart@21:1/5 to David Brown on Mon Feb 5 21:28:20 2024
    On 05/02/2024 19:47, David Brown wrote:
    On 05/02/2024 16:17, Malcolm McLean wrote:

    The library would have facilities for working with lists of strings,
    of course. But C can do that perfectly well, and I don't see it as big
    problem.


    Have you ever tried /any/ other programming language?  Apart from
    assembly and Forth, I have not seen a language that is more cumbersome
    for string handling than C.  Yes, you /can/ do string handling in C, but
    no one would /choose/ C for that.

    See the build.c demo elsewhere in the thread.


    I have a directory "src".  I want to find all the .c files in "src".  I want to make a list of all these files, and a list of matching object
    files to make in the "build/obj/src" directory.  For each file, I want
    to call "gcc -c src/file.c -o build/obj/src/file.o".  I want to do so in parallel, up to 8 commands at a time.  (Ignore any possible runtime
    errors.)

    With Python, you could have (untested) :

        import glob
        import os
        import multiprocessing
        import subprocess

        srcdir = "src"
        objdir = "build/obj"

        srcfiles = glob.glob(srcdir + "/*.c")
        objfiles_src = [fn[:-2] + ".o" for fn in srcfiles]
        objfiles = [objdir + "/" + fn for fn in srcfiles]

        os.mkdirs(objdir + "/" + srcdir, exist_ok = True)

        with multiprocessing.Pool(5) as pool :
            pool.map(lambda src, obj : subprocess.call("gcc -c " +
                     src + " -o " + obj, shell = True),
                    zip(srcfiles, objfiles))

    Obviously much of this could be put in a reusable library, so that end
    users don't have to know about process pools.  (And there are many other ways to structure such code.)

    I had a go at expressing the Python version in my scripting language:

    srcdir := "src/"
    objdir := "build/obj/"+srcdir

    srcfiles := dirlist(srcdir + "*.c")

    objfiles_src := mapvs(changeext, srcfiles, "o")
    objfiles := mapsv(+, objdir, objfiles_src)

    createdir(objdir)

    for i, sfile in srcfiles do
    execcmd(sfprint("gcc -c # -o #", srcdir+sfile, objfiles[i]))
    od

    A couple of issues: 'createdir' can only create one directory at a time;
    for a chain of them like a/b/c, I'd need to split it up and do it one by
    one. (To test this, I created it manually.)

    Also, I don't have any features for parallel executions. However
    'execcmd' starts a process but then doesn't wait for it to complete. If
    I instead use 'system' (or my 'execwait'), then compiling all Lua .c
    files in ./src takes 6.5 seconds instead of 2.5 seconds.

    But this isn't the point of posting this. My interpreter can be
    expressed as a single C file. That allows you to use this kind of
    scripting, without adding any extra dependencies. You still only need a
    C compiler.

    (I tried your Python, but os.mkdir(dir) didn't work, even after fixing
    the name typo and removing the named argument. Bypassing that, there
    were all sorts of errors to do with pickle.py and 'unbounded' methods.

    I believe your glob.glob routine returns filenames with path prepended
    (my dirlist doesn't). I suspect that's why you chose a dest path as build/obj/src rather than build/obj. Anyway, I wasn't able to compare peformance.)

    BTW this scripting is still hard work. For building you want to express
    the requires as data, not code. If you can do that (see my build.c
    example), then C may be adequate.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael S@21:1/5 to David Brown on Mon Feb 5 23:26:25 2024
    On Mon, 5 Feb 2024 20:47:53 +0100
    David Brown <[email protected]> wrote:

    Have you ever tried /any/ other programming language? Apart from
    assembly and Forth, I have not seen a language that is more
    cumbersome for string handling than C.


    In order of my exposure to them: Pascal, Fortran, Ada.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to bart on Mon Feb 5 23:38:38 2024
    On Mon, 5 Feb 2024 21:28:20 +0000, bart wrote:

    (I tried your Python, but os.mkdir(dir) didn't work, even after fixing
    the name typo and removing the named argument.

    “mkdirs” should have been “makedirs” <https://docs.python.org/3/library/os.html#os.makedirs>.

    Bypassing that, there
    were all sorts of errors to do with pickle.py and 'unbounded' methods.

    Trouble with your Python install? In other words, Windows trouble?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to bart on Mon Feb 5 23:41:25 2024
    On Mon, 5 Feb 2024 17:39:25 +0000, bart wrote:

    strcat(cmdstr, "-o ");
    strcat(cmdstr, exefilename);
    strcat(cmdstr, " ");

    for (i=0; i<nfiles; ++i) {
    strcat(cmdstr, files[i]);
    strcat(cmdstr, " ");
    }

    ...
    if (system(cmdstr)!=0)

    What is wrong with this code?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Malcolm McLean on Mon Feb 5 23:34:50 2024
    On Mon, 5 Feb 2024 15:17:15 +0000, Malcolm McLean wrote:

    It would be easier if the CMake scripts were in
    C. It might be bit less convenient.

    Aren’t “easier” and “less convenient” kind of ... opposites?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Malcolm McLean on Tue Feb 6 02:27:33 2024
    On Tue, 6 Feb 2024 01:41:56 +0000, Malcolm McLean wrote:

    On 05/02/2024 23:34, Lawrence D'Oliveiro wrote:

    On Mon, 5 Feb 2024 15:17:15 +0000, Malcolm McLean wrote:

    It would be easier if the CMake scripts were in C. It might be bit
    less convenient.

    Aren’t “easier” and “less convenient” kind of ... opposites?

    Less convenient for the whizzy suoer skilled build system programmeer to write. Easier for the humble C programmer who has it fall over on him to
    fix.

    But the C code will be anything up to an order of magnitude larger than
    the build rules written in the domain-specific language. So more work
    required overall.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Malcolm McLean on Tue Feb 6 03:09:05 2024
    On Tue, 6 Feb 2024 02:59:11 +0000, Malcolm McLean wrote:

    But whilst the CMake scripts are quite short, they aren't trivial ...

    I know, I’ve fiddled with CMake scripts myself. The power is needed
    because builds can be complex.

    But whilst I do write my own CMkae scripts, I don't write them often
    enough to really know the language ...

    Same here. That’s why, every time I need to touch a CMake script, I make
    sure to have the docs open in a browser window <https://cmake.org/documentation/>. I don’t fiddle at random: I try to understand what I am doing. And sometimes I find the person who fiddled
    the script before me didn’t understand that there was a simpler way to do something.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Chris M. Thomasson on Tue Feb 6 07:25:48 2024
    On Mon, 5 Feb 2024 22:32:40 -0800, Chris M. Thomasson wrote:

    On 2/5/2024 6:27 PM, Lawrence D'Oliveiro wrote:

    But the C code will be anything up to an order of magnitude larger than
    the build rules written in the domain-specific language. So more work
    required overall.

    Putting the build rules in source code, is a bad idea, TM?

    Build rules *are* source code.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Malcolm McLean on Tue Feb 6 11:10:36 2024
    On 05/02/2024 22:19, Malcolm McLean wrote:
    On 05/02/2024 19:47, David Brown wrote:
    On 05/02/2024 16:17, Malcolm McLean wrote:
    On 05/02/2024 14:42, Ben Bacarisse wrote:
    Malcolm McLean <[email protected]> writes:


    The term "scripting language" is so vague as to be almost useless, but >>>> in this context the suggestion seems to be to use C as a language to
    invoke the commands required to compile and link a C program.  But C is >>>> very poorly suited to this task.  One would end up writing a library of >>>> functions to do string manipulation and (unless 'system' was deemed
    sufficient) program execution.  C itself would bring very little to the >>>> party and the resulting scripts would be hard to read.

    I'd dispute "very poorly". Of course you can devise a language which
    is better suited to building programs specifically. CMake takes
    exactly that approach. So what happens? It's effectively another
    programming language to learn. Someone else wrote elaborate CMakes
    scripts which I use at work. Sometimes thing go wrong. And then I'm
    messing about with a language I hardly ever use and does things in
    ways I am unfamilar with, trying to troubleshoot. It would be easier
    if the CMake scripts were in C. It might be bit less convenient. But
    it's just looping and branching and calling subroutines at the end of
    the day. As you know full well, that's all computers can do.

    The library would have facilities for working with lists of strings,
    of course. But C can do that perfectly well, and I don't see it as
    big problem.


    Have you ever tried /any/ other programming language?  Apart from
    assembly and Forth, I have not seen a language that is more cumbersome
    for string handling than C.  Yes, you /can/ do string handling in C,
    but no one would /choose/ C for that.

    It's not unreasonable to suggest that you'd rather base your build
    system on an existing mainstream language than a domain-specific
    language (though DSL's have the big advantage of having a syntax and
    semantics tuned to the task in question).  But why C?

    The newgroup is comp.lang.c. So if we propose using a generally accepted
    and widely understood programming language as our build scripting
    language, the choice of language has to be C.

    No, it does not. It is perfectly reasonable to say "not C".


    Here's a quick challenge for you.

    I have a directory "src".  I want to find all the .c files in "src".
    I want to make a list of all these files, and a list of matching
    object files to make in the "build/obj/src" directory.  For each file,
    I want to call "gcc -c src/file.c -o build/obj/src/file.o".  I want to
    do so in parallel, up to 8 commands at a time.  (Ignore any possible
    runtime errors.)


    With make, that would be (this is untested) :

         default : all
         .PHONY all
         srcdir = "src"
         objdir = "build/obj"

         srcfiles = $(wildcard $(srcdir)/*.c)
         objfiles_src = $(srcfiles:.c=.o)
         objfiles = $(addprefix $(objdir)/, \
             $(patsubst ../%,%,$(objfiles_src)))

         all : objfiles

         $(objdir)/src :
                 mkdir -p $@

         $(objdir)/%.o : %.c | $(objdir)/src
                 gcc -c $< -o $@

    Run with "make -j 8".


    With Python, you could have (untested) :

         import glob
         import os
         import multiprocessing
         import subprocess

         srcdir = "src"
         objdir = "build/obj"

         srcfiles = glob.glob(srcdir + "/*.c")
         objfiles_src = [fn[:-2] + ".o" for fn in srcfiles]
         objfiles = [objdir + "/" + fn for fn in srcfiles]

         os.mkdirs(objdir + "/" + srcdir, exist_ok = True)

         with multiprocessing.Pool(5) as pool :
             pool.map(lambda src, obj : subprocess.call("gcc -c " +
                      src + " -o " + obj, shell = True),
                     zip(srcfiles, objfiles))

    Obviously much of this could be put in a reusable library, so that end
    users don't have to know about process pools.  (And there are many
    other ways to structure such code.)


    Your task is to duplicate this in C.



    #include "builduils.h"

    int main(void)
    {
       STRINGLIST *sources;
       STRINGLIST *objfiles;
       int i;


       sources = glob("src/*.c");
       objfiles = stringlist();
       for (i = 0; i < stringlist_Nstrings(sources); i++)
       {
          char objname[1024];
          snprintf(objname, 1024, "%s/%s", objdir, replaceextension(stringlist_get(sources, i), ".o"));
          stringlist_add(objfiles, objname);
       }
       char outputdir[1024];
       snprinf("%s/%s", 1024, objdir, srcdir);
       mkdir(outputdir);

       callparallel("gcc -c %s %s", srcfiles, objfiles);
    }

    OK now I've cheated a little bit by inventing buildutils library
    functions ad hoc.

    Yes. Of course a practical solution would split the common code into a library, and not expect people to re-write that all the time. But a key
    point here is that in make and in Python - both well-established
    languages and tools - this is already in place. But you are skipping
    the hard bits. (Good luck implementing "callparallel" with a sane
    interface that includes the call you have made here!)

    We know it is /possible/ to do this kind of thing in C. We also know it
    is ugly and fiddly compared to other languages that handle strings and
    lists as first-class types (or at least have them as classes). Even
    when you assume the existence of a library with every type and function
    you want, you still need a lot of extra boilerplate code, you have
    artificial limits, memory leaks, etc.


    But I haven't done so outrageously. And we're leaking
    memory. But likely we have many gigabytres, so who are about maybe 2K
    for a few strings. And yes, maybe we could reoplace snprintf() with
    something that creates an arbitrary length output.

    The Python is a bit shorter and more concise. But there's not that much
    in it.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to Lawrence D'Oliveiro on Tue Feb 6 12:56:41 2024
    On 06/02/2024 00:38, Lawrence D'Oliveiro wrote:
    On Mon, 5 Feb 2024 21:28:20 +0000, bart wrote:

    (I tried your Python, but os.mkdir(dir) didn't work, even after fixing
    the name typo and removing the named argument.

    “mkdirs” should have been “makedirs” <https://docs.python.org/3/library/os.html#os.makedirs>.


    Oh, so it was a typo - the "s" was right, but not the "mk" :-(

    The code was just for reference, not tested or complete.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to bart on Tue Feb 6 12:47:25 2024
    On 05/02/2024 22:28, bart wrote:
    On 05/02/2024 19:47, David Brown wrote:
    On 05/02/2024 16:17, Malcolm McLean wrote:

    The library would have facilities for working with lists of strings,
    of course. But C can do that perfectly well, and I don't see it as
    big problem.


    Have you ever tried /any/ other programming language?  Apart from
    assembly and Forth, I have not seen a language that is more cumbersome
    for string handling than C.  Yes, you /can/ do string handling in C,
    but no one would /choose/ C for that.

    See the build.c demo elsewhere in the thread.


    I have a directory "src".  I want to find all the .c files in "src".
    I want to make a list of all these files, and a list of matching
    object files to make in the "build/obj/src" directory.  For each file,
    I want to call "gcc -c src/file.c -o build/obj/src/file.o".  I want to
    do so in parallel, up to 8 commands at a time.  (Ignore any possible
    runtime errors.)

    With Python, you could have (untested) :

         import glob
         import os
         import multiprocessing
         import subprocess

         srcdir = "src"
         objdir = "build/obj"

         srcfiles = glob.glob(srcdir + "/*.c")
         objfiles_src = [fn[:-2] + ".o" for fn in srcfiles]
         objfiles = [objdir + "/" + fn for fn in srcfiles]

         os.mkdirs(objdir + "/" + srcdir, exist_ok = True)

         with multiprocessing.Pool(5) as pool :
             pool.map(lambda src, obj : subprocess.call("gcc -c " +
                      src + " -o " + obj, shell = True),
                     zip(srcfiles, objfiles))

    Obviously much of this could be put in a reusable library, so that end
    users don't have to know about process pools.  (And there are many
    other ways to structure such code.)

    I had a go at expressing the Python version in my scripting language:

        srcdir := "src/"
        objdir := "build/obj/"+srcdir

        srcfiles := dirlist(srcdir + "*.c")

        objfiles_src := mapvs(changeext, srcfiles, "o")
        objfiles := mapsv(+, objdir, objfiles_src)

        createdir(objdir)

        for i, sfile in srcfiles do
            execcmd(sfprint("gcc -c # -o #", srcdir+sfile, objfiles[i]))
        od

    A couple of issues: 'createdir' can only create one directory at a time;
    for a chain of them like a/b/c, I'd need to split it up and do it one by
    one. (To test this, I created it manually.)

    Your scripting language is definitely an improvement over C for this
    kind of thing. That is as I would expect - scripting languages tend to
    be much better at working with strings and external commands.


    Also, I don't have any features for parallel executions. However
    'execcmd' starts a process but then doesn't wait for it to complete. If
    I instead use 'system' (or my 'execwait'), then compiling all Lua .c
    files in ./src takes 6.5 seconds instead of 2.5 seconds.

    But this isn't the point of posting this. My interpreter can be
    expressed as a single C file. That allows you to use this kind of
    scripting, without adding any extra dependencies. You still only need a
    C compiler.

    (I tried your Python, but os.mkdir(dir) didn't work, even after fixing
    the name typo and removing the named argument. Bypassing that, there
    were all sorts of errors to do with pickle.py and 'unbounded' methods.


    "os.mkdirs()" was not a typo. "mkdirs()" is a different function from "mkdir()" - it can make multiple directories as needed. But I may have
    had other errors.

    I believe your glob.glob routine returns filenames with path prepended
    (my dirlist doesn't).

    Yes.

    I suspect that's why you chose a dest path as
    build/obj/src rather than build/obj.

    No. I used that because it is the structure I use for bigger projects.
    I generally have multiple source directories, and in my build
    directories I have "obj", "deps" (for dependency files), perhaps also directories for listings files and other bits and pieces, and inside
    these the directory structure mirrors the source tree. (I actually
    often have another layer between "build" and "obj" because there can be multiple different builds for the same project.)

    Anyway, I wasn't able to compare
    peformance.)

    There was no intention of that. I merely wanted to show that something
    that is key to software building can be done relatively easily in a higher-level language suitable for scripting, in a way that is simpler
    and easier than in C.

    It was not a full build in any sense, and did not cover dependency graphs.


    BTW this scripting is still hard work. For building you want to express
    the requires as data, not code. If you can do that (see my build.c
    example), then C may be adequate.


    Eventually, when you add all the features and flexibility people want,
    you end up with a domain-specific language as you data, interpreted by
    code written in some other language (C or whatever - though again I
    expect other languages to be more suitable). You've just re-invented
    "make".

    Of course it is possible to re-invent make in a better way than the
    original (at least for some people's preferences and needs - "better" is
    always subjective). If I were doing so, I'd want an embedded high-level scripting language of some sort that was not a variant of Lisp. (Some
    people like Lisp, I don't.) Possible options from existing languages
    would be Lua, TCL and Python. And I'd want it neatly integrated, not
    just a way to make extensions, as that would simplify the main DSL.
    (There's plenty of other things I'd change, but I don't want to go
    through a list here.)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)