• Memory organisation of the Z80 Turbo Pascal compiler

    From Lasse =?iso-8859-1?q?Hiller=F8e?= P@21:1/5 to All on Wed Sep 29 11:40:20 2021
    I have been thinking about the old TurboPascal compiler (actually about
    its predecessor COMPAS Pascal, which I used long ago, but I believe the original Z80 CP/M TurboPascal was exactly the same in this regard.)

    In the TP 2.0 manual it says:
    "The recursion stack is used only by recursive procedures and
    functions, i.e. procedures and functions compiled with the
    A compiler directive passive (~A-}). On entry to a recursive
    subprogram it copies its workspace onto the recursion stack,
    and on exit the entire workspace is restored to its original
    state. The default initial value of RecurPtr at the beginning
    of a program, is 1 K ($400) bytes below the CPU stack pointer."

    In other words, all procedures in a program have their own statically
    located "workspace", and when recursive calls are made, the previously activation's local variables are saved away. I suppose this was practical
    on the Z80, because indexed operations were rather expensive and not
    relative to SP, using the separate IX and IY registers, whereas an LDIR instruction was reasonably efficient.

    As far as I can tell, this works great even for a language with nested
    lexical scope like Pascal. When an inner procedure refers to a containing procedure's local variable, it will get the currently active instance.

    However, I think there would be an issue when passing procedure local
    variables as VAR parameters in combination with some forms of mutual and/
    or nested recursion, as the address passed would refer to the right
    workspace, but the workspace might contain the wrong activation instance.
    I'm not completely sure about this, and I don't have a setup at the
    moment where I could experiment and test it.

    In any case, I have never found any description of this style of scope management, since I actually used this compiler back in the 80es, and I
    don't know if it even has a name?

    Other than the copying (which may be considered inefficient, but only
    applies to recursive procedures) and the possible issue with passing
    references ending up pointing to the wrong variable, what would be the
    pros and cons of this method?

    /Lasse

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Hans-Peter Diettrich@21:1/5 to All on Thu Sep 30 11:15:32 2021
    On 9/29/21 1:40 PM, Lasse Hillerøe Petersen wrote:
    I have been thinking about the old TurboPascal compiler (actually about
    its predecessor COMPAS Pascal, which I used long ago, but I believe the original Z80 CP/M TurboPascal was exactly the same in this regard.)

    In the TP 2.0 manual it says:
    "The recursion stack is used only by recursive procedures and
    functions, i.e. procedures and functions compiled with the
    A compiler directive passive (~A-}). On entry to a recursive
    subprogram it copies its workspace onto the recursion stack,
    and on exit the entire workspace is restored to its original
    state. The default initial value of RecurPtr at the beginning
    of a program, is 1 K ($400) bytes below the CPU stack pointer."

    The famous GfA-Basic for Atari ST and Amiga used a similar strange
    handling of local variables as global variables. On entry of a
    subroutine all "local" variables were saved on the stack and restored on
    exit. In between all references within the entire code used the local
    variables of that subroutine instance in the global workspace. This way
    every variable of a unique name and type had a fixed address at runtime.

    [...]
    Other than the copying (which may be considered inefficient, but only
    applies to recursive procedures) and the possible issue with passing references ending up pointing to the wrong variable, what would be the
    pros and cons of this method?

    I don't remember whether GfA Basic included pointers or distinct ref/val parameters. At least it was a damn fast framework on those 68k
    processors, in detail compared to 8086 processors of that time.

    A con of the GfA method was found in the later PC version that used the traditional and compatible way of storing local variables in the
    stackframe. This broke some legacy code, sooner or later, and I ended up
    with a set of subroutines that could not be converted into any other
    language. Unfortunately this was the tricky analysis of complex
    conditional expressions in IF (WHILE...) statements of my C decompiler :-(

    DoDi

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lasse =?iso-8859-1?q?Hiller=F8e?= P@21:1/5 to Hans-Peter Diettrich on Sat Oct 2 10:38:58 2021
    On Thu, 30 Sep 2021 11:15:32 +0200, Hans-Peter Diettrich wrote:

    I don't remember whether GfA Basic included pointers or distinct ref/val parameters. At least it was a damn fast framework on those 68k
    processors, in detail compared to 8086 processors of that time.

    Interesting. I'd have thought that on the 68K architecture you would
    always prefer using stack based frames, given its many addressing modes
    and address registers.

    A con of the GfA method was found in the later PC version that used the traditional and compatible way of storing local variables in the
    stackframe. This broke some legacy code, sooner or later, and I ended up
    with a set of subroutines that could not be converted into any other language. Unfortunately this was the tricky analysis of complex
    conditional expressions in IF (WHILE...) statements of my C decompiler

    So you depended on the "feature"? I don't quite understand how you
    managed to do that.

    I just noticed that in the TP manual, the second paragraph down on the
    page I quoted from actually says:
    "Because of this technique, variables local to a subprogram must not
    be used as var parameters in recursive calls."

    I quoted the TP manual, as at the time I couldn't find the original
    Danish COMPAS manual, and it was the "official" English translation.
    However, the corresponding page in the COMPAS 2.0 manual(COMPAS 3.0
    probably being equivalent to TP 1.0) does not have this caveat, so I
    guess Anders Hejlsberg had not noticed this problem with the save-restore method until then. Ah, the follies of youth. :-)

    Given these problems, I suppose the method was not deemed very "useful"
    by compiler designers. I still wonder if there are more examples of its
    use, and if there is a name for it, or maybe literature describing it.

    An "exercise" set that I guess I will try to figure out now, and which
    others might find entertaining:
    1. Is there a simple way to solve the problem, like figuring out where on
    the routine stack the variable passed by reference will be during the
    recusive call, and just pass that address instead?
    2. How does nested subroutines fit into the method? Should locals of a
    nested subroutine be considered part of the locals of the containing subroutine? What is the situation with the case of indirect recursion,
    where the inner calls its containing subroutine?
    3. Are there other fun ways this method can be used, exploited or abused?

    (Now where did I put my Z80 CP/M emulator? :-) )
    [This isn't the same thing, but in IBM 360 Fortran, the suroutine prolog copied scalar
    parameters into locals and epilog copied them out. It didn't have indirect addressing so that let it use the same base register as for the other locals. Fortran
    didn't allow recursive calls but if you used aliased arguments you could get confusing
    results. -John]

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From gah4@21:1/5 to All on Sat Oct 2 13:08:36 2021
    On Saturday, October 2, 2021 at 10:35:45 AM UTC-7, Lasse Hillerøe Petersen wrote:

    (snip)

    (Now where did I put my Z80 CP/M emulator? :-) )
    [This isn't the same thing, but in IBM 360 Fortran, the suroutine prolog copied scalar
    parameters into locals and epilog copied them out. It didn't have indirect addressing so that let it use the same base register as for the other locals. Fortran
    didn't allow recursive calls but if you used aliased arguments you could get confusing results. -John]

    This comes up in comp.lang.fortran every time someone mentions that Fortran uses pass-by-reference. Fortran still allows for this calling method.
    (If you put slashes around the dummy argument, OS/360 uses actual pass by reference.)

    Also, in some cases Fortran requires a contiguous array, such that a copy
    is made before a subroutine call, and copy back after return, in the case of
    a (potentially) discontiguous array.

    Since PL/I has internal procedures, a procedure pointer (ENTRY variable in
    PL/I terms) includes a reference to the appropriate set of variables
    available to callers of such. (And it even allows nested internal procedures.)

    Getting that right slowed down the addition of pointers to internal
    subroutines to the Fortran standard by many years.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Anton Ertl@21:1/5 to Petersen on Sat Oct 2 20:45:43 2021
    Lasse =?iso-8859-1?q?Hiller=F8e?= Petersen <[email protected]> writes:
    I have been thinking about the old TurboPascal compiler (actually about
    its predecessor COMPAS Pascal, which I used long ago, but I believe the >original Z80 CP/M TurboPascal was exactly the same in this regard.)

    In the TP 2.0 manual it says:
    "The recursion stack is used only by recursive procedures and
    functions, i.e. procedures and functions compiled with the
    A compiler directive passive (~A-}). On entry to a recursive
    subprogram it copies its workspace onto the recursion stack,
    and on exit the entire workspace is restored to its original
    state. The default initial value of RecurPtr at the beginning
    of a program, is 1 K ($400) bytes below the CPU stack pointer."

    In other words, all procedures in a program have their own statically
    located "workspace", and when recursive calls are made, the previously >activation's local variables are saved away. I suppose this was practical
    on the Z80, because indexed operations were rather expensive and not
    relative to SP, using the separate IX and IY registers, whereas an LDIR >instruction was reasonably efficient.

    As far as I can tell, this works great even for a language with nested >lexical scope like Pascal.

    No, it does not give you lexical scoping. Try the man-or-boy test <http://rosettacode.org/wiki/Man_or_boy_test#Pascal>, or the following
    Lisp example:

    testr[x,p,f,u] <- if p[x] then f[x]
    else if atom[x] then u[]
    else testr[cdr[x],p,f,lambda:testr[car[x],p,f,u]].

    This is M-expression syntax for Lisp 2.0, which did not catch on;
    McCarthy gave this example (written by James R. Slagle) as the case
    that showed that Lisp's implementation of the time did not implement
    lexical scoping, which McCarthy had intended, and which this program
    relies on.

    When an inner procedure refers to a containing
    procedure's local variable, it will get the currently active instance.

    Which can be the wrong one.

    However, I think there would be an issue when passing procedure local >variables as VAR parameters in combination with some forms of mutual and/
    or nested recursion, as the address passed would refer to the right >workspace, but the workspace might contain the wrong activation instance.

    That is another problem, but note that the Lisp routine above uses call-by-value.

    In any case, I have never found any description of this style of scope >management, since I actually used this compiler back in the 80es, and I
    don't know if it even has a name?

    My guess is that in the early days many compilers used similar
    techniques, because many compilers failed the man-or-boy test. But
    given that it is incorrect as implementation technique for lexical
    scoping, I doubt that these techniques have been given names.

    Other than the copying (which may be considered inefficient, but only
    applies to recursive procedures) and the possible issue with passing >references ending up pointing to the wrong variable, what would be the
    pros and cons of this method?

    Apart from the lexical scoping problem, it is also not particularly
    efficient in data memory: It consumes space for all local variables of
    all functions at all times (plus the recursion stack), whereas
    implementations using activation frames only keep as many as are alive
    at the same time.

    - anton
    --
    M. Anton Ertl
    [email protected]
    http://www.complang.tuwien.ac.at/anton/

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Hans-Peter Diettrich@21:1/5 to All on Sun Oct 3 02:06:36 2021
    On 10/2/21 12:38 PM, Lasse Hillerøe Petersen wrote:
    On Thu, 30 Sep 2021 11:15:32 +0200, Hans-Peter Diettrich wrote:

    I don't remember whether GfA Basic included pointers or distinct ref/val
    parameters. At least it was a damn fast framework on those 68k
    processors, in detail compared to 8086 processors of that time.

    Interesting. I'd have thought that on the 68K architecture you would
    always prefer using stack based frames, given its many addressing modes
    and address registers.

    Much 68k software, particularly compilers, was poorly ported 8 bit (CP/M?) code. E.g. one compiler only used the 68k address registers because "an
    int is a pointer, a pointer is an int". For expression evaluation the
    operands of each binary operation were copied from memory to the address registers (A0, A1), from there to the data registers (D0, D1) and
    afterwards back again, typically in subroutines at least for logical
    operators. This way a simple single statement C function generated 3
    pages of assembly listing, and that mess made me start the C decompiler development. At that time C was very new to me, and none of the
    compilers had ever heard about ANSI C. So I did all work in GfA Basic
    and was happy with it. After a few years I had a decompiler for
    executables and libraries of half a dozen C compilers, including Amiga
    and HP-UX.

    A con of the GfA method was found in the later PC version that used the
    traditional and compatible way of storing local variables in the
    stackframe. This broke some legacy code, sooner or later, and I ended up
    with a set of subroutines that could not be converted into any other
    language. Unfortunately this was the tricky analysis of complex
    conditional expressions in IF (WHILE...) statements of my C decompiler

    So you depended on the "feature"? I don't quite understand how you
    managed to do that.

    It was by accident, because at that time I couldn't know how it could be
    done otherwise. All my practical experience with homecomputers was Basic
    and machine language, having seen compilers (Algol, Cobol...) only as a
    user on the TR-440. At least the Gfa method was such a clean and
    efficient approach that Occams Razor...

    DoDi

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)