• ARM is channeling the IBM 360

    From John Savard@21:1/5 to All on Thu Jun 20 11:56:29 2024
    I saw this article about a memory-proitection mechanism on ARM that
    has been bypassed...

    https://www.techspot.com/news/103440-researchers-crack-arm-memory-safety-mechanism-achieve-95.html

    and I was struck by how similar it sounds to the memory keys used on
    the 360.

    John Savard

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From John Levine@21:1/5 to All on Thu Jun 20 18:45:47 2024
    According to John Savard <[email protected]d>:
    I saw this article about a memory-proitection mechanism on ARM that
    has been bypassed...

    https://www.techspot.com/news/103440-researchers-crack-arm-memory-safety-mechanism-achieve-95.html

    and I was struck by how similar it sounds to the memory keys used on
    the 360.

    It's not that close. S/360 had a single key in the PSW that it matched against all of a program's storage refrences while this has the tag in a pointer, so it's more like a capability.

    The x86 protection keys are more like S/360. There's a key for each
    virtual page and a PKRU register that has to match.

    --
    Regards,
    John Levine, [email protected], Primary Perpetrator of "The Internet for Dummies",
    Please consider the environment before reading this e-mail. https://jl.ly

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From moi@21:1/5 to John Savard on Thu Jun 20 20:42:01 2024
    On 20/06/2024 18:56, John Savard wrote:
    I saw this article about a memory-proitection mechanism on ARM that
    has been bypassed...

    https://www.techspot.com/news/103440-researchers-crack-arm-memory-safety-mechanism-achieve-95.html

    and I was struck by how similar it sounds to the memory keys used on
    the 360.

    John Savard

    Or the 3 years earlier LEO 3.

    --
    Bill F.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From MitchAlsup1@21:1/5 to John Savard on Thu Jun 20 22:55:41 2024
    John Savard wrote:

    I saw this article about a memory-proitection mechanism on ARM that
    has been bypassed...

    https://www.techspot.com/news/103440-researchers-crack-arm-memory-safety-mechanism-achieve-95.html

    and I was struck by how similar it sounds to the memory keys used on
    the 360.

    Apparently good enough to 'help' debug, but insufficient to add any real security.
    probably "not enough bits"

    The 4 seconds part leads me to believe the attackers are using brute
    force, since
    in 4 seconds one can try something like 2^28 patterns.

    John Savard

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lynn Wheeler@21:1/5 to John Levine on Thu Jun 20 14:28:04 2024
    John Levine <[email protected]> writes:
    It's not that close. S/360 had a single key in the PSW that it matched against
    all of a program's storage refrences while this has the tag in a pointer, so it's more like a capability.

    The x86 protection keys are more like S/360. There's a key for each
    virtual page and a PKRU register that has to match.

    360s, each 2kbytes had 4bit storage protect key .... match executing psw
    4kbit key to storage protect key. zero in psw 4kbit key reserved for
    system and allowing access all storage ... non-zero allowing for
    (isolating) up to 15 separated concurrently executing (mvt) regions.

    a little over decade ago was asked to track down decision to add virtual
    memory to all 370s. basically MVT storage management was so bad that
    required specifying region storage requirements four times larger than
    used ... limiting number of concurrently executing regions to less than
    number needed for keeping 1mbyte, 370/165 busy and justified. Going to
    single 16mbyte virtual memory (VS2/SVS) allowed increasing number of
    concurrent regions by factor of four (up to 15) with litte or no paging
    (sort of like running MVT in a CP67 16mbyte virtual machine). Biggest
    bit of code was creating a copy of passed channel (I/O) programs,
    substituting real addresses for virtual addresses (Ludlow borrows
    "CCWTRANS" from CP67, crafting into MVT EXCP/SVC0).

    trivia: 370/165 engineers started complaining they if they had to
    implement the full 370 virtual memory architecture, it would slip
    announce by six months ... so several features were dropped (including
    virtual memory segment table entry r/o flag, could have combination of different virtual address spaces sharing the same segment, some being
    r/w and some being r/o). Note: other models (& software) that
    implmeneted full architecture, had to drop back to 370/165 subset.

    370s were getting larger fast and increasingly needed more than 15
    concurrently executing regions (to keep systems busy and justified) and
    so transition to VS2/MVS, a different virtual address space for each
    region (isolating each region storage access in different virtual
    address space). However, it inherited os/360 pointer-passing APIs and so
    mapped an image of the "MVS" kernel image into eight mbytes of every
    virtual address space (leaving eight for application). Also "subsystems"
    were mapped into separate address spaces and (pointer passing API)
    needed to access application storage. Initially a common 1mbyte segment
    storage area was mapped into all address spaces (common segment
    area/"CSA"). However space requirements was somewhat proportional to
    number of subsystems and concurrently executing application and "CSA"
    quickly becomes "common system area").

    By 3033 time-frame CSA was frequently 5-6mbytes ... leaving 2-3mbytes
    for application regions (and threatening to becoming 8mbytes, leaving
    zero). This was part of mad rush to xa/370 ... special architecture
    features for MVS, including subsystems able to concurrently access
    multiple address spaces (a subset was eventually retrofitted to 3033 as "dual-address space mode").

    other trivia: in 70s, I was pontificating that there was increasing
    mismatch between disk throughput and system throughput. In early 80s I
    wrote a tome about relative system disk throughput had declined by an
    order of magnitude since os/360 announce (systems got 40-50 times
    faster, disks only got 3-5 times faster). Some disk executive took
    exception and assigned the division system performance group to refute
    the claim. After a couple weeks they came back and effectively said I
    had slightly understated the case. Their analysis was then turned into (mainframe user group) SHARE https://en.wikipedia.org/wiki/SHARE_Operating_System

    presentation about configurating disks for better system throughput
    (16Aug1984, SHARE 63, B874).

    --
    virtualization experience starting Jan1968, online at home since Mar1970

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to Lynn Wheeler on Fri Jun 21 13:27:29 2024
    Lynn Wheeler <[email protected]> writes:
    John Levine <[email protected]> writes:
    It's not that close. S/360 had a single key in the PSW that it matched against
    all of a program's storage refrences while this has the tag in a pointer, so >> it's more like a capability.

    The x86 protection keys are more like S/360. There's a key for each
    virtual page and a PKRU register that has to match.

    360s, each 2kbytes had 4bit storage protect key .... match executing psw >4kbit key to storage protect key. zero in psw 4kbit key reserved for
    system and allowing access all storage ... non-zero allowing for
    (isolating) up to 15 separated concurrently executing (mvt) regions.

    The storage protect key plays a key (pun intended) role in Thomas P. Ryan's _The Adolescence of P-1_. It also describes a timing related attack that yielded a key value of zero as part of the plot.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From John Dallman@21:1/5 to [email protected] on Sat Jun 22 11:58:00 2024
    In article <[email protected]>, [email protected] (MitchAlsup1) wrote:

    The 4 seconds part leads me to believe the attackers are using brute
    force, since in 4 seconds one can try something like 2^28 patterns.

    Makes sense: there's probably a hash of the block address involved in generating those 4-bit tags, and they're brute-forcing the salt.

    Memory Tagging Extension doesn't seem to have been heavily used as yet,
    which may be why ARM aren't very worried about it.

    Pointer Authentication Code in ARMv8.3 is usable within functions, but
    has problems, in that compilers can't readily tell if stored pointers
    have PAC signatures.

    Branch Target Indicators in ARMv8.5 work quite well, but need linker and
    loader support.

    PAC and BTI use instructions that are no-ops in earlier versions of ARM64,
    but you really ought to test that they work before releasing binaries
    that include them. I've done that on Android, where I had the combination
    of hardware, OS and compiler to do it.

    Once I can get an ARM Linux server with the hardware support, Linux
    kernel 5.0 onwards and modern GCC will let me do it there. Windows and
    macOS on ARM both lack some pieces, and iOS requires you to change ABIs
    and drop support for older 64-bit devices.

    John

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From MitchAlsup1@21:1/5 to John Dallman on Sat Jun 22 20:22:35 2024
    John Dallman wrote:

    In article <[email protected]>, [email protected] (MitchAlsup1) wrote:

    The 4 seconds part leads me to believe the attackers are using brute
    force, since in 4 seconds one can try something like 2^28 patterns.

    Makes sense: there's probably a hash of the block address involved in generating those 4-bit tags, and they're brute-forcing the salt.

    Memory Tagging Extension doesn't seem to have been heavily used as yet,
    which may be why ARM aren't very worried about it.

    Pointer Authentication Code in ARMv8.3 is usable within functions, but
    has problems, in that compilers can't readily tell if stored pointers
    have PAC signatures.

    Branch Target Indicators in ARMv8.5 work quite well, but need linker
    and
    loader support.

    Is one worried about branches within a single subroutine getting
    bonked (execute but no write permissions).

    If not, then disallowing the application code from writing GOT should
    solve the problem. The dynamic linker will not setup a GOT entry that
    is not an actual entry point allowed.

    But perhaps some kind of abuse of method calling would enable an entry
    point pointer to get bonked.....

    PAC and BTI use instructions that are no-ops in earlier versions of
    ARM64,
    but you really ought to test that they work before releasing binaries
    that include them. I've done that on Android, where I had the
    combination
    of hardware, OS and compiler to do it.

    Once I can get an ARM Linux server with the hardware support, Linux
    kernel 5.0 onwards and modern GCC will let me do it there. Windows and
    macOS on ARM both lack some pieces, and iOS requires you to change ABIs
    and drop support for older 64-bit devices.

    John

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Lynn Wheeler on Mon Jun 24 01:01:18 2024
    On Thu, 20 Jun 2024 14:28:04 -1000, Lynn Wheeler wrote:

    a little over decade ago was asked to track down decision to add virtual memory to all 370s. basically MVT storage management was so bad that
    required specifying region storage requirements four times larger than
    used ...

    Let me see if I got this straight.

    In the early days, the offshoots of OS/360 were “MFT” (“Multiple Fixed Tasks”) and “MVT” (“Multiple Variable Tasks”). The “Multiple” part had to
    do with this new-fangled thing called “multiprogramming”: basically, there was so much RAM available by that point that you could typically keep
    multiple programs memory-resident at once, even if only one was executing, instead of having to swap everybody but the current program out.

    The difference was, with MFT, a program had to declare its memory
    requirement before it could be started, and the only way to change that
    was to stop the program and start it again. Whereas MVT allowed a program
    to change its memory requirements while it was executing. (Whoah! Program relocation requirement styleee!)

    Then, when virtual memory came along, MFT became OS/VS1, while MVT became OS/VS2.

    Eventually, the MFT→OS/VS1 line died a long-overdue death, and OS/VS2
    became “MVS”. Not sure what other name changes happened along the way, but nowadays this is known as “z/OS”.

    Does that make sense?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to moi on Mon Jun 24 01:02:29 2024
    On Thu, 20 Jun 2024 20:42:01 +0100, moi wrote:

    Or the 3 years earlier LEO 3.

    Lyons Tea-Shops computers! (Yes, I have the Lavington book.)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From John Levine@21:1/5 to All on Mon Jun 24 02:52:05 2024
    According to Lawrence D'Oliveiro <[email protected]d>:
    The difference was, with MFT, a program had to declare its memory
    requirement before it could be started, and the only way to change that
    was to stop the program and start it again. Whereas MVT allowed a program
    to change its memory requirements while it was executing. (Whoah! Program >relocation requirement styleee!)

    Nope. MFT partitioned memory into fixed sized areas when the system
    started, MVT assigned each program as much memory as it said it
    needed, and the areas could be reallocated between job steps. In every
    case the JCL had to say how big a partition each job step needed.

    MFT II made it possible to change the partition sizes, but it was
    still manual, as opposed to MVT which allocated partitions as needed.

    Regardless of which flavor of OS you used, there was no way to
    relocate a program once it had been loaded into memory.
    Roll-out/roll-in was a primitive kind of swapping, but it swapped a
    program out and later back into the same place.

    Eventually, the MFT→OS/VS1 line died a long-overdue death, and OS/VS2 >became “MVS”. Not sure what other name changes happened along the way, but >nowadays this is known as “z/OS”.

    Does that make sense?

    Not really. VS1 was basically MFT running in a single virtual address
    space. The early versions of VS2 were SVS, MVT running in a single
    virtual address space, and then MVS, where each job got its own
    address space. As Lynn has often explained, OS chewed up so much of
    the address space that they needed MVS to make enough room for
    programs to keep doing useful work.

    --
    Regards,
    John Levine, [email protected], Primary Perpetrator of "The Internet for Dummies",
    Please consider the environment before reading this e-mail. https://jl.ly

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lynn Wheeler@21:1/5 to John Levine on Sun Jun 23 17:46:11 2024
    John Levine <[email protected]> writes:
    Not really. VS1 was basically MFT running in a single virtual address
    space. The early versions of VS2 were SVS, MVT running in a single
    virtual address space, and then MVS, where each job got its own
    address space. As Lynn has often explained, OS chewed up so much of
    the address space that they needed MVS to make enough room for
    programs to keep doing useful work.

    ... also SVS single 16mbyte virtual address space (sort of like running
    MVT in CP67 16mbyte virtual machine) to "protect" regions from each
    other still used the 360 4bit storage protection key ... so caped at 15 concurrent regions ... but systems were getting faster, much faster than
    disks were getting faster ... so needed increasing numbers of
    concurrently executing regions ... so went to MVS ... gave each region
    its own virtual address space (to keep them isolated/protected from each other). But MVS was becoming increasingly bloated both in real storage
    and amount it took in each region's virtual address space .... so needed
    more than 16mbyte real storage as well as more than 16mbyte virtual
    storage.

    trivia: I was pontificating in the 70s about mismatch between increase
    in system throughput (memory & CPU) and increase in disk throughput. In
    early 80s wrote a tome that the relative system throughput of disk had
    declined by an order of magnitude since 360 was announced in the 60s
    (systems increase 40-50 times, disks increased 3-5 times). A disk
    division executive took exception and assigned the division performance
    group to refute my claims. After a couple of weeks, they basically came
    back and said that I had slightly understated the problem.

    They then respun the analysis for a (mainframe user group) SHARE
    presentation for how to configure disks for increased system throughput (16Aug1984, SHARE 63, B874).

    more recently there have been some references that cache-miss, memory
    access latency, when measured in count of processor cycles, is
    compareable to 60s disk access latency, when measure in count of 60s
    processor cycles (memory is new disk).

    --
    virtualization experience starting Jan1968, online at home since Mar1970

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Lynn Wheeler on Mon Jun 24 04:27:08 2024
    On Sun, 23 Jun 2024 17:46:11 -1000, Lynn Wheeler wrote:

    trivia: I was pontificating in the 70s about mismatch between increase
    in system throughput (memory & CPU) and increase in disk throughput. In
    early 80s wrote a tome that the relative system throughput of disk had declined by an order of magnitude since 360 was announced in the 60s
    (systems increase 40-50 times, disks increased 3-5 times).

    How much of theoretical disk bandwidth was the filesystem capable of
    using? Because I know early Unix systems were pretty terrible in that
    regard, until Berkeley’s “Fast File System” came along.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Terje Mathisen@21:1/5 to Lynn Wheeler on Mon Jun 24 07:34:05 2024
    Lynn Wheeler wrote:

    John Levine <[email protected]> writes:
    Not really. VS1 was basically MFT running in a single virtual address
    space. The early versions of VS2 were SVS, MVT running in a single
    virtual address space, and then MVS, where each job got its own
    address space. As Lynn has often explained, OS chewed up so much of
    the address space that they needed MVS to make enough room for
    programs to keep doing useful work.

    ... also SVS single 16mbyte virtual address space (sort of like running
    MVT in CP67 16mbyte virtual machine) to "protect" regions from each
    other still used the 360 4bit storage protection key ... so caped at 15 concurrent regions ... but systems were getting faster, much faster than disks were getting faster ... so needed increasing numbers of
    concurrently executing regions ... so went to MVS ... gave each region
    its own virtual address space (to keep them isolated/protected from each other). But MVS was becoming increasingly bloated both in real storage
    and amount it took in each region's virtual address space .... so needed
    more than 16mbyte real storage as well as more than 16mbyte virtual
    storage.

    trivia: I was pontificating in the 70s about mismatch between increase
    in system throughput (memory & CPU) and increase in disk throughput. In
    early 80s wrote a tome that the relative system throughput of disk had declined by an order of magnitude since 360 was announced in the 60s
    (systems increase 40-50 times, disks increased 3-5 times). A disk
    division executive took exception and assigned the division performance
    group to refute my claims. After a couple of weeks, they basically came
    back and said that I had slightly understated the problem.

    They then respun the analysis for a (mainframe user group) SHARE
    presentation for how to configure disks for increased system throughput (16Aug1984, SHARE 63, B874).

    more recently there have been some references that cache-miss, memory
    access latency, when measured in count of processor cycles, is
    compareable to 60s disk access latency, when measure in count of 60s processor cycles (memory is new disk).

    Not only is RAM the new disk, but last level cache is the new RAM, and
    you could argue that $L1 plays the role of a vector computer register array.

    The result values forwarding network is the new register array.

    Yeah, the comparison does break down a bit at the very end, but going in
    the opposite direction, disk (of the spinning rust variety) is an almost perfect match for 60'ies tape: Getting to any particular spot takes a
    long time, so when you get there you had better do a lot of sequential
    access!

    Terje

    --
    - <Terje.Mathisen at tmsw.no>
    "almost all programming can be viewed as an exercise in caching"

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Terje Mathisen on Mon Jun 24 05:45:17 2024
    On Mon, 24 Jun 2024 07:34:05 +0200, Terje Mathisen wrote:

    Not only is RAM the new disk ...

    That would be DRAM. Whatever happened to SRAM? Surely having a few
    mebibytes of that can’t be a big cost these days.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Thomas Koenig@21:1/5 to John Levine on Mon Jun 24 08:13:17 2024
    John Levine <[email protected]> schrieb:
    According to Lawrence D'Oliveiro <[email protected]d>:
    The difference was, with MFT, a program had to declare its memory >>requirement before it could be started, and the only way to change that
    was to stop the program and start it again. Whereas MVT allowed a program >>to change its memory requirements while it was executing. (Whoah! Program >>relocation requirement styleee!)

    Nope. MFT partitioned memory into fixed sized areas when the system
    started, MVT assigned each program as much memory as it said it
    needed, and the areas could be reallocated between job steps. In every
    case the JCL had to say how big a partition each job step needed.

    When I first looked at the syntax for slurm, a workload manager
    for HPC clusters (did somebody say "Job Entry Subsystem"?),
    I found it striking how similar its syntax and semantics are to
    JCL's job cards. But then, they have a similar task.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Thomas Koenig on Mon Jun 24 08:34:15 2024
    On Mon, 24 Jun 2024 08:13:17 -0000 (UTC), Thomas Koenig wrote:

    When I first looked at the syntax for slurm, a workload manager for HPC clusters (did somebody say "Job Entry Subsystem"?),
    I found it striking how similar its syntax and semantics are to JCL's
    job cards. But then, they have a similar task.

    Comparison between slurm and other cluster managers here <https://slurm.schedmd.com/rosetta.html>.

    Wonder why JCL isn’t on that list ...

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Thomas Koenig@21:1/5 to Terje Mathisen on Mon Jun 24 08:17:11 2024
    Terje Mathisen <[email protected]> schrieb:

    Not only is RAM the new disk, but last level cache is the new RAM, and
    you could argue that $L1 plays the role of a vector computer register array.

    The result values forwarding network is the new register array.

    These days, I tend to think as memory accesses as asynchronous I/O.

    Yeah, the comparison does break down a bit at the very end, but going in
    the opposite direction, disk (of the spinning rust variety) is an almost perfect match for 60'ies tape: Getting to any particular spot takes a
    long time, so when you get there you had better do a lot of sequential access!

    There is also room for an SSD in your comparison: It's the new disk.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to Lawrence D'Oliveiro on Mon Jun 24 13:33:48 2024
    Lawrence D'Oliveiro <[email protected]d> writes:
    On Mon, 24 Jun 2024 07:34:05 +0200, Terje Mathisen wrote:

    Not only is RAM the new disk ...

    That would be DRAM. Whatever happened to SRAM? Surely having a few
    mebibytes of that can’t be a big cost these days.

    I'm not sure what a mebibyte is.

    SRAM is still relatively expensive, as it requires 6 times as many
    transisters per cell as DRAM. It's used on-chip for various
    purposes in certain SoCs, but not generally for bulk memory.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From John Dallman@21:1/5 to Lurndal on Mon Jun 24 15:20:00 2024
    In article <0BeeO.13060$[email protected]>, [email protected] (Scott Lurndal) wrote:
    Lawrence D'Oliveiro <[email protected]d> writes:
    That would be DRAM. Whatever happened to SRAM? Surely having a few >mebibytes of that can't be a big cost these days.
    I'm not sure what a mebibyte is.

    A pedant's way of saying 1024*1024 bytes, as opposed to 1000*1000 bytes. <https://en.wikipedia.org/wiki/Byte#Units_based_on_powers_of_10>

    John

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Stephen Fuld@21:1/5 to Lawrence D'Oliveiro on Mon Jun 24 17:13:32 2024
    Lawrence D'Oliveiro wrote:

    On Mon, 24 Jun 2024 07:34:05 +0200, Terje Mathisen wrote:

    Not only is RAM the new disk ...

    That would be DRAM. Whatever happened to SRAM? Surely having a few
    mebibytes of that can’t be a big cost these days.


    For most applications, that SRAM is best used as a cache. Multi
    megabyte caches made from SRAM are common.


    --
    - Stephen Fuld
    (e-mail address disguised to prevent spam)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Stephen Fuld@21:1/5 to Lawrence D'Oliveiro on Mon Jun 24 17:09:24 2024
    Lawrence D'Oliveiro wrote:

    On Thu, 20 Jun 2024 14:28:04 -1000, Lynn Wheeler wrote:

    a little over decade ago was asked to track down decision to add
    virtual memory to all 370s. basically MVT storage management was so
    bad that required specifying region storage requirements four times
    larger than used ...

    Let me see if I got this straight.


    You're close.



    In the early days, the offshoots of OS/360 were “MFT” (“Multiple
    Fixed Tasks”)


    "Multiprogramming with a Fixed number of Tasks"


    and “MVT” (“Multiple Variable Tasks”).


    "Multiprogramming with a Variable number of Tasks"



    The “Multiple”
    part had to do with this new-fangled thing called “multiprogramming”: basically, there was so much RAM available by that point that you
    could typically keep multiple programs memory-resident at once, even
    if only one was executing, instead of having to swap everybody but
    the current program out.


    Right. So you could make productive use of the CPU while one program
    was idle, typically waiting for an I/O.



    The difference was, with MFT, a program had to declare its memory
    requirement before it could be started, and the only way to change
    that was to stop the program and start it again. Whereas MVT allowed
    a program to change its memory requirements while it was executing.
    (Whoah! Program relocation requirement styleee!)


    Not quite. With MFT, the system operator defined the numbeer and size
    of each region of memory. For example, a site may have two partitions
    of 3MB and one of 6MB. These were rarely changed. It was the
    operator's responsibility to try to maximize system usage by starting
    jobs in the appropriate sized partitions. With MVT, there were no
    fixed sized partitions. The memory requirements for each job were
    specified by the job and again, it was the operator's responsibility to
    try to maximize usage.

    While a job could change its memory requirements with each task,IIRC it
    was better to specify the largest requirement in the beginning, and
    releasing memory as the largest remaining requirement got smaller.
    This was because if the memory requirement increased inn the middle of
    the job, you were not quaranteed that the additional memory was
    available, and if it wasn't, the job would have to be terminated or
    "rolled out" until the required memory was available. Note, as John
    explained, since programs were not relocatable, if a program was rolled
    out, it could only be rolled in to the same address it was rolled out
    from.




    Then, when virtual memory came along, MFT became OS/VS1, while MVT
    became OS/VS2.

    Eventually, the MFT→OS/VS1 line died a long-overdue death, and OS/VS2 became “MVS”. Not sure what other name changes happened along the
    way, but nowadays this is known as “z/OS”.


    Right.


    Does that make sense?

    Close.



    --
    - Stephen Fuld
    (e-mail address disguised to prevent spam)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Stephen Fuld@21:1/5 to All on Mon Jun 24 17:59:31 2024
    Stephen Fuld wrote:


    snip


    Sorry for the self folllow up, but the next paragraph in my prewvious
    post is copnfusing at best.

    The no relocation issue only applied within a task, not between tasks.
    So, for example, if you had a job with three tasks, r.g. compile, link
    and execute", if the execute had larger memory requirements than say
    the link, the fjob might be delayed if an "execute sized" region wasn't available, but that region didn't have to be the same one as the link
    used.



    While a job could change its memory requirements with each task,IIRC
    it was better to specify the largest requirement in the beginning, and releasing memory as the largest remaining requirement got smaller.
    This was because if the memory requirement increased inn the middle of
    the job, you were not quaranteed that the additional memory was
    available, and if it wasn't, the job would have to be terminated or
    "rolled out" until the required memory was available. Note, as John explained, since programs were not relocatable, if a program was
    rolled out, it could only be rolled in to the same address it was
    rolled out from.


    --
    - Stephen Fuld
    (e-mail address disguised to prevent spam)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From John Levine@21:1/5 to All on Tue Jun 25 04:04:31 2024
    According to Lawrence D'Oliveiro <[email protected]d>:
    How much of theoretical disk bandwidth was the filesystem capable of
    using? Because I know early Unix systems were pretty terrible in that
    regard, until Berkeley’s “Fast File System” came along.

    My recollection is that if you were using QSAM with multiple buffers
    and full track records it wasn't hard to keep the disk going at full
    speed. Later versions of OS do chained scheduling if you have enough
    buffers, doing several disk operations with one cnannel program.

    --
    Regards,
    John Levine, [email protected], Primary Perpetrator of "The Internet for Dummies",
    Please consider the environment before reading this e-mail. https://jl.ly

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to John Levine on Tue Jun 25 05:39:28 2024
    On Tue, 25 Jun 2024 04:04:31 -0000 (UTC), John Levine wrote:

    According to Lawrence D'Oliveiro <[email protected]d>:

    How much of theoretical disk bandwidth was the filesystem capable of
    using? Because I know early Unix systems were pretty terrible in that
    regard, until Berkeley’s “Fast File System” came along.

    My recollection is that if you were using QSAM with multiple buffers and
    full track records it wasn't hard to keep the disk going at full speed.
    Later versions of OS do chained scheduling if you have enough buffers,
    doing several disk operations with one cnannel program.

    Presumably the downside of that was there was no such thing as “stream- oriented” I/O: it was all record-based, just like most of the other OSes.

    Unix was unique in hiding the need from applications/users to worry about sector sizes when writing to files and reading from files. But there was a significant overhead in that, at least in the early years.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Terje Mathisen on Tue Jun 25 07:00:07 2024
    On Tue, 25 Jun 2024 08:49:43 +0200, Terje Mathisen wrote:

    Even on (MS)DOS it was easy to saturate the hard drive from a single
    program, you just needed large enough (i.e. at least a full track each) buffers.

    That sounds more like a peak thing than a sustained thing. In between
    filling the buffers, the disk is left idle. So on average you are
    operating well below theoretical disk capacity.

    After all, MS-DOS never suppported async I/O.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Terje Mathisen@21:1/5 to John Levine on Tue Jun 25 08:49:43 2024
    John Levine wrote:
    According to Lawrence D'Oliveiro <[email protected]d>:
    How much of theoretical disk bandwidth was the filesystem capable of
    using? Because I know early Unix systems were pretty terrible in that
    regard, until Berkeley’s “Fast File System” came along.

    My recollection is that if you were using QSAM with multiple buffers
    and full track records it wasn't hard to keep the disk going at full
    speed. Later versions of OS do chained scheduling if you have enough
    buffers, doing several disk operations with one cnannel program.

    Even on (MS)DOS it was easy to saturate the hard drive from a single
    program, you just needed large enough (i.e. at least a full track each) buffers.

    I did end up making special file/record layouts which were optimized for
    this, using exactly 4kB for each header+bitmap record.

    Terje

    --
    - <Terje.Mathisen at tmsw.no>
    "almost all programming can be viewed as an exercise in caching"

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael S@21:1/5 to Terje Mathisen on Tue Jun 25 11:19:04 2024
    On Tue, 25 Jun 2024 08:49:43 +0200
    Terje Mathisen <[email protected]> wrote:

    John Levine wrote:
    According to Lawrence D'Oliveiro <[email protected]d>:
    How much of theoretical disk bandwidth was the filesystem capable
    of using? Because I know early Unix systems were pretty terrible
    in that regard, until Berkeley’s “Fast File System” came
    along.

    My recollection is that if you were using QSAM with multiple buffers
    and full track records it wasn't hard to keep the disk going at full
    speed. Later versions of OS do chained scheduling if you have enough buffers, doing several disk operations with one cnannel program.

    Even on (MS)DOS it was easy to saturate the hard drive from a single program, you just needed large enough (i.e. at least a full track
    each) buffers.


    I am not sure that "saturate the hard drive" is a correct wording.
    According to my understanding, [when within track] hard drives used in
    early PCs were more capable than hard disk controllers (Xebec 1210 in
    XT, I don't know what was used before XT). In turn, disk side interface
    of disk controller was likely more capable than its system bus side.
    Now, those are just feelings, I can't find hard data to back it up.

    I did end up making special file/record layouts which were optimized
    for this, using exactly 4kB for each header+bitmap record.

    Terje


    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Terje Mathisen@21:1/5 to Lawrence D'Oliveiro on Tue Jun 25 12:35:59 2024
    Lawrence D'Oliveiro wrote:
    On Tue, 25 Jun 2024 08:49:43 +0200, Terje Mathisen wrote:

    Even on (MS)DOS it was easy to saturate the hard drive from a single
    program, you just needed large enough (i.e. at least a full track each)
    buffers.

    That sounds more like a peak thing than a sustained thing. In between
    filling the buffers, the disk is left idle. So on average you are
    operating well below theoretical disk capacity.

    What I measured was very close to the disk vendor's max spec for reading
    or writing the entire disk. Maybe they were a little bit more efficient
    than you imply, i.e. just a little bit of read-ahead caching would cover
    the latency between read requests. That said, I never measured the
    sustained write speed directly, only indirectly by the time it took PartitionMagic to restore an image or copy a disk.

    After all, MS-DOS never suppported async I/O.

    Not by the OS, no, but I certainly wrote asm code which used hybrid interrupt/polling to sustain max IO rate background transfers for both
    network and serial/parallel ports.

    Terje

    --
    - <Terje.Mathisen at tmsw.no>
    "almost all programming can be viewed as an exercise in caching"

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Stephen Fuld@21:1/5 to Lawrence D'Oliveiro on Tue Jun 25 14:54:28 2024
    Lawrence D'Oliveiro wrote:

    On Tue, 25 Jun 2024 04:04:31 -0000 (UTC), John Levine wrote:

    According to Lawrence D'Oliveiro <[email protected]d>:

    How much of theoretical disk bandwidth was the filesystem capable
    of >> using? Because I know early Unix systems were pretty terrible
    in that >> regard, until Berkeley’s “Fast File System” came along.

    My recollection is that if you were using QSAM with multiple
    buffers and full track records it wasn't hard to keep the disk
    going at full speed. Later versions of OS do chained scheduling if
    you have enough buffers, doing several disk operations with one
    cnannel program.

    Presumably the downside of that was there was no such thing as
    “stream- oriented” I/O: it was all record-based, just like most of
    the other OSes.

    For the business oriented and even scientific oriented applications
    typically run on S/360, "stream oriented" I/O would have been a
    distinct disadvantage. COBOL I/O (and even Fortran I/O) is "record
    oriented". "byte oriented" I/O would ha made things harder for the
    programer as well as much less efficient.


    Unix was unique in hiding the need from applications/users to worry
    about sector sizes when writing to files and reading from files. But
    there was a significant overhead in that, at least in the early years.


    There still is additional overhead. But the application amd
    programming language mix has changed, so stream oriented I/O is now the default.



    --
    - Stephen Fuld
    (e-mail address disguised to prevent spam)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to Stephen Fuld on Tue Jun 25 14:59:11 2024
    "Stephen Fuld" <[email protected]d> writes:
    Lawrence D'Oliveiro wrote:

    Presumably the downside of that was there was no such thing as
    “stream- oriented” I/O: it was all record-based, just like most of
    the other OSes.

    For the business oriented and even scientific oriented applications
    typically run on S/360, "stream oriented" I/O would have been a
    distinct disadvantage. COBOL I/O (and even Fortran I/O) is "record >oriented". "byte oriented" I/O would ha made things harder for the
    programer as well as much less efficient.


    Unix was unique in hiding the need from applications/users to worry
    about sector sizes when writing to files and reading from files. But
    there was a significant overhead in that, at least in the early years.


    There still is additional overhead. But the application amd
    programming language mix has changed, so stream oriented I/O is now the >default.

    Is there? It's trivially easy to impose a record structure
    on a stream of bytes, and all the unix file access APIs allow
    direct access to the device without extra buffering if desired,
    subject to some alignment constraints.

    And even on the mainframes, the records were blocked and read
    in larger chunks and the application deblocked them.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From EricP@21:1/5 to Michael S on Tue Jun 25 11:24:03 2024
    Michael S wrote:
    On Tue, 25 Jun 2024 08:49:43 +0200
    Terje Mathisen <[email protected]> wrote:

    John Levine wrote:
    According to Lawrence D'Oliveiro <[email protected]d>:
    How much of theoretical disk bandwidth was the filesystem capable
    of using? Because I know early Unix systems were pretty terrible
    in that regard, until Berkeley’s “Fast File System” came
    along.
    My recollection is that if you were using QSAM with multiple buffers
    and full track records it wasn't hard to keep the disk going at full
    speed. Later versions of OS do chained scheduling if you have enough
    buffers, doing several disk operations with one cnannel program.
    Even on (MS)DOS it was easy to saturate the hard drive from a single
    program, you just needed large enough (i.e. at least a full track
    each) buffers.


    I am not sure that "saturate the hard drive" is a correct wording.
    According to my understanding, [when within track] hard drives used in
    early PCs were more capable than hard disk controllers (Xebec 1210 in
    XT, I don't know what was used before XT). In turn, disk side interface
    of disk controller was likely more capable than its system bus side.
    Now, those are just feelings, I can't find hard data to back it up.

    I did end up making special file/record layouts which were optimized
    for this, using exactly 4kB for each header+bitmap record.

    Terje

    According to this the interface to the ST506/ST412 drives from early 1980's could handle 5 Mb/s and the avg track seek time was 170 ms (later 85 ms).
    The 8-bit PC bus was 8 MB/s so there should be no technical reason for
    not keeping *multiple* such drives fully busy (seeking or transferring)
    while concurrently executing code.

    https://en.wikipedia.org/wiki/ST-506#History

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to Stephen Fuld on Tue Jun 25 15:41:59 2024
    "Stephen Fuld" <[email protected]d> writes:
    Scott Lurndal wrote:


    And even on the mainframes, the records were blocked and read
    in larger chunks and the application deblocked them.

    Sure. But lets use a typical example of reading 80 column card images >blocked 10, so an 800 byte disk record (remember, this is IBM, CKD type >disks, so really 800 byte records on the disk). So for ten out of
    every eleven reads, the read verb just meant a single load of the base >address of the next record in the block.

    Same thing on Unix. Read 800 bytes at a time and treat it as 10
    80-byte records.

    Modern unix/linux systems will anticipate sequential accesses
    and do anticipatory loads such that data is already in memory
    when the next "record" is needed.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Stephen Fuld@21:1/5 to Scott Lurndal on Tue Jun 25 15:30:28 2024
    Scott Lurndal wrote:

    "Stephen Fuld" <[email protected]d> writes:
    Lawrence D'Oliveiro wrote:

    Presumably the downside of that was there was no such thing as
    “stream- oriented” I/O: it was all record-based, just like
    most of >> the other OSes.

    For the business oriented and even scientific oriented applications typically run on S/360, "stream oriented" I/O would have been a
    distinct disadvantage. COBOL I/O (and even Fortran I/O) is "record oriented". "byte oriented" I/O would ha made things harder for the programer as well as much less efficient.


    Unix was unique in hiding the need from applications/users to worry
    about sector sizes when writing to files and reading from files.
    But >> there was a significant overhead in that, at least in the
    early years.


    There still is additional overhead. But the application amd
    programming language mix has changed, so stream oriented I/O is now
    the default.

    Is there? It's trivially easy to impose a record structure
    on a stream of bytes,


    Sure. But you have taken a (typically now 4K) record structure on the
    disk, treated it as a stream of bytes, then aggregating groupsof those
    bytes into records. This, while not a huge overhead, is *some*
    additional overhead.




    and all the unix file access APIs allow
    direct access to the device without extra buffering if desired,
    subject to some alignment constraints.

    Yes. Thus bypassing the overhead of the "stream of bytes" illusion.



    And even on the mainframes, the records were blocked and read
    in larger chunks and the application deblocked them.

    Sure. But lets use a typical example of reading 80 column card images
    blocked 10, so an 800 byte disk record (remember, this is IBM, CKD type
    disks, so really 800 byte records on the disk). So for ten out of
    every eleven reads, the read verb just meant a single load of the base
    address of the next record in the block.



    --
    - Stephen Fuld
    (e-mail address disguised to prevent spam)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lynn Wheeler@21:1/5 to John Levine on Tue Jun 25 08:33:39 2024
    John Levine <[email protected]> writes:
    My recollection is that if you were using QSAM with multiple buffers
    and full track records it wasn't hard to keep the disk going at full
    speed. Later versions of OS do chained scheduling if you have enough
    buffers, doing several disk operations with one cnannel program.

    When 360/67 was delivered to univ. I was hired fulltime responsible for
    OS/360 (tss/360 never came to production). Initially student fortran
    jobs ran over a minute (had run under a second on 709 tape->tape). I
    installed HASP which cuts the time in half. I then started redoing
    OS/360 SYSGEN to carefully place SYSTEM datasets and PDS (program
    library) members to optimize arm seek and multi-track search (channel
    program used to searc PDS directory for member location) cutting another
    2/3rds to 12.9secs. Student Fortran never got better than 709 until I
    installed Univ. of Waterloo WATFOR.

    when CP67 was 1st delivered to univ (3rd installation after cambridge
    itself and MIT lincoln labs), all I/O was FIFO and page I/O was single
    4k page at time. CMS filesystem was 800 byte blocks and was usually
    single block transfer per channel program ... however if loading a
    program image and had been allocated contiguous sequential, it would
    transfer up to track worth in single channel program.

    I redid disk I/O to ordered seek and redid page I/O to maximize page
    transfers per revolution (at same arm position). For 2301 fixed-head
    (paging) drum I got it from max around 70 4k/sec to peak of 270 4k/sec
    (max transfers per 2301 revolution).

    There was problem with CMS filesystem that pretty much did scatter
    allocate (CMS sort of shared some CTSS heritage with UNIX going back
    through MULTICS) ... so it was rare file that it happened to have any sequentially allocated, contiguous records. Shortly after graduating and joining science center ... and seeing what multics was doing on flr
    above (science center was on 4th flr 545tech sq, multics was on the
    5flr), I modified CMS filesystem to be 4k records and use a paged mapped
    API (and the underneath page I/O support would order for maximum
    transfers per revolution) ... and I also added to CMS program image
    generation, an attempt to maximize contiguous allocation, which could
    result close to max transfers/revolution in single channel program (when loading program).

    --
    virtualization experience starting Jan1968, online at home since Mar1970

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael S@21:1/5 to EricP on Wed Jun 26 01:00:42 2024
    On Tue, 25 Jun 2024 11:24:03 -0400
    EricP <[email protected]> wrote:

    Michael S wrote:
    On Tue, 25 Jun 2024 08:49:43 +0200
    Terje Mathisen <[email protected]> wrote:

    John Levine wrote:
    According to Lawrence D'Oliveiro <[email protected]d>:
    How much of theoretical disk bandwidth was the filesystem capable
    of using? Because I know early Unix systems were pretty terrible
    in that regard, until Berkeley’s “Fast File System” came
    along.
    My recollection is that if you were using QSAM with multiple
    buffers and full track records it wasn't hard to keep the disk
    going at full speed. Later versions of OS do chained scheduling
    if you have enough buffers, doing several disk operations with
    one cnannel program.
    Even on (MS)DOS it was easy to saturate the hard drive from a
    single program, you just needed large enough (i.e. at least a full
    track each) buffers.


    I am not sure that "saturate the hard drive" is a correct wording. According to my understanding, [when within track] hard drives used
    in early PCs were more capable than hard disk controllers (Xebec
    1210 in XT, I don't know what was used before XT). In turn, disk
    side interface of disk controller was likely more capable than its
    system bus side. Now, those are just feelings, I can't find hard
    data to back it up.
    I did end up making special file/record layouts which were
    optimized for this, using exactly 4kB for each header+bitmap
    record.

    Terje

    According to this the interface to the ST506/ST412 drives from early
    1980's could handle 5 Mb/s and the avg track seek time was 170 ms
    (later 85 ms).

    https://en.wikipedia.org/wiki/ST-506#History


    Thank you. I didn't realize that PC HDs of early 80s were *that*
    slow.

    The 8-bit PC bus was 8 MB/s so there should be no
    technical reason for not keeping *multiple* such drives fully busy
    (seeking or transferring) while concurrently executing code.

    8 MB/s sounds fast. I think, even for XT 2 MB/s is more realistic. Less
    than that for original PC.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From John Levine@21:1/5 to All on Wed Jun 26 03:07:56 2024
    According to Scott Lurndal <[email protected]>:
    There still is additional overhead. But the application amd
    programming language mix has changed, so stream oriented I/O is now the >>default.

    Is there? It's trivially easy to impose a record structure
    on a stream of bytes, and all the unix file access APIs allow
    direct access to the device without extra buffering if desired,
    subject to some alignment constraints.

    At the time Unix was written, the big advanage of byte stream text was
    that there weren't any trailing blanks so you fit a lot more data onto
    a disk than you did with 80 or 132 character fixed records. I realize
    that mainframe disk files can have variable length records but it's
    not my impression that they were ever very popular.

    These days although Unix-ish systems still let you read or write
    arbitrary text chunks, in practice everyone uses I/O libraries that do
    reads and writes aligned to disk blocks, or map tthe file and let the
    pager handle it. The logical file format is an application thing on
    top of that.

    --
    Regards,
    John Levine, [email protected], Primary Perpetrator of "The Internet for Dummies",
    Please consider the environment before reading this e-mail. https://jl.ly

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Stephen Fuld on Wed Jun 26 05:55:30 2024
    On Tue, 25 Jun 2024 22:28:40 -0700, Stephen Fuld wrote:

    But if you are using the "standard" Unix file system, doesn't it read
    the block(S) into its cache, then when the user program does the read, transfer the data from its cache into the user's variables? That is the extra overhead to which I was referring.

    That tends to happen anyway, even on OSes which insist on record-oriented
    I/O. For example, on DEC’s VMS, the record blocking layer is called “RMS” (“Record Management Services”), and that usually copies records between
    the user’s buffers and its own internal buffers (“move mode”). It is possible to request “locate mode”, where it returns the address of a
    record directly within its internal buffers. But there are many
    restrictions on this, among other things:

    * It only works for reads, not for writes
    * It doesn’t work for records crossing block boundaries
    * It doesn’t work for compressed records

    So this record-copying overhead is not, in itself, a point against Unix-
    style streaming I/O.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Stephen Fuld@21:1/5 to Scott Lurndal on Tue Jun 25 22:28:40 2024
    On 6/25/2024 8:41 AM, Scott Lurndal wrote:
    "Stephen Fuld" <[email protected]d> writes:
    Scott Lurndal wrote:


    And even on the mainframes, the records were blocked and read
    in larger chunks and the application deblocked them.

    Sure. But lets use a typical example of reading 80 column card images
    blocked 10, so an 800 byte disk record (remember, this is IBM, CKD type
    disks, so really 800 byte records on the disk). So for ten out of
    every eleven reads, the read verb just meant a single load of the base
    address of the next record in the block.

    Same thing on Unix. Read 800 bytes at a time and treat it as 10
    80-byte records.

    But if you are using the "standard" Unix file system, doesn't it read
    the block(S) into its cache, then when the user program does the read,
    transfer the data from its cache into the user's variables? That is the
    extra overhead to which I was referring.


    Modern unix/linux systems will anticipate sequential accesses
    and do anticipatory loads such that data is already in memory
    when the next "record" is needed.

    Sure. But isn't it in the file system cache, so you still have to
    transfer the data to the user program's variables?


    --
    - Stephen Fuld
    (e-mail address disguised to prevent spam)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Terje Mathisen@21:1/5 to Michael S on Wed Jun 26 08:35:36 2024
    Michael S wrote:
    On Tue, 25 Jun 2024 11:24:03 -0400
    EricP <[email protected]> wrote:

    Michael S wrote:
    On Tue, 25 Jun 2024 08:49:43 +0200
    Terje Mathisen <[email protected]> wrote:

    John Levine wrote:
    According to Lawrence D'Oliveiro <[email protected]d>:
    How much of theoretical disk bandwidth was the filesystem capable
    of using? Because I know early Unix systems were pretty terrible
    in that regard, until Berkeley’s “Fast File System” came
    along.
    My recollection is that if you were using QSAM with multiple
    buffers and full track records it wasn't hard to keep the disk
    going at full speed. Later versions of OS do chained scheduling
    if you have enough buffers, doing several disk operations with
    one cnannel program.
    Even on (MS)DOS it was easy to saturate the hard drive from a
    single program, you just needed large enough (i.e. at least a full
    track each) buffers.


    I am not sure that "saturate the hard drive" is a correct wording.
    According to my understanding, [when within track] hard drives used
    in early PCs were more capable than hard disk controllers (Xebec
    1210 in XT, I don't know what was used before XT). In turn, disk
    side interface of disk controller was likely more capable than its
    system bus side. Now, those are just feelings, I can't find hard
    data to back it up.
    I did end up making special file/record layouts which were
    optimized for this, using exactly 4kB for each header+bitmap
    record.

    Terje

    According to this the interface to the ST506/ST412 drives from early
    1980's could handle 5 Mb/s and the avg track seek time was 170 ms
    (later 85 ms).

    https://en.wikipedia.org/wiki/ST-506#History


    Thank you. I didn't realize that PC HDs of early 80s were *that*
    slow.

    The 8-bit PC bus was 8 MB/s so there should be no
    technical reason for not keeping *multiple* such drives fully busy
    (seeking or transferring) while concurrently executing code.

    8 MB/s sounds fast. I think, even for XT 2 MB/s is more realistic. Less
    than that for original PC.

    The original 8-bit PC ISA bus was 1 byte per 4 CPU clocks, so a bit more
    than 1MB/s.

    I don't think I did any serious disk benchmarking before the AT days
    however (i.e. 1984++), and by that time we had 6-10 MHz 286 cpus on a
    16-bit bus.

    I installed my first Novell Netware 2.0a (or similar) in 1986 or '87, on
    a 286 machine with a 30 MB (afair) hard drive.

    Terje


    --
    - <Terje.Mathisen at tmsw.no>
    "almost all programming can be viewed as an exercise in caching"

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From EricP@21:1/5 to Michael S on Wed Jun 26 08:29:06 2024
    Michael S wrote:
    On Tue, 25 Jun 2024 11:24:03 -0400
    EricP <[email protected]> wrote:

    Michael S wrote:
    On Tue, 25 Jun 2024 08:49:43 +0200
    Terje Mathisen <[email protected]> wrote:

    John Levine wrote:
    According to Lawrence D'Oliveiro <[email protected]d>:
    How much of theoretical disk bandwidth was the filesystem capable
    of using? Because I know early Unix systems were pretty terrible
    in that regard, until Berkeley’s “Fast File System” came
    along.
    My recollection is that if you were using QSAM with multiple
    buffers and full track records it wasn't hard to keep the disk
    going at full speed. Later versions of OS do chained scheduling
    if you have enough buffers, doing several disk operations with
    one cnannel program.
    Even on (MS)DOS it was easy to saturate the hard drive from a
    single program, you just needed large enough (i.e. at least a full
    track each) buffers.

    I am not sure that "saturate the hard drive" is a correct wording.
    According to my understanding, [when within track] hard drives used
    in early PCs were more capable than hard disk controllers (Xebec
    1210 in XT, I don't know what was used before XT). In turn, disk
    side interface of disk controller was likely more capable than its
    system bus side. Now, those are just feelings, I can't find hard
    data to back it up.
    I did end up making special file/record layouts which were
    optimized for this, using exactly 4kB for each header+bitmap
    record.

    Terje
    According to this the interface to the ST506/ST412 drives from early
    1980's could handle 5 Mb/s and the avg track seek time was 170 ms
    (later 85 ms).

    https://en.wikipedia.org/wiki/ST-506#History


    Thank you. I didn't realize that PC HDs of early 80s were *that*
    slow.

    The 8-bit PC bus was 8 MB/s so there should be no
    technical reason for not keeping *multiple* such drives fully busy
    (seeking or transferring) while concurrently executing code.

    8 MB/s sounds fast. I think, even for XT 2 MB/s is more realistic. Less
    than that for original PC.

    That would most likely be due to DRAM speed and memory controller.
    Looking at a 1980 Motorola Memory Book and it has
    4kb, 16kb and 64kb DRAMs. The speed range from 500 ns to 120 ns.

    The 64kb 150 ns drams would have been very expensive,
    and might not even be available at all if some big company
    buys up all the supply (that could happen back then).

    Wikipedia says original PC used Mostek 4116-compatible 16kb,
    but they likely would be slower to lower cost, so 300 ns.

    A TTL DRAM memory controller would add, oh say, 200 ns on top of that
    (just the XOR to match the memory address cost 20 ns).

    So a memory 1 byte wide, 500 ns access time = 2MB/s.

    (Its a bit of a wasteful design because a memory controller for
    1 byte wide or 4 bytes wide costs about the same, but 4 bytes wide
    can handle the full 8 MB/s sequential DMA bus speed. But that would
    also force you to use memory chips in multiples of 32 rather than 8,
    increasing base cost.)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to John Levine on Wed Jun 26 15:42:43 2024
    John Levine <[email protected]> writes:
    According to Scott Lurndal <[email protected]>:
    There still is additional overhead. But the application amd
    programming language mix has changed, so stream oriented I/O is now the >>>default.

    Is there? It's trivially easy to impose a record structure
    on a stream of bytes, and all the unix file access APIs allow
    direct access to the device without extra buffering if desired,
    subject to some alignment constraints.

    At the time Unix was written, the big advantage of byte stream text was
    that there weren't any trailing blanks so you fit a lot more data onto
    a disk than you did with 80 or 132 character fixed records.

    That was one advantage. Consider, however, that byte streams fit
    quite well into the unix pipeline paradigm and that may have been
    the driving force behind the non-record-based I/O facilities in
    Unix, and allowed record-based or block-based file handling if
    the application desired it (albeit without the complexity of
    e.g. DEC RMS or the obtuse record and disk management IBM crap.

    Burroughs systems, like IBM, were primary record based, but
    with real underlying filesystems more in the RMS family
    than in the PDS/track/CKD IBM (crap is putting it kindly).



    These days although Unix-ish systems still let you read or write
    arbitrary text chunks, in practice everyone uses I/O libraries that do

    I've been writing unix and unix-like operating systems for almost
    four decades now. I would disagree with the "everyone uses I/O libraries"; while stdio provides buffering, high-performance applications such
    as the Oracle RDMB don't use stdio - they directly access the device
    (unix raw devices, or linux O_DIRECT) and handle all their own buffering.

    reads and writes aligned to disk blocks, or map tthe file and let the
    pager handle it. The logical file format is an application thing on
    top of that.

    As I said, it's trivially easy to impose a structure on a stream of
    bytes in the application.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to Stephen Fuld on Wed Jun 26 15:49:11 2024
    Stephen Fuld <[email protected]d> writes:
    On 6/25/2024 8:41 AM, Scott Lurndal wrote:
    "Stephen Fuld" <[email protected]d> writes:
    Scott Lurndal wrote:


    And even on the mainframes, the records were blocked and read
    in larger chunks and the application deblocked them.

    Sure. But lets use a typical example of reading 80 column card images
    blocked 10, so an 800 byte disk record (remember, this is IBM, CKD type
    disks, so really 800 byte records on the disk). So for ten out of
    every eleven reads, the read verb just meant a single load of the base
    address of the next record in the block.

    Same thing on Unix. Read 800 bytes at a time and treat it as 10
    80-byte records.

    But if you are using the "standard" Unix file system,

    There are dozens of Unix file systems (S5 (AT&T), UFS (BSD), vxfs (Tolerant/Veritas)
    and in modern linux ext2, ext3, ext4, btrfs, zfs, xfs, vxfs
    and a bunch more. The filesystem controls the placement of
    data on a device. The underlying filesystem is completely
    invisible to the application which uses a standard set of
    APIs to access the data in a file.

    The legacy API is called stdio and as you note, will buffer
    the blocks in the application and the access APIs will manage
    the buffers. This is basically the same layering that the
    Burroughs MCP provided as well as DEC RMS where they
    buffered the blocks in application buffers and extracted
    the records upon demand from the application.

    Unix API's such as mmap() will map the file data directly into
    the application address space with no additional buffering.

    Likewise for raw device accesses (on traditional linux) or
    O_DIRECT (linux) - there is no additional buffering or data
    movement required.

    So if an application needs direct access without any buffering,
    there are multiple means to accomplish that.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From John Levine@21:1/5 to All on Wed Jun 26 16:04:17 2024
    According to Scott Lurndal <[email protected]>:
    Is there? It's trivially easy to impose a record structure
    on a stream of bytes, and all the unix file access APIs allow
    direct access to the device without extra buffering if desired,
    subject to some alignment constraints.

    At the time Unix was written, the big advantage of byte stream text was >>that there weren't any trailing blanks so you fit a lot more data onto
    a disk than you did with 80 or 132 character fixed records.

    That was one advantage. Consider, however, that byte streams fit
    quite well into the unix pipeline paradigm and that may have been
    the driving force behind the non-record-based I/O facilities in
    Unix,...

    Pipelines work equally well with fixed length records. They were
    familar with byte streams from PDP-10, Multics, and other systems
    they'd worked with.

    These days although Unix-ish systems still let you read or write
    arbitrary text chunks, in practice everyone uses I/O libraries that do

    I've been writing unix and unix-like operating systems for almost
    four decades now. I would disagree with the "everyone uses I/O libraries"; >while stdio provides buffering, high-performance applications such
    as the Oracle RDMB don't use stdio - they directly access the device
    (unix raw devices, or linux O_DIRECT) and handle all their own buffering.

    Yes, of course databases do their own I/O, designed to be aware of the
    disk structure. Some I've seen use files, some want to take over the
    entire disk partition and read and write the special file.
    --
    Regards,
    John Levine, [email protected], Primary Perpetrator of "The Internet for Dummies",
    Please consider the environment before reading this e-mail. https://jl.ly

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From John Levine@21:1/5 to statement to get a buffer to on Wed Jun 26 16:12:58 2024
    According to Lawrence D'Oliveiro <[email protected]d>:
    possible to request “locate mode”, where it returns the address of a >record directly within its internal buffers. But there are many
    restrictions on this, among other things:

    * It only works for reads, not for writes

    It works fine for writes on IBM systems. PL/I even has a LOCATE
    statement to get a buffer to write into.

    * It doesn’t work for records crossing block boundaries

    The PL/I programmer's guide has a note saying that if
    you use spanned records, it does an internal copy so
    don't bother. It "works" but it doesn't help.

    * It doesn’t work for compressed records

    Indeed.

    In any event this was all decades ago. These days most systems do the
    disk I/O through the pager so it's all locate mode all the time.
    --
    Regards,
    John Levine, [email protected], Primary Perpetrator of "The Internet for Dummies",
    Please consider the environment before reading this e-mail. https://jl.ly

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Stephen Fuld@21:1/5 to Lawrence D'Oliveiro on Wed Jun 26 16:25:35 2024
    Lawrence D'Oliveiro wrote:

    On Tue, 25 Jun 2024 22:28:40 -0700, Stephen Fuld wrote:

    But if you are using the "standard" Unix file system, doesn't it
    read the block(S) into its cache, then when the user program does
    the read, transfer the data from its cache into the user's
    variables? That is the extra overhead to which I was referring.

    That tends to happen anyway, even on OSes which insist on
    record-oriented I/O. For example, on DEC’s VMS, the record blocking
    layer is called “RMS” (“Record Management Services”), and that usually copies records between the user’s buffers and its own
    internal buffers (“move mode”). It is possible to request “locate mode”, where it returns the address of a record directly within its internal buffers. But there are many restrictions on this, among
    other things:

    * It only works for reads, not for writes
    * It doesn’t work for records crossing block boundaries
    * It doesn’t work for compressed records

    So this record-copying overhead is not, in itself, a point against
    Unix- style streaming I/O.


    The fact that other OSs "made the same mistake" :-) isn't a point for
    treating all I/O as a stream of bytes. I don't know VAX, but I don't understand why not for writes. The no crossing block boundries is a
    side effect of fixed block disks. This couldn't happen in OS/360 with
    CKD disks. I agree about compression, of course, as unless you do the compression in the I/O hardware stream, you need to "move" the data
    anyway.




    --
    - Stephen Fuld
    (e-mail address disguised to prevent spam)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From John Levine@21:1/5 to All on Wed Jun 26 19:51:34 2024
    According to Stephen Fuld <[email protected]d>:
    understand why not for writes. The no crossing block boundries is a
    side effect of fixed block disks. This couldn't happen in OS/360 with
    CKD disks.

    Actually, it did. OS had a Variable Block Spanned record format that
    could split a logical record over several physical disk blocks. I don't
    think it was very widely used, but it's still there in z/OS:

    https://www.ibm.com/docs/en/zos/3.1.0?topic=SSLTBW_3.1.0/com.ibm.zos.v3r1.cbcpx01/spanned.html

    This doesn't mean it was a particularly good idea as implemented, of course.

    I see that VSAM now has what they call linear datasets, which we would call normal block files, which are used for Data in Virtual which we call memory mapped files:

    https://www.ibm.com/docs/en/zos/3.1.0?topic=guide-data-in-virtual
    --
    Regards,
    John Levine, [email protected], Primary Perpetrator of "The Internet for Dummies",
    Please consider the environment before reading this e-mail. https://jl.ly

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Stephen Fuld@21:1/5 to John Levine on Wed Jun 26 21:50:13 2024
    John Levine wrote:

    According to Stephen Fuld <[email protected]d>:
    understand why not for writes. The no crossing block boundries is a
    side effect of fixed block disks. This couldn't happen in OS/360
    with CKD disks.

    Actually, it did. OS had a Variable Block Spanned record format that
    could split a logical record over several physical disk blocks. I
    don't think it was very widely used, but it's still there in z/OS:


    https://www.ibm.com/docs/en/zos/3.1.0?topic=SSLTBW_3.1.0/com.ibm.zos.v3r1.cbcpx01/spanned.html


    Thanks John. I had never heard about that.




    --
    - Stephen Fuld
    (e-mail address disguised to prevent spam)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to John Levine on Wed Jun 26 21:22:33 2024
    On Wed, 26 Jun 2024 16:12:58 -0000 (UTC), John Levine wrote:

    According to Lawrence D'Oliveiro <[email protected]d>:

    possible to request “locate mode”, where it returns the address of a >>record directly within its internal buffers. But there are many >>restrictions on this, among other things:

    * It only works for reads, not for writes

    It works fine for writes on IBM systems. PL/I even has a LOCATE
    statement to get a buffer to write into.

    I found this manual <http://bitsavers.trailing-edge.com/pdf/ibm/360/tss/GC28-2045-1_Time_Sharing_System_PLI_Language_Reference_Manual.pdf>.
    On page 96, it says

    The LOCATE Statement
    --------------------

    The LOCATE statement can be used only with a BUFFERED OUTPUT
    SEQUENTIAL or TRANSIENT file. (Note: A program that uses a
    TRANSIENT file cannot be executed on TSS/360.) It allocates
    storage within an output buffer for a based variable, setting a
    pointer to the location in the buffer as it does so. This pointer
    can then be used to refer to the allocation so that data can be
    moved into the buffer. The record is written out automatically,
    during executiong of a subsequent WRITE or LOCATE statement for
    the file, or when the file is closed.

    So it has its own limitations on applicability. For example, it only
    seems to work for writes, not reads.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From John Levine@21:1/5 to All on Thu Jun 27 01:40:29 2024
    According to Lawrence D'Oliveiro <[email protected]d>:
    possible to request “locate mode”, ...
    * It only works for reads, not for writes

    It works fine for writes on IBM systems. PL/I even has a LOCATE
    statement to get a buffer to write into.

    I found this manual ><http://bitsavers.trailing-edge.com/pdf/ibm/360/tss/GC28-2045-1_Time_Sharing_System_PLI_Language_Reference_Manual.pdf>.
    On page 96, it says

    The LOCATE Statement
    --------------------
    ...

    So it has its own limitations on applicability. For example, it only
    seems to work for writes, not reads.

    Well, yeah. Keep reading and on page 97 you'll find the SET option
    which is how a you get the pointer for a locate mode READ. It was more
    common when defining based data to say in the declaration what the
    default pointer to the data was, so if you said to READ or LOCATE the
    item, it'd automatically set the pointer.

    That TSS manual is rather estoteric since hardly anyone ever used TSS
    and even those of use who did, didn't write much PL/I.

    Here's the regular PL/I manual which explains locate mode for
    reading and writing on page 168.

    https://bitsavers.org/pdf/ibm/360/pli/C28-8201-1_PLIrefMan_Jan69.pdf


    --
    Regards,
    John Levine, [email protected], Primary Perpetrator of "The Internet for Dummies",
    Please consider the environment before reading this e-mail. https://jl.ly

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From MitchAlsup1@21:1/5 to Stephen Fuld on Thu Jun 27 01:13:23 2024
    Stephen Fuld wrote:

    Lawrence D'Oliveiro wrote:

    On Tue, 25 Jun 2024 22:28:40 -0700, Stephen Fuld wrote:

    But if you are using the "standard" Unix file system, doesn't it
    read the block(S) into its cache, then when the user program does
    the read, transfer the data from its cache into the user's
    variables? That is the extra overhead to which I was referring.

    That tends to happen anyway, even on OSes which insist on
    record-oriented I/O. For example, on DEC’s VMS, the record blocking
    layer is called “RMS” (“Record Management Services”), and that
    usually copies records between the user’s buffers and its own
    internal buffers (“move mode”). It is possible to request “locate
    mode”, where it returns the address of a record directly within its
    internal buffers. But there are many restrictions on this, among
    other things:

    * It only works for reads, not for writes
    * It doesn’t work for records crossing block boundaries
    * It doesn’t work for compressed records

    So this record-copying overhead is not, in itself, a point against
    Unix- style streaming I/O.


    The fact that other OSs "made the same mistake" :-) isn't a point for treating all I/O as a stream of bytes. I don't know VAX, but I don't understand why not for writes. The no crossing block boundries is a
    side effect of fixed block disks. This couldn't happen in OS/360 with
    CKD disks. I agree about compression, of course, as unless you do the compression in the I/O hardware stream, you need to "move" the data
    anyway.

    I had a job at S.E.L where I was assigned the task of reading the
    intermediate
    representation of the Mary 2 compiler (Ivan's). Out FORTRAN (66+) had
    record
    oriented I/O that was notoriously slow if you simply wanted stream I/O.

    So, after pondering this for a day or two, I wrote up a "streamifier"
    a front end to the record oriented FORTRAN I/O. And provided back end
    callers essentially a getc() putc() interface to record oriented files.

    The system only allowed a use program to have 6 files open at any point
    in
    time, and since I did not know how man files I might need, I allocated a
    20 element file control block structure, which contained the file names
    that SW thinks are open, and their current status and point.

    Then I build a cache 16-pages of memory and tagged the lines with the
    file
    identifier,...

    When I took a cache miss in the table, I would look through the open
    files
    list to see if I needed to close one file down and open a different file
    up. Perform some I/O requests, and allow caller to munge through
    streaming
    data.

    This SW package would handle large number of virtually open files, and
    enable stream level access significantly faster than the record oriented assembly code file management in the OS. Something like 5× faster--but
    it
    has been 40 years.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Thomas Koenig@21:1/5 to John Levine on Thu Jun 27 05:34:27 2024
    John Levine <[email protected]> schrieb:
    According to Scott Lurndal <[email protected]>:
    Is there? It's trivially easy to impose a record structure
    on a stream of bytes, and all the unix file access APIs allow
    direct access to the device without extra buffering if desired,
    subject to some alignment constraints.

    At the time Unix was written, the big advantage of byte stream text was >>>that there weren't any trailing blanks so you fit a lot more data onto
    a disk than you did with 80 or 132 character fixed records.

    That was one advantage. Consider, however, that byte streams fit
    quite well into the unix pipeline paradigm and that may have been
    the driving force behind the non-record-based I/O facilities in
    Unix,...

    Pipelines work equally well with fixed length records. They were
    familar with byte streams from PDP-10, Multics, and other systems
    they'd worked with.

    Did these systems actually feature byte streams? Both the PDP-10
    and the GE 635 were word-oriented machines with 36 bits, so I
    assume it must have been word streams at least.

    But a quick search didn't turn up details, so I may be wrong.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From John Dallman@21:1/5 to Lurndal on Thu Jun 27 09:14:00 2024
    In article <TFWeO.51161$[email protected]>, [email protected] (Scott Lurndal) wrote:

    Consider, however, that byte streams fit quite well into the
    unix pipeline paradigm and that may have been the driving force
    behind the non-record-based I/O facilities in Unix, and allowed
    record-based or block-based file handling if the application
    desired it ...

    Don't forget that the original use case for Unix was document production,
    where record-based i/o is not very useful.

    John

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From John Levine@21:1/5 to All on Thu Jun 27 16:00:51 2024
    According to Thomas Koenig <[email protected]>:
    Pipelines work equally well with fixed length records. They were
    familar with byte streams from PDP-10, Multics, and other systems
    they'd worked with.

    Did these systems actually feature byte streams? Both the PDP-10
    and the GE 635 were word-oriented machines with 36 bits, so I
    assume it must have been word streams at least.

    TOPS-10 did disk I/O to and from block buffers in the user's address
    space, but the system calls set up a byte pointer and a count so you
    could fetch or store one byte at a time. A line was anything up to
    CR/LF (and maybe form feed and other control characters) with the
    convention that you ignored null bytes.

    The same system calls worked on ttys and paper tapes, where it read
    up to a CR/LF and put it in the buffer. So either way the application
    saw a stream of bytes.

    Never used GCOS and never did non-trivial programming on DTSS so I
    couldn't tell you what they did. DTSS had communication files which
    were a lot like pipes. See https://www.cs.dartmouth.edu/~doug/DTSS/commfiles.pdf

    --
    Regards,
    John Levine, [email protected], Primary Perpetrator of "The Internet for Dummies",
    Please consider the environment before reading this e-mail. https://jl.ly

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Thomas Koenig@21:1/5 to John Levine on Thu Jun 27 16:34:11 2024
    John Levine <[email protected]> schrieb:

    Never used GCOS and never did non-trivial programming on DTSS so I
    couldn't tell you what they did. DTSS had communication files which
    were a lot like pipes. See https://www.cs.dartmouth.edu/~doug/DTSS/commfiles.pdf

    That is an extremely interesting article, thank you!


    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From MitchAlsup1@21:1/5 to John Levine on Thu Jun 27 17:39:46 2024
    John Levine wrote:

    According to Lawrence D'Oliveiro <[email protected]d>:
    possible to request “locate mode”, ...
    * It only works for reads, not for writes

    It works fine for writes on IBM systems. PL/I even has a LOCATE
    statement to get a buffer to write into.

    I found this manual >><http://bitsavers.trailing-edge.com/pdf/ibm/360/tss/GC28-2045-1_Time_Sharing_System_PLI_Language_Reference_Manual.pdf>.
    On page 96, it says

    The LOCATE Statement
    --------------------
    ...

    So it has its own limitations on applicability. For example, it only
    seems to work for writes, not reads.

    Well, yeah. Keep reading and on page 97 you'll find the SET option
    which is how a you get the pointer for a locate mode READ. It was more
    common when defining based data to say in the declaration what the
    default pointer to the data was, so if you said to READ or LOCATE the
    item, it'd automatically set the pointer.

    That TSS manual is rather estoteric since hardly anyone ever used TSS
    and even those of use who did, didn't write much PL/I.

    I did; TSS 360/67 CMU, and I did; both PL/C and PL/1.

    Here's the regular PL/I manual which explains locate mode for
    reading and writing on page 168.

    https://bitsavers.org/pdf/ibm/360/pli/C28-8201-1_PLIrefMan_Jan69.pdf

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to John Levine on Fri Jun 28 00:38:53 2024
    On Thu, 27 Jun 2024 16:00:51 -0000 (UTC), John Levine wrote:

    DTSS had communication files which were a lot like pipes. See https://www.cs.dartmouth.edu/~doug/DTSS/commfiles.pdf

    In the beginning I thought “Unix sockets”, which handle out-of-band data, but they don’t let you control the entire filesystem API. Though DAG connections (and other topologies) are certainly doable.

    One current Linux mechanism that does give total API control, though,
    would be FUSE pluggable filesystems.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to John Levine on Fri Jun 28 00:31:37 2024
    On Thu, 27 Jun 2024 16:00:51 -0000 (UTC), John Levine wrote:

    A line was anything up to CR/LF
    (and maybe form feed and other control characters) with the convention
    that you ignored null bytes.

    This sounds like where RSTS/E got a similar convention. This way, the filesystem didn’t have to worry about allocations of partial sectors, and application programs didn’t need to worry about RMS-style record
    attributes to figure out where the end-of-file was.

    But of course it only worked for plain-text files.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to John Dallman on Fri Jun 28 00:41:59 2024
    On Thu, 27 Jun 2024 09:14 +0100 (BST), John Dallman wrote:

    Don't forget that the original use case for Unix was document
    production, where record-based i/o is not very useful.

    Thinking of the kinds of documents: consider that, well into the 1980s and 1990s, sending out letters to mailing lists was a common scenario, and
    that requires the ability to handle both text (the letter form) and
    database (the address list) functions, and merge the two.

    That’s “document production”, but it probably doesn’t count as “technical
    document production” (manuals, teaching material etc).

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to All on Fri Jun 28 00:42:34 2024
    On Thu, 27 Jun 2024 17:39:46 +0000, MitchAlsup1 wrote:

    I did; TSS 360/67 CMU, and I did; both PL/C and PL/1.

    Did you make much use of LOCATE-style I/O?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From John Levine@21:1/5 to All on Fri Jun 28 01:30:10 2024
    According to Lawrence D'Oliveiro <[email protected]d>:
    On Thu, 27 Jun 2024 09:14 +0100 (BST), John Dallman wrote:

    Don't forget that the original use case for Unix was document
    production, where record-based i/o is not very useful.

    Thinking of the kinds of documents: consider that, well into the 1980s and >1990s, sending out letters to mailing lists was a common scenario, and
    that requires the ability to handle both text (the letter form) and
    database (the address list) functions, and merge the two. ...

    The killer app for Unix and nroff was typing up patent applications,
    and the killer feature was putting line numbers every Nth line of the
    formatted output the way the patent office wanted. At the time, it was
    the only document system that could do that.

    --
    Regards,
    John Levine, [email protected], Primary Perpetrator of "The Internet for Dummies",
    Please consider the environment before reading this e-mail. https://jl.ly

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Thomas Koenig@21:1/5 to John Levine on Fri Jun 28 05:32:32 2024
    John Levine <[email protected]> schrieb:
    According to Lawrence D'Oliveiro <[email protected]d>:
    On Thu, 27 Jun 2024 09:14 +0100 (BST), John Dallman wrote:

    Don't forget that the original use case for Unix was document
    production, where record-based i/o is not very useful.

    Thinking of the kinds of documents: consider that, well into the 1980s and >>1990s, sending out letters to mailing lists was a common scenario, and
    that requires the ability to handle both text (the letter form) and >>database (the address list) functions, and merge the two. ...

    The killer app for Unix and nroff was typing up patent applications,
    and the killer feature was putting line numbers every Nth line of the formatted output the way the patent office wanted. At the time, it was
    the only document system that could do that.

    There was another killer app, which is not in Kernighan's book
    on UNIX, but can be found on a Youtube video of a conference
    discussions on the origins of UNIX.

    The CEO of Bell was far-sighted, and for reasons of vanity did not
    want to wear glasses when he gave speeches. The UNIX system that
    they had set up included a phototypesetter and the capability to
    use larger letters, so he could read them.

    That gave them a friend in very high places, helicopters picking
    up speech manuscripts, highly confidential speeches on a machine
    that very many people had dialup access to and, when this was
    pointed out, a PDP-11 running UNIX with a phototypesetter in the
    CEO's secretary's office.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From MitchAlsup1@21:1/5 to Lawrence D'Oliveiro on Fri Jun 28 15:08:20 2024
    Lawrence D'Oliveiro wrote:

    On Thu, 27 Jun 2024 17:39:46 +0000, MitchAlsup1 wrote:

    I did; TSS 360/67 CMU, and I did; both PL/C and PL/1.

    Did you make much use of LOCATE-style I/O?

    No, I fond VSAM completely adequate for my needs.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to All on Fri Jun 28 21:55:04 2024
    On Fri, 28 Jun 2024 15:08:20 +0000, MitchAlsup1 wrote:

    Lawrence D'Oliveiro wrote:

    On Thu, 27 Jun 2024 17:39:46 +0000, MitchAlsup1 wrote:

    I did; TSS 360/67 CMU, and I did; both PL/C and PL/1.

    Did you make much use of LOCATE-style I/O?

    No, I fond VSAM completely adequate for my needs.

    This tends to confirm my suspicion that LOCATE-style I/O is too hedged
    round with restrictions to be useful outside some limited set of
    situations. Most of the time, it’s just not worth the trouble.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Stephen Fuld@21:1/5 to Lawrence D'Oliveiro on Fri Jun 28 22:38:02 2024
    On 6/28/2024 2:55 PM, Lawrence D'Oliveiro wrote:
    On Fri, 28 Jun 2024 15:08:20 +0000, MitchAlsup1 wrote:

    Lawrence D'Oliveiro wrote:

    On Thu, 27 Jun 2024 17:39:46 +0000, MitchAlsup1 wrote:

    I did; TSS 360/67 CMU, and I did; both PL/C and PL/1.

    Did you make much use of LOCATE-style I/O?

    No, I fond VSAM completely adequate for my needs.

    This tends to confirm my suspicion that LOCATE-style I/O is too hedged
    round with restrictions to be useful outside some limited set of
    situations. Most of the time, it’s just not worth the trouble.

    It really depends upon what you are doing. When I was doing OS/360
    stuff in the early 1970s, primarily COBOL, locate mode was what we used
    almost all the time. Of course, this was mostly sequential sequential
    data sets using QSAM, but back then, it was much more common than it is
    today.


    --
    - Stephen Fuld
    (e-mail address disguised to prevent spam)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From John Levine@21:1/5 to All on Sat Jun 29 18:22:04 2024
    According to Stephen Fuld <[email protected]d>:
    On 6/28/2024 2:55 PM, Lawrence D'Oliveiro wrote:
    On Fri, 28 Jun 2024 15:08:20 +0000, MitchAlsup1 wrote:

    Lawrence D'Oliveiro wrote:

    On Thu, 27 Jun 2024 17:39:46 +0000, MitchAlsup1 wrote:

    I did; TSS 360/67 CMU, and I did; both PL/C and PL/1.

    Did you make much use of LOCATE-style I/O?

    No, I fond VSAM completely adequate for my needs.

    This tends to confirm my suspicion that LOCATE-style I/O is too hedged
    round with restrictions to be useful outside some limited set of
    situations. Most of the time, it’s just not worth the trouble.

    It really depends upon what you are doing. ...

    Yes, and more often than not locate I/O is faster and easier. VSAM was
    TSS' native file system. You used GET and PUT macros to read and write
    files, and they had move and locate mode just like QSAM did for
    OS-compatible disks and tapes. See this TSS programmer's manual,
    starting on page 18. On TSS locate mode just told you where in virtual
    storage the record was. When you touched the address, the page fault
    caused the I/O.

    https://bitsavers.org/pdf/ibm/360/tss/GC28-2056-2_Time_Sharing_System_Data_Management_Facilities_Dec77.pdf

    As I think I said before, these days you map the file into your
    address space and read or write it like a long string, so its all
    locate all the time.

    See for example my version of grepcidr which maps the file in and
    chugs through it with a state machine, much faster than the old read a
    line and use a RE:

    https://github.com/jrlevine/grepcidr3

    --
    Regards,
    John Levine, [email protected], Primary Perpetrator of "The Internet for Dummies",
    Please consider the environment before reading this e-mail. https://jl.ly

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to John Levine on Sat Jun 29 23:53:59 2024
    On Sat, 29 Jun 2024 18:22:04 -0000 (UTC), John Levine wrote:

    ... more often than not locate I/O is faster and easier.

    Given all the caveats and restrictions, “easier” is not how I would describe it.

    But perhaps we’re talking at cross-purposes. If Mitch did his TSS and PL/I stuff in the 1970s, while you’re talking about the 1960s, then that could explain it. By the 1970s, CPU/RAM speeds had improved to the point where copying records a few hundred bytes at a time between buffers was not the performance bottleneck; disk I/O was.

    When you touched the address, the page fault caused the I/O.

    There seems to be this assumption that the paging mechanism is some kind
    of clever way of doing I/O. It’s not.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From MitchAlsup1@21:1/5 to Lawrence D'Oliveiro on Sun Jun 30 01:59:23 2024
    Lawrence D'Oliveiro wrote:

    On Sat, 29 Jun 2024 18:22:04 -0000 (UTC), John Levine wrote:

    ... more often than not locate I/O is faster and easier.

    Given all the caveats and restrictions, “easier” is not how I would describe it.

    But perhaps we’re talking at cross-purposes. If Mitch did his TSS and
    PL/I
    stuff in the 1970s, while you’re talking about the 1960s, then that
    could
    explain it. By the 1970s, CPU/RAM speeds had improved to the point where copying records a few hundred bytes at a time between buffers was not
    the
    performance bottleneck; disk I/O was.

    In particular, my PL/1 programs were not I/O bound.

    When you touched the address, the page fault caused the I/O.

    There seems to be this assumption that the paging mechanism is some kind
    of clever way of doing I/O. It’s not.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Stephen Fuld@21:1/5 to Lawrence D'Oliveiro on Sun Jun 30 04:33:11 2024
    Lawrence D'Oliveiro wrote:

    On Sat, 29 Jun 2024 18:22:04 -0000 (UTC), John Levine wrote:

    ... more often than not locate I/O is faster and easier.

    Given all the caveats and restrictions, “easier” is not how I would describe it.

    Again, it depends. For COBOL, you didn't have to specify anything.
    The compiler set up everything for you for you, and it "just worked".



    But perhaps we’re talking at cross-purposes. If Mitch did his TSS and
    PL/I stuff in the 1970s, while you’re talking about the 1960s, then
    that could explain it. By the 1970s, CPU/RAM speeds had improved to
    the point where copying records a few hundred bytes at a time between
    buffers was not the performance bottleneck; disk I/O was.


    Yes, but given multiprogramming, even in the 1970s, you would typically
    have several batch programs running at the same time, so during waits
    for I/O, another program could use the CPU. But using the CPU to move
    records meant it couldn't be doing anything else at the same time.



    --
    - Stephen Fuld
    (e-mail address disguised to prevent spam)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Stephen Fuld on Sun Jun 30 07:22:23 2024
    On Sun, 30 Jun 2024 04:33:11 -0000 (UTC), Stephen Fuld wrote:

    Lawrence D'Oliveiro wrote:

    By the 1970s, CPU/RAM speeds had improved to the
    point where copying records a few hundred bytes at a time between
    buffers was not the performance bottleneck; disk I/O was.

    Yes, but given multiprogramming, even in the 1970s, you would typically
    have several batch programs running at the same time, so during waits
    for I/O, another program could use the CPU. But using the CPU to move records meant it couldn't be doing anything else at the same time.

    Scraping the bottom of the barrel, much?

    Work out the numbers. The CPU time necessary to copy a single record is
    most likely a small fraction of the time it takes to service an I/O
    interrupt.

    And this is not taking into account the fact that I/O interrupts run at a higher priority than user-level tasks like copying buffers, anyway.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lynn Wheeler@21:1/5 to Lawrence D'Oliveiro on Sat Jun 29 23:39:22 2024
    Lawrence D'Oliveiro <[email protected]d> writes:
    Work out the numbers. The CPU time necessary to copy a single record is
    most likely a small fraction of the time it takes to service an I/O interrupt.

    And this is not taking into account the fact that I/O interrupts run at a higher priority than user-level tasks like copying buffers, anyway.

    back to IBM decision to add virtual memory to every 370 ... aka MVT
    storage management was so bad that regions had to be specified four
    times larger than used ... as result a normal/typical 1mbyte 370/165
    only ran four regions concurrently, insufficient to keep system busy and justified. adding virtual memory, could run MVT in a 16mbyte virtual
    address space (aka VS2/SVS, sort of like running MVT in cp67 16mbyte
    virtual machine)... increasing number of concurrent running regions by
    factor of four times (up to 15) ... with little or no paging.

    however, created different overhead (in part because the FS failure gave page-mapped filesystems a bad reputation) ... application filesystem
    channel programs were created (usually) by library routines in
    application space ... and the channel programs passed to EXCP/SVC0 for execution, now would have virtual addresses (rather than real required
    by I/O system) ... this required EXCP/SVC0 make a copy of every channel program, substituting real addresses for virtual addresses (initially
    done by crafting CP67's "CCWTRANS" into EXCP/SVC0).

    370 systems getting larger then were then banging against the concurrent
    region 15 limit imposed by the 4bit storage protection scheme keeping
    regions separated and had to transition from VS2/SVS single address
    space to VS2/MVS where every region was isolated in its own separate
    address space.

    However, MVS was increasingly becoming quite bloated (also EXCP/SVC0
    still had to make channel program copies) and device redrive (device
    idle between interrupt to starting next queued request) was a few
    thousand instructions. I had transferred to SJR and got to wander around datacenters in silicon valley including bldg14&15 (disk development and
    product test) across the street. They were doing prescheduled, 7x24, stand-alone testing and had mentioned they had recently tried MVS, but
    in that environment, MVS had 15min mean-time-between failure (besides
    its significant device idle waiting for device redrive) requiring manual re-ipl/reboot (aka test devices frequently violated all sort of rules & protocol). I offered to rewrite I/O supervisor to make it bullet proof
    and never fail allowing any amount of ondemand, concurrent testing
    ... improving productivity (as well as cutting to a couple hundred
    instructions between taking interrupt and redriving device).

    --
    virtualization experience starting Jan1968, online at home since Mar1970

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From John Dallman@21:1/5 to All on Sun Jun 30 10:50:00 2024
    In article <87ed8e7os5.fsf@localhost>, [email protected] (Lynn Wheeler)
    wrote:

    back to IBM decision to add virtual memory to every 370 ... aka MVT
    storage management was so bad that regions had to be specified four
    times larger than used

    What was the problem with the memory management? My experience of systems without virtual memory doesn't include any that shared the machine among several applications, so I have trouble guessing.

    John

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Michael S@21:1/5 to Thomas Koenig on Sun Jun 30 13:49:04 2024
    On Sun, 30 Jun 2024 10:44:34 -0000 (UTC)
    Thomas Koenig <[email protected]> wrote:

    John Dallman <[email protected]> schrieb:
    In article <87ed8e7os5.fsf@localhost>, [email protected] (Lynn
    Wheeler) wrote:

    back to IBM decision to add virtual memory to every 370 ... aka MVT
    storage management was so bad that regions had to be specified four
    times larger than used

    What was the problem with the memory management? My experience of
    systems without virtual memory doesn't include any that shared the
    machine among several applications, so I have trouble guessing.

    Imagine a process which resides at a certain address. It contains
    code, data, and pointers to data. Now you swap it out and want
    to reload it. You can use the same base address, then everything
    is fine. Or you can use a different one, where do the pointers
    point, especially registers which contain addresses?


    Why would I want to use different address?

    The /360 tried to solve this via base pointers, which all addresses
    were supposed calculated relative to to. Hence the RX and RS
    instraction all had a base register + 12 bit offset for their
    addressing modes - swapping out the base registers (if you knew
    which ones they were, was this info in the executable?) should have
    worked. But the SS instructions for decimal arithmetic did not have
    base pointers, so that solution did not work in the general casse.

    Going to virtual memory from the start would have saved the
    base pointer issue, and would have allowed 16-bit displacements,
    also saving registers in the case where 12-bit displacements were
    not enough.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Thomas Koenig@21:1/5 to John Dallman on Sun Jun 30 10:44:34 2024
    John Dallman <[email protected]> schrieb:
    In article <87ed8e7os5.fsf@localhost>, [email protected] (Lynn Wheeler)
    wrote:

    back to IBM decision to add virtual memory to every 370 ... aka MVT
    storage management was so bad that regions had to be specified four
    times larger than used

    What was the problem with the memory management? My experience of systems without virtual memory doesn't include any that shared the machine among several applications, so I have trouble guessing.

    Imagine a process which resides at a certain address. It contains
    code, data, and pointers to data. Now you swap it out and want
    to reload it. You can use the same base address, then everything
    is fine. Or you can use a different one, where do the pointers
    point, especially registers which contain addresses?

    The /360 tried to solve this via base pointers, which all addresses
    were supposed calculated relative to to. Hence the RX and RS
    instraction all had a base register + 12 bit offset for their
    addressing modes - swapping out the base registers (if you knew
    which ones they were, was this info in the executable?) should have
    worked. But the SS instructions for decimal arithmetic did not have
    base pointers, so that solution did not work in the general casse.

    Going to virtual memory from the start would have saved the
    base pointer issue, and would have allowed 16-bit displacements,
    also saving registers in the case where 12-bit displacements were
    not enough.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Thomas Koenig@21:1/5 to Michael S on Sun Jun 30 11:05:12 2024
    Michael S <[email protected]> schrieb:
    On Sun, 30 Jun 2024 10:44:34 -0000 (UTC)
    Thomas Koenig <[email protected]> wrote:

    John Dallman <[email protected]> schrieb:
    In article <87ed8e7os5.fsf@localhost>, [email protected] (Lynn
    Wheeler) wrote:

    back to IBM decision to add virtual memory to every 370 ... aka MVT
    storage management was so bad that regions had to be specified four
    times larger than used

    What was the problem with the memory management? My experience of
    systems without virtual memory doesn't include any that shared the
    machine among several applications, so I have trouble guessing.

    Imagine a process which resides at a certain address. It contains
    code, data, and pointers to data. Now you swap it out and want
    to reload it. You can use the same base address, then everything
    is fine. Or you can use a different one, where do the pointers
    point, especially registers which contain addresses?


    Why would I want to use different address?

    Memory overlap and fragmentation after having started and stopped
    (or swapped out) too many processes. Remember, these were
    physical-memory machines. You could load a process to a certain
    place, but you had more running, and one of them was swapped out
    or terminated, it left block of available memory where the next
    process didn't necessarily fit.

    They would have fared better by assigning a base register (or two,
    one for data and one for code) invisible from problem state
    and handled by the OS. Not sure why they didn't do so, but
    reading the literature seems to imply that they did not think it
    through. Now, of course, we have the benefit of hindsight.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From John Dallman@21:1/5 to Koenig on Sun Jun 30 13:16:00 2024
    In article <v5rcui$fqgj$[email protected]>, [email protected] (Thomas
    Koenig) wrote:

    Imagine a process which resides at a certain address. It contains
    code, data, and pointers to data. Now you swap it out and want
    to reload it. You can use the same base address, then everything
    is fine. Or you can use a different one, where do the pointers
    point, especially registers which contain addresses?

    The /360 tried to solve this via base pointers, which all addresses
    were supposed calculated relative to to. Hence the RX and RS
    instraction all had a base register + 12 bit offset for their
    addressing modes - swapping out the base registers (if you knew
    which ones they were, was this info in the executable?) should have
    worked. But the SS instructions for decimal arithmetic did not have
    base pointers, so that solution did not work in the general casse.

    And only a 12-bit offset, to boot. I've read of systems with base and
    limit registers, where all accesses were offsets from the base (or
    separate base registers for code and data). Real-mode 8086 code can work
    that way, although I don't think you can limit the size of a segment to
    less than 64KB. It looks as if /360 tried to construct that kind of
    operation style from more general base register address modes, but didn't
    do the job thoroughly.

    Going to virtual memory from the start would have saved the
    base pointer issue, and would have allowed 16-bit displacements,
    also saving registers in the case where 12-bit displacements were
    not enough.

    Virtual memory was pretty new technology at the time, and required a disk
    or drum. The central idea of /360 was having the same ISA across a wide
    range of machines, and virtual memory wasn't affordable at the low end at
    the time, AFAICS.

    So the problem wasn't with MFT memory management APIs or their
    implementation, but that a /360 site had to find a set of partition sizes
    that allowed for all the combinations of programs that they needed to run simultaneously. This was inevitably wasteful of memory, because each
    partition had to allow for the largest program that could be required to
    run in it.

    Is that correct?

    John

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Anton Ertl@21:1/5 to John Levine on Sun Jun 30 14:08:45 2024
    John Levine <[email protected]> writes:
    As I think I said before, these days you map the file into your
    address space and read or write it like a long string, so its all
    locate all the time.

    Reading with mmap() appears to be quite common, either directly or
    through libaries like glibc which use mmap() for buffered input where
    possible. For writing, using mmap() poses some difficulties:

    * Writing to an mmap()ed region has weaker atomicity semantics than
    write().

    * mmap() works on pages (not in the interface, fortunately, but
    internally), but for writing you want byte granularity. The
    difference is not a big problem for reading, but it is a bigger one
    for writing.

    * You have to extend the file length with a separate system call, and
    then mmap() the new area, so you might just as well use write().

    I have not looked at what glibc does for buffered writing, but I have
    been advisor on a master's thesis that tried to extend the reach of
    mmap()
    <https://www.complang.tuwien.ac.at/Diplomarbeiten/syrowatka00.ps.gz>;
    at the start I was very enthusiastic about the possibilities, at the
    end I concluded that mmap() is good for reading (when it can be
    applied), but usually not appropriate for writing.

    - anton
    --
    'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
    Mitch Alsup, <[email protected]>

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to Anton Ertl on Sun Jun 30 15:35:05 2024
    [email protected] (Anton Ertl) writes:
    John Levine <[email protected]> writes:
    As I think I said before, these days you map the file into your
    address space and read or write it like a long string, so its all
    locate all the time.

    Reading with mmap() appears to be quite common, either directly or
    through libaries like glibc which use mmap() for buffered input where >possible. For writing, using mmap() poses some difficulties:

    * Writing to an mmap()ed region has weaker atomicity semantics than
    write().

    That depends - if all writers are using mmap, atomicity is provided
    by other mechanisms (process-scope mutex, sysv or posix semaphores,
    etc).

    An application where one writes via mmap and others read
    using stdio or read/write/pread/pwrite simulataneously
    is poorly designed.


    * mmap() works on pages (not in the interface, fortunately, but
    internally), but for writing you want byte granularity. The
    difference is not a big problem for reading, but it is a bigger one
    for writing.

    In my experience, mmap is most often used when record-granularity
    is required, rather than treating the mapped region as a
    stream-of-bytes.


    * You have to extend the file length with a separate system call, and
    then mmap() the new area, so you might just as well use write().

    False economy. ftruncate is a single system call.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Stephen Fuld@21:1/5 to Thomas Koenig on Sun Jun 30 16:05:59 2024
    Thomas Koenig wrote:

    Michael S <[email protected]> schrieb:
    On Sun, 30 Jun 2024 10:44:34 -0000 (UTC)
    Thomas Koenig <[email protected]> wrote:

    John Dallman <[email protected]> schrieb:
    In article <87ed8e7os5.fsf@localhost>, [email protected] (Lynn
    Wheeler) wrote:

    back to IBM decision to add virtual memory to every 370 ... aka
    MVT >> >> storage management was so bad that regions had to be
    specified four >> >> times larger than used

    What was the problem with the memory management? My experience of
    systems without virtual memory doesn't include any that shared
    the >> > machine among several applications, so I have trouble
    guessing. >>
    Imagine a process which resides at a certain address. It contains
    code, data, and pointers to data. Now you swap it out and want
    to reload it. You can use the same base address, then everything
    is fine. Or you can use a different one, where do the pointers
    point, especially registers which contain addresses?


    Why would I want to use different address?

    Memory overlap and fragmentation after having started and stopped
    (or swapped out) too many processes. Remember, these were
    physical-memory machines. You could load a process to a certain
    place, but you had more running, and one of them was swapped out
    or terminated, it left block of available memory where the next
    process didn't necessarily fit.

    They would have fared better by assigning a base register (or two,
    one for data and one for code) invisible from problem state
    and handled by the OS.

    Precisely! This is what the, roughly contemporary, Univac 1108 did. It
    worked will for several decades. Eventually the 1108's successors went
    to a paging scheme (but with ~ 16 KB pages), to avoid the fragmentation
    issues from using variable sized chunks (called banks in 1100
    terminology) of memory.



    Not sure why they didn't do so, but
    reading the literature seems to imply that they did not think it
    through.


    Yes. John Levine said they thought that changing the, user visible,
    base registers would be doable :-(

    The invisible base registers may require an extra adder in the CPU when computing addresses, but this is much less overhead than paging
    requires.


    Now, of course, we have the benefit of hindsight.


    Yup!



    --
    - Stephen Fuld
    (e-mail address disguised to prevent spam)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Stephen Fuld@21:1/5 to Lawrence D'Oliveiro on Sun Jun 30 16:13:51 2024
    Lawrence D'Oliveiro wrote:

    On Sun, 30 Jun 2024 04:33:11 -0000 (UTC), Stephen Fuld wrote:

    Lawrence D'Oliveiro wrote:

    By the 1970s, CPU/RAM speeds had improved to the
    point where copying records a few hundred bytes at a time between
    buffers was not the performance bottleneck; disk I/O was.

    Yes, but given multiprogramming, even in the 1970s, you would
    typically have several batch programs running at the same time, so
    during waits for I/O, another program could use the CPU. But using
    the CPU to move records meant it couldn't be doing anything else at
    the same time.

    Scraping the bottom of the barrel, much?

    Work out the numbers. The CPU time necessary to copy a single record
    is most likely a small fraction of the time it takes to service an
    I/O interrupt.

    And this is not taking into account the fact that I/O interrupts run
    at a higher priority than user-level tasks like copying buffers,
    anyway.

    The thing you are missing is that (in the common scenario I was talking
    about) locate mode costs absolutely zero. No overhead in the I/O
    system, no changes to the source code, nothing. And not using it has
    no, i.e. zero, advantages. So while the savings might be small, there
    are no costs, so why no use it.

    And BTW, not that it effects the calculation, but you only have one I/O interrupt per physical block, not per logical record.



    --
    - Stephen Fuld
    (e-mail address disguised to prevent spam)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From MitchAlsup1@21:1/5 to Michael S on Sun Jun 30 16:17:11 2024
    Michael S wrote:

    On Sun, 30 Jun 2024 10:44:34 -0000 (UTC)
    Thomas Koenig <[email protected]> wrote:

    John Dallman <[email protected]> schrieb:
    In article <87ed8e7os5.fsf@localhost>, [email protected] (Lynn
    Wheeler) wrote:

    back to IBM decision to add virtual memory to every 370 ... aka MVT
    storage management was so bad that regions had to be specified four
    times larger than used

    What was the problem with the memory management? My experience of
    systems without virtual memory doesn't include any that shared the
    machine among several applications, so I have trouble guessing.

    Imagine a process which resides at a certain address. It contains
    code, data, and pointers to data. Now you swap it out and want
    to reload it. You can use the same base address, then everything
    is fine. Or you can use a different one, where do the pointers
    point, especially registers which contain addresses?


    Why would I want to use different address?

    Real Time pre virtual memory.

    The /360 tried to solve this via base pointers, which all addresses
    were supposed calculated relative to to. Hence the RX and RS
    instraction all had a base register + 12 bit offset for their
    addressing modes - swapping out the base registers (if you knew
    which ones they were, was this info in the executable?) should have
    worked. But the SS instructions for decimal arithmetic did not have
    base pointers, so that solution did not work in the general casse.

    Going to virtual memory from the start would have saved the
    base pointer issue, and would have allowed 16-bit displacements,
    also saving registers in the case where 12-bit displacements were
    not enough.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From MitchAlsup1@21:1/5 to Thomas Koenig on Sun Jun 30 16:18:55 2024
    Thomas Koenig wrote:

    Michael S <[email protected]> schrieb:
    On Sun, 30 Jun 2024 10:44:34 -0000 (UTC)
    Thomas Koenig <[email protected]> wrote:

    John Dallman <[email protected]> schrieb:
    In article <87ed8e7os5.fsf@localhost>, [email protected] (Lynn
    Wheeler) wrote:

    back to IBM decision to add virtual memory to every 370 ... aka MVT
    storage management was so bad that regions had to be specified four
    times larger than used

    What was the problem with the memory management? My experience of
    systems without virtual memory doesn't include any that shared the
    machine among several applications, so I have trouble guessing.

    Imagine a process which resides at a certain address. It contains
    code, data, and pointers to data. Now you swap it out and want
    to reload it. You can use the same base address, then everything
    is fine. Or you can use a different one, where do the pointers
    point, especially registers which contain addresses?


    Why would I want to use different address?

    Memory overlap and fragmentation after having started and stopped
    (or swapped out) too many processes. Remember, these were
    physical-memory machines.

    There were also the base-bounds machines.

    You could load a process to a certain
    place, but you had more running, and one of them was swapped out
    or terminated, it left block of available memory where the next
    process didn't necessarily fit.

    They would have fared better by assigning a base register (or two,
    one for data and one for code) invisible from problem state
    and handled by the OS. Not sure why they didn't do so, but
    reading the literature seems to imply that they did not think it
    through. Now, of course, we have the benefit of hindsight.

    In that sense, virtual memory is simply an infinite amount of
    base registers.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From MitchAlsup1@21:1/5 to John Dallman on Sun Jun 30 16:21:31 2024
    John Dallman wrote:

    In article <v5rcui$fqgj$[email protected]>, [email protected] (Thomas Koenig) wrote:

    Imagine a process which resides at a certain address. It contains
    code, data, and pointers to data. Now you swap it out and want
    to reload it. You can use the same base address, then everything
    is fine. Or you can use a different one, where do the pointers
    point, especially registers which contain addresses?

    The /360 tried to solve this via base pointers, which all addresses
    were supposed calculated relative to to. Hence the RX and RS
    instraction all had a base register + 12 bit offset for their
    addressing modes - swapping out the base registers (if you knew
    which ones they were, was this info in the executable?) should have
    worked. But the SS instructions for decimal arithmetic did not have
    base pointers, so that solution did not work in the general casse.

    And only a 12-bit offset, to boot. I've read of systems with base and
    limit registers, where all accesses were offsets from the base (or
    separate base registers for code and data). Real-mode 8086 code can work
    that way, although I don't think you can limit the size of a segment to
    less than 64KB. It looks as if /360 tried to construct that kind of
    operation style from more general base register address modes, but
    didn't do the job thoroughly.

    They ran out of encoding bits due to all the LD-OPs

    Going to virtual memory from the start would have saved the
    base pointer issue, and would have allowed 16-bit displacements,
    also saving registers in the case where 12-bit displacements were
    not enough.

    Virtual memory was pretty new technology at the time, and required a
    disk
    or drum. The central idea of /360 was having the same ISA across a wide
    range of machines, and virtual memory wasn't affordable at the low end
    at the time, AFAICS.

    So the problem wasn't with MFT memory management APIs or their implementation, but that a /360 site had to find a set of partition
    sizes that allowed for all the combinations of programs that they needed
    to
    run simultaneously. This was inevitably wasteful of memory, because each partition had to allow for the largest program that could be required to
    run in it.

    Is that correct?

    John

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Stephen Fuld@21:1/5 to John Dallman on Sun Jun 30 16:30:27 2024
    John Dallman wrote:

    In article <v5rcui$fqgj$[email protected]>, [email protected]
    (Thomas Koenig) wrote:

    Imagine a process which resides at a certain address. It contains
    code, data, and pointers to data. Now you swap it out and want
    to reload it. You can use the same base address, then everything
    is fine. Or you can use a different one, where do the pointers
    point, especially registers which contain addresses?

    The /360 tried to solve this via base pointers, which all addresses
    were supposed calculated relative to to. Hence the RX and RS
    instraction all had a base register + 12 bit offset for their
    addressing modes - swapping out the base registers (if you knew
    which ones they were, was this info in the executable?) should have worked. But the SS instructions for decimal arithmetic did not have
    base pointers, so that solution did not work in the general casse.

    And only a 12-bit offset, to boot. I've read of systems with base and
    limit registers, where all accesses were offsets from the base (or
    separate base registers for code and data).

    Yes, e.g. Univac 1108.



    Real-mode 8086 code can
    work that way, although I don't think you can limit the size of a
    segment to less than 64KB. It looks as if /360 tried to construct
    that kind of operation style from more general base register address
    modes, but didn't do the job thoroughly.

    Going to virtual memory from the start would have saved the
    base pointer issue, and would have allowed 16-bit displacements,
    also saving registers in the case where 12-bit displacements were
    not enough.

    Virtual memory was pretty new technology at the time, and required a
    disk or drum. The central idea of /360 was having the same ISA across
    a wide range of machines, and virtual memory wasn't affordable at the
    low end at the time, AFAICS.

    But IIRC even low end S/360s required a disk, at least to IPL(boot)
    from. But Virtual memory is more expensive than the hidden base
    registers, as you need page tables. probably a TLB, etc.


    So the problem wasn't with MFT memory management APIs or their implementation, but that a /360 site had to find a set of partition
    sizes that allowed for all the combinations of programs that they
    needed to run simultaneously. This was inevitably wasteful of memory,
    because each partition had to allow for the largest program that
    could be required to run in it.

    Is that correct?

    That is correct for OS/MFT, but not for OS/MVT.



    --
    - Stephen Fuld
    (e-mail address disguised to prevent spam)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Anton Ertl@21:1/5 to Scott Lurndal on Sun Jun 30 16:38:10 2024
    [email protected] (Scott Lurndal) writes:
    [email protected] (Anton Ertl) writes:
    John Levine <[email protected]> writes:
    As I think I said before, these days you map the file into your
    address space and read or write it like a long string, so its all
    locate all the time.

    Reading with mmap() appears to be quite common, either directly or
    through libaries like glibc which use mmap() for buffered input where >>possible. For writing, using mmap() poses some difficulties:

    * Writing to an mmap()ed region has weaker atomicity semantics than
    write().

    That depends - if all writers are using mmap, atomicity is provided
    by other mechanisms (process-scope mutex, sysv or posix semaphores,
    etc).

    An application where one writes via mmap and others read
    using stdio or read/write/pread/pwrite simulataneously
    is poorly designed.

    "An application"? I live in a world where all kinds of programs can
    access the same files. How such a program works internally is often
    not known to the designers of the others.

    In practice your advice boils down to avoiding to write with mmap().



    * mmap() works on pages (not in the interface, fortunately, but
    internally), but for writing you want byte granularity. The
    difference is not a big problem for reading, but it is a bigger one
    for writing.

    In my experience, mmap is most often used when record-granularity
    is required, rather than treating the mapped region as a
    stream-of-bytes.

    I usually mmap() the whole file.

    * You have to extend the file length with a separate system call, and
    then mmap() the new area, so you might just as well use write().

    False economy. ftruncate is a single system call.

    Yes, that's the one that extends the file length. ftruncate()
    followed by mmap() are two system calls. And at some point you also
    want to msync() (although not for each ftruncate).

    - anton
    --
    'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
    Mitch Alsup, <[email protected]>

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to Stephen Fuld on Sun Jun 30 17:27:36 2024
    "Stephen Fuld" <[email protected]d> writes:
    Thomas Koenig wrote:

    They would have fared better by assigning a base register (or two,
    one for data and one for code) invisible from problem state
    and handled by the OS.

    Precisely! This is what the, roughly contemporary, Univac 1108 did.

    As did the Burroughs B3500 and successors. In the early 80's,
    the architecture was updated to support an almost unlimited
    number of environments, each of which could map eight
    memory areas at any one time into the process addresss space,
    while maintaining binary compatability with the legacy applications
    which only had one base/limit register instead of eight.


    It
    worked will for several decades. Eventually the 1108's successors went
    to a paging scheme (but with ~ 16 KB pages), to avoid the fragmentation >issues from using variable sized chunks (called banks in 1100
    terminology) of memory.


    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Lurndal@21:1/5 to Stephen Fuld on Sun Jun 30 17:28:58 2024
    "Stephen Fuld" <[email protected]d> writes:
    Lawrence D'Oliveiro wrote:

    On Sun, 30 Jun 2024 04:33:11 -0000 (UTC), Stephen Fuld wrote:

    Lawrence D'Oliveiro wrote:

    By the 1970s, CPU/RAM speeds had improved to the
    point where copying records a few hundred bytes at a time between
    buffers was not the performance bottleneck; disk I/O was.

    Yes, but given multiprogramming, even in the 1970s, you would
    typically have several batch programs running at the same time, so
    during waits for I/O, another program could use the CPU. But using
    the CPU to move records meant it couldn't be doing anything else at
    the same time.

    Scraping the bottom of the barrel, much?

    Work out the numbers. The CPU time necessary to copy a single record
    is most likely a small fraction of the time it takes to service an
    I/O interrupt.

    And this is not taking into account the fact that I/O interrupts run
    at a higher priority than user-level tasks like copying buffers,
    anyway.

    The thing you are missing is that (in the common scenario I was talking >about) locate mode costs absolutely zero. No overhead in the I/O
    system, no changes to the source code, nothing. And not using it has
    no, i.e. zero, advantages. So while the savings might be small, there
    are no costs, so why no use it.

    That type of direct buffer access was also extensively used
    on the Burroughs systems.


    And BTW, not that it effects the calculation, but you only have one I/O >interrupt per physical block, not per logical record.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From John Dallman@21:1/5 to Stephen Fuld on Sun Jun 30 18:38:00 2024
    In article <v5s173$jl70$[email protected]>, [email protected]d
    (Stephen Fuld) wrote:

    John Dallman wrote:

    Virtual memory was pretty new technology at the time, and
    required a disk or drum. The central idea of /360 was having
    the same ISA across a wide range of machines, and virtual
    memory wasn't affordable at the low end at the time, AFAICS.

    But IIRC even low end S/360s required a disk, at least to IPL(boot)
    from.

    Are you sure? Per Wikipedia, the lowest-end real S/360, the Model 30,
    could run with only card equipment, running BPS, or with only tape drives, under TOS.

    <https://en.wikipedia.org/wiki/IBM_System/360_Model_30#System_software>

    BOS was a really minimal OS for an 8KB RAM machine with one disc drive,
    and DOS was less minimal.

    The Model 30 was apparently one of the most popular machines in the early
    days of S/360. Being able to build such small machines was a strong
    commercial consideration for the company, and thus the architecture.

    But Virtual memory is more expensive than the hidden base
    registers, as you need page tables. probably a TLB, etc.

    Yup, and it pushes up the minimum amount of RAM you need.

    So the problem wasn't with MFT memory management APIs or their implementation, but that a /360 site had to find a set of
    partition sizes that allowed for all the combinations of
    programs that they needed to run simultaneously. This was
    inevitably wasteful of memory, because each partition had
    to allow for the largest program that could be required to
    run in it.

    Is that correct?

    That is correct for OS/MFT, but not for OS/MVT.

    Right. And MVT had dynamic memory allocation, but since it used direct
    physical addresses, it suffered progressive fragmentation, and would
    presumably have to be IPL'ed at intervals?

    John

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Stephen Fuld@21:1/5 to John Dallman on Sun Jun 30 17:51:01 2024
    John Dallman wrote:

    In article <v5s173$jl70$[email protected]>, [email protected]d (Stephen Fuld) wrote:

    John Dallman wrote:

    Virtual memory was pretty new technology at the time, and
    required a disk or drum. The central idea of /360 was having
    the same ISA across a wide range of machines, and virtual
    memory wasn't affordable at the low end at the time, AFAICS.

    But IIRC even low end S/360s required a disk, at least to IPL(boot)
    from.

    Are you sure? Per Wikipedia, the lowest-end real S/360, the Model 30,
    could run with only card equipment, running BPS, or with only tape
    drives, under TOS.


    <https://en.wikipedia.org/wiki/IBM_System/360_Model_30#System_software>


    You, and John are right. I got that wrong and apologize for doing so.




    --
    - Stephen Fuld
    (e-mail address disguised to prevent spam)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From John Dallman@21:1/5 to Stephen Fuld on Sun Jun 30 18:52:00 2024
    In article <v5s5u5$ki8d$[email protected]>, [email protected]d
    (Stephen Fuld) wrote:
    John Dallman wrote:
    Are you sure? Per Wikipedia, the lowest-end real S/360, the Model
    30, could run with only card equipment, running BPS, or with only
    tape
    drives, under TOS.
    You, and John are right. I got that wrong and apologize for doing
    so.

    No problem.

    John

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From MitchAlsup1@21:1/5 to Stephen Fuld on Sun Jun 30 18:44:31 2024
    Stephen Fuld wrote:

    John Dallman wrote:

    In article <v5rcui$fqgj$[email protected]>, [email protected]
    (Thomas Koenig) wrote:

    Imagine a process which resides at a certain address. It contains
    code, data, and pointers to data. Now you swap it out and want
    to reload it. You can use the same base address, then everything
    is fine. Or you can use a different one, where do the pointers
    point, especially registers which contain addresses?

    The /360 tried to solve this via base pointers, which all addresses
    were supposed calculated relative to to. Hence the RX and RS
    instraction all had a base register + 12 bit offset for their
    addressing modes - swapping out the base registers (if you knew
    which ones they were, was this info in the executable?) should have
    worked. But the SS instructions for decimal arithmetic did not have
    base pointers, so that solution did not work in the general casse.

    And only a 12-bit offset, to boot. I've read of systems with base and
    limit registers, where all accesses were offsets from the base (or
    separate base registers for code and data).

    Yes, e.g. Univac 1108.

    S.E.L 32/65 but not 32/67 or 32/87
    CDC 6600 7600
    CRAY-1 1/S

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lynn Wheeler@21:1/5 to John Dallman on Sun Jun 30 09:09:23 2024
    [email protected] (John Dallman) writes:
    Are you sure? Per Wikipedia, the lowest-end real S/360, the Model 30,
    could run with only card equipment, running BPS, or with only tape drives, under TOS.

    <https://en.wikipedia.org/wiki/IBM_System/360_Model_30#System_software>

    BOS was a really minimal OS for an 8KB RAM machine with one disc drive,
    and DOS was less minimal.

    The Model 30 was apparently one of the most popular machines in the early days of S/360. Being able to build such small machines was a strong commercial consideration for the company, and thus the architecture.

    at end of semester after taking two credit hr into course, was hired to
    rewrite 1401 MPIO for (64kbyte) 360/30 ... which was running early
    os/360 PCP (single executable program at a time) ... had 2311 disks,
    tapes, and unit record. I first had a 2000 card program, assembled under
    os/360 but ran "stand-alone" ... being loaded with the "BPS" loader (had
    my own monitor, device drivers, interrupt handlers, error recovery,
    storage management, etc. Making changes during development & test
    required brining up os/360 and re-assembly and then stand-alone loading.

    I eventually got around to adding os/360 mode of operation using
    assembly option to generate either the stand-alone version or the os/360 version. It turns out the stand-alone version took 30mins to assemble,
    however the OS/360 version took an hour to assemble (OS/360 required DCB
    macro for each device and each DCB macro added six minutes elapsed time
    to assembly) ... aka stand-alone testing and then re-ipl for OS/360
    30min re-assemble still took less time than OS/360 testing and hour re-assemble.

    --
    virtualization experience starting Jan1968, online at home since Mar1970

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lynn Wheeler@21:1/5 to John Dallman on Sun Jun 30 08:58:22 2024
    [email protected] (John Dallman) writes:
    What was the problem with the memory management? My experience of systems without virtual memory doesn't include any that shared the machine among several applications, so I have trouble guessing.

    os/360 had "relative" (fixed) adcons ... that were resolved to fixed
    (real) address at initial program load (and couldn't change for the
    duration of the program) ... that also presented downside when moving to virtual memory paged environment ... could directly execute paged image
    from disk ... first, executable image had to be preloaded and all
    "relative' adcons modified for the specific instance. tss/360 had
    addressed that by keeping relative adcons, "relative" to base kept in
    data structure specifically for that instant (same paged shared
    executable image could appear at different addresses for different
    programs executing in different address spaces).

    MVT memory management for dynamic allocation for data had horrendous
    problem with storage fragmentation and frequent requirement for large
    areass of contiguous storage. Storage fragmentation problem increased
    the longer the programs were running (and maintaining contiguous
    allocation as number of different, concurrently running regions
    increased). After joining IBM, I had done a page mapped filesystem for
    CMS and because CMS made extensive use of OS/360 compilers, I was
    constantly fighting the OS/360 adcon convention (wanting to constantly
    prefix ADCONs as part of executable loading).

    note before I had graduated, I had been hired fulltime into small group
    in the Boeing CFO office to help with the formation of Boeing Computer
    Services (consolidate all data processing into an independent busines
    unit). I thot Renton datacenter possibly largest in the world (couple
    hundred million in IBM 360s, sort of precursor to modern cloud megadatacenters), 360/65s arriving faster than could be installed, boxes constantly being staged in the hallways around the machine room. Lots of politics between Renton director and CFO who only had a 360/30 up at
    Boeing field for payroll (although they enlarged the room to install a
    360/67 for me to play with when I wasn't doing other stuff).

    While I was there they moved a two-processor, duplex 360/67
    (originally for tss/36) up to Seattle from Boeing Huntsville.
    Huntsville had got the two processor machine with lots of 2250 graphic
    screens
    https://en.wikipedia.org/wiki/IBM_2250

    for (long running) CAD 2250 applications ... since tss/360 didn't have
    any CAD support ... they configured it as two single processor systems
    each running MVT13 ... which was severely affected by the fragmentation
    problem that increased the longer each CAD 2250 program was running. A
    few years before the decision was made to add virtual memory to all 370s
    ... Boeing Huntsville had modified MVT13 to run in virtual memory mode
    ... it didn't support paging ... but used the virtual memory to create contiguous virtual memory areas out of non-contiguous areas of real
    storage (to address the MVT storage management problem).

    the initial solution adding virtual memory to all 370s (VS2/SVS) was to continue allow each executing region to continue specifying/reserving
    large, contiguous storage area ... but support paging and increase the
    number of concurrently executing regions.

    The original OS/360 design point of running in small real storage
    contributed to the excessive disk activity ... where lots of system was fragmented into small pieces that would be sequentially loaded from disk
    for execution ... and then increasing the number of concurrently
    executing regions used to compensate for the large I/O disk filesystem
    wait time (somewhat analogous to processor poor cache hit rate)

    --
    virtualization experience starting Jan1968, online at home since Mar1970

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From moi@21:1/5 to All on Mon Jul 1 02:13:30 2024
    On 30/06/2024 19:44, MitchAlsup1 wrote:
    Stephen Fuld wrote:

    John Dallman wrote:

    In article <v5rcui$fqgj$[email protected]>, [email protected]
    (Thomas Koenig) wrote:

    Imagine a process which resides at a certain address.  It contains
    code, data, and pointers to data.  Now you swap it out and want
    to reload it.  You can use the same base address, then everything
    is fine.  Or you can use a different one, where do the pointers
    point, especially registers which contain addresses?

    The /360 tried to solve this via base pointers, which all addresses
    were supposed calculated relative to to.  Hence the RX and RS
    instraction all had a base register + 12 bit offset for their
    addressing modes - swapping out the base registers (if you knew
    which ones they were, was this info in the executable?)  should have
    worked.  But the SS instructions for decimal arithmetic did not have
    base pointers, so that solution did not work in the general casse.

    And only a 12-bit offset, to boot. I've read of systems with base and
    limit registers, where all accesses were offsets from the base (or
    separate base registers for code and data).

    Yes, e.g. Univac 1108.

    S.E.L 32/65 but not 32/67 or 32/87
    CDC 6600 7600
    CRAY-1 1/S

    Ferranti Orion 1960
    EE KDF9 1960
    ICT 1900 Series 1964

    --
    Bill F.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Stephen Fuld on Mon Jul 1 21:41:02 2024
    On Sun, 30 Jun 2024 04:33:11 -0000 (UTC), Stephen Fuld wrote:

    Lawrence D'Oliveiro wrote:

    On Sat, 29 Jun 2024 18:22:04 -0000 (UTC), John Levine wrote:

    ... more often than not locate I/O is faster and easier.

    Given all the caveats and restrictions, “easier” is not how I would
    describe it.

    Again, it depends. For COBOL, you didn't have to specify anything. The compiler set up everything for you for you, and it "just worked".

    Maybe it didn’t. Given the way locate-mode I/O is set up, it should automatically fall back to copy-mode if the conditions are not right. So
    maybe you were in fact using copy-mode, not locate-mode, most of the time, without realizing it.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From John Levine@21:1/5 to All on Tue Jul 2 01:02:07 2024
    According to Lawrence D'Oliveiro <[email protected]d>:
    Again, it depends. For COBOL, you didn't have to specify anything. The
    compiler set up everything for you for you, and it "just worked".

    Maybe it didn’t. Given the way locate-mode I/O is set up, it should >automatically fall back to copy-mode if the conditions are not right. So >maybe you were in fact using copy-mode, not locate-mode, most of the time, >without realizing it.

    You know, you could admit that just once you're wrong.

    Or you could read the COBOL manuals, e.g. page 91 of the DOS COBOL
    Programmer's guide:

    Files can be processed using multiple buffers. Logical records are
    referenced in the proper biock by adjusting registers (using them as
    pointers).

    This technique eliminates the need for moving a record from the
    buffer area to a separate record work area, as well as the record work
    area itself. The record can be operated on directly in the buffer
    area.

    --
    Regards,
    John Levine, [email protected], Primary Perpetrator of "The Internet for Dummies",
    Please consider the environment before reading this e-mail. https://jl.ly

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to John Levine on Tue Jul 2 03:03:16 2024
    On Tue, 2 Jul 2024 01:02:07 -0000 (UTC), John Levine wrote:

    According to Lawrence D'Oliveiro <[email protected]d>:

    Again, it depends. For COBOL, you didn't have to specify anything.
    The compiler set up everything for you for you, and it "just worked".

    Maybe it didn’t. Given the way locate-mode I/O is set up, it should
    automatically fall back to copy-mode if the conditions are not right.
    So maybe you were in fact using copy-mode, not locate-mode, most of the
    time, without realizing it.

    You know, you could admit that just once you're wrong.

    Admit that you never checked whether locate mode was actually engaged or
    not. You just assumed that it was.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From John Levine@21:1/5 to All on Tue Jul 2 13:09:09 2024
    According to Lawrence D'Oliveiro <[email protected]d>:
    On Tue, 2 Jul 2024 01:02:07 -0000 (UTC), John Levine wrote:

    According to Lawrence D'Oliveiro <[email protected]d>:

    Again, it depends. For COBOL, you didn't have to specify anything.
    The compiler set up everything for you for you, and it "just worked".

    Maybe it didn’t. Given the way locate-mode I/O is set up, it should
    automatically fall back to copy-mode if the conditions are not right.
    So maybe you were in fact using copy-mode, not locate-mode, most of the
    time, without realizing it.

    You know, you could admit that just once you're wrong.

    Admit that you never checked whether locate mode was actually engaged or
    not. You just assumed that it was.

    Huh, you can't admit you're wrong. Well, whatever.
    --
    Regards,
    John Levine, [email protected], Primary Perpetrator of "The Internet for Dummies",
    Please consider the environment before reading this e-mail. https://jl.ly

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dan Cross@21:1/5 to [email protected] on Tue Jul 2 13:46:56 2024
    In article <v60u5l$99i$[email protected]>, John Levine <[email protected]> wrote: >According to Lawrence D'Oliveiro <[email protected]d>:
    On Tue, 2 Jul 2024 01:02:07 -0000 (UTC), John Levine wrote:

    According to Lawrence D'Oliveiro <[email protected]d>:

    Again, it depends. For COBOL, you didn't have to specify anything.
    The compiler set up everything for you for you, and it "just worked". >>>>
    Maybe it didn’t. Given the way locate-mode I/O is set up, it should
    automatically fall back to copy-mode if the conditions are not right.
    So maybe you were in fact using copy-mode, not locate-mode, most of the >>>> time, without realizing it.

    You know, you could admit that just once you're wrong.

    Admit that you never checked whether locate mode was actually engaged or >>not. You just assumed that it was.

    Huh, you can't admit you're wrong. Well, whatever.

    I still don't understand why people keep trying to have real
    discussions with this Lawrence fellow: he's clearly either a
    dedicated troll, or more likely the village idiot.

    - Dan C.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)