In article <
[email protected]>,
Anthony Ortiz <
[email protected]> wrote:
The 65C832 as proposed is basically a 32 bit version of the 65C816. In
order to implement it you’ll need to make some decisions that WDC never got
around to:
* opcode and byte count for XFE to switch between bit modes;
* how to handle XBA in 32 bit mode (swap the top and bottom 16 bit groups, >> or bytes 1 and 0 like in 16 bit mode)
* whether to clear or preserve the top 16 bits of the A, X, and Y
registers when switching between 32 bit and 16 bit modes;
* register transfer ops in 32 bit mode (TDC, TSC, TXA, TYA clear top 16
bits of C?); and
* probably other things I haven’t thought of.
In choosing to emulate a 65C832 you limit yourself to
* 8 bit data bus (4 memory cycles to load a 32 bit register)
* 24 bit program address space (16Mb limit)
* 24 bit data address space (unless you pretend it is an ASIC version with >> 32 bit data address space)
You’ll also need to develop a new software development tool chain for this >> ‘preliminary’ processor.
By comparison, if you chose an ARM coprocessor you’d have the 32 bit
address space and tool chain ready to go.
This is what I don't understand... I'm talking about a spiritual
successor to the 65C816 that looks like a duck, walks like a duck, and
quacks like a duck... unlike what the 65C832 would be to the 65C816 as
that is to the 65C02 as that is to the 6502, the ARM has no resemblance >whatsoever to the 6502 line despite it having been the inspiration for
the ARM; you might as well put an Intel inside and program a new GS/OS
in x86 and run it and claim it's an Apple IIgs, but it's not, you can't >leverage any existing software, not even a single instruction, so it
doesn't make any sense in an Apple II. With the 65C832 you'd be able to >leverage what's already out there, and any assemblers and compilers
would simply need to be extended, not replaced. What I'm saying is that
I think we're at the point where we can create a much faster Apple II >accelerator (via FPGA or emulation as I'm doing on my Pi) so we can
achieve that 1ghz GS/OS , and while we're at it maybe we can add some
things that we've always wanted in the process, like 32-bitness or some >badly-needed instructions.
Also I'm not stuck on the 65C832, right now this is all just talk, just >trying to see what the veterans here think the successor should look
like if one had been made for the 32-bit world, just a bunch of
locker-room talk for now. I'll be happy just to get this 1ghz 6502
going, lol!
As for what to shoot for: it will not be easy to make an FPGA processor
which is faster than software emulation. Software can emulate a 65816
at an effective speed of 1GHz already, which is actually much faster
than the speed of a real 65816 running at 1GHz. This works out to about
300 million instructions per second (since 65816 instruction average a
little over 3 clocks each). FPGAs at reasonable prices are basically
limited to around 300-350MHz clock speeds. A complex FPGA design which executed one 65816 instruction every clock cycle would just about match
the speed of emulation on today's CPUs. But since accessing all memory couldn't sustain that 350MHz speed, it's effective rate will be lower
(think caches and cache misses).
So if not the fastest experience, what do you want?
Theorizing about CPU designs can be fun, but a 65832 has a lot of
headwinds against it. A lot of software is needed to get anywhere
(assemblers, compilers, disassemblers, etc. etc.). There are many ways
to add 32-bit support, so there are a lot of choices to be made, where
easy would be in direct violation of making it run fast. To have any
kind of speed, it will need to run one instruction per cycle (or more!),
which means a new mode (since 6502/65816 compatibility needs to keep the
byte fetches). One approach is WDM is a prefix for existing
instructions, and changes how they work--WDM STA could always write 32
bits, for example, and WDM BNE could use 32-bit (or 16-bit)
displacements. But what should WDM CLC do? This is where new
operations can be added. The 65816 makes some mistakes (like SEP #$20;
STA; REP #$30 to do a store of one byte), which would be nice to fix in
some way. Another approach is WDM REP #$30 enters 32-bit mode, and then
you just widen all the existing instructions to work on 32-bit data.
But this can be harder to make fast. So, if you create a new
instruction set, then you've got write a lot of software to support this (compilers, assemblers), plus then write software which takes advantage
of it. I think that's what the previous poster was saying: if WDM was a
switch to an ARM instruction set, then you get a whole lot of support
for the software needed.
Kent
--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)