On 2021-04-17 13:48, Robert Prins wrote:
Hi all,
I would like to disassemble the final version of a self-written Turbo Pascal
V3 program, i.e. a simple .COM file, and to that effect I've dug out my old (AD 2004) registered copy of IDA Pro (V4.7.0.831). Not having used it for more than 10 years, and no longer having access to their forum, I'm now stuck. The .COM file loads, IDA happily disassembles it, but it just creates one single segment,
and I have no (longer) a clue on how to create the data segment. There's a bit of info in the TP3 Manual, and using David Lindauer's GRDB in DOXBox-X allows me
to single-step through the RTL initialisation code and that shows me it sets up up DS and SS, but it doesn't help me in setting up these segments in IDA.
I've tried the "Create Segment" option, but I'm lost entering the required
values for start address, end address and base, "class" is probably "DATA", the once for the single "seg000" that IDA creates are CODE, start @ 0x0100, end @ 0xD623, which leads me to assume that a to-be-created "seg001" should start at 0x0000, end at 0xffff, and have a base of 0xd63 (paragraphs), but that results in a "Bad segment base: segment would have bytes with a negative offset" pop-up.
Trying start @ 0xd630, end @ 0x1d630, with a base 0x0000 creates a segment,
but it looks like
seg000:D622
seg001:C8C00 ;
---------------------------------------------------------------------------
seg001:C8C00
seg001:C8C00 ; Segment type: Regular
seg001:C8C00 seg001 segment byte public '' use16
seg001:C8C00 assume cs:seg001
seg001:C8C00 ;org 0C8C00h
seg001:C8C00 assume es:nothing, ss:nothing, ds:nothing,
fs:nothing, gs:nothing
Which may be correct, but the "org 0c8c00" makes absolutely no sense to me.
I've had a bit, or rather, a huge, amount of help from Hex-Rays' Ilfak Guilfanov, and using the names in "scg.zip" (found @ <
https://www.pcengines.ch/tp3.htm>, I've got a complete disassembly of the compiler. I cut down the IDA generated .IDC file to include just the info about the RTL, manually changed some data, which at some stage should be done with built-in IDC functions, wrote a bit of REXX to add identifiers to every Pascal procedure (basically inline statements that jump over upper-cased procedure names in Pascal-string format) and got myself a nice assembly listing, with code
that's obviously working, but very "simple" (Let's just leave it at that...)
I could let IDA generate an assembler listing, hack that to pieces, most likely in some automated way, as there are dozens of procedures that look like
cseg:4E77 proc day_ptr_is_td_top near
cseg:4E77
cseg:4E77 push bp
cseg:4E78 mov bp, sp
cseg:4E7A push bp
cseg:4E7B jmp $+3
cseg:4E7E ; ------------------------------------------------------------ cseg:4E7E
cseg:4E7E @01:
cseg:4E7E jmp short @02
cseg:4E7E ; ------------------------------------------------------------ cseg:4E80 db 17,'DAY_PTR_IS_TD_TOP'
cseg:4E92 ; ------------------------------------------------------------ cseg:4E92
cseg:4E92 @02:
cseg:4E92 mov eax, [td_top]
cseg:4E96 mov [day_ptr], eax
cseg:4E9A mov [winday_top], eax
cseg:4E9E call _day_list_is_day_ptr
cseg:4EA1 jmp $+3
cseg:4EA4 ; ------------------------------------------------------------ cseg:4EA4
cseg:4EA4 @03:
cseg:4EA4 mov sp, bp
cseg:4EA6 pop bp
cseg:4EA7 retn
cseg:4EA7 endp day_ptr_is_td_top
where a stack-frame isn't required, and likewise for the "jmp $+3"'s.
However, right now I've started to think about something else, making a few tweaks to the compiler itself. IDA Pro has a built-in assemble command, and can save a changed .COM file, but that would result in an output file with just a lot of NOP instructions, like 20+ in the random number generator
x(n+1) = (x(n) * 129 + 907633385) mod 2^32
32-bit multiplication and addition are easier on a 32-bit CPU than on a 16-bit one...
But of course it would be more interesting to see if it's possible to retrofit Norbert Juffa's enhanced 6-byte-real IEEE-compliant (as far as that's possible in this format) arithmetic to the RTL. That however would not realistically possible via the assemble command, but would require a real reassembly. IDA Pro provides two options for generating source, "generic" (aka MASM?) or TASM "Ideal" mode.
Now I can probably figure out what to change where to let Turbo set up its segmentation magic, but my disassembly contains a data segment with the uninitialised RTL variables, and I don't want/need that in a .COM file. The assembler listing generated by the program in the above-mentiond scg.zip and to be assembled with "AS" from the same just has a series of "var = value" to set up these variables. So is there a way to create in TASM/MASM some kind of "dummy" data segment just to set up variable names/offsets? Googling on dummy/virtual segment doesn't come up with anything helpful, but I'm sure that this is not an uncommon situations.
Robert
--
Robert AH Prins
robert(a)prino(d)org
The hitchhiking grandfather -
https://prino.neocities.org/indez.html
Some REXX code for use on z/OS -
https://prino.neocities.org/zOS/zOS-Tools.html
--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)