program number;
var a : integer;
begin
a := (5 + 6)
end.
Now, the first thing to notice about this program
is that it has no output! It just adds 2 integers,
and stores the results in memory location "a".
The point of the program is to exhibit its P88
assembly-language counterpart, not to be an
interesting program by itself.
b) If you create that program in Unix, and then
compile it with "pc", the
result will be a Unix
machine-language version
of the Pascal
program, which can then
be executed.
Compworks compiles it, instead,
into a program
that can be executed on
the P88 simulated
computer, and the result
is shown--not in
the 0s and 1s of machine
language--but in
a more readable assembly
language:
COPY AX,
#C0
ADD
AX, #C1
COPY _E0,
AX
COPY AX,
_E0
COPY a,
AX
#C0
5
#C1
6
_E0
0
a
0
c) Let's hand trace this, showing the results of our
trace with a table whose
columns represent
the registers we need to
keep track of, and
whose rows represent the
steps of the program.
On the CPU side of our toy
P88 computer,
we need to keep track of
the accumulator (AX);
on the memory side, we need
to keep track of
the memory locations called
#C0, #C1, _E0, & a.
Initially, AX is empty, and
the others have the
values as shown in the assembly
language
program above:
AX #C0
#C1 _E0 a
=====================
5 6
0 0
After the first instruction
is fetched and
executed, our table will
look like this:
AX #C0
#C1 _E0 a
=====================
5 6
0 0
5
Here, we're using the following
convention:
If no new data is stored
in any column, then
assume that whatever data
was there before
is still there. So,
#C0 still has 5 in it, #C1 still
has 6, etc.
Here's the rest of the trace of the program:
AX #C0
#C1 _E0 a
=====================
5 6
0 0
5
11
11
11
11
So, at the end of the program,
AX has 11,
#C0 still has 5, #C1 still
has 6, _E0 has 11
& a has 11.
That's exactly what we wanted
to happen:
i.e., we wanted 11 (the
sum of 5 and 6) to
be in memory location "a".
d) But something odd happened: The sum, 11,
got computed in AX, then got copied
into
_E0, then got copied back
into AX, and then
at long last got copied
into "a".
Why didn't the assembly language
program
just directly copy the 11
from AX to "a"?
* It could
have; it just didn't. This is a nice
example of something we talked about
earlier in the semester: 2 different
algorithms that have the same input-output
behavior (i.e., that do the same thing, but
that do it in different ways).
One
of the 2 algorithms is longer and less
efficient than the other. So why didn't the
compiled version of our Pascal program
do things the easy, short way, instead of
the longer way?
Well,
there are things called "optimizing
compilers" that can take a longer compiled
program, inspect it for silly "do-nothing
dances" like the moving of 11 back and
forth that accomplishes nothing, and delete
those extra steps.
* But that
doesn't explain why the compiler
has those extra steps there in the first place.
The full explanation can be found in
Biermann, Ch. 9, which discusses compilers.
But here's a quick overview of how it came
to happen:
The
Pascal program's main (in fact, only)
instruction is an assignment statement:
a := (5 + 6)
In
general, assignment statements have
the form:
X := e
where
"X" is the name of a memory location
and "e" is some expression that could be:
- a value to be stored in X
(e.g., an integer, a string, etc.)
- a more complicated expression
that first has to be computed
to yield a value that can then
be stored in X
(e.g., it could be a substr function,
or, as in our case, it could be an
arithmetic problem)
The
compiler has to be able to translate
any such assignment statement, whether or
not "e" is a simple value or a piece of code
that has to be computed first.
The
translation of "X := e" from Pascal to
P88 assembly language is the following
sequence of (at least) 3 instructions:
code(e)
COPY AX, register where code(e)'s value
is stored
COPY X, AX
Let's consider these in reverse order:
"COPY X,
AX" is the required P88 instruction
that stores something into memory register
X; since that's what "X := e" is supposed to
do (namely, store something in X), we need
this line of code.
"COPY AX,
reg. where code(e)'s value is stored"
puts something into AX. Presumably, what it
puts there is what eventually has to be stored
in X. So we need this line of code to get the
data into AX, so that we can then get it into X;
remember: in assembly language, we cannot
put anything directly into a memory register;
it can only be put there from the accumulator.
So
what is the "register where code(e)'s value
is stored"? Well, that takes us back to the
first line: what is "code(e)"? Simply put,
it's whatever assembly language instructions
are needed to compute "e".
So
"X := e" is compiled into 3 assembly
language instructions:
compute e;
store the result in AX;
move the result into X
But
"compute e" is more than just doing
the computation; the result of the
computation must be located somewhere;
call that location "_E0".
For
us, "compute e" is just "(5+6)", with
the result (11) stored somewhere.
But
the computation is done in AX;
so then it has to be moved "somewhere",
namely, _E0.
But
then the last 2 parts of the compiled
assembly language instructions get 11 from
_E0 into AX, and thence into "a".
That's
why there are the seemingly extra
instructions that just move 11 around for
no apparent purpose!