CSE 111, Fall 2000

Great Ideas in Computer Science

Lecture Notes #32

MACHINE ARCHITECTURE
AND ASSEMBLY LANGUAGE

5.    Sample P88 program.

a)    Here's one of the programs available on the
        "compworks"  simulator, first in its Pascal
        version:

    program number;
    var a : integer;
    begin
        a := (5 + 6)
    end.

    Now, the first thing to notice about this program
    is that it has no output!  It just adds 2 integers,
    and stores the results in memory location "a".

    The point of the program is to exhibit its P88
    assembly-language counterpart, not to be an
    interesting program by itself.

b)    If you create that program in Unix, and then
        compile it with "pc", the result will be a Unix
        machine-language version of the Pascal
        program, which can then be executed.

        Compworks compiles it, instead, into a program
        that can be executed on the P88 simulated
        computer, and the result is shown--not in
        the 0s and 1s of machine language--but in
        a more readable assembly language:

        COPY    AX, #C0
        ADD     AX, #C1
        COPY    _E0, AX
        COPY    AX, _E0
        COPY    a, AX
        #C0       5
        #C1       6
        _E0        0
        a            0

c)    Let's hand trace this, showing the results of our
        trace with a table whose columns represent
        the registers we need to keep track of, and
        whose rows represent the steps of the program.

        On the CPU side of our toy P88 computer,
        we need to keep track of the accumulator (AX);
        on the memory side, we need to keep track of
        the memory locations called #C0, #C1, _E0, & a.

        Initially, AX is empty, and the others have the
        values as shown in the assembly language
        program above:

        AX    #C0    #C1    _E0    a
        =====================
                    5        6        0    0

        After the first instruction is fetched and
        executed, our table will look like this:

        AX    #C0    #C1    _E0    a
        =====================
                    5        6        0    0
        5

        Here, we're using the following convention:
        If no new data is stored in any column, then
        assume that whatever data was there before
        is still there.  So, #C0 still has 5 in it, #C1 still
        has 6, etc.

        Here's the rest of the trace of the program:

        AX    #C0    #C1    _E0    a
        =====================
                    5        6        0    0
        5
        11
                                        11
        11
                                               11

        So, at the end of the program, AX has 11,
        #C0 still has 5, #C1 still has 6, _E0 has 11
        & a has 11.

        That's exactly what we wanted to happen:
        i.e., we wanted 11 (the sum of 5 and 6) to
        be in memory location "a".

d)    But something odd happened:  The sum, 11,
       got computed in AX, then got copied into
        _E0, then got copied back into AX, and then
        at long last got copied into "a".

        Why didn't the assembly language program
        just directly copy the 11 from AX to "a"?

        *    It could have; it just didn't.  This is a nice
            example of something we talked about
            earlier in the semester:  2 different
            algorithms that have the same input-output
            behavior (i.e., that do the same thing, but
            that do it in different ways).

            One of the 2 algorithms is longer and less
            efficient than the other.  So why didn't the
            compiled version of our Pascal program
            do things the easy, short way, instead of
            the longer way?

            Well, there are things called "optimizing
            compilers" that can take a longer compiled
            program, inspect it for silly "do-nothing
            dances" like the moving of 11 back and
            forth that accomplishes nothing, and delete
            those extra steps.

        *    But that doesn't explain why the compiler
            has those extra steps there in the first place.
            The full explanation can be found in
            Biermann, Ch. 9, which discusses compilers.
            But here's a quick overview of how it came
            to happen:

            The Pascal program's main (in fact, only)
            instruction is an assignment statement:

                a := (5 + 6)

            In general, assignment statements have
            the form:

                X := e

            where "X" is the name of a memory location
            and "e" is some expression that could be:

                -    a value to be stored in X
                        (e.g., an integer, a string, etc.)

                -    a more complicated expression
                        that first has to be computed
                        to yield a value that can then
                        be stored in X
                        (e.g., it could be a substr function,
                            or, as in our case, it could be an
                            arithmetic problem)

            The compiler has to be able to translate
            any such assignment statement, whether or
            not "e" is a simple value or a piece of code
            that has to be computed first.

            The translation of "X := e" from Pascal to
            P88 assembly language is the following
            sequence of (at least) 3 instructions:

                code(e)
                COPY AX, register where code(e)'s value
                                is stored
                COPY X, AX

            Let's consider these in reverse order:

            "COPY X, AX" is the required P88 instruction
            that stores something into memory register
            X; since that's what "X := e" is supposed to
            do (namely, store something in X), we need
            this line of code.

           "COPY AX, reg. where code(e)'s value is stored"
            puts something into AX.  Presumably, what it
            puts there is what eventually has to be stored
            in X.  So we need this line of code to get the
            data into AX, so that we can then get it into X;
            remember:  in assembly language, we cannot
            put anything directly into a memory register;
            it can only be put there from the accumulator.

            So what is the "register where code(e)'s value
            is stored"?  Well, that takes us back to the
            first line:  what is "code(e)"?  Simply put,
            it's whatever assembly language instructions
            are needed to compute "e".

            So "X := e" is compiled into 3 assembly
            language instructions:

                compute e;
                store the result in AX;
                move the result into X

            But "compute e" is more than just doing
            the computation; the result of the
            computation must be located somewhere;
            call that location "_E0".

            For us, "compute e" is just "(5+6)", with
            the result (11) stored somewhere.

            But the computation is done in AX;
            so then it has to be moved "somewhere",
            namely, _E0.

            But then the last 2 parts of the compiled
            assembly language instructions get 11 from
            _E0 into AX, and thence into "a".

            That's why there are the seemingly extra
            instructions that just move 11 around for
            no apparent purpose!
 


Copyright © 2000 by William J. Rapaport (rapaport@cse.buffalo.edu)

file: 111F00/lecturenotes32.30nv00.html