CSE 111, Fall 2000

Great Ideas in Computer Science

Lecture Notes #23

PROGRAMMING IN PASCAL:
TEXT PROCESSING (continued)

11.  Pascal's Version of Karel's
       "define-new-instruction" (continued)

e) (continued)
    A Pascal function -- which is one of Pascal's
       versions of Karel's "define-new-instruction" --
        has:

        a name
        an input
        an output
        an algorithm (i.e., a procedure that converts
                             the input into the output)

e.g.)    function LastLetter(word : string) : string;
           begin
                LastLetter := substr(word, length(word), 1)
            end;

"function" is a reserved word;
The name of this Pascal function is "LastLetter"
The input of this Pascal function is called "word"
The type of the input is "string"
The type of the output is "string"
The algorithm is to compute the "substr" function
    as shown, and assign the result to the name of
    the function.

The plurals2.p program, from which this comes,
behaves as follows:

1.  The compiler sets aside 3 memory locations:
    noun, plural, task
    each able to contain a string.

2.  The compiler sees the definition of "function
    LastLetter", so it does the following:

    a)    It sets aside a memory location called
            "LastLetter", of type "string"
    b)    It sets aside a memory location called
            "word", also of type "string"

    *    That takes care of the input and output
        of this function.

3.  When the program begins execution, it prints
    a prompt to the user, asking for a noun.

4.  The user types in a noun, which is stored in
    memory location "noun".

5.  A message, "not a special case", is stored in
    memory location "task"

6.  Now what happens depends on what the user
     types in.  I'll only describe what happens in
    the case where the user types in a word ending
    in "y", so that we can see how "LastLetter" works.
    So, suppose that back in step 4, the user had
    typed in the noun "memory", which is now stored
    in memory location "noun".

7.    Since noun <> 'mouse', the first "if" statement
        is not executed.

8.    Since noun <> 'buffalo', the 2nd "if" statement
        is not executed.

9.    Since task = 'not a special case', the 3rd "if"
        statement is executed:

10.  Now we come to the test:

        LastLetter(noun) = 'y'

    Here's what happens:

    a)    A copy of the contents of "noun" are
            stored in "word"
    b)    The computer checks the definition
            of "LastLetter", and computes

            substr(word, length(word), 1)

            This, as you know, returns the last
            letter of "word".  "word", in our case,
            is 'memory', so the last letter is "y".

            'y' is then stored in memory location
            "LastLetter"

11.    Since LastLetter(noun) = 'y',
            the test is true,  so the "then" portion
            of the "if-then-else" instruction is
            executed, and AllButLastLetter(noun)
            is computed.  By now, you should be
            able to figure out how that is done.
            It is, of course, "memor".

12.    Next, "memor" is concatenated with 'ies',
        and the result ("memories") is stored in
        memory location "plural".

13.      Finally, a message is written that says
            that the plural of the string stored in
            "noun" is the string stored in "plural";
            i.e., the plural of "memory" is "memories"
 

12.  But does it really compute the plural of any
       noun?


Well, try it with these nouns to find out:

    dog
    memory
    donkey
    knife
    goose
    baggage
    water

This is a major issue in computational linguistics, and
requires the skills of a linguist to help us write the
program!
 

13.  "index"

Here is another string manipulation function that is
built into Pascal.

"index" takes 2 strings as input, and returns an
integer:

index('abcdef', 'cdef') = 3
index('abcbc', 'bc') = 2
index('a', 'b') = 0

It works as if it were declared thus:

function index (word, subword : string) : integer;
begin
     if "subword" is a substring of "word"
        then index := the position of the 1st character
                             of the 1st occurrence of "subword"
                            in "word"
        else index := 0
end;

Note that the text's version of "index" is called "pos"
and that its input is in the opposite order from that
of "index":

    pos(subword, word) = index(word, subword)
 


Copyright © 2000 by William J. Rapaport (rapaport@cse.buffalo.edu)

file: 111F00/lecturenotes23.03nvc00.html