Assignment 3, Due at 11:59pm, Tuesday Sep 23

Objectives

The 250HTML markup language

Recall that in HTML we can change text attributes by surrounding the text with a pair of tags, e.g.
<b>This is bold</b>, <i>this is italic</i>, 
<b>bold again <i>and italic at the same time</i></b>, 
and <u>underlined</u>, and back to normal.
Of course, not all HTML files are correctly tagged. For example, the following is not tagged correctly
<b>Bold text<i> and italic</b>
The World Wide Web Consortium (W3C) has a service allowing us to validate any HTML file, confirming whether the file is well-formed. In this assignment, we will make up our own HTML-like language, and implement a "validator" for the language. Also, we will write a function that renders (displays) a text string (or expression) marked up in our language correctly.

Let's call our language "250HTML". The language accepts the following tags: The tags correspond to a subset of the text attributes and colors that we can easily set using terminal control codes I discussed already. Valid input expressions in 250HTML look as follows.
<red>Red <dim>dim and red</dim> back to red</red>
    
<yellow>Yellow <underline>underlined yellow <dim>dim</dim> underlined yellow</underline> and <cyan>cyan</cyan> and yellow again</yellow>
An invalid 250HTML expression consists of the following type of errors:

What to do

You are to write a C++ program that does the following.
  1. It keeps reading user's inputs, line by line. Each input line the user types is supposed to be in one of the following three forms:
    validate [250HTML expression]
    display [250HTML expression]
    exit
    
    For example, the following are good commands:
    > validate <red>This is red <blue>and this is blue</blue> and back to red</red>
    > display <red>print this out in dimmed red and red</red>
    In the above commands,
    <red>This is red <blue>and this is blue</blue> and back to red</red>
    and
    display <red>print this out in dimmed red and red</red>
    are 250HTML expressions.
  2. If the user types exit, then your program just quits.
  3. The validate command reports whether the expression is well-formed according to our language 250HTML described above.
  4. The display command prints the expression where the text attributes (foreground color and 3 other attributes) are displayed correctly (if the expression is well-formed). All tags are stripped off when displaying the expression, of course. If the expression is not well-formed, then an error message is reported. If two foreground color tag pairs are nested inside one another, then the inner text has the color specified by the inner tag pair. For example,
    <red>This is red <blue>and this is blue</blue> and back to red</red>
    In general, if there is a conflict the inner-most tag pair has the highest priority.

The program above might seem to be a daunting task, but I have already written a skeleton of the program for you. You can download the source files with

wget http://www.cse.buffalo.edu/~hungngo/classes/2014/Fall/250/assignments/A3.tar
tar -xvf A3.tar
Please read all the code in the code base, but you can only modify one file: cmd.cpp to implement the two functions that were left empty there. You can compile the program by typing make. The Makefile is already written for you.

Explanation and my implementation

How to submit

Submit only the cmd.cpp file. We will put your submission into a directory that has all other files in the codebase and compile using make
submit_cse250 cmd.cpp
Note again that the submission only works if you logged in to your CSE account and the cpp file is there. All previous things can be done at home, as long as you remember to upload the final file to your CSE account and run the submit script from there.

Grading (100 points total)

As it was with the first two assignments, this assignment is to be done alone.

Supporting materials

The crucial piece of knowledge is the algorithm for recognizing Well-Formed Expressions based on stacks. I will discuss that and C++'s stack in class. Read the textbook, the chapter on stacks if you have to. Read the Terminal control post on the blog. The following snippet of code is probably helpful too. You will need Lexer.h and Lexer.cpp in the same directory as the following file to test it.
// lexerDriver.cpp
#include <iostream>
#include "Lexer.h"

using namespace std; // BAD PRACTICE

int main() 
{
    string line;
    int i;

    Token tok; 
    Lexer lexer;
    while (getline(cin, line)) // Ctrl-Z/D to quit!
    { 
        cout << "Enter an expression to tokenize: \n> ";
        lexer.set_input(line);
        while (lexer.has_more_token()) {
            tok = lexer.next_token();
            switch (tok.type) {
                case TAG:
                    if (tok.value[0] != '/') 
                        cout << "OPEN TAG: " << tok.value << endl;
                    else
                        cout << "CLOSE TAG: " << tok.value.substr(1) << endl;
                    break;
                case IDENT:
                    cout << "IDENT: " << tok.value << endl;
                    break;
                case BLANK:
                    cout << "BLANK: " << tok.value << endl;
                    break;
                case ERRTOK:
                    cout << "Syntax error on this line\n";
                    break;
                default:
                    break;
            }
        }
    }
    return 0;
}