CSE 250 Spring 2004

Programming Project #4

Garbage collection simulation

This project is optional. I will drop your lowest project grade in final grade calculations. If you do not submit this project your project component grade will be based on the review project and projects two and three.

You may work alone or with one other person on this project. The other person must be a student in the class this semester. If you work with another person you and your partner will together make a joint submission (from either of your accounts - do not make two submissions). You must both document clearly in the code what code each of you wrote. You should each contribute equally to the project. If you do not, your grade may be adjusted to reflect how much work you actually contributed to the project.

Description

Implement the mark-sweep garbage collection algorithm described in section 12.4.4 of the textbook. To do this you will need to implement a simulation of memory in which there is a runtime stack and a memory heap. You can implement the stack as an array of int and the heap as another array of int. Details of this simulation, and some simplifying assumptions that we will make, are given below.

Details of memory cell structure

You will need to model a memory location in some detail. To make this part moderately realistic you will model you stack and your heap using arrays of unsigned ints. Your stack should have a maximal size of 32 memory locations and and your heap should each have a maximal size of 128 memory locations. This will make it reasonable to print out both the stack and the heap.

The ranges of array indices is 0-31 for the stack and 0-127 for the heap.

The only values stored in memory locations are pointers, which can have as value a valid memory address (a value in the range 1-127) or null (the value 0).

You must have a way of indicating whether a memory cell is being used or is free. I suggest that you use some of the free bits in the representation of an unsigned int to encode this information. We can't just use the sign of the int, because null is represented as 0, and -0 is just 0. In other words, if we use the sign of the number to indicate whether it is occupied or not all nulls will either always be reclaimed or never be reclaimed.

Consider a different approach. An unsigned int occupies 32 bits on our systems, using CC to compile. A valid heap index only requires 7 bits (0000001 through 1111111). This means we have quite a few bits in an unsigned int that are not being used to represent a valid heap index. Let's use those bits to indicate the status of the memory cell.

How can you store more than one thing in a 32 bit string? You can use bit masks and bitwise operators to read and write individual bits or collections of bits. Consider the following code:

#include <iostream>

using std::cout;
using std::endl;

int main() {
  unsigned int cell = 65534;
  unsigned int mask = 15; 
  cout << "mask is " << mask << endl;         // 0000 0000 0000 0000 0000 0000 0000 1111
  cout << "cell is " << cell << endl;         // 0000 0000 0000 0000 1111 1111 1111 1110
  cout << "cell & mask is " << (cell & mask); // 0000 0000 0000 0000 0000 0000 0000 1110
  cout << endl;
  return 0;
}
The output is:
mask is 15
cell is 65534
cell & mask is 14
When we "and" two bit strings, the bits in the result that are 1 are those that are 1 in both of the two original bit strings. The mask has 1's in those bit positions that we want to read from the "cell" bit string. In the above example the "mask" has 1's only in the bit positions 0 through 3. In (cell & mask) all the bits in positions from 4 through 31 must be zero (since they are zero in the mask), but the bits in positions 0 through 3 have the same value they have in "cell". This shows how we can read particular bits of a longer bit string.

How can we set bits in a representation? To turn them on we "or" with a bit-string with a 1 in the desired position:

#include 

using std::cout;
using std::endl;

int main() {

  unsigned int val = 7;

  unsigned int bit16 = 65536;
  unsigned int bit15 = 32768;
  unsigned int bit14 = 16384;
  unsigned int bit13 =  8192;
  unsigned int bit12 =  4096;
  unsigned int bit11 =  2048;
  unsigned int bit10 =  1024;
  unsigned int bit09 =   512;
  unsigned int bit08 =   256;
  unsigned int bit07 =   128;

  cout << "val is " << val << endl;
  cout << "bit10 is " << bit10 << endl;
  cout << "val | bit10 is " << (val | bit10);
  cout << endl;

  return 0;
}
The output is
val is 7
bit10 is 1024
val | bit10 is 1031
To turn bits off we "and" with a the negation of the bit-string with a 1 in the desired position:
#include 

using std::cout;
using std::endl;

int main() {

  unsigned int val = 1031;

  unsigned int bit16 = 65536;
  unsigned int bit15 = 32768;
  unsigned int bit14 = 16384;
  unsigned int bit13 =  8192;
  unsigned int bit12 =  4096;
  unsigned int bit11 =  2048;
  unsigned int bit10 =  1024;
  unsigned int bit09 =   512;
  unsigned int bit08 =   256;
  unsigned int bit07 =   128;

  cout << "val is " << val << endl;
  cout << "bit10 is " << bit10 << endl;
  cout << "val & ~bit10 is " << (val &  ~bit10);
  cout << endl;

  return 0;
}
The output is
val is 1031
bit10 is 1024
val & ~bit10 is 7

Details of stack implementation

Implement the runtime stack using an array. Implement push and pop methods as follows:
pop which first decrements the stack pointer and then returns the contents of the location currently pointed at by the stack pointer.

push(addr) which assigns addr to the location currently pointed at by the stack pointer, and then increments the stack pointer.

Details of heap implementation

Implement the heap using an array. Implement an allocation method, named allocate new, to get a block of memory from the heap. To keep things simpler, assume that what is allocated is a "cons cell". Here are some links from a Google search on "cons cell": Thus, the allocate new method returns the address (array index) of the first of two array positions that will be occupied by the cons cell. The allocate new method should have two forms:
allocate new() which assigns a null pointer to each component of the cons cell.

allocate new(first, second) which assigns the value of first to the first component of the cons cell and the value of second to the second component of the cons cell.

The null pointer must be represented by the value 0.

Details of cons cell implementation

A cons cell must have accessors and mutators defined: To make it easier for you to manipulate structures on the heap you can define the following methods for your cons cells. These methods can be defined on a separate cons cell class (whose instances store the starting address of the cons cell in the heap array), in which case they will be defined as follows:
getFirst() which returns the contents of the first component of the cons cell

setFirst(addr) which assigns addr to the first component of the cons cell

getSecond() which returns the contents of the second component of the cons cell

setSecond(addr) which assigns addr to the second component of the cons cell

These methods can also be defined directly on the heap, in which case they will be defined as follows:
getFirst(index) which returns the contents of the first component of the cons cell whose starting address is index

setFirst(index, addr) which assigns addr to the first component of the cons cell whose starting address is index

getSecond(index) which returns the contents of the second component of the cons cell whose starting address is index

setSecond(index, addr) which assigns addr to the second component of the cons cell whose starting address is index

Advice

Work incrementally. Make sure you always have an implementation which compiles and runs with correct (though perhaps incomplete) functionality. Do not work on your only copy of your code. Make backups which are submittable.

Write down questions that you have, and bring them to office hours (mine or the TAs'), or send them to cse-250.

Questions?

If you have questions, please ask! If you need clarification or would like more information, please ask! Send e-mail to cse-250@cse.buffalo.edu

Submission

The project is due on or before April 26 at 11:59PM. Place all and only the files you intend to submit in a directory named PP4. You must name your directory PP4. If you do not you risk having points taken off.

The PP4 directory must include only your source code (declaration and implementation) files as well as your makefile and any required documentation.

Your submission must not include object files, executable files or any files whose filenames begin with `#' or end in `~'. You also must not include the SunWS_cache directory and its contents in your submission.

Use the zip utility to create a file named `PP4.zip'. From the directory containing the `PP4' directory, issue the command

    zip -r PP4 PP4
Submit only the resultant `PP4.zip' file.

Use the submit_cse250 command to submit your work.


Carl G. Alphonce
Last modified: Tue Apr 20 20:57:31 EDT 2004