Using FreeBSD For Operating Systems Course Lab

Ken Smith

kensmith@cse.buffalo.edu
  

FreeBSD is a registered trademark of Wind River Systems, Inc. This is expected to change soon.

Intel, Celeron, EtherExpress, i386, i486, Itanium, Pentium, and Xeon are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries.

Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in this document, and the FreeBSD Project was aware of the trademark claim, the designations have been followed by the ``™'' or the ``®'' symbol.

This article describes using FreeBSD 5.X for an Operating Systems course, though the information may be useful to anyone trying to set up a Crash Lab for Operating Systems research/development. The lab configuration described has been used for a second semester course about Operating Systems programming for six years now, though the base operating system has been FreeBSD for only the past three years. It has supported class sizes up to thirty students though expanding it to handle more students would not be difficult. How to configure the systems so the students can work independently and not interfere with each other, while not requiring their own dedicated machine, is described.


Table of Contents
1. Introduction
2. Lab Hardware Configuration
3. Specific Configuration Issues
4. References

1. Introduction

Operating Systems Programming Courses can benefit greatly from hands-on work with a real-life operating system. This gives students experience in the same environment they would most likely find a job - working with an already existing system. Skills learned exceed just the programming. Students typically need to read and understand a significant portion of the existing code before they can modify it to do the assignments. The procedures required to do things like add system calls demonstrate extra functionality in the normal UNIX® software build tools. Use of things like Source Code Management systems make some aspects of the course easier and should be considered a required part of the course. As a side-effect of needing to reboot crash machines for testing purposes some aspects of system administration come into play. Many more aspects of system administration are learned if the students decide to set up their own crash machine, which is possible but not required (they can work on the lab machines).

FreeBSD makes an excellent system to base the course on. The lab can be set up so the students can share the computers and are still able to work independently on their projects. Source code abides by style guides that are documented which makes it easier to follow. Mechanisms are available to help navigate the kernel source code locating functions without needing to know in advance what file they are defined in. As of FreeBSD 5.0 the kernel itself has been set up to allow multiple threads of control so Multiprocessing issues are seen all through the existing code and students need to consider those issues as part of their projects.

The lab configuration works well for this course but has also been used by research groups interested in having groups of researchers working on a large project. It is also possible for students to work on home machines and "port" their work to the lab machines for grading purposes if the general guidelines for the lab configuration are followed (and the students abide by the use of the Source Code Management system as recommended).


2. Lab Hardware Configuration

The basic setup of the lab is to have one central server that all students have unprivileged accounts on. This is where the home directories are located, and provides a stable place that all students can do the majority of their editing. Students do not install their kernels here or reboot this machine as part of their work. They typically log in using ssh(1) and do their work using normal text editors. Because of the way the FreeBSD kernel is built this machine can be used to do the kernel builds as well.

This server will benefit from as much money as you can throw at it, though a surprisingly old and ill-equipped machine has performed this role for us. Our server for the past three years has been a single-CPU i386 machine with a 333MHz processor, 256Mb of memory, two Adaptec SCSI controllers, and around 34Gb of disk space. Faster processors, more memory, and faster disks will all make the students happier because kernel compiles will run faster. But except for the initial kernel compile relatively little needs to be re-compiled each time they make a change so this older/slower hardware has been adequate. Be sure to base disk space requirements on how many students you expect to have in the course, they will each need their own copy of the kernel source code, C-library, plus any extra programs needed (test programs, etc.). More disk space can also help with grading issues, as discussed later.

Depending on your needs for the lab a second server set up to be a "compile server" may be desirable. The tradeoffs involved in deciding whether or not to provide this second server are discussed below.

In addition to the main file server and optional compile server all that is required is crash machines for the students to test their work on. New fast machines are always good but old hand-me-down machines are fine as well. A lab with eight PCs has been used for a class with 30 students registered. Each of the PCs was a 300MHz machine with 128Mb of memory and 4Gb disk space. Strictly speaking the machines just need enough disk space to hold the baseline operating system, the students' work is in their home directory on the main file server.

Last but not least, since grading of the projects tends to be somewhat time consuming a separate crash machine or two may be reserved for the Teaching Assistant(s) who help with grading.

The lab has had its own small room for the crash machines, and students registered for the course have had access to the room. The servers have been located in a machine room the students can not access. Depending on your environment this may be the best way to set up the machines for this course - having computers that students will be debugging kernels on kept in open labs other students use for other courses is a recipe for disaster.

This setup has assumed normal i386 based PC's with a monitor and keyboard. Another possibility is to attach i386 machines (or SUN Sparc-64 machines if you have enough of that architecture machine to build this lab from) using their serial port as the console to a console server machine. The console server would need a serial port expansion card in it to attach the crash machines to, and would need to have the comms/conserver port installed for access to the crash machine consoles. With this setup the crash machines could also be kept in a machine room, and the students could access them completely remotely.


3. Specific Configuration Issues

Several years' worth of experience running this course has lead to some suggestions for configuring various aspects of the lab. These suggestions help with things like the sharing of the crash machines among the students, practical aspects of helping the students with problems as they work on their projects, grading issues, etc. If all the suggestions and procedures from the Handbook are followed it becomes practical for the students to set up their own machine at home and use that for the majority of their work, relieving some of the stress on the lab machines. It will be inevitable that the crash machines will be "saturated" shortly before assignments are due. Students' natural tendency to procrastinate until the last minute does get better as the course carries on, and in itself is a valuable thing for students at this level (typically 4th year students) to learn.


3.1. Operating System Installation

When installing the operating system it is advisable to stick with a RELEASE version. One way to get that is do an install from scratch using the latest CDs or FTP-based install. Once the lab has been functional doing an install from scratch is not necessary. Just follow the normal FreeBSD upgrade procedures except use a _RELEASE tag in the cvsupfile used with cvsup(1) For example:

   *default  tag=RELENG_5_1_RELEASE
   
This makes synchronizing the source code among all of the machines involved (including student home machines) easier.

To give extra flexibility for potential projects consider creating a separate disk partition during installation. We usually call it /scratch. Some of the more useful projects involve modifying filesystems, adding features. The /scratch partition can be set up to not automatically mount at boot time. Students can have their own version of newfs(8) that writes a (slightly or drastically) different filesystem data structure into the partition, and the kernel can have a new filesystem type that it supports as part of the project. This way the machine itself will rely on normal UFS for the system partitions it needs to boot and run but the students can have space to work on their own filesystem support.

The Lab Manual describes how students should set up their home directory so they can do their work. A modification to the file /etc/make.conf is needed for them to be able to build the C-library and normal user-level programs using their own set of #include files instead of the system files in /usr/include. That modification can be made to the file server but it breaks the ability to compile normal programs for people not taking the course. Because of that the separate "compile server" described above may be desirable. As with the main file server the compile server is meant to be a stable machine the students do not have privileged accounts on so they can not reboot it. On this machine as well as the crash machines /etc/make.conf contains:

   CFLAGS+=-I${HOME}/CSE422/include
   
As part of their initial setup students are told to copy /usr/include there. When recompiling the C-library and when compiling test programs this will set it up so it is the #include files in the students' directories that get used. Students are told they need to use the compile server when rebuilding their test programs or the C-library but they can rebuild the kernel on any of the machines. If you can not provide a separate compute server the modified /etc/make.conf can be installed on the main file server.

The central file server should be set up to allow the rest of the lab machines to mount /home using NFS. It is not necessary for the client machines to have root access through NFS for the students to be able to install their kernels. This also provides a small barrier to cheating, though due to the nature of NFS and giving students super-user privileges on the crash machines be aware that students can gain access to the other students' files. Until remote filesystem access using something other than plain UID permissions checking becomes practical this hole will be open.

To help the students locate a crash machine it may be helpful if the servers and crash machines are running rwhod(8). Also be sure to provide a copy of the GENERIC kernel on the crash machines to help with recovering the machine when a student's kernel fails to boot properly. The Lab Manual describes the recovery procedures.


3.2. Computer Account Setup

When setting up the student accounts it is possible for them to work with the home directory permissions set to be closed off for other users. For grading purposes at least the Instructor will require super-user privileges on the main server. If enough disk space is available a copy of the students' entire home directory can be made and the copy can be chown(8)-ed to the grader. This has the advantage of giving the grader time to work through the grading procedures while the students are free to carry on with other aspects of the course. But this requires around twice the disk space. If this is not possible the grader will need super-user privileges to access the students' work and grade it.

Accounts on the crash machines and optional compile server can be set up using NIS or other distributed systems but simply copying the accounts to the machines manually can work for a lab that will typically be this small.

Students can be given super-user privileges on crash machines using the security/sudo port. By only adding them to the sudoers file on the crash machines and not the main servers they will only be able to reboot these crash machines. As with the main servers it is possible for the students to do the majority of their work remotely, using ssh(1) to log in. Physical access to the crash machines is desirable for some debugging scenarios, as well as being able to recover a machine when a kernel is unable to boot all the way to multi-user mode.


3.3. Operational Procedures and Issues

The accompanying Lab Manual provides students with the instructions for setting up their home directories, build various pieces of the system, use the RCS Source Code Management System, test their work, etc.

Exactly how much of the system source code they will need to copy will depend on the specifics of the projects. In addition to the kernel they will need the C-library if they are writing new system calls and calling them from test programs in the same way any other system call gets used. From past experience it is advisable for the project description to define any new system calls to be added and exactly what the parameters should be (I have been providing the function prototype in the description). This takes away some of the students' freedom but it is a huge help when it comes to grading. Students are told a significant portion of their grade will be based on how well they test their work themselves but on average what they provide for testing will be less than optimal. The grader is best off having their own set of small test programs written and pre-defining the system call interface makes this much easier to deal with.

RCS is recommended as the Source Code Management system, and it is recommended students be required to use it in the fashion described in the Lab Manual. This has several benefits. CVS is a much more flexible system but has a lot of overhead associated with it that is of questionable value when used for projects that are supposed to be individual efforts. Students are typically new to these Source Code Management systems and a lot of other basic operational procedures are being thrown at them (e.g. often students struggle with all the steps necessary to add a new system call) so keeping it simple helps. And if the students follow the directions properly the Instructor can easily find things an indivual has changed by going into the students' source directories and doing:

# find . -name RCS -print
and then checking what files have been checked in to those directories. The rcsdiff(1) command can be used to see what got changed in the individual files. Students working at home can use a similar procedure to find what they have changed and make sure they move everything to the lab machines for grading purposes. Last but not least the FreeBSD Project itself uses CVS and using RCS will minimize the chances of nasty interactions.

Students will need to look through the source code for various things. Typically they will need to view the source code of other functions in the kernel that already exist. The "tags" support for FreeBSD makes this fairly easy. On the main fileserver run the command "make tags" in /usr/src/sys/i386. After doing that if sitting in that directory

% vi -t open
   
will place you in the editor, editing the file the open() function is defined in and at the first line of that function. Once the students have their own kernel build directory "make tags" in that directory will have the same result. If you add the following to the Makefile in that directory and if you have the etags(1) command installed the students can do the same tags search mechanism in emacs as described in the Lab Manual:
TAGS::
    -etags ${COMM} ${SI386}


A mechanism for users "reserving" a crash machine for their use is needed. More elegant ways are certainly possible but something as simple as having them edit /etc/motd has worked just fine. When they need a machine they use ruptime(1) on the main server, and all of the lab machines are set up to run rwhod(8). An uptime of more than half an hour suggests that the machine is not being used by another student. They can further check with rwho(1). Upon logging in to a crash machine the students check /etc/motd. If it says the machine is available they edit it to say "This machine is in use" and provide their username. When they are done with the machine they should edit the file to indicate it is available again.

Once a student has a crash machine reserved they first copy their kernel to /tmp (probably needing to remove someone else's which they can do with sudo(8)). Then they need to copy it from /tmp into /boot/kernel/kernel using sudo(8). This step may or not be necessary depending on the permissions of the directories and whether or not the NFS server trusts the clients with root NFS accesses. As a barrier to cheating it is best to keep things as closed down as possible. It is best to have the students just copy the kernel and any modules they write themselves into place instead of using "make install" in the kernel build directory so that these permission settings can be used (the "make install" would need to be run as root).


3.4. Students Working at Home

If students wish to set up something so they can work at home they are strongly encouraged to use a dedicated computer they do not care about for the crash machine at home. For many of them this is their first exposure to most system administration tasks. Particularly when doing filesystem projects mistakes can be disasterous. It is best if mistakes made by the students while trying to work on this course only effects things for this course and not the machine that has the rest of their work on it.

The Lab Manual has minimal guidelines for setting up a home machine, assuming it is one dedicated to just this coursework. That section of the Manual is still a work in progress. Because of FreeBSD's distribution methods it is easy to make a copy of the CDs you want the students to install (the version you chose for the lab computers) and provide them if you like. We have typically made two or three copies and loaned them out to students interested in working at home. The students can also use the FTP-based installation mechanisms. If cvsup(1) is needed the students can use the same version tag as used on the lab machines so they have an identical copy of the source code on their machine. They can follow the same procedures on a home crash machine as gets used in the lab except that one machine is both the edit/compile machine and crash/test machine. If they use the RCS procedures as described in the Lab Manual finding what they changed as part of the project is very easy. Students have been required to move their work, usually by simply copying any modified and new files, to the lab fileserver. They are strongly encouraged to do one final build/test run there. The grader will use what is on the lab fileserver for grading.


3.5. Grading Students' Work

Grading can be done in one of two ways, depending on the circumstances. If there is not enough disk space available to create a copy of the students' home directories no other course related work can be done between the project due date and when grading can be completed. But with sufficient disk space copies of the students' home directories can be made (we have used /home/GRADING in the past) using an adaptation of the tar(1) command the students use to set up their home directories. These copies can then be chown(8)-ed to whoever will be doing the grading which makes grading a bit less risky because the grader does not need to be working as root on the main fileserver. Grading does tend to be quite time consuming, usually involving the testing of each student's kernel on a crash machine. As mentioned above having a specification for any new system call interfaces be part of the project description helps the grader develop a small set of their own test programs since students will typically come up with "less than optimal" testing procedures on their own as part of the project.

Exactly what to grade will depend on your tastes, time available for grading, etc. In addition to the obvious "Does it work?" we have based part of the grade on:

  • Their description of how they tested their work and the test programs themselves.

  • Whether they they used RCS properly.

  • How well they followed style(9) guidelines.

  • Do they allow normal users to do things only root should be allowed to do?


3.6. Resources Available to the Students

As mentioned there is an accompanying Lab Manual that can be given to the students. Here a copy of the Syllabus, the Lab Manual, and a copy of all the slides used in the Lecture are available for purchase at a store on campus. The Lab Manual contains some material that 4th year students really should know very well already, they could easily find elsewhere, etc. But in practice it has been a valuable "refresher" and having this material readily available instead of needing to hunt it down has proven worthwhile.

For the textbook we have used "The Design and Implementation of the 4.4BSD Operating System" by McKusick et. al. FreeBSD has evolved significantly beyond what is described in that book. The extra materials, diversions from what is described in the book, etc. have been handled by the lecture notes. However the author is currently doing a new version of that book which focuses on FreeBSD which will help immensely.

The FreeBSD Web Site also has materials that help though some of it is out of date, while some of it is incomplete. The Architecture Manual can be useful, as can various pieces of the Handbook.

Last but definitely not least, be sure to point out Section 9 of the online manuals. The number of kernel functions described and the accuracy of those descriptions has improved dramatically through time.


4. References

[1] FreeBSD Handbook

[2] FreeBSD Architecture Handbook