CSE422/512 Lab Manual

Spring 2003

Revision 1.1

Ken Smith

U. Buffalo CSE Department

201 Bell Hall

Box 602000

Buffalo, NY, 14260

Introduction

This manual covers background information needed for the lab work of CSE422. The homework assignments and projects will involve modifications to the UNIX Operating System. For the Spring 2003 semester the base system will be FreeBSD 5.0. A lot of background information will be needed to get the homework assignments and projects done. Some of this background information should have been picked up from previous courses but is provided here as a brief review. Even the review material may cover important topics not mentioned in your previous courses so it should be read even if it seems to be something you know. The equipment available in the lab and how it is configured will be explained. You will need to follow certain procedures while working in the lab so the machines can be shared. Certain computers will have certain dedicated uses, and some will be available as “crash machines”. The manual will also explain procedures needed to accomplish tasks on FreeBSD such as building Kernels, add System Calls, add Device Drivers, etc. Also hints on setting up your own machine so you can work outside the lab will be given.

Setting up A Home Machine

If you can manage to have a separate machine dedicated to FreeBSD for use in this course that will make working on the Homeworks and Projects much easier for you than competing with the rest of the class for the crash machines. The next best thing would be to have a machine that dual-boots Windows or FreeBSD. This section is not meant to be a comprehensive guide to getting everything you need working on a home machine, this will just be a guide to some of the issues you will face. You will still have a lot of learning to do.

If you can dedicate a machine to FreeBSD you can skip over this section and go to the section that describes how to install FreeBSD itself. For machines that will be set up to run both Windows and FreeBSD a few extra steps are usually needed before you can install FreeBSD. This will be based on a machine shared by Windows-98 and FreeBSD. Some of the issues mentioned apply to other Windows OS’s but the other Windows’s will have various extra issues you may come across.

The first step is making space on the hard drive for FreeBSD. It needs its own section of the disk that it can control completely, totally separate from the space Windows controls. This is done by partitioning the disk. The problem is that Windows-98 will probably have been configured to occupy the entire disk as one large partition. FreeBSD has a utility program called FIPS that will allow you to try resizing the disk partition Windows is using. Note that this means FIPS is adjusting the Windows filesystem data structure. FIPS is a freely available program, and sometimes you get what you pay for when it comes to software. FIPS has not destroyed a Windows filesystem on me yet but there is no guarantee that it will work reliably for you. Be sure to back up your Windows files somehow, and be prepared to reload your machine from scratch if FIPS destroys your Windows system. If the possibility of destroying your Windows setup is too big a risk for you then do not proceed (meaning you can not load FreeBSD on that machine).

Newer versions of Windows (Windows-2000, Windows-XP, etc) typically use NTFS filesystems instead of FAT filesystems. FIPS does not support adjusting the size of NTFS filesystems. If you need to do this sort of thing on these newer systems you are best off buying a program called “PartitionMagic” which does support adjusting the size of NTFS filesystems.

For this version of the Manual there will not be much help as far as what is required to install FreeBSD. The people who distribute FreeBSD have documentation available to help you some – see http://www.freebsd.org to get started. We have a mirror of their current distribution on the main CSE FTP server as well as the DCSL FTP server. On the CSE systems it is available straight through the filesystem in /ftp/mirror/FreeBSD. I can also loan you a CD.

If you do set up a machine at home be sure to make a separate partition on your hard disk named /scratch. This should be approximately 100Mbytes in size, and should be empty when you are finished with the installation. The projects in which we work on the filesystem code will use this partition as where you create your modified filesystem.

Lab Configuration

There are several machines set up in the lab for you to use for Homeworks and Projects. They all have dedicated purposes. You need super-user privileges on some of them, but to protect everyone using the lab you will not have super-user privileges on the central servers. You can do things as the UNIX super-user root on the “crash machines” by preceding the command you need to run as root with the command sudo(8). This sudo(8) program will ask you for your password and if you type it correctly it will run the command you gave it as root. For example you can copy a file named “kernel” to the place a kernel will be loaded from as the machine boots with:

% sudo cp kernel /boot/kernel/kernel

The sudo(8) command will work for you on the “crash machines” but not on the servers.

The main fileserver for the lab is named “mack.dcsl.buffalo.edu”. This will be where you want to log in to read through source code for researching things, where you want to do most of your editing, and where you compile your kernels. It should remain stable, and can not be used to test kernels on. In addition to mack there is a “compile server” named “sneetch.dcsl.buffalo.edu”. There are some things, described below, that you must compile on sneetch instead of mack. The rest of the lab is “crash machines”, the names of which will be posted to the newsgroup. These will be the machines you have “sudo privileges” on and will be where you run your test kernels.

Modular Programming

When working with very large programs or systems in C (and many other languages) it is very important to learn how to break up the program into small pieces, making it modular. There are a wide variety of reasons for doing this but one of the most important is cutting down on how long it takes to rebuild the program when small changes get made. To understand how the UNIX Kernel (a large program ) gets built you need to understand the C compiler, the linker, and the utility program make(1). There are a few other utility programs that will make your life easier while working with the kernel as well.

C Compiler

FreeBSD uses GNU’s C compiler. If you run the command:

% cc prog.c

(assuming you made no mistakes in the C code) you are left with an executable file named a.out that you can run. The compiler did not generate that file, the linker did. The compiler generates an object file which is machine code but not quite ready to run. Because you gave the compiler no command line flags to change its default behavior it went ahead and called the linker ld(1) to generate the executable file. If you use the command line flag –c with the compiler, for example:

% cc –c prog.c

then the compiler will generate the object file and will not automatically call the linker. Now instead of an executable file named a.out the result of the above command will be an object file named prog.o. When breaking up a program into multiple source code files the individual source code files can be compiled with the –c command line flag to produce the object files. This way if you only change the code in one file you only need to recompile that one file. The code in the other files will not have changed so their object files do not need to be re-generated.

Preprocessing is also handled outside of the compiler itself. A separate program gets called by the compiler named cpp(1) which handles processing all the C preprocessor directives such as #include, #define, #ifdef, etc. The compiler first runs cpp(1) to take care of the preprocessor directives. It then generates assembly code from the resulting C code. The assembler is then run to convert the assembly code to object code. Compile time options can be used to see the results of the various stages, for example the command line flag –S can be used to see the assembly code. There are a lot of flags for the compiler, the ones you need to be aware of are:

c : compile source file(s) to object files but do not call the linker to produce an executable file
I : add a directory to be searched for include files that are referenced with #include <filename> (as opposed to #include “filename”)
S : compile to assembly code but do not call the assembler; leave behind assembly code in a file with the suffix “.s” in place of “.c”
E : runs the preprocessor but does not compile, the output will go to the standard output (by default your screen)
v : verbose mode, the compiler will show you what it is doing while it runs
o : place the executable in the file name after the –o flag (instead of a.out)
O : uses object code optimization to improve the executable
L : add a directory to be searched for library files
l : (lower case letter ell) add a library to be searched for unresolved symbols during linking; specifying –l<name> usually means look for the library file /usr/lib/lib<name>.a but can use –L flag to add a directory in which to search for library files
D : add macro definitions during preprocessing

By default the C compiler will automatically call the linker with –lc (the C library) as well as a few other library and object files. You should run the compiler on a simple program (e.g. “Hello World”) using the –v flag to see what else the C compiler is doing for you. The C library contains most of the programming utility functions in Section 3 of the UNIX Manuals. These are things like time(3), fopen(3), fclose(3), printf(3), etc. The C library also contains special entry points for the System Calls in Section 2 of the UNIX Manuals. Examples of system calls are gettimeofday(2), open(2), close(2), write(2), etc. Libraries are built by the library archiver ar(1) and ranlib(1). Basically a library file is created from a bunch of object files using ar(1):

% ar r libfoo.a file1.o file2.o file3.o …

Then run ranlib(1) on the resulting library file to create a table of contents that ld(1) can use to search for symbols. The –t command line flag to ar(1) will show what files make up the library file. The utility program nm(1) lists what symbols are available in an archive which is typically function entry points, global variables, and references not resolved in that module.

Care should be taken when dividing up programs into modular chunks. Header files (“.h” files) should NOT generate any storage and, most definitely, should never contain code. These files should contain the definitions of symbolic constants (e.g. a symbolic constant used as the size of an array) and definitions of data types (e.g. a C struct definition for a node element used in a linked list). The header files should not contain actual variable declarations (these “generate storage”). Header files were designed to be information included in multiple C source code files using the #include preprocessor directive. If you define variables in the header files then these variables will be defined in multiple object files, and the linker will complain about multiple references to the same name. Variables need to be defined in the C source code files.

Other than that you should simply divide up your program into pieces that seem to be logically grouped and have these groupings in separate files. There are no definitive guidelines on how large to make files or anything like that. When it gets to the point it seems like a file is taking “too long” to compile think about breaking it up.

make(1)

Modular programming allows you to set it up so that at any given time you should only need to recompile a few of your C source code files to rebuild and test your program. For only a few files you keeping track of which C source code files depend on which header files (“.h” files) and which of the files you have modified at any given time is not too hard. Doing that for a larger project with many C source code files, many header files, and many programmers becomes unmanageable fast. Solving this problem is what the command make(1) was designed to do.

Make(1) uses a configuration file named either Makefile or makefile to define what files depend on what other files. For example if the file util.c includes the header file defs.h then if defs.h changes util.c should be recompiled. Because of that it is said that util.c depends on defs.h. In the same way util.o (the object file built by the compiler when it compiles util.c) depends on util.c, whenever util.c changes util.o should be rebuilt by compiling util.c. By defining the dependencies of all the files that are needed to build your project in Makefile you can reduce the work needed to rebuild the project at any time to simply typing the command make.

The Makefile gets built by specifying targets, what those targets depend on, and how to build that particular target. The general form of a line in a Makefile is:

target1 target2 … : depend1 depend2 depend3 …

operation(s)

You can specify more than one target but that is rare, usually you only specify one target. The operation(s) are the shell commands necessary to build the listed targets. Be sure to indent the shell command lines with tabs – some versions of make(1) are very picky about that and do not work properly if the lines begin with spaces.

As an example say the executable file foo gets built from two C source code files named foo.c and bar.c. Also suppose that bar.c includes the header file config.h. One possible version of the Makefile describing this would be:

foo: foo.o bar.o

cc –O foo.o bar.o –o foo

foo.o: foo.c

cc –c foo.c

bar.o: bar.c config.h

cc –c bar.c

Since make(1) was designed by programmers to make programmers’ lives easier it was set up to know about various “natural” dependencies on its own. For example make(1) knows that foo.o probably gets built from foo.c and it knows to use the C compiler with the “-c” command line flag to build the object file. Taking make(1)’s internal knowledge into account a shorter version of the Makefile is:

foo: foo.o bar.o

cc –O foo.o bar.o –o foo

bar.o: config.h

The make(1) command when it runs will use the filesystem time stamps on the files to see which files have changed, and will rebuild a target only if it is older than one or more of its dependencies. If all the dependencies are older than the target that target will not be rebuilt. The above example is quite simple and probably does not seem worthwhile. As the number of source code files, header files, and in some cases programmers increases make(1) becomes more and more invaluable. Later when you are ready to build kernels take a look at the Makefile in your kernel build directory.

There are various features of make(1) that help keep things organized in larger projects. Macros are available to help with updating the Makefile. You could use one to list all the object files for example:

OBJ=foo.o bar.o

foo: $(OBJ)

cc –O $(OBJ) –o foo

Macros often get used for listing the programming libraries needed during linking, special C compiler flags needed, etc. The make(1) command itself has several macros that it defines and uses on its own, but you can override their default value in the Makefile. One of these macros is named CC and it is the command make(1) will use to compile with. You can use the GNU compiler gcc instead of the normal compiler cc by defining CC in your Makefile:

CC=gcc

You can also override macro values on the command line:

% make “CC=gcc”

Changing the macro CFLAGS alters what flags make(1) will use when it runs the compiler. See the manual page for make(1) for a list of other macros make(1) defines for itself and what their purpose is. Note that because of the way make(1) works you should only define a macro in a Makefile once. The make(1) command needs to read the entire Makefile before it does anything because it needs to build the entire “dependency tree”. If you define a macro several times in a Makefile make(1) will wind up using the very last value you specify for the macro everywhere you use the macro.

With no command line arguments make(1) will build the first target defined in the Makefile. You can give a target on the command line to build that target instead. Often for user-level programs the programmer will provide a target named clean that has no dependencies. The shell commands listed for it remove all object files and other temporary files created when the project gets built. Also there is a command line flag to make(1), “-n”, which tells make(1) to show what it would do but it does not actually run the shell commands.

diff(1) and patch(1)

The diff(1) command will show you the differences between two text files. There are five different output formats available, we’ll only talk about two here (see the manual page for diff(1) for the others). The default is for diff(1) to show “ed-like” (ed is a very old line-oriented editor) commands which, if you used those commands in ed, would convert the first file into the second file. Its output will only include the lines that need be changed and what editing operation need be done (change, delete, or add). If diff(1) is used with the command line flag “-c” then “context diffs” are produced. These include not just the line to be changed and the editing commands but also three lines’ worth of unchanged lines around the one(s) that need be changed. For the purposes of automated editing (having some other program do the editing instead of a human) this provides a little extra “security” because that program can complain if the extra unchanged lines mentioned in the context diff do not match the lines in the file being edited.

Often programmers of large packages will maintain it by producing an initial release of the software, and then distribute bug fixes and/or enhancements as “patches”. A patch is a single file with a whole bunch of context diffs in it. A program named patch(1) can take this single file, interpret the context diffs, and then do all the editing needed on the source code files to update the source code.

RCS

The last of the programming tools we’ll talk about is a “Source Code Management System” called the Revision Control System (RCS). It should be noted there are better, more advanced systems available. If you are starting a new project of your own, check out the “Concurrent Version System” (CVS). It evolved from RCS and is in fairly wide-spread use (for example the FreeBSD project uses it). RCS is a simpler package which, because of its simplicity, makes assisting students with problems a little bit easier.

Generally speaking some of the desirable features of any Source Code Management System are:

Some means to keep two programmers from modifying the same file at the same time
Version numbering schemes of some sort
Log messages to track why changes are made and by whom
Allow for backing out of changes

Those are the minimal features, it various packages have other features as well. Typically the more features the more “overhead” there will be, both in the form of resources (e.g. disk space) and usage (extra things the programmers need to do, how much they need to know to use the system, etc).

There are two primary commands used in RCS:

ci – short for “check in”, used to start off a new file in RCS or release a lock on a file you had locked to work on
co – short for “check out”, used to lock a file you want to work on

RCS keeps version information for files in a specially formatted file (one file per source code file) in a sub-directory named “RCS”. You need to make that directory in each source code directory in which you plan to use RCS (see notes below on specifics for CSE422).

To start off using RCS for a particular file use the command:

% ci –l foo.c

if the file were named “foo.c”. This will create the file named “RCS/foo.c,v” (if you did not create the directory “RCS” the file created would be “foo.c,v” but that quickly clutters up your source code directory). The “-l” option to ci(1) means to do the initial check-in but then immediately check-out a locked version of the file. You are made the “locker” (nobody else may check-out a locked copy) and the file will be set to have write permission turned on (RCS will set it up so that if you have not locked the file you will not have write permission turned on). You are now ready to edit the file. When you are done editing the file you should use the command:

% ci –u foo.c

to check-in the file, and then immediately check-out an unlocked copy of the file. These check-ins should be done when you think the current editing session is over, not every time you leave the editor. Debug what you are currently working to add before you bother doing the check-in. If you need to edit a file that you have an unlocked copy of you can use the command:

% co –l foo.c

to check-out a locked copy.

Information about the other programs that come with RCS are available from their manual pages. To get started do the command:

% man rcsintro

and then check the manual pages listed in the “See Also” section. Commands you are likely to find very useful are rcsdiff(1) and rcs(1).

Conditional Compilation

You will see conditional compilation used all over the kernel. This involves using the #ifdef pre-processor directives to check to see if symbols are defined. These symbols may be used as part of the code or data structures but more often than not the symbols are simply used to decide whether or not various pieces of the source code get compiled into the program or not. A simple example is turning on and off extra debugging messages in a program. A simple example of how it might be used is the following simple program.

#include <stdio.h>

#include “config.h”

char buf[LEN];

int

main(int argc, char *argv[])

{

printf(“What is the magic word?”);

fgets(buf, LEN-1, stdin);

if (buf[strlen(buf)-1] == ‘\n’)

buf[strlen(buf)-1] = ‘\0’;

#ifdef DEBUG

fprintf(stderr, “User input %s\n”, buf);

#endif

if (strcasecmp(buf, “please”))

printf(“Nope.\n”);

else

printf(“Yup.\n”);

}

If you have the line:

#define DEBUG

in the file “config.h” then the extra fprintf() statement will be compiled into your program for you and you get extra debugging information. If the symbol DEBUG is not defined anywhere then that extra fprintf() statement will not be compiled into your program. A particularly easy way to take care of this for a single file being compiled is to compile it with:

% cc –DDEBUG prog.c –o prog

The “-D” command line flag to the compiler tells the pre-processor to define that symbol as it processes the file(s) it is compiling. If you are using make(1) you could do it easily there by adding this to the CFLAGS macro:

CFLAGS=-DDEBUG

Now any time make(1) uses the compiler it will have “-DDEBUG” as the flags it calls the compiler with.

Adding in debugging information that you would like to keep around (for future debugging sessions) but would not want compiled into your programs when you produce releases is only one thing conditional compilation is useful for. In addition to that this mechanism is used to determine whether or not entire sub-systems or special features get compiled into the kernel. The “Kernel Config File” (more later) can be used to specify what devices should be supported for that kernel. Conditional compilation is used to either include or exclude the device driver’s code in the kernel that gets built. Optional features like quotas on filesystems are treated the same way.

Working With The Source Code

Copying Source Code

You will need to get a copy of the existing FreeBSD kernel source code, the C library, and the “include files” to get started. For homeworks and projects you will need to copy the source code of other pieces of FreeBSD as well but initially just these things should be copied. DO NOT copy the entire source code tree – we do not have enough disk space for everyone to have their own copy of the entire thing. Use the following commands while logged in to the lab central fileserver (mack.dcsl.buffalo.edu). The directories are case sensitive – make sure you capitalize exactly what I have in the commands below or later you will have problems with some things not working right.

First create a directory named CSE422 (even Graduate Students) in your home directory:

% mkdir CSE422

Now get a copy of the C Library and kernel source code with the commands:

% cd /usr/src

% tar –cf - ./lib/libc ./sys | (cd ~/CSE422; tar –xpBf -)

This will take a while. When that finishes get a copy of the files in /usr/include with the command:

% cd /usr

% tar –cf - ./include | (cd ~/CSE422; tar –xpBf -)

In general when you need to copy other pieces of the source code tree you should place them in the same tree structure as what is in /usr/src. For example if you find you need a copy of the mount(8) command you can find where the command itself and its source code is with the whereis(1) command:

% whereis mount

mount: /sbin/mount /usr/share/man/man8/mount.8.gz /usr/src/sbin/mount

This shows the source code is in /usr/src/sbin/mount. You should copy it by doing the commands:

% cd /usr/src

% tar cf - ./sbin/mount | (cd ~/CSE422; tar –xpBf -)

to preserve the directory structure.

The directories in /usr/src are:

bin – programs in /bin
contrib. – things contributed from other projects
crypto – support for cryptography (might be export issues)
etc – things from /etc
games – guess…
gnu – things specifically from GNU project
include – header files for /usr/include
kerberos5, kerberosIV – support for a distributed authentication mechanism with a high level of security
lib – libraries in /lib
libexec – programs in /usr/libexec
release – tools to help with building the release
sbin – programs in /sbin (administrative)
secure – security enhancing stuff, mostly SSH and SSL
share – things in /usr/share (e.g. termcap)
sys – kernel source code
tools – tools to help with installation
usr.bin – programs in /usr/bin
usr.sbin – programs in /usr/sbin (administrative)

Finding Stuff

There is a “tags file” that can be built which will make finding things in the kernel source code easier. It gets built in “/usr/src/sys/i386” so you need to be in that directory when you want to go looking for things. You can find the source code for a function in the kernel source tree by using “vi –t” and the function name when in the above directory. If you wanted to see the code for the open(2) system call you could do:

% vi –t open

to find it. Shortly into the code for open(2) a function call to falloc() is made, so to see its source code you can do:

% vi –t falloc

and so on. The equivalent mechanism for emacs is also available. For emacs the file that got built is named TAGS and is in the same directory. You need to go to /usr/src/sys/i386 and start up emacs. Then, once per editing session, do :

M-x visit-tags-table

Accept its defaults. Now to search for open(2) do:

M-x tags-search

Enter as the regular expression “^open(“. Then to find falloc() run the tags-search again and use the regular expression “^falloc(“. The “^” character is a regular expression meta-character that anchors the search at the beginning of the line.

Building the kernel

To cut down on the length of the pathnames specified here assume you are starting off at the top-level of your kernel source code directory (~/CSE422/sys). You should use the central fileserver mack.dcsl.buffalo.edu to build your kernels on. This is also the machine to log in on if you want to spend time reading through the existing source code to learn how things work.

In your kernel source directory is a directory “i386/conf” in which you will find a kernel config file named “CRASH”. This is a config file I have tailored for the crash machines in the lab. It removes device drivers and options we will not be using at all. Most of this procedure you only need to do if you change the “config file”. In general if you have not changed that file you should be able to just type “make” in the build directory created for you to rebuild your kernel.

To build your kernel, follow these steps:

Copy the config file, CRASH, to a file name that is your username, all upper-case. In my case that would be “KENSMITH”. As with the steps above please follow this step exactly because it makes me helping you easier if I know exactly what directory winds up being your build directory. In the examples below replace “KENSMITH” with the name of your config file.
Run the command config(8) with the name of your config file, e.g. for me that would be “config KENSMITH”
Change directories to the “build directory” that config(8) creates for you. In my case that is “../compile/KENSMITH”
Run the command “make depend”. This will build up the list of dependencies for your Makefile.
Finally run “make”. When this finishes you will have a file named “kernel”

Testing a kernel

You first need to “reserve” a crash machine. The current list of crash machines is on the DCSL Web Site. From mack.dcsl.buffalo.edu you can do the command “ruptime” and see if any of the crash machines have nobody logged in on them. You can then pick one and log in. The “ruptime” information is not reliable (it is based on “rwho” broadcasts which happen at several minute intervals) so someone may actually be logged in. There will also be machines in the “ruptime” listing that are not crash machines (some of which you can log in on, e.g. the “compile server” sneetch.dcsl.buffalo.edu while some you can’t log in on). Watch the “message of the day” which comes up as you are logging in. If the machine is not in use that message should say the machine is available. If someone is using it that message should say what user is using the machine. If the machine is available you need to edit the message of the day file to say you are using the machine. To do that use the command :

% sudo vi /etc/motd

(you can use any editor that is available, you do not need to use vi(1)). The sudo(8) command will ask you for your password the first time you use it. If you use sudo(8) again within a 5-minute period of time it will not ask you for the password again. However if you do not use sudo(8) for more than 5 minutes it will again ask youyou’re your password. This is just checking to make sure that it’s really you running sudo(8) and not the janitor who is cleaning your office and found that you left yourself logged in or something along those lines… 

Change the file to say that you are using the machine and exit the editor saving the file. Now that machine is yours to use. When you are done testing PLEASE remember to log back in to the crash machine one last time and edit /etc/motd to say the machine is available again.

To test your kernel you need to copy the file “kernel” you built above to /kernel and reboot the machine. To do that use the commands:

% cd ~/CSE422/sys/i386/compile/KENSMITH

% sudo cp –p kernel /boot/kernel/kernel

% sudo reboot

(change KENSMITH to your username of course). This will reboot the machine. It will take the machine several minutes to reboot, at which point you can log in through the network again to run test programs. If you would like to see some of the messages that the kernel generates as it boots up you can use the command dmesg(1) to see these logged messages remotely. They are also printed to the console if you want to be in Bell 314 to watch the machine boot. Part of the output from dmesg(1) will be the name of the directory the kernel was built in if you want some sort of “confirmation” that it really is your kernel that is running.

Once you have gone through this procedure once you should just need to type “make” in your build directory to rebuild the kernel unless you have made some drastic changes. One such drastic change would be changing the config file to alter what options or devices are to be part of your kernel. Adding new source code files that need to be part of the kernel is another thing that means you should do the entire procedure from scratch. If you make significant changes to what files are #include-ed in source files you should do the “make depend” step again but you don’t need to do the “config” step for this case.

Adding Source Code Files to the kernel

To add file(s) that will be compiled into the kernel you need to add its name to a file that config(8) will use while it is building the kernel’s Makefile. If the file(s) you are adding are architecture independent you should add them to ~/CSE422/sys/conf/files. There are versions of that file named “files.i386”, “files.alpha” and “files.pc98” which is where you should add them if what you are adding is architecture dependent (e.g. a device driver). The format of the “files” file is fairly straightforward for what you will need to add.

Please keep the source code for any files you add to the kernel in a separate directory so it is easy for me to find when you are asking for help. The directory should be ~/CSE422/sys/local. If you are adding a new system call named foo(2) and the code to support it is in the file named “foo.c” please put that file in this “local” directory. This would become a required file – since it is supporting a new system call it must be compiled into every kernel no matter what options are set in the kernel config file. The line you need to add to the “files” file in this case is:

local/foo.c standard

If you are adding in something that could be optional, and would be included into a kernel by having in the kernel’s config file:

options FOO

and the line in the “files” file would be:

local/foo.c optional foo

Now that file will be included in the kernel Makefile only if the option “FOO” is in the config file. Note that you just modified a file that config(8) uses to figure out what needs to go into the Makefile it places in your kernel build directory. You will need to go through the entire kernel build procedure from scratch, starting off with running config(8).

Adding a System Call

First you need to write the code that should implement the system call, or at least some of it to get you started. To make it available as a system call inside the kernel you then need to go to the directory ~/CSE422/sys/kern and edit the file “syscalls.master”. Read the comment at the top of the file for an indication of what each column means and how to control whether the system call is considered MP Safe or not. For the simple case of adding a “standard” system call that is not Multiprocessor Safe the line would look something like:

424 STD BSD { int foo (void); }

This will make it system call number 424 (the next available one as of FreeBSD 5.0-RELEASE), a “Standard” system call, a BSD based system call (as opposed to one specified by the POSIX standard), and provides the function prototype.

After editing that file run the shell script “makesyscalls.sh” like this:

% sh makesyscalls.sh syscalls.master

This shell script will build the files “init_sysent.c” and “syscalls.c” in that kern directory as well as ../sys/{syscall.h,sysproto.h,syscall.mk} header files. You need to copy the include files that got built (syscall.h, and sysproto.h) to ~/CSE422/include/sys so that when you rebuild the C Library (next section) the modified header files are what gets used during the compile. As mentioned above, all source code files you add to the kernel source tree should go in ~/CSE422/sys/local so that is where the source code for your new system call should be placed, and you will need to follow the directions above for adding source code files to the kernel as well as the instructions for rebuilding the kernel to include the new system call.

Note that this adds the system call to the kernel, but at the moment you have no way to call the new system call from a test program. The C Library has no entry point for the new system call. You need to rebuild the C Library, and link your programs against that new C Library for them to be able to use the new system call.

Building the C Library

As mentioned above you need to make sure you have copied any modified header files into your ~/CSE422/include/sys directory. You will also need to do this build on either the “compile server” (sneetch.dcsl.buffalo.edu) or on one of the crash machines. This is because on those machines, but not mack.dcsl.buffalo.edu, I have created the file /etc/make.conf which contains the line:

CFLAGS+= -I${HOME}/CSE422/include

Because of that line the C pre-processor will look in your directory for the header files as well as the “usual place”. To help make sure the pre-processor is using yours instead of the system’s normal ones I have removed /usr/include on the crash machines and the compile server. Note that it should be OK to compile your kernels on the compile server as well if you like, though all the files involved need to be copied through the network which will typically slow it down a little bit. User-level code (the C-library, and any test programs you write) must be compiled on the compile server or the crash machines.

To add a system call to the C Library you need to run make(1) in ~/CSE422/lib/libc. The system call entry points in the C Library source code are made from the stuff in the directory ~/CSE422/lib/libc/sys. The file “Makefile.inc” there references the file named “syscall.mk” in your ~/CSE422/sys/sys directory (which is another reason you need to copy these directory trees exactly, that won’t be true if you do not have them in the same position relative to each other) which is how your new system call winds up being accounted for without needing to edit anything in the source directories beneath libc. I do NOT want you replacing the shared libraries or anything like that so to save yourself some time you can just build the static version of the library:

% make libc.a

Now when compiling your test programs you would need to specify the command line flags “-static –L ~/CSE422/lib/libc” (note there MUST be a space before the “~” character) to tell the compiler to generate a statically linked executable and to add ~/CSE422/lib/libc to the directories it will look for libraries in. If you had built a C Library previously and are adding in a new system call again their make procedure does not seem to quite take all dependencies into account and it would be best if you do “make clean” and then “make” (which will recompile the entire thing) to make sure your alterations were included. Also note the above syntax if you want to reference your lib directory in Makefile’s. The lines of Makefiles are processed by sh() so you need to reference your home directory as “${HOME}”.

You do not need to rebuild the C library every time you modify your new system calls. In general you only need to rebuild the C library when you first add a new system call or if you change something about the parameters for the system call (number of arguments, or the type of the arguments). The code that actually implements the system calls is inside the kernel itself so rebuilding the kernel and rebooting with the new kernel is all you need to do if you only change that code.

Coding Standards

There are just a few standards you will need to follow with your coding. First you MUST use RCS. Any time you need to modify a file that you have not modified before you should follow the steps:

cd to the directory the file you need to modify is in
check to see if there is a directory there named “RCS”, if not make one
use the ci(1) program to check in the file, and then check out a locked copy
now edit the file
when done modifying the file use ci(1) again to check-in your modified version

PLEASE do NOT check-in files that you do not modify. While trying to help you when you have problems the Instructor will find it very helpful to be able to quickly identify what pieces of the kernel sources you have modified and if you stick with the above procedure this is possible.

The other coding standard you must follow is with how you define functions. You should use ANSI coding (function argument types given between the parentheses after the function name). Furthermore the type of the return value for the function should appear above the line with the function’s name. Both lines should begin at the very left edge of the file, and no blank between the name and the parentheses. For example if you were defining a function named “foo()” with an integer return value and one integer argument the function’s definition would look like:

int

foo(int bar)

{

<code>

}

It is this coding standard that helps with the building of the tags file. It also makes finding what file a function is defined in amongst a directory full of C code files easier. If you follow normal C coding indentation practices the only place the function’s name will appear at the beginning of the line with no indentation will be where it gets defined. The command :

% grep ‘^foo(‘ *.c

will find where the function is defined (as opposed to all the places it is used).

In addition to the above standards which must be followed strictly the FreeBSD group has coding standards of their own. It is these coding standards that help to make the FreeBSD source code relatively easy to read, understand, and navigate your way through. The style guide itself can be seen by reading the manual page entry for “style” (do “man style”). You should read through that to get a feel for the standards. Your projects will not be expected to abide by all of the rules set forth in the style manual page but around five points worth of every project will be reserved for completeness of commenting, neatness of your code, etc. You should try to abide by the style guide as much as possible.

NOTE: The kernel source code has been under development for a very long time. Some of it was written before the ANSI standards for C were developed. Just because you see examples of code that does not abide by the style guide in the kernel does not mean that is acceptable for new code being added to the kernel now. It simply means nobody has had the time to go through some of the older parts of the kernel to fix up that older source code. As an example MANY of the functions use the old-style C declarations for the function’s arguments but that will not be acceptable in your newly written code.

When Things Go Wrong…

You can wind up with a kernel that won’t boot properly. To help with this I have left a “Generic” kernel in the root filesystem of all the crash machines named “/kernel_gen”. To boot that kernel you need to be at the machine’s console. As it starts a boot sequence it will do the normal BIOS checks, and then it will say:

F1 FreeBSD

Default: F1

You can just let that time out or you can press the return key to have it boot up FreeBSD. It will print up a series of messages that ends with :

Hit [Enter] to boot immediately, or any other key for command prompt.

Booting [kernel] in XX seconds…

Press some key other than the Enter key (space is convenient). You now get an “ok” prompt. At this point the system has actually done a fair bit of work, and developed some “state” around the kernel. You can check out the commands available here for your own amusement using “help” but the sequence of commands you should need are:

ok unload boot/kernel/kernel

ok load boot/kernel/kernel_gen

ok boot –s

This will boot the machine to single-user mode. It will ask you for the full pathname to a shell (just hit return to accept the default) and then gives you a “#” prompt. Since the machine crashed to get here most likely it is best to run fsck(8) to check the filesystems. Errors during fsck(8) are to be expected. You basically always want to say “yes” to questions about fixing things. If you run fsck(8) with no arguments it will check all the filesystems listed in /etc/fstab. When it finishes you need to re-mount the root filesystem because right now it is mounted read-only. The most useful way to take care of that is with the command:

# mount –a –t nonfs

which means to mount all filesystems that are not NFS partitions. Now copy the Generic kernel to be what will be booted and reboot the machine:

# cd /boot/kernel

# cp kernel_gen kernel

# reboot

It will automatically boot the Generic kernel now so you should not need to do anything more at the console, just let it autoboot on its own.

Further References

The FreeBSD Handbook is available at http://www.freebsd.org. If you want to look into some of what is available inside of the kernel as far as functions goes they have started writing some manual pages. They are in Section 9 of the online manuals, and you can see what manual pages are available there by doing:

% ls /usr/share/man/man9

and then use the man(1) command to view ones that look interesting. Some of the names conflict with names in other sections of the manual page. You can force the man(1) command to look in section 9 by giving the number on the command line. For example:

% man 9 intro

will show you the introductory page. It is a good idea to read that too. You will need to use the kernel’s malloc() function at some point but the command “man malloc” will show you the C-library version malloc(3). Use “man 9 malloc” to see the right manual page.