Bina Ramamurthy: Research Interests


My Ph.D. dissertation deals with design and implementation of a comprehensive checkpointing and recovery scheme for distributed systems. It covered three major areas: Distributed systems, Software Engineering and Computer Architecture. The error detection aspects of the dissertation involved architectural design. Checkpointing and recovery component dealt with distributed systems. The Object-oriented testbed designed and implemented to evaluate the performance of my design, applied software engineering principles.


Distributed Computing Environment

I have diversified into applied research which involves design using Open Systems' Distributed Computing Environment (OSF's DCE). More specifically I am interested in evaluating viability and performance of three-tier client-server models using OSF's DCE. The issues that will be addressed in this research are: Three-tier model, security provided by DCE, using multi-threads for service, cross-platform (currently only Solaris and Windows NT) service, plugging in realistic database server (such as Oracle). I am guiding two undergraduates on this project. Two DCE vendors Transarc and Gradient have provided free evaluation licenses for their version of DCE. Both students working with me are graduating this semester: One is going on to graduate school with plans to continue research in the same area (DCE) and the other is going to work at IBM in a related area (CORBA). These two undergraduates chose me over the other faculty for doing their independent study research. I plan to continue my research in this area to explore building model systems using OSF's DCE. Many campus-wide computing organizations are interested in this project. We recently presented one of our pilot projects to a campus-wide DCE group. This research has excellent potential for attracting grants from industry.


Checkpointing and Recovery

The checkpointing and recovery scheme designed for my dissertation provides the basis for a very interesting theoretical work. An important contribution of this work is the notion of {\em safety} of error detection points of the various processes involved in a distributed computation. An error detection point of a process is safe if the process need not be rolled back beyond this point during recovery. A global state representing the safe detection points of various processes is maintained using a matrix structure. The checkpointing during normal system operation, the safety criteria, and recovery are formally explained based on this matrix. I am currently working on a journal paper on this topic. This work provides a theoretical framework for analyzing similar schemes in distributed systems.


An Object-Oriented Testbed

The object-oriented testbed (OTEC) designed for my dissertation is a very useful tool for practitioners and designers in the area of checkpointing and recovery. I am working on refining the design, adding more facilities and a graphical user-interface to make it available publicly. I hope to gain experience in designing robust software packages using software engineering principles. The package is written in C++ and we plan to use Xlib for the Graphical User Interface. A future graduate student of the Computer Science department is working on this project under my guidance. A technical paper based on this topic has been accepted for presentation at International Symposium on Fault-Tolerant Computing, FTCS-27.


Processor Design for Multi-threading

My special research interest is in Computer Architecture. I have taught RISC architecture in all my hardware-based courses since its introduction. I am particularly interested is in super-scalar architectures and architectures for multi-threading. With the proliferation of client-server computing and internet, applications are becoming increasingly multi-threaded. However, the underlying architectures are designed for single-threads of execution. I have ideas for exploring compiler technology and super-scalar architectures for supporting multi-threads. Most solutions for control and data dependencies in the context of super-scalar architectures assume single thread of flow. By presenting multiple independent threads to the instruction dispatcher, and with additional logic for keeping track of various threads, it should be possible to increase issue rate as well as processor untilization. I have recently taught a graduate course in Computer Architecture that helped me gain some useful insights into recent trends in this area. I will be seeking funding from industry as well as federal sources such as DARPA (see DARPA BAA97-03) and NSF for research projects in this area.

In summary, my research interests cover three very interesting areas that have many open problems and offer ample opportunity for applied research and for attracting research grants.