|
Jason J. Corso
Research Overview
|
Overview of Research
My research interests are in the fields of computer and
medical vision (segmentation and recognition), computational
biomedicine, machine intelligence, statistical learning, perceptual
interfaces and smart environments. My current research focus
is to develop techniques for automatically learning hierarchical
statistical models of complex phenomena and deriving robust efficient
inference algorithms on these models.
The main task of my postdoctoral fellowship was to develop methods that
automatically identify and characterize anatomic and pathologic
structures within medical image data. In this work, I proposed an
approach to incorporate learned model-specific affinity functions in a
Bayesian affinity calculation for graph-based segmentation methods.
I also developed a new energy minimiation algorithm called
Graph-Shifts that manipulates a dynamic hierarchical decomposition of
the image to rapidly and robustly do segmentation and recognition.
[more]
In my graduate work, I was a member of the Visual Interaction Cues
project. His dissertation proposed an extensible framework for
building perceptual interfaces that use video-based input devices. I
studied the development of general methods for vision-based
interaction that allow dynamic, unencumbered interaction in
environments augmented with new display technology and both active and
passive vision-systems. [more]
My graduate and post-graduate work was jointly funded by an NIH
National Center for Biomedical Computing Award ( LONI/CCB), the National Library of Medicine, National
Science Foundation and a fellowship from the Link Foundation.
| Shift# | Coronal | Sagittal |
| 5 |
|
|
| 50 |
|
|
| 500 |
|
|
| 5000 |
|
|
|
Graph-Shifts Algorithm for Energy Minimization with Applications to 2D
and 3D Segmentation and Classification
Graph-Shifts is a novel energy minimization algorithm that
manipulates a dynamic hierarchical decomposition of the data
to rapidly and robustly minimize an energy function. The
dynamic hierarchical representation makes it plausible to take large
jumps in the energy space analogous to combined split-and-merge
operations. We use a deterministic approach to quickly choose the
optimal move at each iteration. It has been applied in 2D and 3D joint
image segmentation and classification, as depicted on the left for
the segmentation of subcortical brain structures. Graph-shifts
typically converges orders of magnitude faster than conventional
minimization algorithms, like PDE-based methods, and has been shown to
be very robust to initialization.
|
|
Efficient Multilevel Image Segmentation and Integrated Bayesian Model Classification

The main task of my postdoctoral position is to develop methods that
automatically identify and characterize anatomic and pathologic
findings within medical image data. Robust and accurate segmentation
of images is a crucial part of biomedical science. For example,
accurate, automatic segmentation of tumor in brain MR images would
provide the necessary quantifiable measurements for studying
population statistics and advancing medical diagnosis. However,
automatic segmentation is a difficult problem: it is
under-constrained, precise biophysical models are generally not yet
known, and the organic data presents high intra-class variance. In
this research, we study methods for automatic segmentation of image
data that strive to leverage the efficiency of bottom-up algorithms
with the power of top-down models. The work takes one step toward
unifying two state-of-the-art image segmentation approaches: graph
affinity-based and generative model-based segmentation. Specifically,
the main contribution of the work is a mathematical formulation for
incorporating soft model assignments into the calculation of
affinities, which are traditionally model free. This Bayesian
model-aware affinity measurement has been integrated into the
multilevel Segmentation by Weighted Aggregation algorithm. As a
byproduct of the integrated Bayesian model classification, each node
in the graph hierarchy is assigned a most likely model class according
to a set of learned model classes. The technique has been applied to
the task of detecting and segmenting brain tumor and edema,
subcortical brain structures and multiple sclerosis lesions in
multichannel magnetic resonance image volumes. Quantifiable
improvements will be shown for the difficult case of brain tumor.
Slides from a talk on this topic:
[mov] |
[pdf]
|
[Click for official project page]
|
Vision-Based Human-Machine Interaction
[Click for detailed project page]
[Click for official project page]
We have developed a methodology for vision-based interaction called
Visual Interaction Cues (VICs). The VICs paradigm is a methodology for
vision-based interaction operating on the fundamental premise that, in
general vision-based HCI settings, global user modeling and tracking are
not necessary. For example, when a person presses the number-keys while
making a telephone call, the telephone maintains no notion of the user.
Instead, it only recognizes the action of pressing a key. In contrast,
typical methods for vision-based HCI attempt to perform global user
tracking to model the interaction. In the telephone example, such
methods would require a precise tracker for the articulated motion of
the hand. However, such techniques are computationally expensive, prone
to error and the re-initialization problem, prohibit the inclusion of an
arbitrary number of users, and often require a complex gesture-language
the user must learn. In the VICs paradigm, we make the observation that
analyzing the local region around an interface component (the telephone
key, for example) will yield sufficient information to recognize user
actions.
-
J. J. Corso, G. Ye, D. Burschka, and G. D. Hager.
A Practical Paradigm and Platform for Video-Based Human-Centered
Computing.
IEEE Computer, Special Issue on Human-Centered Computing.
(in submission).
-
Jason J. Corso. Techniques for Vision-Based
Human-Computer Interaction. PhD Thesis. The Johns Hopkins
University. 2005.
-
Jason J. Corso and Guangqi Ye and Gregory D. Hager.
Analysis of Multi-Modal Gestures with a Coherent Probabilistic
Graphical Model.
Virtual Reality, 2005.
-
Jason J. Corso.
Vision-Based Techniques for Dynamic, Collaborative Mixed-Realities.
In Brian J. Thompson, editor, Research Papers of the Link
Foundation Fellows, volume 4. University of Rochester Press.
Invited Report for Link Foundation Fellowship; to be released Fall
2004.
-
Guangqi Ye and Jason J. Corso and Gregory D. Hager.
Gesture Recognition using 3D Appearance and Motion Features.
In B. Kisacanin and V. Pavlovic and T. Huang, editor,
Real-Time Vision for Human-Computer Interaction. 2005.
Extended version of the paper by the same title in Proceedings
of Workshop on Real-Time Vision for Human-Computer Interaction (at CVPR
2004).
-
Guangqi Ye, Jason J. Corso, Darius Burschka, and Gregory D. Hager.
VICs: A Modular HCI Framework Using Spatio-Temporal Dynamics.
Machine Vision and Applications, 2004.
-
Guangqi Ye and Jason J. Corso and Gregory D. Hager.
Gesture Recognition Using 3D Appearance and Motion Features.
In Proceedings of Workshop on Real-time Vision for
Human-Computer Interaction (at CVPR 2004), 2004.
-
Guangqi Ye, Jason J. Corso, Gregory D. Hager, and Allison M. Okamura.
VisHap: Augmented Reality Combining Haptics and Vision.
In Proceedings of IEEE International Conference on Systems, Man
and Cybernetics, 2003.
-
Jason J. Corso, Darius Burschka, and Gregory D. Hager.
The 4DT: Unencumbered HCI With VICs.
In Proceedings of CVPRHCI, 2003.
-
Guangqi Ye, Jason J. Corso, Darius Burschka, and Gregory D. Hager.
VICs: A Modular Vision-Based HCI Framework.
In Proceedings of 3rd International Conference on Computer
Vision Systems, pages 257-267, 2003.
|
[Click for detailed project page on coherent
regions.]
|
Coherent Image Regions - Coupled Segmentation and Correspondence
[Click for detailed project page on coherent
regions.]
[Click for detailed project page on subspace
fusion for global segmentation.]
We study methods that attempt to integrate information from coherent
image regions to represent the image. Our novel sparse image
segmentation can be used to solve robust region correspondences and
therefore constrain the search for point correspondences. The
philosophy behind this work is that coherent image regions provide a
concise and stable basis for image representation: concise meaning that
the required space for representing the image is small, and stable
meaning that the representation is robust to changes in both viewpoint
and photometric imaging conditions.
In addition, we have proposed a subspace labeling technique for global
Image segmentation in a particular feature subspace is a fairly well
understood problem. However, it is well known that operating in only a
single feature subspace, e.g. color, texture, etc, seldom yields a good
segmentation for real images. However, combining information from
multiple subspaces in an optimal manner is a difficult problem to solve
algorithmically. We propose a solution that fuses contributions from
multiple feature subspaces using an energy minimization approach. For
each subspace, we compute a per-pixel quality measure and perform a
partitioning through the standard normalized cut algorithm. To fuse the
subspaces into a final segmentation, we compute a subspace label for
every pixel. The labeling is computed through the graph-cut energy
minimization framework proposed by Boykov et al. Finally, we combine
the initial subspace segmentation with the subspace labels obtained from
the energy minimization to yield the final segmentation.
|
|
Direct Methods for Surface Tracking
We have developed a set of algorithms to directly track planar surfaces
and parametric surfaces under a calibrated stereo-rig.
A movie demonstrating the planar surface tracking is here. A binary pixel mask is maintained which
determines those pixel belonging to the plane (and with good
texture); it is shown in red in the lower left of the video. The green
vector being rendered is the plane's normal vector. Left is an image
of the system that was built with our plane tracking routines to
localize mobile robots. In the image, we show the real scene, the two
walls that are being tracked (one in blue and one in red), and an
overhead (orthogonal) projection of the reconstructed walls.
-
William W. Lau, Nicholas A. Ramey, Jason J. Corso, Nitish Thakor, and
Gregory D. Hager.
Stereo-Based Endoscopic Tracking of Cardiac Surface Deformation.
In Proceedings of Seventh International Conference on Medical
Image Computing and Computer-Assisted Intervention (MICCAI), 2004.
-
Nicholas A. Ramey and Jason J. Corso and William W. Lau and Darius Burschka
and Gregory D. Hager.
Real Time 3D Surface Tracking and Its Applications.
In Proceedings of Workshop on Real-time 3D Sensors and Their
Use (at CVPR 2004), 2004.
-
Jason J. Corso, Nicholas Ramey, and Gregory D. Hager.
Stereo-Based Direct Surface Tracking with Deformable Parametric
Models.
Technical report, The Johns Hopkins University, 2003.
CIRL Lab Technical Report 2003-02.
-
Jason J. Corso, Darius Burschka, and Gregory D. Hager.
Direct Plane Tracking in Stereo Image for Mobile Navigation.
In Proceedings of International Conference on Robotics and
Automation, 2003.
-
Jason J. Corso and Gregory D. Hager.
Planar Surface Tracking Using Direct Stereo.
Technical report, The Johns Hopkins University, 2002.
CIRL Lab Technical Report.
|
[Click for detailed project page]
|
Interactive Haptic Rendering of Deformable Surfaces
[Click for detailed project page]
We have developed a new method for interactive deformation and haptic
rendering of viscoelastic surfaces. There are competing demands for
haptic rendering and graphics renderings; i.e. an implicit object
representation is best for Haptic interaction while an explicit
representation is best for Graphic rendering. In our approach, we fuse
an implicit and explicit object representation permitting fast haptic
interaction and fast graphic rendering. Objects are defined by a
discretized Medial Axis Transform (MAT), which consists of an ordered
set of circles (in 2D) or spheres (in 3D) whose centers are connected by
a skeleton. Our implementation, called DeforMAT, is appealing because
it takes advantage of single point haptic interaction to render
efficiently while maintaining a very low memory footprint.
|
|
Real-Time Volume Visualization
We developed a method for the voxelization of large scalar fields with
the goal of interactive volume rendering. An adaptive octree is used to
optimally sample the underlying unstructured grid. The unstructured
grid is embedded into a voxel-space and those regions not corresponding
to input data are flagged as being outside of the embedded model. The
octree nodes share borders enabling smooth data continuity between them.
Gradients are computed and stored with the textures for lighting
computation. We integrated this system as a preprocess for an
interactive volume system that we developed. This approach leverages
the current 3D texture mapping PC hardware for the problem of
unstructured grid rendering. We specialize the 3D texture octree to the
task of rendering unstructured grids through a novel pad and stencil
algorithm, which distinguishes between data and non-data voxels. Both
the voxelization and rendering processes efficiently manage large,
out-ofcore datasets. The system manages cache usage in main memory and
texture memory, as well as bandwidths among disk, main memory, and
texture memory. It also manages rendering load to achieve interactivity
at all times. It maximizes a quality metric for a desired level of
interactivity. It has been applied to a number of large data and
produces high quality images at interactive, user-selectable frame rates
using standard PC hardware.
|
|
Links to much older research and projects:
- Concept - Interactive CSG modeling.
- CnD - Concurrent and Distributed Development Environment is a Javaspaces based environment for distributed software development teams.
|