Prof. Corso moved to the Electrical Engineering and Computer Science department at the University of Michigan in the 8/2014. He continues his work and research group in high-level computer vision at the intersection of perception, semantics/language, and robotics. Unless you are looking for something specific, historically, here, you probably would rather go to his new page.
Vision Seminar

Main Course Material
Course Outline
Full Teaching List

CSE 705: Vision Seminar on Spatiotemporal Video Analysis
SUNY at Buffalo
Fall 2010

Instructors: Jason Corso (jcorso), Raymond Fu (yunfu)
Course Webpage:
Meeting Times:M 11-1
Location: Vision Lab, Lockwood B20A
Office Hours: (Corso) M 4-5, F 2-3


  • First meeting is Monday, 8/30, to discuss paper list and logistics.

Main Course Material

Course Overview: This is a seminar course covering spatiotemporal video analysis. We will read and discuss papers on this topic throughout the semester, with the students primarily in charge of leading the discussions.

Prerequisites: It is assumed that the students have significance experience with computer vision, machine learning, and image analysis.

Grading: Grading is P/F unless a student specifically request otherwise.

Course Outline

See the paper list below for the full paper citations. I just list the authors here.
Date11-12Speaker 1 12-1 Speaker 2
8/30Ifeoma talk.IfeomaIntroduction 
9/6No Meeting   
9/13Research TalkJeff  
9/20Grundman et al. CVPR 10SagarResearch TalkJason
9/27Fei Fei et al. BMVC 05KushalResearch TalkCaiming
10/4Laptev et al. CVPR 08DuyguNo Meeting 
10/11Badrinarayanan et al. CVPR 10UtkarshRav-Acha et al. CVPR 06Kevin
10/18Savarese et al. MVC 08XinResearch TalkGang
10/25Bai et al. SIGGRAPH 08AlbertSun et al. CVPR 2009Caiming
11/1Ross et al. NIPS 05AishwaryaCVPR Round-UpAll
11/8No MeetingNo Meeting 
11/15 Bobick and Davis PAMI 2001Ananth Blank et al. ICCV 05Jeff
11/29Zhou et al. NIPS 06KushalResearch TalkAvik
12/6Zhu and Mumford FTCGV 07AvikResearch TalkSagar
Paper List
The paper list was circulated in class. This is a partial list and can be augmented by participants.
  • Video Segmentation
    • S. Paris. Edge-preserving smoothing and mean-shift segmentation of video streams. In ECCV, 2008.
    • W. Brendel and S. Todorovic. Video object segmentation by tracking regions. In ICCV, 2009.
    • Y. Huang, Q. Liu, and D. Metaxas. Video object segmentation by hypergraph cut. In CVPR, 2009.
    • M. Grundmann, V. Kwatra, M. Han, I. Essa, Efficient Hierarchical Graph-Based Video Segmentation, cvpr 2010
  • Video Segmentation (Interactive)
    • B. Price, B. Morse, and S. Cohen. Livecut: Learning-based interactive video segmentation by evaluation of multiple propagated cues. In ICCV, 2009.
    • X. Bai, J. Wang, D. Simons, and G. Sapiro. Video snapcut: robust video object cutout using localized classifiers. ACM SIGGRAPH, 28, 2009.
  • Spatiotemporal Interest Points
    • I. Laptev and T. Lindeberg. Space-time Interest Points. ICCV 2003.
    • P. Dollar, V. Rabaud, G. Cottrell, and S. Belongie. Behavior recognition via sparse spatio-temporal features. VS-PETS 2005.
    • Y. Ke, R. Sukthankar, and M. Hebert. Efficient Visual Event Detection using Volumetric Features. ICCV 2005.
    • A. Oikonomopoulos, I. Patras, and M. Pantic. Spatiotemporal Salient Points for Visual Recognition of Human Actions. SMC-B 36(3):710-719. 2006.
    • I. Laptev, M. Marszalek, C. Schmid, and B. Rozenfeld. Learning realistic human actions from movies. In CVPR, pages 1–8, Anchorage, Alaska, June 2008.
  • Activity/Motion Recognition/Learning
    • R. Polana and R. C. Nelson. Detecting activities. CVPR 1993
    • A. Madabhushi and J. K. Aggarwal. A bayesian approach to human activity recognition. In VS ’99: Workshop on Visual Surveillance, page 25, 1999.
    • A. F. Bobick and J. W. Davis. The recognition of human movement using temporal templates. IEEE PAMI, 23:257– 267, 2001.
    • C. Schuldt, I. Laptev, and B. Caputo. Recognizing human actions: A local svm approach. In ICPR, pages 32–36, 2004.
    • M. Blank, L. Gorelick, E. Shechtman, M. Irani, and R. Basri. Actions as Space-Time Shapes. ICCV 2005. (or PAMI Version)
    • A. Bissacco and S. Soatto. Classifying Human Dynamics Without Contact Forces. CVPR 2006.
    • T. T. Truyen, D. Q. Phung, S. Venkatesh and H. H. Bui. AdaBoost.MRF: Boosted Markov Random Forests and Application to Multilevel Activity Recognition. CVPR 2006.
    • A. Veeraraghavan, R. Chellappa and A.K. Roy-Chowdhury. The Function Space of an Activity. CVPR 2006.
    • J. C. Niebles and L. Fei-Fei. A hierarchical model of shape and appearance for human action classification. CVPR 2007.
    • E. Shechtman and M. Irani. Space-time behavior based correlation -- OR -- How to tell if two underlying motion fields are similar without computing them? PAMI 2007. 29(11):2045-2056.
    • H. Jiang and D. R. Martin. Finding actions using shape flows. ECCV 2008.
    • J. Sun, X. Wu, S. Yan, L.-F. Cheong, T.-S. Chua, and J. Li. Hierarchical spatio-temporal context modeling for action recognition. In CVPR, 2009.
    • R. Messing, C. Pal, and H. Kautz. Activity recognition using the velocity histories of tracked keypoints. ICCV 2009.
  • Unsupervised Action Analysis
    • J. C. Niebles, H. Wang. and L. Fei-Fei. Unsupervised Learning of Human Action Categories Using Spatial-Temporal Words. BMVC 2005.
    • G. Mori, H. Jiang, M. S. Drew, Y. Wang. Unsupervised Discovery of Action Classes. CVPR 2006.
    • S. Savarese, A. D. Pozo, J. C. Niebles, and L. Fei-Fei. Spatial-temporal correlations for unsupervised action classification. In Motion and Video Computing, 2008.
  • Video Summarization (only a few)
    • H.-W. Kang, Y. Matsushita, X. Tang and X.-Q. Chen. Space-Time Video Montage. CVPR 2006.
    • A. Rav-Acha, Y. Pritch and S. Peleg. Making a Long Video Short: Dynamic Video Synopsis. CVPR 2006.
  • Technical Background Papers
    • D. Zhou, J. Huang, and B. Sch"okopf. Learning with hypergraphs: Clustering, classification, and embedding. In NIPS’06
    • R. Zass and A. Shashua. Probabilistic graph and hypergraph matching. In CVPR 2008
    • D. Freedman and P. Kisilev. Fast mean shift by compact density representation. In CVPR 2009.
    • D. Ross, J. Lim, R.-S. Lin, and M.-H. Yang. Incremental Learning for Robust Visual Tracking. In NIPS 2005.

last updated: Sat Jun 21 07:38:45 2014; copyright jcorso