ContactPerson: aygun@cse.buffalo.edu Remote host: paganini.cse.buffalo.edu ### Begin Citation ### Do not delete this line ### %R 2003-07 %U /tmp/aygun_dissertation.pdf %A Aygun, Ramazan Savas %T Spatio-Temporal Browsing of Multimedia Presentations %D May 08, 2003 %I Department of Computer Science and Engineering, SUNY Buffalo %K Multimedia Presentations, Spatial Browsing, Temporal Browsing, Sprite Generation, Multimedia Synchronization %X Emerging applications like asynchronous distant learning and collaborative engineering require organization of media streams as multimedia presentations. The browsing of presentations enables interactive surfing of the multimedia documents. We propose spatio-temporal browsing of multimedia presentations in the sense that browsing can be performed both in the spatial and temporal domain. The spatial browsing is provided by incorporation of camera controls like panning, tilting, and zooming. Panoramic images enable a kind of browsing by storing the image at high resolutions from various angles. However, the generation of high resolution sprite (mosaic) from digital video is not an easy task. Since the video data may also exist in a compressed format, new features like boundaries have to be extracted from the compressed video. We consider compressed data that is generated by Discrete Cosine Transform (DCT), which has been used in MPEG-1, MPEG-2, MPEG-4, and H263.1. Global Motion Estimation (GME) has been improved for videos where motion does not occur frequently. Motion sensors, which are sensitive pixels to motion, are proposed to indicate the existence of motion and yield quick approximation to the motion. Motion sensors reduce the amount of computations of the hierarchical evaluation of low-pass filtered images in iterative descent methods. The generated sprites are usually more blurred than original frames due to image warping stage and errors in motion estimation. The temporal integration of images is performed using the histemporal filter based on the histogram of values within an interval. The initial frame in the video sequence is registered at a higher resolution to generate high resolution sprite. Instead of warping of each frame, the frames are warped into the sprite at intervals to reduce the blurring in the sprite. We also introduce a new sprite called conservative sprite where new pixels are exclusively mapped on the sprite during temporal integration phase. The sprite pyramid is introduced to handle sprite at different resolutions. To measure the quality of the sprite, a new measure called sharpness is used to estimate the blurring in the sprite. The generated sprite is used for spatial browsing. On the other hand, temporal browsing is closely related with the synchronization of different streams. The power of synchronization models is limited to the synchronization specifications and user interactions. The proposed synchronization model is an event-based model that can handle time-based actions while enabling user interactions like backward and skip. The synchronization model processes the synchronization rules based on Event-Condition-Action (ECA) rules. Since the structure of a synchronization rule is simple, the manipulation of the rules can be performed easily in existence of user interactions. The synchronization model uses Receiver-Controller-Actor (RCA) scheme to execute the rules. In RCA scheme, receivers, controllers, and actors are objects to receive events, to check conditions, and to execute actions, respectively. The synchronization rules can easily be regenerated from SMIL expressions. The deduction of synchronization rules is based on author's specification. A middle layer between the specification and the synchronization model assists the synchronization model to provide user interactions while keeping the synchronization specification minimal. We call this middle layer as middle-tier. The middle-tier for multimedia synchronization handles synchronization rules that can be extracted explicitly from the user specification and synchronization rules that can be deduced implicitly from explicit synchronization rules. The synchronization model also generates a virtual timeline to manage the user interactions that change the course of the presentation. The verification and correctness of schedules are also important. The general methods to check the correctness of a specification are theoretical verification, simulation, and testing. Model checking is a technique that automatically detects all the states that a model can enter and checks the truthness of well-formed formulas. Moreover model checking can present contradictory examples if the formulas are not satisfied. PROMELA/SPIN tool has been used for model checking to check LTL (Linear Temporal Logic) formulas. These formulas can automatically be generated and verified.