CSE 722: Selected Topics in Data Mining

Spring 2014

Basic Information
Overview

Data Mining is the process of discovering new and insightful knowledge from large bodies of data. The amount of data in our world has been exploding, and nearly every industry is desperate to infer actionable knowledge from data. There are great opportunities as well as numerous research challenges for data mining in social media analysis, medical domains, computer security and many other fields. This seminar will provide an overview of the state-of-the-art data mining techniques that arise in real applications. We will cover advanced techniques and algorithms for data mining as well as emerging data mining applications. This course will be highly beneficial to students whose research interests are in data mining, machine learning, bioinformatics, databases, information retrieval, artificial intelligence, and also to those who may need to apply data mining to any application.

Prerequisites

Have taken at least one course in data mining, machine learning, pattern recognition, information retrieval or other data analytics related field.

Course Structure

We discuss two papers each week. The papers will be selected from recent publications/surveys from top conferences/journals in data mining, machine learning or other relevant domains. Grading is S/U and each student is required to present and review one or two papers in class. Students who registered 3 credits are required to complete a survey on a specific topic in data mining and present/review more papers (As stated in Prerequisites, you MUST have taken at least one course in data mining related areas.)

Course Topics and Schedule

In this semester, we will focus on the following topics: crowdsourcing, stream mining, online learning, location data mining, trajectory mining, spatial-temporal data mining, graph mining, mining social media data, recommendation and health data analysis. You can find three example presentation slides here: Slide 1, Slide 2, Slide 3

Date Papers Presenters Reviewers
Week 1 (January 29) Introduction Jing Gao N/A
Week 2 (February 5) Knowledge Graph Tutorial Jing Gao N/A
Week 3 (February 12) Reactive Crowdsourcing
Recursive Fact-finding: A Streaming Approach to Truth Estimation in Crowdsourcing Applications
Pradnya Kulkarni
Vrushal Vijay Mhatre
Vrushal Vijay Mhatre, Mahesh Jaliminche
Ramanan Parthasarathy, Taher Suterwala
Week 4 (February 19) Mining Evolutionary Multi-Branch Trees from Text Streams
Towards Never-Ending Learning from Time Series Streams
Mahesh Jaliminche
Wei Zheng
Sinchan Bhattacharya, Vrushal Vijay Mhatre
Chuishi Meng, Anirudha Karwa
Week 5 (February 26) Online Data Fusion
Cost-Sensitive Online Active Learning with Application to Malicious URL Detection
Chuishi Meng
Ajinkya S
Ajinkya S, Ramanan Parthasarathy
Yazhou Cao, Sayali Deshmukh
Week 6 (March 5) Online Community Detection in Social Sensing
Modeling/Predicting the Evolution Trend of OSN-based Applications
Vidya Ramachandran
Prachi Gokhale
Poonam Pradhan, Wei Zheng
Sayali Deshmukh, Amol Chandla
Week 7 (March 12) Geo-Spotting: Mining Online Location-based Services for Optimal Retail Store Placement
Trade Area Analysis using User Generated Mobile Location Data
Amol Chandola
Sanjay Ramanathan
Roshini Sebastian, Lei Lin
Prachi Gokhale, Dilip Pednekar
March 19 Spring Break
No Class
N/A N/A
Week 8 (March 26) TODMIS: Mining Communities from Trajectories
Mining Lines in the Sand: On Trajectory Discovery From Untrustworthy Data in Cyber-Physical System
Ramanan Parthasarathy
Lei Lin
Prachi Gokhale, Pradnya Kulkarni
Poonam Pradhan, Mahesh Jaliminche
Week 9 (April 2) U-Air: When Urban Air Quality Inference Meets Big Data
Who, Where, When and What: Discover Spatio-Temporal Topics for Twitter Users
Roshini Sebatian
Rahul Tejwani
Vidya Ramachandran, Lei Lin
Sanjay Ramanathan, Vidya Ramachandran
Week 10 (April 9) Modeling Dynamic Behavior in Large Evolving Graphs
Efficient Processing of Streaming Graphs for Evolution-Aware Clustering
Poonam Pradhan
Sayali Deshmukh
Xiaowei Jia, Pradnya Kulkarni
Ajinkya S, Wei Zheng
Week 11 (April 16) Inferring Anchor Links across Multiple Heterogeneous Social Networks
Connecting Users across Social Media Sites: A behavioral-Modeling Approach
Sinchan Bhattacharya
Dilip Pednekar
Dilip Pednekar, Devashish Bharadwaj
Devashish Bharadwaj, Sinchan Bhattacharya
Week 12 (April 23) Unsupervised Feature Selection for Multi-View Data in Social Media
What's Your Next Move: User Activity Prediction in Location-based Social Networks
Houping Xiao
Anirudha Karwa
Xiaowei Jia, Rahul Tejwani
Sanjay Ramanathan, Yazhou Cao
Week 13 (April 30) From Amateurs to Connoisseurs: Modeling the Evolution of User Expertise through Online Reviews
Making Recommendations from Multiple Domains
Devashish Bharadwaj
Yazhou Cao
Roshini Sebastai, Rahul Tejwani
Houping Xiao, Amol Chandla
Week 14 (May 7) Multi-Source Learning with Block-wise Missing Data for Alzheimer's Disease Prediction
Network Discovery via Constrained Tensor Analysis of fMRI Data
Taher Suterwala
Xiaowei Jia
Houping Xiao, Chuishi Meng
Taher Suterwala, Anirudha Karwa