CSE 722: Selected Topics in Data Mining
Spring 2013
Basic Information
- Instructor: Jing Gao (jing@buffalo.edu)
- Time: 1-3pm Wednesday
- Location: 113A Davis Hall
- Office Hour: 1:30-3:30pm Tuesday
- Office: 350 Davis Hall
Overview
Data Mining is the process of discovering new and insightful knowledge from large bodies of data.
The amount of data in our world has been exploding, and nearly every industry is desperate to infer actionable
knowledge from data. There are great opportunities as well as numerous research challenges for data mining in social media analysis,
medical domains, computer security and many other fields. This seminar will provide an overview of the state-of-the-art data
mining techniques that arise in real applications. We will cover advanced techniques and algorithms for data mining as well
as emerging data mining applications. This course will be highly beneficial to students whose research interests are in data
mining, machine learning, bioinformatics, databases, information retrieval, artificial intelligence, and also to those who
may need to apply data mining to any application.
Prerequisites
Have taken at least one course in data mining, machine learning, pattern recognition, information retrieval or other data analytics related field.
Course Structure
We discuss two papers each week. The papers will be selected from recent publications/surveys from top conferences/journals in data mining, machine learning or other relevant domains. Grading is S/U and each student is required to present one or two papers in class. Students who registered 3 credits are required to complete a research project in data mining (As stated in Prerequisites, you MUST have taken at least one course in data mining related areas.)
Course Topics and Schedule
In this semester, we will focus on the topics relevant to mining big data. Three features define big data: 1) Volume (i.e., unprecedentedly large data sets), 2) Velocity (i.e., evolving and streaming data), and 3) Variety (i.e., data from multiple sources). The topics we will discuss include integration of multi-source data, evolutionary pattern discovery from streaming data and parallel processing algorithm for large-scale data mining.