Knowledge Discovery and Data Mining (KDD) is an interdisciplinary area which utilizes various techniques from statistics and machine learning to extract useful information from large amounts of data. Many application areas, ranging from security to genome studies benefit from the techniques introduced in this field.

Please sign up here in the beginning of class.

In this course, we will discuss the main concepts and techniques related to processing and evaluating data. Throughout the semester, the students will be exposed to various problems in data mining and the corresponding potential solution proposals. The course will be mostly oriented towards analysis of the data.

Prerequisites: Good knowledge level of linear algebra, probability, algorithms and programming.

When emailing me, please put BIL713 in the subject line.


The final grade will consist of the following
Paper Presentation 15%
Participation (attendance, participation in discussions)5%
Homeworks (two-three programming assignments)20%
Project (progress and final term report)40%
Final Exam 20%

Paper presentations

Depending on enrollment, each student will need to present a few recent papers on Data Mining in class. The presentation should be clear and practiced and the student should read the assigned paper and related work in enough detail to be able to lead a discussion and answer questions. Extra credit will be given to students who also prepare a simple experimental demo highlighting how the method works in practice. The list of the papers to select from will be provided.

A presentation should be roughly 20-25 minutes long (no overtime please). You are allowed to take some material from presentations on the web as long as you cite the source fairly. In the presentation, also provide the citation to the paper you present and to any other related work you reference.

Deadline: The presentation should be handed in one day before the class (or before if you want feedback).

Structure of presentation:
  • Main motivation and clear statement of the problem
  • Overview of the technical approach
  • Overview of the experimental evaluation
  • Strengths/weaknesses of the paper
  • Discussion: future direction, links to other work

Project

Each student will need to write a short project proposal in the beginning of the class (in March). The projects will be research oriented. In the middle of semester, you will need to hand in a progress report. The final report will be due to the end of the classes. The final report is expected to be of a conference paper quality.

The students can work on projects individually or in pairs. The project can be an interesting topic that the student comes up with himself/herself or with the help of the instructor. The grade will depend on the ideas, how well you present them in the report, how well you position your work in the related literature, how thorough are your experiments and how comprehensive are your conclusions.

back to top

Related courses:


Software:


Popular datasets:


Main conferences:



back to top