Programme type: Doctoral
Data mining aims at revealing non-trivial, hidden and ultimately applicable knowledge in large data. This course focuses on two key data mining issues: data size and their heterogeneity. When dealing with large data, it is important to resolve both the technical issues such as distributed computing or hashing and general algorithmic complexity. In this part, the course will be motivated mainly by case studies on web and social network mining. The second part will discuss approaches that merge heterogeneous prior knowledge with measured data. Bioinformatics will make the main application field here. It is assumed that students have completed the master course on Machine Learning and Data Analysis (A4M33SAD).
Anand Rajaraman, Jure Leskovec, Jeffrey D. Ullman: Mining of Massive Datasets, Cambridge University Press, 2011.