PAKDD-98 Tutorials

Data Mining: An Overview from Database Perspective
Jiawei Han, Simon Fraser University, Canada

Data mining, or knowledge discovery in databases, has been popularly recognized as an important research issue with broad applications. This tutorial is to present a comprehensive survey, from the database perspective, on the data mining techniques developed recently. Several kinds of data mining methods, including characterization, association, classification, clustering, time-series analysis, pattern analysis, visual data mining, and meta-rule guided mining, will be reviewed. OLAP mining, a technique which integrates mining and OLAP operations in data warehouses will be examined in detail. Techniques for mining knowledge in different kinds of databases, including data warehouses, relational, transaction, object-oriented, spatial, and multimedia databases, as well as mining on WWW, will be examined. Data mining systems, data mining applications, and some research issues will also be discussed.

Jiawei Han received his Ph.D. from the University of Wisconsin, Madison, in 1985. He is Director of the Intelligent Database Systems Research Laboratory, and a Professor in the School of Computing Science, at Simon Fraser University in British Columbia, Canada. He has conducted research in the areas of data mining and data warehousing, deductive and object-oriented databases, spatial databases, multimedia databases, and logic programming, with over 100 journal and conference publications. He is an editor of IEEE Transactions on Knowledge and Data Engineering, the Journal of Intelligent Information Systems, and Data Mining and Knowledge Discovery: An International Journal. He has served or is currently serving on the program committees of over 30 international conferences and workshops, including ICDE'95 (Program Committee Vice-Chair), DOOD'95, ACM-SIGMOD'96, VLDB'96, KDD'96 (Program Co-Chair), CIKM'97, SSD'97, KDD'97, and ICDE'98.

Applications of Minimum Message Length in Data Analysis
Chris Wallace, Monash University, Australia

Minimum Message Length is a technique based on information theory for discovering and confirming patterns in data. Essentially, Minimum Message Length considers a pattern to have occurred in data if the assumption of the pattern enables the data to be encoded more concisely. Relevant mathematical apparatus and detailed applications will be covered.

Professor Chris Wallace received his PhD from the University of Sydney in 1959. He is a Fellow of the Association for Computing Machinery and the Australian Computer Society. His main research interests are in information theory and computer architecture. Among other distinguished contributions, Professor Wallace conceived and developed (initially with D. Boulton) a new theory for multivariate analysis based upon information and coding theory. This technique is now embodied in a large computer program used by research workers in several biological and social science disciplines in Australia and overseas. The basic theory and technique has also been applied to the testing and refinement of extremely complex hypotheses in archeology, and is currently being developed (with J. Patrick and P.R. Freeman) as a new and very general method for statistical and inductive inference. Some of the research results have been published in the Machine Learning journal (in 1993), the 1996 International Conference on Machine Learning, and the 1997 International Joint Conference on Artificial Intelligence.