ICDM '01
The 2001 IEEE International Conference on Data Mining

Sponsored by the IEEE Computer Society

Doubletree Hotel, San Jose, California, USA
November 29 - December 2, 2001

ICDM 2001 Program

Notes for poster paper authors:
  1. Each poster review on November 30 in the afternoon has a 10 min presentation.
  2. At the Posters and Software Demos session (November 30 6:30pm - 9:00pm), each poster paper can have around 12 sheets (A4 or similar size) for posting. The conference will provide a set of white foam paper for each poster paper.

November 29


Tutorials (Montery Room)

Workshop (Carmel Room)

Workshop (Santa Clara Room)


Text Mining for Bioinformatics
by Hinrich Schuetze
Integrating Data Mining and Knowledge Management Text Mining (TextDM '2001)
2:00 Mining Time Series Data
by Eamonn Keogh
November 30
8:30 Opening/Awards (Cascade Sierra Room)
9:00 Keynote: Jim Gray, Microsoft Research, USA (The 1999 Turing Award Winner).
The World Wide Telescope: Mining the Sky
Cascade Sierra Room
10:00 Catered Break (Bayshore Foyer)
10:30 Cascade Room
Research Track 1A:
Sierra Room
Research Track 2A:
Siskiyou Room
Research Track 3A
1 S115
G Richards, V J Rayward-Smith
Discovery of Association Rules in Tabular Data
Jose Balcazar, Yang Dai, Osamu Watanabe
Provably Fast Training Algorithms for Support Vector Machines
Steven Noel, Vijay Raghavan, C.-H. Henry Chu
Visualizing Association Mining Results through Hierarchical Clusters
2 S451
Masakazu Seno, George Karypis
LPMiner: An Algorithm for Finding Frequent Itemsets Using Length-Decreasing Support Constraint

David Maxwell Chickering, Christopher Meek, Robert Rounthwaite
Efficient Determination of Dynamic Split Points in a Decision Tree

Beitao Li, Wei-Cheng Lai, Edward Chang, Kwang-Ting Cheng
Mining Image Features for Efficient Query Processing
3 S426
Floris Geerts, Bart Goethals, Jan Van den Bussche
A Tight Upper Bound on the Number of Candidate Patterns
Tom Fawcett
Using Rule Sets to Maximize ROC Performance
Jong-Sheng Cherng, Mei-Jung Lo
A Hypergraph Based Clustering Algorithm for Spatial Data Sets
12:00 Lunch (ICDM Steering Committee Meeting with ICDM '02 Organizers)
1:30 Cascade Room
Research Track 1B:
Sierra Room
Research Track 2B:
Siskiyou Room
Research Track 3B
4 S458
Chang-Shing Perng, Haixun Wang, Sheng Ma, Joself L. Hellerstein
FARM: A Framework for Exploring Mining Spaces with Multiple Attributes
Maria Halkidi, Michalis Vazirgiannis
Clustering Validity Assessment: Finding the Optimal Partitioning of Set
Ming-Chuan Hung, Don-Lin Yang
An Efficient Fuzzy C-Means Clustering Algorithm

Karam Gouda, Mohammed J. Zaki
Efficiently Mining Maximal Frequent Itemsets

Manoranjan Dash, Kian Lee Tan, Huan Liu
Efficient Yet Accurate Clustering
Richard J Bolton, David J Hand
Significance Tests for Patterns in Continuous Data
6 S551
 J. Li, H. Shen, and R. Topor
Mining the Smallest Association Rule Set for Predictions
Valerie Guralnik, George Karypis
A Scalable Algorithm for Clustering Sequential Data
Wei Wang, Jiong Yang, Philip Yu
Meta-Patterns: Revealing Hidden Periodic Patterns
7 S179
Bing Liu, Yiming Ma, and R. Lee
Analyzing the Interestingness of Association Rules from the Temporal Dimension

Konstantinos Kalpakis, Dhiral Gada, Vasundhara Puttagunta
Distance Measures for Effective Clustering of ARIMA Time-Series

Wai-Ho Au, Keith C.C. Chan
Classification with Degree of Membership: A Fuzzy Approach

Chang-Hung Lee, Cheng-Ru Lin, Ming-Syan Chen
On Mining General Temporal Assocation Rules in a Publication Database

Hichem Frigui, Mohamed Ben Hadj Rhouma
A Synchronization Based Algorithm for Discovering Ellipsoidal Clusters in Large Datasets

Tzung-Pei Hong, Yeong-Chyi Lee
Mining Coverage-Based Fuzzy Rules by Evolutional Computation
4:00 Catered Break (Bayshore Foyer)
4:30   Cascade Room
Research Track 1C

Poster Previews
(See below)
Sierra Room
Research Track 2C

Poster Previews
(See below)
Siskiyou Room
Research Track 3C

Poster Previews
(See below)
5:45 Break
6:30 - 9:00 Posters and Software Demos (Donner Room)
December 1
8:30 Keynote: Jerome H. Friedman, Stanford University, USA.
Predictive Data Mining with Multiple Additive Regression Trees
Cascade Sierra Room
9:30 Catered Break (Bayshore Foyer)
10:00 Cascade Room
Research Track 1D:
Sierra Room
Research Track 2D:
Siskiyou Room
Research Track 3D:
9 S442
Marzena Kryszkiewicz
Concise Representation of Frequent Patterns Based on Disjunction-Free Generators
Haixun Wang, Philip S. Yu
SSDT: A Scalable Subspace-Splitting Classifier for Biased Data
Hisao Ishibuchi, Takashi Yamamoto, Tomoharu Nakashima
Fuzzy Data Mining: Effect of Fuzzy Discretization
10 S408
Christopher Jermaine
The Computational Complexity of High-Dimensional Correlation Search
Shoji Hirano, Shusaku Tsumoto
Indiscernability Degress of Objects for Evaluating Simplicity of Knowledge in the Clustering Procedure
Paul Munteanu, Mohamed Bendou
The EQ Framework for Learning Equivalence Classes of Bayesian Networks

Wenmin Li, Jiawei Han, Jian Pei
CMAR: Accurate and Efficient Classification Based on Multiple Class-Association Rules

SungYoung Jung, Taek-Soo Kim
An Agglomerative Hierarchical Clustering Using partial Maximum Array and Incremental Similarity Computation Method

Lior Rokach, Oded Maimon
Theory and Applications of Attribute Decomposition

Sheng Ma, Joseph L. Hellerstein
Mining Mutually Dependent Patterns
Tapio Elomaa, Juho Rousu
Preprocessing Opportunities in Optimal Numerical Range Partitioning
Rong Chen, Krishnamoorthy Sivakumar, Hillol Kargupta
Distributed Web Mining Using Bayesian Networks from Multiple Data Streams
12:00  Lunch    
1:30 Keynote: Pat Langley, Institute for the Study of Learning and Expertise, USA.
Knowledge and Data in Computational Scientific Discovery
Cascade Sierra Room
2:30 Catered Break (Bayshore Foyer)
3:00 Cascade Room
Research Track 1 E:
Sierra Room
Research Track 2 E:
Siskiyou Room
Research Track 3 E:
13 S310
Sigal Sahar
Interestingness PreProcessing
Mahesh V. Joshi, Vipin Kumar, Ramesh Agarwal
Evaluating Boosting Algorithms to Classify Rare Classes: Comparison and Improvements
Joaquim Silva, Agra Coelho, Gabriel Lopes
Document Clustering and Cluster Topic Extraction in Multilingual Corpora
14 S160
Xiaohua Tony Hu
Using Rough Sets Theory and Database Operations to Construct a Good Ensemble of Classifiers for Data Mining Applications
Wei Fan, Matthew Miller, Salvatore J. Stolfo, Wenke Lee, and P. Chan
Using Artificial Anomalies to Detect Unknown and Known Network Intrusions
Henry E. Kyburg, Jr.
Statistical Considerations in Learning from Data
15 S283
Ying Sai, Yiyu Yao, Ning Zhong
Data Analysis and Mining in Ordered Information Tables
Ning Zhong, Y.Y. Yao, Muneaki Ohshima, Setsuo Ohsuga
Interestingness, Peculiarity, and Multi-database Mining
Johan Himberg, Kalle Korpiaho, Heikki Mannila, Johanna Tikanmäki, and Toivonen
Time Series Segmentation for Context Recognition in Mobile Devices

Thomas Knight, Jon Timmis
AINE: An Immunological Approach to Data Mining

Suhail Ansari, Ron Kohavi, Llew Mason, Zijian Zheng
Integrating E-Commerce and Data Mining: Architecture and Challenges
Charu Aggarwal, Philip Yu
On Effective Conceptual Indexing and Similarity Search in Text Data
5:00 Break    
6:30 – 8:30 Banquette (Cascade Sierra Room)    
December 2
8:30 Keynote:Benjamin W. Wah, University of Illinois, Urbana-Champaign, USA (President, IEEE Computer Society).
Intelligent Mining for Time Series Predictions
Cascade Sierra Room
9:30 Catered Break (Bayshore Foyer)
10:00 Cascade Room
Research Track 1G:
Sierra Room
Research Track 2G:
Siskiyou Room
Research Track 3G:
17 S515
C. Ordonez, E. Omiecinski, L. de Braal, C. Santana, N. Ezquerra, J. Taboada, D. Cooke, E. Krawczynska, and E. Garcia
Mining Constrained Association Rules to Predict Heart Disease
Gary R. Livingston, John M. Rosenberg, Bruce G. Buchanan
Closing the Loop: An Agenda- and Justification-Based Framework for Selecting the Next Discovery Task to Perform
Henner Graubitz, Myra Spiliopoulou, Karsten Winkler
The DIAsDEM Framework for Converting Domain-Specific Texts into XML Documents with Data Mining Techniques
18 S422
Jian Pei, Jiawei Han, Hongjun Lu, Shojiro Nishio, S. Tang, and D. Yang
H-Mine: Hyper-Structure Mining of Frequent Patterns in Large Databases
Gary Livingston, John M. Rosenberg, Bruce G. Buchanan
Closing the Loop: Heuristics for Autonomous Discovery
Wen-Hsiang Lu, Lee-Feng Chien, Hsi-Jian Lee
Anchor Text Mining for Translation of Web Queries

Kimmo Raivio, Olli Simula, Jaana Laiho
Neural Analysis of Mobile Radio Access Network

Ching-Yao Wang, Tzung-Pei Hong, Shian-Shyong Tseng
Maintenance of Sequential Patterns for Record Deletion

Catherine Blake, Wanda Pratt
Better Rules, Few Features: A Semantic Approach to Selecting Features from Text
20 S390
Hillol Kargupta, Byung-Hoon Park
Mining Decision Trees from Data Streams in a Mobile Environment
Pierre-Yves ROLLAND
FlExPat: Flexible Extraction of Sequential Patterns
Zarrin Langari, Frank Wm. Tompa
Subject Classification in the Oxford English Dictionary
12:00  Lunch
1:30 Cascade Room
Research Track 1H:
Sierra Room
Research Track 2H:
Siskiyou Room
Research Track 3H:

AIXIN SUN, EE-PENG LIM Hierarchical Text Classification and Evaluation

Eamonn Keogh, Selina Chu, David Hart, Michael Pazzani
An Online Algorithm for Segmenting Time Series

Xiaofeng He, Chris Ding, Hongyuan Zha, Horst Simon
Automatic Topic Identification Using Webpage Clustering

22 S410
Joao Gama
Functional Trees for Classification
Michael Anderson
Knowledge Discovery from Diagrammatically Represented Data
Krishna Bharat, Bay-Wei Chang, Monika Henzinger, Matthias Ruhl
Who Links to Whom: Mining Linkage between Web Sites
23 S561
Aijun An, Yuanyuan Wang
Comparisons of Classification Methods for Screening Potential Compounds
Chris Ding, Xiaofeng He, Hongyuan Zha, Ming Gu, and H.Simon
A Min-Max Cut Algorithm for Graph Partitioning and Data Clustering

Jung-Won Lee, Kiho Lee, Won Kim
Preparations for Semantics-Based XML Mining

24 S316
Virginia Wheway
Using Boosting to Simplify Classification Models
Michihiro Kuramochi, George Karypis
Frequent Subgraph Discovery


3:30 Catered Break (Bayshore Foyer)
4:00 Panel:  Data Mining: How Research Meets Practical Development?
Cascade Sierra Room
Chair: Rao Kotagiri, University of Melbourne, Australia.

Panel Members:

  • Pat Langley, Institute for the Study of Learning and Expertise, USA.
  • Gregory Piatetsky-Shapiro, KDnuggets, USA.
  • Philip Yu, IBM T.J. Watson Research Center, USA.
  • Benjamin W. Wah, University of Illinois, Urbana-Champaign, USA.
5:00  Adjourn

Cascade Room
Research Track 1C
S474 Osmar R. Zaiane, Mohammad El-Hajj, Paul Lu Fast Parallel Association Rule Mining without Candidacy Generation
S467 Wolfgang Gaul, Lars Schmidt-Thieme Mining Generalized Association Rules for Sequential and Path Data
S567 Honghua Dai Inexact Field Learning: An Approach to Induce High Quality Rules from  Low Quality Data
S445 Viviane Crestana Jensen, Nandit Soparkar Heuristic Optimization for Decentralized Frequent Itemset Counting
S371 Viet Phan-Luong The Representative Basis for Association Rules
S162 Fan-Chen Tseng, Ching-Chi Hsu, H. Chen Mining Frequent Closed Itemsets with the Frequent Pattern List
S308 Hasan M. Jamil Ad Hoc Association Rule Mining as SQL3 Queries
S257 Show-Jane Yen, Yue-Shi Lee

An Efficient Data Mining Technique for Discovering Interesting Sequential Patterns

S485 Carlotta Domeniconi, Dimitrios Gunopulos Incremental Support Vector Machine Construction
S464 Sherri K. Harms, Jitender Deogun, Jamil Saquer, Tsegaye Tadesse Discovering Representative Episodal Association Rules from Event Sequences Using Frequent Closed Episode Sets and Event Constraints

Sierra Room
Research Track 2C
S361 Jie Chen, Haiying Li, Shiwei Tang Association Rules Enhanced Classification of Underwater Acoustic Signal
S401 Samuel Steingold, Richard Wherry, Gregory Piatetsky-Shapiro Measuring Real-Time Predictive Models
S278 Pascal Soucy, Guy W. Mineau A Simple KNN Algorithm for Text Categorization
S513 Carlos Ordonez, Edward Omiecinski, N. Ezquerra A Fast Algorithm to Cluster High Dimensional Basket Data
S130 D. Zhang, Q. Ha, M. Lu Mining California Vital Statistics Data
S336 Fernando Alonso, ,Juan P. Caraça-Valente, Loïc Martínez, Cesar Montes

Discovering Similar Patterns for Characterizing Time Series in a Medical Domain

S332 Tadashi Nomoto, Yuji Matsumoto An Experimental Comparison of Supervised and Unsupervised Approaches to Text Summarization
S338 Daniel Gillblad, Anders Holst Dependency Derivation in Industrial Process Data
S240 Xiong Wang a-Surface and Its Application to Mining Protein Data
S260 Petri Myllymaki, Tomi Silander, Henry Tirri, Pekka Uronen Bayesian Data Mining on the Web with B-Course
S347 Bernard Zenko, Ljupco Todorovski, Saso Dzeroski A Comparison of Stacking with Meta Decision Trees to Bagging, Boosting, and Stacking with other Methods

Siskiyou Room
Research Track 3C
S505 X. Liang, and Y. Liang Applications of Data Mining in Hydrology
S202 Tobias Scheffer, Christian Decomain, Stefan Wrobel Mining the Web with Active Hidden Markov Models
S181 Gang LI, Fu Tong, Honghua Dai

Evolutionary Structure Learning Algorithm for Bayesian Network and Penalized Mutual Information Metric

S206 June-Suh Cho, Nabil R. Adam Efficient Splitting Rules Based on the Probabilities of Pre-assigned Intervals
S258 Andreas Hotho, Alexander Maedche, Steffen Staab Text Clustering Based on Good Aggregations
S224 J. Paetz Metric Rule Generation with Septic Shock Patient Data
S403 Stefan Ruping Incremental Learning with Support Vector Machines
S191 Zhiyong Liu, , Lei  Xu RPCL-Based Local PCA Algorithm
S449 Rayid Ghani Combining Labeled and Unlabeled Data for Text Classification with a Large Number of Categories
S291 Nitesh Chawla, Steven Eschrich, Lawrence O. Hall Creating Ensembles of Classifiers

This page has been accessed times since October 11, 2001.
Last modified: November 21, 2001.