ICDM 2003 Accepted Papers


We had a total of 501 papers submitted to ICDM this year, from which 58 regular papers, 61 short papers, and 9 industry-track papers were selected for presentation.

Research-Track Regular Papers

  1. R229 Bin Zhang, "Regression Clustering"
  2. R240 Dmitry Pavlov, "Sequence Modeling with Mixtures of Conditional Maximum Entropy Distributions"
  3. R253 Fengzhan Tian and Yuchang Lu, "Learning Bayesian Networks from Incomplete Data Based on EMI Method"
  4. R256 Levon Lloyd and Steven Skiena, "Parsing without a Grammar: Making Sense of Unknown File Formats"
  5. R267 Jian Pei, Xiaoling Zhang, Moonjung Cho, Haixun Wang, and Philip S. Yu, "MaPle: A Fast Algorithm for Maximal Pattern-based Clustering"
  6. R281 Jessica Lin, Eamonn Keogh, and Wagner Truppel, "Clustering of Streaming Time Series is Meaningless: Implications for Previous and Future Research"
  7. R297 Saharon Rosset and Einat Neumann, "Integrating Customer Value Considerations into Predictive Modeling"
  8. R314 Huidong Jin, Man-Leung Wong, and Kwong-Sak Leung, "Scalable Model-based Clustering by Working on Data Summaries"
  9. R338 zehang sun, George Bebis, and Ronald Miller, "Evolutionary Gabor Filter Optimization with Application to Vehicle Detection"
  10. R339 Raymond Chan, Qiang Yang, and Yi-Dong Shen, "Mining High Utility Itemsets"
  11. R341 Wei Fan, Haixun Wang, Philip Yu, and Sheng Ma, "Is random model better? Its accuracy and efficiency"
  12. R342 Raymond Chi-Wing Wong, Ada Wai-Chee Fu, and Ke Wang, "MPIS: Maximal-Profit Item Selection with Cross-Selling Considerations"
  13. R347 Shou-de Lin and Hans Chalupsky, "Unsupervised Link Discovery in Multi-relational Data via Rarity Analysis"
  14. R356 Yongqiao Xiao, Jenq-Foung Yao, Zhigang Li, and Margaret Dunham, "Efficient Data Mining for Maximal Frequent Subtrees"
  15. R370 Jiuyong Li and Yanchun Zhang, "Generate Interesting Rules Directly"
  16. R373 Xingzhi Sun, Maria E. Orlowska, and Xue Li, "Introducing Uncertainty into Pattern Discovery in Temporal Event Sequences"
  17. R380 Mukund Deshpande, Michihiro Kuramochi, and George Karypis, "Frequent Sub-Structure-Based Approaches for Classifying Chemical Compounds"
  18. R387 Robert Munro, Sanjay Chawla, and Pei Sun, "Complex Spatial Relationships"
  19. R390 noriaki kawamae, "Semantic Log Analysis Based on A User's Query Behavior Model"
  20. R401 Ran Wolff and Assaf Schuster, "Association Rule Mining in Peer-to-Peer Systems"
  21. R405 Ran Wolff, Assaf Schuster, and Dan Trock, "A High-Performance Distributed Algorithm for Mining Association Rules"
  22. R419 Jinze Liu and Wei Wang, "OP-Cluster: Clustering by Tendency in High Dimensional Space"
  23. R432 Jvrg Walter, Jvrg Ontrup, and Helge Ritter, "Interactive Visualization and Navigation in Large Data Collections using the Hyperbolic Space"
  24. R433 Fabien De Marchi and Jean-Marc Petit, "Zigzag: a new algorithm for mining large inclusion dependencies in databases"
  25. R435 Jeremy Kubica and Andrew Moore, "Probabilistic Noise Identification and Data Cleaning"
  26. R442 Jeremy Kolter and Marcus Maloof, "Dynamic Weighted Majority: A New Ensemble Method for Tracking Concept Drift"
  27. R449 Qiang Yang and Hong Cheng, "Mining Plans for Customer-Class Transformation"
  28. R459 Akihiro Inokuchi and Hisashi Kashima, "Mining Significant Pairs of Patterns from Graph Structures with Class Labels"
  29. R462 Hua-Jun Zeng, Xuan-Hui Wang, Zheng Chen, and Wei-Ying Ma, "CBC: Clustering Based Text Classification Requiring Minimal Labeled Data"
  30. R472 Chihli Hung and Stefan Wermter, "A Dynamic Adaptive Self-Organising Hybrid Model for Text Clustering"
  31. R484 Jieping Ye, Ravi Janardan, Cheong Hee Park, and Haesun Park, "A new optimization criterion for generalized discriminant analysis on undersampled problems"
  32. R493 Petre Tzvetkov, Xifeng Yan, and Jiawei Han, "TSP: Mining Top-K Closed Sequential Patterns"
  33. R502 Sau Dan Lee and Luc De Raedt, "An Algebra for Inductive Query Evaluation"
  34. R522 Alexander Topchy, Anil Jain, and William Punch, "Combining Multiple Weak Clusterings"
  35. R527 Amihood Amir, Reuven Kashi, and Nathan Netanyahu, "Efficient Multidimensional Quantitative Hypotheses Generation"
  36. R528 Francesco Bonchi, Fosca Giannotti, Alessio Mazzanti, and Dino Pedreschi, "ExAMiner: Optimized Level-wise Frequent Pattern Mining with Monotone Constraints"
  37. R535 Taneli Mielikdinen, "Change Profiles"
  38. R542 Kang Peng, Slobodan Vucetic, Bo Han, Hongbo Xie, and Zoran Obradovic, "Exploiting Unlabeled Data for Improving Accuracy of Predictive Data Mining"
  39. R553 Shi Zhong and Joydeep Ghosh, "Model-based Clustering with Soft Balancing"
  40. R557 Guizhen Yang, Saikat Mukherjee, and I. V. Ramakrishnan, "On Precision and Recall of Multi-Attribute Data Extraction from Semistructured Sources"
  41. R558 Olfa Nasraoui, Cesar Cardona, Carlos Rojas, and Fabio Gonzalez, "Mining Evolving Clusters in Noisy Data with a Scalable Immune System Learning Model"
  42. R560 Yinghui Yang and Balaji Padmanabhan, "Segmenting Customer Transactions Using a Pattern-Based Clustering Approach"
  43. R565 Robert Gwadera, Mikhail Atallah, and Wojciech Szpankowski, "Reliable Detection of Episodes in Event Sequences"
  44. R575 Cheong Hee Park and Haesun Park, "Efficient Nonlinear Dimension Reduction for Clustered Data Using Kernel Functions"
  45. R577 Srujana Merugu and Joydeep Ghosh, "Privacy-preserving Distributed Clustering using Generative Models"
  46. R587 Qi Li, Jieping Ye, and Chandra Kambhamettu, "Spatial Interest Pixels (SIPs): Useful Low-Level Features of Visual Media Data"
  47. R588 Lewis Frey, Douglas Fisher, Ioannis Tsamardinos, Constantin Aliferis, and Alexander Statnikov, "Identifying Markov Blankets with Decision Tree Induction"
  48. R598 Jeonghee Yi, Tetsuya Nasukawa, Razvan Bunescu, and Wayne Niblack, "Sentiment Analyzer: Extracting Sentiments About A Given Topic Using Natural Language Processing Techniques"
  49. R618 J Elble, C Heeren, and L Pitt, "Optimized Disjunctive Association Rules via Sampling"
  50. R619 Bianca Zadrozny, John Langford, and Naoki Abe, "Cost-sensitive learning by cost-proportionate example weighting"
  51. R620 Hillol Kargupta, Souptik Datta, Qi Wang, and Krishnamoorthy Sivakumar, "On the Privacy Preserving Properties of Random Data Perturbation Techniques"
  52. R622 Alexandrin Popescul, Lyle Ungar, Steve Lawrence, and David Pennock, "Statistical Relational Learning for Document Mining"
  53. R631 Eren Manavoglu, Dmitry Pavlov, and C. Lee Giles, "Probabilstic User Behavior Models"
  54. R637 Shusaku Tsumoto, "Visualization of Rules Similarity using Multidimensional Scaling"
  55. R654 Hui Xiong, Pang-Ning Tan, and Vipin Kumar, "Mining Strong Affinity Association Patterns in Data Sets with Skewed Support Distribution"
  56. R670 Bing Liu, Xiaoli Li, Wee Sun Lee, and Philip Yu, "Building Text Classifiers Using Positive and Unlabeled Data"
  57. R676 Einoshin Suzuki, Takeshi Watanabe, Hideto Yokoi, and Katsuhiko Takabayashi, "Detecting Interesting Exceptions from Medical Test Data with Visual Summarization"
  58. R688 Aleksandar Lazarevic, Ramdev Kanapady, Chandrika Kamath, Vipin Kumar, and Kumar Tamma, "Localized Prediction of Continuous Target Variables Using Hierarchical Clustering"

Research-Track Short Papers

  1. R211 Frederic Maire, "Balancing Board Machines"
  2. R244 Jiwen Guan and David Bell, "Rough set theory finding maximal association rules in mining for keyword co-occurrences"
  3. R259 Man Lung Yiu and Nikos Mamoulis, "Frequent-Pattern based Iterative Projected Clustering"
  4. R268 Joarder Kamruzzaman and Ruhul Sarker, "SVM based models for predicting foreign currency exchange rates"
  5. R269 Reda Alhajj and Mehmet Kaya, "Integrating Fuzziness into OLAP for Multidimensional Fuzzy Association Rules Mining"
  6. R288 Jin Huang and Charles Ling, "Accuracy vs AUC: Comparing Naive Bayes, Decision Trees and SVM"
  7. R290 Mehmet Kaya and Reda Alhajj, "Facilitating Fuzzy Association Rules Mining by Using Multi-Objective Genetic Algorithms for Automated Clustering"
  8. R298 Chun-Nan Hsu, Hao-Hsiang Chung, and Han-Shen Huang, "The Hybrid Poisson Aspect Model for Personalized Shopping Recommendation"
  9. R306 Yun Chi, Yirong Yang, and Richard R. Muntz, "Indexing and Mining Free Trees"
  10. R311 Jinyan Li and Huiqing Liu, "Ensembles of Cascading Trees"
  11. R320 Kai Ming Ting and Regina Jing Ying Quek, "Model Stability: A key factor in determining whether an algorithm produces an optimal model from a matching distribution"
  12. R348 Longin Jan Latecki, Rajagopal Venugopal, Marc Sobel, and Steve Horvath, "Tree-structured Partitioning Based on Splitting Histograms of Distances"
  13. R350 Raz Tamir and Reinhard Rapp, "Mining the Web to Discover the Meanings of an Ambiguous Word"
  14. R352 Lawrence Hall and Kevin Bowyer, "Comparing Pure Parallel Ensemble Creation Techniques against Bagging"
  15. R355 Hwanjo Yu, "General MC: Estimating Boundary of Positive Class from Small Positive Data"
  16. R358 Ping Chen, Chenyi Hu, Wei Ding, Heloise Lynn, and Simon Yves, "Icon-based Visualization of Large High-Dimensional Datasets"
  17. R360 Tao Li, "Using Discriminant Analysis for Multi-class Classification"
  18. R368 Stanley Oliveira and Osmar Zaiane, "Protecting Sensitive Knowledge By Data Sanitization"
  19. R381 Juan Velasquez, Hiroshi Yasuda, and Terumasa Aoki, "Combining the web content and usage mining to understand the visitor behavior in a web site"
  20. R382 Jyh-Jong Tsay, "Enhancing Techniques for Efficient Topic Hierarchy Integration"
  21. R399 Frans Coenen, Paul Leng, and Ahmed Shakil, "T-trees, Vertical Partitioning and Distributed Association Rule Mining"
  22. R403 Lemuel Waitman, Douglas Fisher, and Paul King, "Bootstrapping Rule Induction"
  23. R406 Rajaraman Kanagasabai and Ah-Hwee Tan, "Mining Semantic Networks for Knowledge Discovery"
  24. R437 Andreas Hotho, Steffen Staab, and Gerd Stumme, "Ontologies Improve Text Document Clustering"
  25. R438 Sriharsha Veeramachaneni and Paolo Avesani, "Active Sampling for Feature Selection"
  26. R443 Arkadiusz Wojna, "Center-Based Indexing for Nearest Neighbors Search"
  27. R451 Ed Heierman and Diane Cook, "Improving Home Automation by Discovering Regularly Occurring Device Usage Patterns"
  28. R452 Mark Krogel and Tobias Scheffer, "Effectiveness of Information Extraction, Multi-Relational, and Semi-Supervised Learning for Mining Microarray Data"
  29. R457 Hongwei Zhu and Otman Basir, "A K-NN Associated Fuzzy Evidential Reasoning Classifier With Adaptive Neighbor Selection"
  30. R465 Horia Nicolai Teodorescu and LucianIulian Fira, "A Hybrid Data-Mining Approach in Genomics and Text Structures"
  31. R469 Qiang Yang, Jie Yin, Charles Ling, and Tielin Chen, "Postprocessing Decision Trees to Extract Actionable Knowledge"
  32. R486 Pasi Frdnti, Olli Virmajoki, and Ville Hautamdki, "Fast PNN-Based Clustering Using K-Nearest Neighbor Graph"
  33. R494 Julien BLANCHARD, Fabrice GUILLET, and Henri BRIAND, "A user-driven and quality-oriented visualization for mining association rules"
  34. R496 Jeremy Kubica, Andrew Moore, and Jeff Schneider, "Tractable Group Detection on Large Data Sets"
  35. R503 Daniel Keim, Stephen North, Christian Panse, and Mike Sips, "PixelMaps: A New Visual Data Mining Approach for Analyzing Large Spatial Data Sets"
  36. R512 Matthew V. Mahoney and Philip K. Chan, "Learning Rules for Anomaly Detection of Hostile Network Traffic"
  37. R516 Francois Fouss, Jean-Michel Renders, and Marco Saerens, "Links between Kleinbergs hubs and authorities, correspondence analysis, and Markov Chains"
  38. R517 Hanchuan Peng and Chris Ding, "Structure Search and Stability Enhancement of Bayesian Networks"
  39. R525 Tassos Argyros and Charis Ermopoulos, "Efficient Subsequence Matching in Time Series Databases Under Time and Amplitude Transformations"
  40. R537 Amihood Amir, Reuven Kashi, Daniel Keim, Nathan Netanyahu, and Markus Wawryniuk, "Analyzing High-Dimensional Data by Subspace Validity"
  41. R541 Daniel Barbara, Carlotta Domeniconi, and Ning Kang, "Mining Relevant Text from Unlabelled Documents"
  42. R547 Keke Chen and Ling Liu, "Validating and Refining Clusters via Visual Rendering"
  43. R551 Zhaohui Zheng, Rohini Srihari, and Sargur Srihari, "A Feature Selection Framework for Text Filtering"
  44. R555 Chang-Tien Lu, Dechang Chen, and Yufeng Kou, "Algorithms for Spatial Outlier Detection"
  45. R556 Ching-Huang Yun, Kun-Ta Chuang, and Ming-Syan Chen, "Clustering Item Data Sets with Association-Taxonomy Similarity"
  46. R568 Michele Sebag and Jerome Aze, "Evolutionary Optimization of the ROC Curve: Application to Medical Data Mining"
  47. R572 Young-Koo Lee, Won-Young Kim, Y. Dora Cai, and Jiawei Han, "CoMine: Efficient Mining of Correlated Patterns"
  48. R586 Ricardo Vilalta, Murali-Krishna Achari, and Christoph Eick, "Class Decomposition Via Clustering: A New Framework For Low-Variance Classifiers"
  49. R604 Doina Caragea, Dianne Cook, and Vasant Honavar, "Towards Simple, Easy-to-Understand, yet Accurate Classifiers"
  50. R606 Huseyin Polat and Wenliang Du, "Privacy-Preserving Collaborative Filtering using Randomized Perturbation Techniques"
  51. R610 Tomoyuki shubata, Takekazu Kato, and Toshikazu Wada, "K-D Decision Tree;An Accelerated and Memory Efficient Nearest Neighbor Classifier"
  52. R617 Jennifer Neville, David Jensen, and Brian Gallagher, "Simple Estimators for Relational Bayesian Classifiers"
  53. R621 Peng Zhang, Jing Peng, and Carlotta Domeniconi, "Dimensionality Reduction Using Kernel Pooled Local Discriminant Information"
  54. R629 Jun Huan, Wei Wang, and Jan Prins, "Efficient Mining of Frequent Subgraph in the Presence of Isomophism"
  55. R634 Shusaku Tsumoto, "Pattern Discovery based on Rule Induction and Taxonomy Generation"
  56. R641 Aijun An, Shakil Khan, and Xiangji Huang, "Objective and Subjective Algorithms for Grouping Association Rules"
  57. R645 Matthew Otey, Adriano Veloso, Chao Wang, Srinivasan Parthasarathy, and Wagner Meira Jr., "Incremental Techniques for Mining Dynamic and Distributed Databases"
  58. R657 Inderjit Dhillon and Yuqiang Guan, "Information Theoretic Clustering of Sparse Co-Occurrence Data"
  59. R673 Yuefeng Li and Ning Zhong, "Interpretations of Association Rules by Granular Computing"
  60. R677 James Bailey, Thomas Manoukian, and Kotagiri Ramamohanarao, "A Fast Algorithm for Computing Hypergraph Transversals and its Application in Mining Emerging Patterns"
  61. R684 Sameer Pradhan, Kadri Hacioglu, Wayne Ward, James Martin, and Dan Jurafsky, "Semantic Role Parsing: Adding Semantic Structure to Unstructured Text"

Industry-Track Papers

  1. I203 Mingkun Li, Shuo Feng, Ishwar Sethi, Jason Luciow, and Keith Wagner, "Mining Production Data with Neural Network & CART"
  2. I208 Kaidi Zhao, Bing Liu, Tom Tirpak, and Andreas Schaller, "Detecting Patterns of Change Using Enhanced Parallel Coordinate Visualization"
  3. I213 Steve Selvaggio, Zach Zakharian, Jutta Kreyss, and Michael White, "Text Mining for a Clear Picture of Defect Reports: A Praxis Report"
  4. R220 Rajat Gupta, B.V.L. Narayana, P. Krishna Reddy, G.V. Ranga Rao, C.L.L. Gowda, Y.U.R. Reddy and G.Rama Murthy, "Understanding Helicoverpa armigera Pest Population Dynamics related to Chickpea Crop Using Neural Networks"
  5. R285 Frank Dellmann, Holger Wulff, and Stefan Schmitz, "Statistical Analysis of Web Log Files of a German Automobile Producer: Findings from a Practical Project Concerning Web Usage Mining"
  6. R303 Qinghua Guo, Maggi Kelly, and Catherine Graham, "One-class support vector machines for predicting distribution of Sudden Oak Death in California"
  7. R402 Phuong Minh Tu, Doheon Lee, and Kwang-Hyung Lee, "Regulatory element discovery using tree-structured modes"
  8. R531 Byung-Hoon Park, George Ostrouchov, Gong-Xin Yu, Al Geist, Andrey Gorin, and Nagiza Samatova, "Inference of Protein-Protein Interactions by Unlikely Profile Pair"
  9. R574 Choh Man Teng, "Applying Noise Handling Techniques to Genomic Data: A Case Study"

This page has been accessed times since August 15, 2003.