Multi-Layer Induction in Large, Noisy Databases

Sponsor(s): U.S. Army Research Office, Grant No. DAAD19-02-1-0178.
Duration: June 1, 2002 - August 31, 2005.

Project Summary: This project plans to design a new data mining algorithm, multi-layer induction, which divides a database into subsets of approximately equal size, runs an existing induction algorithm on the first subset to obtain a first set of rules, and then processes each of the remaining data subsets one at a time by incorporating the induction results from the previous subsets(s). This way, multi-layer induction will accumulate discovered rules from each data subset at each layer and produce a final integrated output which is expected to represent the original data more accurately. Any noisy data contained in the original database can be partitioned and diminished in multi-layer induction into the small data subsets, thus the effects of noise would be diluted and induction efficiency can be increased.

Research Team:

Graduate Theses:

Publications:

  1. Xingquan Zhu and Xindong Wu, "Class Noise Handling for Effective Cost-Sensitive Learning by Cost-Guided Iterative Classification Filtering", IEEE Trans. on Knowledge and Data Engineering (TKDE), 18(2006), 10: 1435-1440. [abstract]

  2. Xingquan Zhu and Xindong Wu, "Bridging Local and Global Data Cleansing: Identifying Class Noise in Large, Distributed Data Datasets", Data Mining and Knowledge Discovery (DMKD), 12(2006), 2: 275-308. [abstract]

  3. Xingquan Zhu, Ying Yang and Xindong Wu, "Effective Classification of Noisy Data Streams with Attribute-oriented Dynamic Classifier Selection Perspective", Knowledge and Information Systems (KAIS), vol.9, no.3, 2006, 339-363. [abstract]

  4. Xingquan Zhu and Xindong Wu, "Cost-Constrained Data Acquisition for Intelligent Data Preparation", IEEE Trans. on Knowledge and Data Engineering (TKDE), vol.17, no.11, 2005. [abstract]

  5. Xingquan Zhu, Ahmed K. Elmagarmid, Xiangyang Xue, Lide Wu, Ann C. Catlin. "InsightVideo: Towards Hierarchical Video Content Organization for Efficient Browsing, Summarization and Retrieval", IEEE Trans. on Multimedia, vol.7, no.4, 2005. [abstract]

  6. Xingquan Zhu, Xindong Wu, Ahmed K. Elmagarmid, Zhe Feng, and Lide Wu, "Video Data Mining: Semantic Indexing and Event Detection from the Association Perspective", IEEE Trans. on Knowledge and Data Engineering (TKDE), vol.17, no.5, 2005. [abstract]

  7. Xingquan Zhu and Xindong Wu, "Data Acquisition with Active Impact-Sensitive Instance Selection", in Proceedings of the 16th IEEE International Conference on Tools with Artificial Intelligence (ICTAI 2004), Boca Raton, FL, November 15 - 17 2004. [abstract]

  8. Xingquan Zhu, Xindong Wu and Ying Yang, "Dynamic Classifier Selection for Effective Mining from Noisy Data Streams", in Proceedings of the 4th IEEE International Conference on Data Mining (ICDM 2004) , Brighton, UK, November 1 - 4, 2004. [abstract]

  9. Xingquan Zhu and Xindong Wu, "Cost-guided Class Noise Handling for Effective Cost-sensitive Learning", in Proceedings of the 4th IEEE International Conference on Data Mining (ICDM 2004) , Brighton, UK, November 1 - 4, 2004. [abstract]

  10. Ying Yang, Xindong Wu and Xingquan Zhu, "Dealing with Predictive-but-Unpredictable Attributes in Noisy Data Sources", in Proceedings of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD 2004), Pisa, Italy, 2004. [abstract]

  11. Xingquan Zhu and Xindong Wu, "Class Noise vs Attribute Noise: A Quantitative Study of Their Impacts ", Artificial Intelligence Review 22 (3-4):177-210, November 2004. [abstract]

  12. Xingquan Zhu, Xindong Wu and Ying Yang, "Error Detection and Impact-sensitive Instance Ranking in Noisy Datasets", in Proceedings of the 19th National Conference on Artificial Intelligence (AAAI-04), July 25-29, 2004, San Jose, California. [abstract]

  13. Xingquan Zhu, Xindong Wu, Jianping Fan, Walid G. Aref, Ahmed K. Elmagarmid, "Exploring Video Content Structure for Hierarchical Summarization", Multimedia Systems, 10(2):98-115, 2004. [abstract]

  14. Xingquan Zhu, Xindong Wu and Qijun Chen, "Eliminating Class Noise in Large Datasets", in Proceedings of the 20th ICML International Conference on Machine Learning (ICML 2003), August 21-24, 2003, Washington D.C., 920-927. [abstract] [pdf]

  15. Xingquan Zhu and Xindong Wu, "Mining Video Association for Efficient Database Management", in Proceedings of the 18th International Joint Conference on Artificial Intelligence (IJCAI 2003), Acapulco, Mexico, August 12-15, 2003, 1422-1424. [abstract] [pdf]

  16. Shichao Zhang, Xindong Wu, and Chengqi Zhang, "Multi-Database Mining", The IEEE Computational Intelligence Bulletin, Volume 2, 5-13. [abstract] [pdf]

  17. Xingquan Zhu and Xindong Wu, "Sequential Association Mining for Video Summarization", in Proceedings of the 2003 IEEE International Conference on Multimedia & Expo (ICME 2003), Baltimore, MD, USA, July 6-9, 2003, Volume 3, 333-336. [abstract] [pdf]

  18. Xingquan Zhu, Jianping Fan, Ahmed K. Elmagarmid, and Xindong Wu, "Hierarchical Video Summarization and Content Description Joint Semantic and Visual Similarity", Multimedia Systems, 9(2003), 1: 31-53. [abstract] [pdf].


This page has been accessed times since May 22, 2003.
Last modified: August 18, 2006.