Predictive Data Mining with Multiple Additive Regression Trees

by Jerome H. Friedman, Stanford University

Predicting future outcomes based on past observational data is a common application in data mining. The primary goal is usually predictive accuracy, with secondary goals being speed, ease of use, and interpretability of the resulting predictive model. New automated procedures for predictive data mining, based on "boosting" (CART) regression trees, are described. The goal is a class of fast "off-the-shelf" procedures for classification and regression that are competitive in accuracy with more customized approaches, while being fairly automatic to use (little tuning), and highly robust especially when applied to less than clean data. Tools are presented for interpreting and visualizing these multiple additive regression tree (MART) models.

This page has been accessed times since October 3, 2001.