TY - GEN
T1 - Scalable parallel algorithms for predictive modelling
AU - Christen, P.
AU - Hegland, M.
AU - Nielsen, O.
AU - Roberts, Stephen
AU - Altas, I.
PY - 2000
Y1 - 2000
N2 - Data Mining applications have to deal with increasingly large data sets and complexity. Only algorithms which scale linearly with data size are feasible. We present parallel regression algorithms which after a few initial scans of the data compute predictive models for data mining and do not require further access to the data. In addition, we describe various ways of dealing with the complexity (high dimensionality) of the data. Three methods are presented for three different ranges of attribute numbers. They use ideas from the finite element method and are based on penalised least squares fits using sparse grids and additive models for intermediate and very high dimensional data. Computational experiments confirm scalability both with respect to data size and number of processors.
AB - Data Mining applications have to deal with increasingly large data sets and complexity. Only algorithms which scale linearly with data size are feasible. We present parallel regression algorithms which after a few initial scans of the data compute predictive models for data mining and do not require further access to the data. In addition, we describe various ways of dealing with the complexity (high dimensionality) of the data. Three methods are presented for three different ranges of attribute numbers. They use ideas from the finite element method and are based on penalised least squares fits using sparse grids and additive models for intermediate and very high dimensional data. Computational experiments confirm scalability both with respect to data size and number of processors.
UR - http://www.scopus.com/inward/record.url?scp=0006940688&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=0006940688&partnerID=8YFLogxK
M3 - Conference paper
AN - SCOPUS:0006940688
VL - 2
T3 - Management Information Systems
SP - 423
EP - 432
BT - Second International Conference on Data Mining, Data Minig II
T2 - Second International Conference on Data Mining, Data Minig II
Y2 - 5 July 2000 through 7 July 2000
ER -