Efficient algorithm for testing goodness-of-fit for classification of high dimensional data
Articles
Gintautas Jakimauskas
Institute of Mathematics and Informatics
Published 2009-12-20
https://doi.org/10.15388/LMR.2009.52
PDF

Keywords

Gausian mixture model
goodness-of-fit

How to Cite

Jakimauskas, G. (2009) “Efficient algorithm for testing goodness-of-fit for classification of high dimensional data”, Lietuvos matematikos rinkinys, 50(proc. LMS), pp. 293–297. doi:10.15388/LMR.2009.52.

Abstract

Let us have a sample satisfying d-dimensional Gaussian mixture model (d is supposed to be large). The problem of classification of the sample is considered. Because of large dimension it is natural to project the sample to k-dimensional (k = 1,  2, . . .) linear subspaces using projection pursuit method which gives the best selection of these subspaces. Having an estimate of the discriminant subspace we can perform classification using projected sample thus avoiding ’curse of dimensionality’.  An essential step in this method is testing goodness-of-fit of the estimated d-dimensional model assuming that distribution on the complement space is standard Gaussian. We present a simple, data-driven and computationally efficient procedure for testing goodness-of-fit. The procedure is based on well-known interpretation of testing goodness-of-fit as the classification problem, a special sequential data partition procedure, randomization and resampling, elements of sequentialtesting.Monte-Carlosimulations are used to assess the performance of the procedure.

PDF

Downloads

Download data is not yet available.