The Iterative Bayesian Model Averaging Algorithm: an improved method for gene selection and classification using microarray data

Abstract

Classification is a supervised learning technique that in the context of microarray analysis is most frequently used to identify genes whose expression is correlated with specific phenotypes of the samples. Typically, the interest is in identifying genes that are predictive of disease. In such cases, both the accuracy of the prediction and the number of genes necessary to obtain a given accuracy is important. In particular, methods that select a small number of relevant genes and provide accurate classification of microarray samples aid in the development of simpler diagnostic tests. In addition, methods that adopt a weighted average approach over multiple models have the potential to provide more accurate predictions than methods that do not take model uncertainty into consideration. Hence, we developed the Bayesian Model Averaging (BMA) method for gene selection and classification of microarray data (Yeung et al., 2005). Typical gene selection and classification procedures ignore model uncertainty and use a single set of relevant genes (model) to predict the class labels. BMA is a multivariate technique that takes the interaction of variables (typically genes) and model uncertainty into consideration. In addition, the output of BMA contains posterior probabilities for each prediction which can be useful in assessing the correctness of a given call (diagnosis).

Topics

1 Figures and Tables

Download Full PDF Version (Non-Commercial Use)