A simulated annealing-based algorithm for iterative class discovery using fuzzy logic for informative gene selection
Within a gene expression matrix, there are usually several particular macroscopic phenotypes of samples related to some diseases or drug effects, such as diseased samples, normal samples or drug treated samples. The goal of sample-based clustering is to find the phenotype structures or subsamples. We present a novel method for automatically discovering clusters of samples which are coherent from a genetic point of view. Each possible cluster is characterized by a fuzzy pattern which maintains a fuzzy discretization of relevant gene expression values. Possible clusters are randomly constructed and iteratively refined by following a probabilistic search and an optimization schema. Evaluation of the proposed algorithm on publicly available microarray datasets shows high accuracy in spite of noise and the presence of other clusters. The results obtained support the appropriateness of using fuzzy logic to represent and filter gene expression values following an iterative approach. The proposed method complements our previous GENECBR system and both are freely available under GNU General Public License from http://www.genecbr.org/fpclustering.htm and http://www.genecbr.org/, respectively.