This method leverages published gene list data in order to focus feature selection on genes that are likely to be predictive. When gene lists are selected independently from the dataset, the potential for over-fitting should be reduced. Similar phenotypes are likely to be mechanistically related. The method requires probes to genes information and potentially relevant gene lists.

Gene Lists

Gene list Files

Gene list files are text files with 1 or more columns with a tab character between each column. Gene list files contain one line per feature.

 PrimaryID [tab] GenBankID [tab] RefSeqID [tab] ProbesetID

Lines beginning with the character '#' are ignored. The fourth field is the probe set identifier which matches the chip.

