GBG logo LeFE Build:8 Genomics and Bioinformatics Group

Welcome to The Home of LeFEMiner: The Learner of Functional Enrichment

Overview

LeFEMiner, based on the LeFE algorithm, is a novel tool, for the interpretation of gene microarray data. LeFEMiner applies a random forest machine learning algorithm in conjunction with a permutation based statistical method to determine which sets of functionally related genes (ie. gene categories or ensembles) are most biologically related to the observed biology or state of the samples. LeFEMiner differs from the previously published methods because it embraces the complexity of non-linear and multifactor gene regulation with flexible machine learning models. Applications of LeFEMiner to well studied datasets produce meaningful results that are either consistent with previously demonstrated biology phenomina or represent plausibly novel findings .

What is this web-tool?

This web application has been created to give non-technical end users access to the supercomputed LeFE algorithm. To access the tool please go to the Submit LeFE Jobs section of the site.

What's the output of LeFEMiner?

LeFE ranks the analyzed gene categories by their likelihood of association with the experiments' high-level phenotypes descriptions. The statistical significance of the results is assessed by reperforming the same operation several times on a dataset in which the experiments' phenotypes have been permuted. The Example Output Tab shows precomputed results of LeFE applied to a dataset comparing p53 mutants verses wildtype cell lines.

How does it work?

Briefly, LeFE takes a gene expression matrix and a set of annotations or signatures of each sample. For example, your microarrays may contain gene expression profiles of tumors and adjacent normal tissue. Your signatures will then be two classes of data, N and T for normal and tumor. LeFE then determines which functionally related genes ensembles are most associated with the higher-level sample annotations. The ensembles are defined as genes from the same Gene Ontology category or other independent predefined ensembles.

For a graphical overview of how LeFE works, please see the ‘How It Works’ section of the website. A more technical overview can be found in the forthcoming manuscript in Genome Biology.

Limitations

At its heart, LeFE uses a powerful random forest machine learning algorithm. The flexibility of the random forest requires that it LeFE be run with a diverse set of samples. Therefore, we require the LeFE be operated on at least 3 microarrays for classification problems and 15 microarrays for regression problems. The web-tool will reject jobs that don’t contain the minimum number of samples.

What's new?

  • January 10th, 2007: This whole website.
  • May 15th, 2007: LeFE now supports FDR estimation.
  • August 6th, 2007: The manuscript describing LeFE has been accepted for publication in Genome Biology.

Citing LeFE

LeFE's citation is as follows: Eichler GS, Reimers M, Kane D, Weinstein JN, "The LeFE algorithm: embracing the complexity of gene expression in the interpretation of microarray data.", Genome Biology, 2007 Sep 10;8(9):R187


LeFE™ is a development of the Genomics and Bioinformatics Group, Laboratory of Molecular Pharmacology (LMP), Center for Cancer Research (CCR), National Cancer Institute (NCI). Please email us with any problems, questions or feedback on the tool.

Notice and Disclaimer