A Genetic Algorithm Guided Formation of Spaced Dyads Coupled with an EM Algorithm for Motif Discovery
Genome-wide analyses of protein binding sites generate large amounts of data; a ChIP dataset might contain 10,000 sites. Unbiased motif discovery in such datasets is not generally feasible using current methods without restricting the initial motif profiles. We propose an efficient method, GADEM, which combines spaced dyads and an expectation-maximization (EM) algorithm. In testing with six genome-wide ChIP datasets, GADEM proved a capable, efficient tool for de novo motif discovery and singularly adept in identification of long motifs (> 40 bp).
This program was developed by Leping Li (http://www.niehs.nih.gov/research/atniehs/labs/bb/staff/li/index.cfm) at the National Institutes of Environmental Health Sciences, Research Triangle Park, North Carolina 27709.
This work is made available under the GPL v2.
Download the source code for the distribution of GADEM along with usage documentation and examples. (gadem_v1.2.tar.gz) (http://www.niehs.nih.gov/research/resources/software/gadem/docs/gadem_v1.2.tar.gz) (699KB)
In the main directory of the distribution, type
By default, the configure program will direct the executable files to /usr/local/bin which, in most cases, requires the user to "su" to root prior to the "make install" step. The target directory for the executable file can be overridden by specifying the --prefix option during the configure phase. For example,
will direct the executables into /home/GADEM_user/bin directory.
The configure application accepts several arguments to tailor the build and installation process. Please see the INSTALL file contained in the root directory of the distribution for further details.
The source code and package were developed using Windows and tested on Linux (Fedora). Although the intent was to make the code portable to most U*IX variants, you may encounter minor build issues on other platforms. Feedback regarding any difficulties you may experience will be very helpful in improving the distribution package.
Leping Li, NIEHS (http://www.niehs.nih.gov/research/atniehs/labs/bb/staff/li/index.cfm)