HIV sequence database

FindModel

Purpose: Findmodel analyzes your alignment to see which phylogenetic model best describes your data; this model can then be used to generate a better tree. Background information and references are given below.

File size limits: Finding the best evolutionary model is a computationally intensive procedure, both in its original implementation as the Modeltest PAUP* script and in our FindModel implementation. To reduce the computational burden on our servers, we have limited the default runs to a reduced set of models, and excluded those that do not have an obvious biological interpretation. (If you know of any system where Modeltest consistently returns a model that we do not include, please let us know.) The full set of models can be run by checking the checkbox below the input section. Currently, input files smaller than 6 Kb for the reduced set and 3 Kb for the full set are run immediately; if your input file exceeds the limit, your job will be run in batch, and you will receive an email when it has finished. The email contains a link to your results. Currently, input files larger than 500 Kb for the reduced set and 350 Kb for the full set are too large for our machine to process.

Formats: Findmodel attempts to automatically recognize the format of your input file, using Format Conversion. If this fails, you can use the Sequence Conversion Tools interface to convert your alignment to Fasta, and use that for input.

Background: Findmodel was developed from a web implementation of the Modeltest script written by David Posada and Keith Crandall. It uses Bill Bruno's program Weighbor to generate the tree based on Jukes-Cantor distances. Weighbor is used because it is much faster than maximum likelihood, but less biased and more robust than NJ. Ziheng Yang's PAML is used to calculate the likelihood. The method from David Posada and Keith Crandall's MODELTEST paper is used to calculate AIC scores. One difference to the Modeltest evaluation is that we do not allow invariant sites, as this feature is not implemented in PAML because estimates of the fraction of invariant sites tend to be very sensitive to the number or taxa. NOTE: There is a downloadable interface for the original Modeltest code available from Genedrift.org; thanks to Stuart Ray for this information.

More information about the functionality, methods, and phylogenetic packages used, and the performance of this tool is available here[PDF].

Contributors and References: Contributors to this implementation of FindModel include Ning Tao, Russell Richardson, William Bruno and Carla Kuiken.

For background information on selecting evolutionary models, see (for example): Posada D, Crandall KA. Selecting the best-fit model of nucleotide substitution. Syst Biol. 2001 Aug;50(4):580-601.

Johan Nylander has written an implementation of Modeltest that is based on MrBayes rather than PAUP.

last modified: Fri Apr 11 13:27 2008

Index of all tools	ADRA
Branchlength	Codon Alignment
Consensus Maker	ELF
ElimDupes	Entropy
Epilign	FindModel
Format converter	Gap strip/squeeze
Gene Cutter	HDent/HDdist
Heatmap	Hepitope
Highlighter	HIV BLAST
HIValign	Hypermutation
jpHMM at GOBICS	Mosaic Vaccine Tool Suite
Motif Scan	N-Glycosite
ODprep/ODfit	PCOORD
PeptGen	PhyloPlace
Primalign	Protein Feature Accent
Protein structure	Recombinant HIV-1 drawing tool
RIP	SeqPublish
Sequence locator	SNAP
SUDI subtyping	SynchAlign
Translate	Treemaker
External tools

Use all 28 models
Construct initial tree using	Weighbor PAUP* MrBayes
Always email results

Paste your input here [Sample Input]
or upload your file