Analytical and Computational Tools
- The Broad Institute Genomic Center for Infectious Diseases
- ALLPATHS-LG
- AVA454
AssembleViral454 is an assembler designed for small and non-repetitive genomes sequenced at high depth. - HUMAnN
The HMP Unified Metabolic Analysis Network is a pipeline for efficiently and accurately determining the presence/absence and abundance of microbial pathways in a community from metagenomic data. - PASA
The Program to Assemble Spliced Alignments is a eukaryotic genome annotation tool that exploits spliced alignments of expressed transcript sequences to automatically model gene structures and to maintain gene structure annotation consistent with the most recently available experimental sequence data. - PriSM
A set of algorithms designed specifically to create degenerate primers for the amplification and sequencing of short viral genomes while maintaining sample population diversity. - Prodigal
Prokaryotic Dynamic Programming Genefinding Algorithm is a microbial (bacterial and archaeal) gene finding program. - QIIME
Quantitative Insights Into Microbial Ecology is an open source software package for comparison and analysis of microbial communities. - RC454
ReadClean454 is a program that takes a set of 454 read and quality files as well as a consensus assembly for those reads and corrects for known 454 error modes. - Trinity
This represents a novel method for the efficient and robust de novo reconstruction of transcriptomes from RNA-seq data. - V-Phaser
Use this tool to call variants in genetically heterogeneous populations from ultra-deep sequence data. - V-Profiler
This tool takes a read alignment and a list of accepted variants at each location in the alignment (such as would be generated by V-Phaser) and analyzes the intra-host diversity of a genome. - VICUNA
This program is a de novo assembly program targeting populations with high mutation rates.
- NIH Center for Human Immunology, Autoimmunity and Inflammation
- Desktop cDNA Annotation System (dCAS)
dCAS automates large-scale cDNA sequence analysis. - EuPathDB BLAST
- HIV Databases
- Codon Alignment
This tool takes any DNA alignment and returns a codon alignment and translation. - Consensus Maker
Calculates consensus sequences. - ElimDupes/Duplicate Sequence Removal
This tool compares the sequences and eliminates any duplicates or very similar sequences. - Entropy/Shannon Entropy-Two
These tools apply Shannon Entropy as a measure of variation in DNA and protein sequence alignments. - Format Converter v2.0.3
This program takes as input a sequence or sequences (e.g., an alignment) in an unspecified format and converts the sequence(s) to a different user-specified format. - Gap Strip/Squeeze v 2.1.0
Use to delete aligned columns that contain a chosen percentage of gaps or other characters. - Gene Cutter/Sequence Alignment and Protein Extraction
This is a sequence alignment and protein extraction tool. - HDent/HDdist/Analysis of Heteroduplex Mobility Shifts
This is a Web executable version with a link to the full source release for HDent and HDdist, a pair of programs for analysing data from heteroduplex mobility and tracking assays (HMA and HTA). - HIV Blast
Find the HIV database sequences most similar to your query(s). - HIV Sequence Locator
This tool finds the position of your nucleotide or protein sequence(s) relative to the appropriate viral reference strain. - HIVAlign/VirAlign for HIV sequences
This tool takes aligned or unaligned sequences and gives the alignment of the region which the input sequences touched. - Hypermut/Analysis and Detection of APOBEC-induced Hypermutation
Document the nature and context of nucleotide substitutions in a sequence population relative to a reference sequence. - PCOORD/Principal Coordinate Analysis
This is a procedure to find meaningful patterns in sequence data. - PhyloPlace/Phylogenetic Placement Service
This service reports phylogenetic relatedness of a query sequence with reference sequences in known clades. - PhyML interface
This is a fast and flexible program that generates good maximum likelihood trees. - Poisson-Fitter
This analyzes sets of homogeneous DNA sequences and performs statistical tests on the Hamming Distance frequency distributions. - Quality Control/HIV-1 Sequence Quality Analysis
Examines sets of HIV-1 nucleotide sequences for common problems and prepares HIV-1 sequence sets, together with related data, for submission to GenBank. - QuickAlign
Align a desired region to premade alignments or your own alignment. - RIP/Recombinant Identification Program
Identifies recombination in query sequence(s) by calculating similarity to a background alignment in a sliding window. - SeqPublish/Sequence Alignment Publishing Tool
This interface takes a sequence alignment (nucleotides or amino acids) and replaces residues identical to a reference sequence with dashes. - SNAP/Synonymous Non-synonymous Analysis Program
Calculates synonymous and non-synonymous substitution rates based on a set of codon-aligned nucleotide sequences. - SUDI Subtyping/SUbtyping Distance
This tool is designed to help determine if a newly defined clade of related sequences should most appropriately be considered a new subtype, a new sub-subtype, or part of a previously defined subtype. - SynchAlign/Synchronize Alignments
This tool aligns two alignments to each other. - Translate/Translate Nucleotide to Amino Acid Sequences
This tool translates nucleotide sequences and returns amino acid sequences. - VESPA/Viral Epidemiology Signature Pattern Analysis
This program detects signature patterns (atypical amino acid or nucleotide residues) in a set of query sequences relative to a set of reference sequences.
- Codon Alignment
- Immunology Database and Analysis Portal (ImmPORT)
- Influenza Research Database (IRD)
- Align Sequences (MSA)
- Analyze Sequence Variation (SNP)
- Annotate Nucleotide Sequences
This tool is an interactive version of the influenza annotation pipeline. You can submit any number of nucleotide sequences in FASTA format for validation and annotation. - Identify Point Mutations
This tool will scan the specified proteins that are from type A viruses with the specified subtype found in the IRD database for the presence of the amino acid you specify at the coordinate (location) indicated. - Identify Similar Sequences (BLAST)
Use BLAST algorithms to identify similar nucleotide or amino acid sequences in a variety of custom IRD databases. - Infer Phylogenetic Relationships and Generate Trees
- Metadata-driven Comparative Analysis Tool for Sequences (meta-CATS)
The meta-CATS tool can perform customized comparative genomics analyses with minimal manual manipulation. - Visualize Aligned Sequences
Interactive alignment viewer to visualize nucleotide or amino acid sequence.
- The JCVI Genomic Center for Infectious Diseases
- AutoCloser
- Celera Assembler
A de novo whole-genome shotgun DNA sequence assembler. - CLOE
Software for the finishing stage, also known as closure. - Elvira
A set of tools and procedures designed for high throughput assembly of small genomes such as viruses. - Ergatis
A Web-based utility that is used to create, run, and monitor reusable computational analysis pipelines. - PANDA
The Protein And Nucleotide Data Archive unifies the archival of the sequences from Taxonomy, GenBank, RefSeq, UniProt, PDB, and PRF on a regular interval to build and maintain a data archive. - PanOCT
The Pan-genome Ortholog Clustering Tool is a program written in PERL for pan-genomic analysis of closely related prokaryotic species or strains. - Primer Designer
This is a high-throughput PCR primer design software. - SSS
This software organizes large chunks of data. - Sybil
This is is a Web-based software package for comparative genomics. - VIGOR
The Viral Genome ORF Reader supports high throughput feature prediction and annotation.
- JOINSOLVER
This is a Web-based software program developed for human immunoglobulin V(D)J recombination analysis. - jpHMM at GOBICS (jumping profile Hidden Markov Model)
This is a probabilistic approach to compare a sequence to a multiple alignment of a sequence family. - Pathogen Portal RNA-Seq Pipeline (BRC)
- Pathosystems Resource Integration Center (PATRIC)
- BLAST
- Comparative Pathway Tool
Allows you to identify a set of pathways based on taxonomy, EC number, pathway ID, pathway name and/or specific annotation type. - MG-RAST Metagenomics Analysis Server
The server is an automated analysis platform for metagenomes providing quantitative insights into microbial populations based on sequence data. - Rapid Annotation using Subsystem Technology (RAST)
This is a fully automated service for annotating complete or nearly complete bacterial and archaeal genomes.
- Pathogen Functional Genomics Resource Center: Please note this program has ended. Software developed through the PFGRC program is available through http://www.pathogenportal.org/portal/portal/PathPort/ADB/ADB?action=a&windowstate=normal&c=pfgrcs.
- Gingko Application and Source Code
- Systems Biology for Infectious Disease Research
- CSDeconv
Computational method to locate TF binding from ChIP-seq data using blind deconvolution. - GenomeView
Genome editor and viewer with dynamic visualization of aligned short read sequences.
- CSDeconv
- Vectorbase
- Virus Pathogen Resource (ViPR)
Databases and Data Sets
- Database of Mutations Causing Human Hyper Ige Syndrome (STAT3base)
- EuPathDB
- Search
- Sequence Retrieval
Retrieve sequences by gene IDs.
- HIV Databases
- ADRA/Antiviral Drug Resistance Analysis Tool
This is a reference for researchers in the field of HIV drug resistance and is a summary of the drug-resistance mutations that have been defined in the literature. - Epitope Alignments
The alignments are FASTA files of the unique epitope sequences aligned to the LANL HIV subtype reference alignments. - HIV-1 Resistance Mutation Database
- HIV Sequence Alignments
- HLA Analysis Tools
These tools calculate and graph HLA frequencies in a population; search for HLA linkage disequilibrium in a population; compare the HLA frequencies in two populations. - Intra-Patient Sequence Search
This interface retrieves sets of sequences containing more than N sequences per patient. - Sequence Search Interface
- ADRA/Antiviral Drug Resistance Analysis Tool
- Immunology Database and Analysis Portal (ImmPORT)
- Influenza Research Database (IRD)
- Human Isolates with Clinical Metadata
Search samples collected from patients presenting at physician with influenza symptoms. - PCR Primer Probe Data
The data provide sequence information and assay metadata for different commonly used primers and probes used in rapid detection and sub-typing of Influenza viruses and in diagnostic applications. - Nucleotide Sequence Search
Search for influenza sequences, proteins, and strains using two types of searches. - Phenotypes
Find influenza virus strains that carry specific phenotypic characteristics.
- Human Isolates with Clinical Metadata
- Non-Obese Diabetic (NOD) Mouse BAC Library
- Pathosystems Resource Integration Center (PATRIC)
- Feature Finder
Use to locate specific features(s) based on taxonomy (e.g., genus or species), feature type (e.g., CDS, rRNA, etc.), keyword, sequence status, and/or annotation type. - Genomics Finder
Use to search for all PATRIC genomes based on genome names and available metadata. - ID Mapping Tool
Use to locate synonymous identifiers across multiple-source databases. - Protein Family Sorter Tool
- Feature Finder
- Rabbit Immunology and Infectious Disease Research
- Systems Approach to Immunology/Computational Core
Data is made available via a Web Portal, in two complementary forms: as bulk downloads (both raw and processed data) and via interactive exploration of data—allowing the identification of relationships among the different data types generated by the Cores. - Systems Biology for Infectious Diseases Research
- TB ChipSeq regulatory binding sites
Search for TB regulatory binding sites.
- TB ChipSeq regulatory binding sites
- VectorBase Sequence Data
A resource focused on invertebrate vectors of human disease. - ViPR Search Virus Pathogen Resource.
The Virus Pathogen Resource is a useful resource for the virology research community designed to assist users with various projects involving sequence and structure analysis, comparative genomics, and virus phenotype studies among others.
Visualization and Modeling Tools
- NIH Center for Human Immunology, Autoimmunity and Inflammation
- Highlighter (for Nucleotide Sequences)
This tool highlights matches, mismatches, transition and transversion mutations, and silent and non-silent mutations. - HIMMER
- HLA Analysis Tools
These tools calculate and graph HLA frequencies in a population; search for HLA linkage disequilibrium in a population; compare the HLA frequencies in two populations. - ImmPORT
- Influenza Research Database (IRD)
- Networks on Circular Chromosome (NetCirChro)
Cytoscape plug-in for network visualization on circular chromosomes. - Nexplorer
Evolutionary and comparative analysis are important tools used for the annotation of genomes. - Papillomavirus Episteme (PaVE)
Information and analytical resources for scientific research on the papillomaviridae family of viruses. - Pixel
This tool generates an image of an alignment using 1 or more colored pixel(s) for each residue, thus allowing errors in large alignments to be easily seen. - Single Nucleotide Polymorphism (SNP) Explorer
Analyze and visualize hyper-igm/cvid re-sequencing microarray data. - ViPR/Visualize Aligned Sequences
Clinical Genomics
Contact Information
For more information on Bioinformatics Genomics and DNA Analysis, email:
Bioinformatics and Computational Sciences Branch
Content last reviewed on December 10, 2015