Scientific Supercomputing at the NIH

EMBOSS on Helix
EMBOSS (The European Molecular Biology Open Software Suite) is an open-source software analysis package specially developed for the needs of the molecular biology user community. Emboss is composed of over 100 high-quality applications. EMBOSS website.

EMBOSS can read and write sequences in all the common sequence formats, such as EMBL, Genbank, Fasta, SwissProt, PIR, GCG, MSF, Clustal and raw. Thus, when using EMBOSS, sequences do not have to be converted from one format to another.

Some of the areas covered by EMBOSS programs:

Version

When you type 'emboss' to initialize EMBOSS, a header will be printed that will list the version of EMBOSS, the available databases, and the update status of each database.

How to Use

Documentation

Sample Session

(User input in bold):
helix% emboss ************************************************************************ Welcome to EMBOSS 4.0.0 ************************************************************************ Databases available: genbank Release 167 (19/Aug/08) genpept Release 167 (27/Aug/08) est Release 167 (19/Aug/08) refseqaa Release 30 (11/Jul/08) refseqnt Release 30 (11/Jul/08) PROSITE Release 20.36 (22/Jul/08) Restriction Enzymes (REBASE) 809 (29/Aug/08) Transfac Release 11.4 (14/Dec/07) prints Release 38_1 (24/Oct/07) uniprot Release 14.0 (22/Jul/08) allnt including genbank,est,refseqnt,gbnew allaa including genpept,uniprot,refseqaa,gpnew gpnew 01/Sep/08, 61516 entries since 19/Aug/08 rel 167 gbnew 31/Aug/08, 403077 entries since 19/Aug/08 rel 167 Type 'wossname keyword' to find a program Type 'showdb' to display available databases Type 'tfm programname' to display the program help Type 'programname -help' to list command-line options EMBOSS Web Interface at NIH: http://helixweb.nih.gov/emboss/ HELP! Helix Staff: 301-594-6248 or email: staff@helix.nih.gov ********************************************************************* helix% needle maize_hb.fas rice_hb.fas Needleman-Wunsch global alignment. Gap opening penalty [10.0]: Gap extension penalty [0.5]: Output alignment [af291052.needle]: helix% more af291052.needle ######################################## # Program: needle # Rundate: Mon Jan 08 2007 15:59:43 # Commandline: needle # [-asequence] maize_hb.fas # [-bsequence] rice_hb.fas # Align_format: srspair # Report_file: af291052.needle ######################################## #======================================= # # Aligned_sequences: 2 # 1: AF291052.1 # 2: OSU76029 # Matrix: EDNAFULL # Gap_penalty: 10.0 # Extend_penalty: 0.5 # # Length: 958 # Identity: 607/958 (63.4%) # Similarity: 609/958 (63.6%) # Gaps: 207/958 (21.6%) # Score: 2034.5 # # #======================================= AF291052.1 1 ATGGCACTCGCGGAGG---CCGACGACGGCGCGGTGGTCTTCGGCGAGGA 47 |||||.||||.||||| .|.|.|.||..||||||..||||.||||||| OSU76029 1 ATGGCTCTCGTGGAGGATAACAATGCCGTAGCGGTGAGCTTCAGCGAGGA 50 AF291052.1 48 GCAGGAGGCGCTGGTGCTCAAGTCGTGGGCCGTCATGAAGAAGGACGCCG 97 ||||||||||||||||||||||||.|||||..||.||||||||||..||| OSU76029 51 GCAGGAGGCGCTGGTGCTCAAGTCATGGGCGATCTTGAAGAAGGATTCCG 100 AF291052.1 98 CCAACCTGGGCCTCCGCTTCTTCCTCAAGTAAGTACGTTTCCGTGCTACA 147 ||||..|.|.|||||||||||||.|.|||||.|||| .|.||||.| OSU76029 101 CCAATATTGCCCTCCGCTTCTTCTTGAAGTATGTAC--ATGCGTGTT--- 145 AF291052.1 148 CACTGCC-----------TGCG----CACGTGCGCTTGGGTT------GC 176 |||.|| |||| || |.|.||||||| || OSU76029 146 -ACTACCATTTCTCTTTTTGCGGAATCA---GAGATTGGGTTTGTGAAGC 191 AF291052.1 177 ACCTGCACCGGCGGCCATCGAGC-----------CTGCTCCTTGACTAAC 215 | |..| ||.|||| |||.|.|.||..| OSU76029 192 A--TTAA---------ATTGAGCAATGCATTTCGCTGATACATGTGT--- 227 [...] #---------------------------------------