Equivalent Applications for GCG programs
GCG on the Helix Systems was retired in Jan 2007. Most of the GCG
programs are available in EMBOSS. This table lists the
EMBOSS equivalent for each GCG program, or where applicable, an alternative program that
is available on the Helix Systems.
Full list of all EMBOSS programs.
GCG program | EMBOSS program | Description/Comments |
Assemble | merger union |
Construct new sequences from pieces of existing sequences; merger only accepts 2 sequences while assemble and union accept several. |
BackTranslate | backtranseq backtranambig |
Backtranslate
protein -> nucleotide sequence. backtranambig backtranslates to ambiguous codons. |
BestFit | water matcher |
Bestfit uses the Smith-Waterman algorithm to find the best local alignment between 2 sequences. water uses Smith-Waterman, matcher uses Pearson's lalign algorithm. |
Blast Psiblast |
dbiBlast | NCBI homology search between query and database |
Breakup | splitter | Splits a sequence into (overlapping) smaller sequences |
Chopup | - | Helps
to convert a non-GCG sequence format Not needed in EMBOSS because it reads most sequence formats without conversion |
CodonFrequency | chips compseq cusp |
CodonFrequency
--tabulates codon usage. compseq -- counts composition of dimer/trimer in sequence. chips -- calculates codon usage stats cusp -- creates a codon usage table. |
CodonPreference |
syco wobble |
Recognize protein coding sequences |
CoilScan | pepcoil | Predicts coiled-coil regions |
Compare + DotPlot | dottup
+ dotmatcher dotpath |
2-sequence
comparison. dotpath does a non-overlapping wordmatch dotplot. |
Composition | compseq pepstats |
Sequence composition |
compresstext | - | Removes extra whitespace in text files. Can be done via Unix shell script. |
comptable | - | Creates a scoring matrix |
consensus | prophecy | Creates
a consensus sequence or matrices/profiles from multiple alignments |
correspond | codcmp | Codon usage table comparison |
corrupt | msbar | Randomly mutate sequence |
dataset | dbiflat dbiblast dbigcg |
Creates searchable sequence database. GCG's Dataset requires sequences in GCG format, whereas dbiflat, dbiblast, dbigcg will take most formats between them. |
detab | - | Replaces tabs with spaces in sequence files. Can be performed by Unix shell command. |
distances | - | Calculates pairwise evolutionary distances between aligned sequences. The Phylip package can do this. |
diverge | - | Estimates pairwise substitutions per site between 2 or more coding sequences. The Phylip package can do this. |
dotplot | dottup dotmatcher |
2-sequence comparison |
extractpeptide | transeq | ExtractPeptide takes the output of Map and can write one or more of the reading-frame translations. transeq translates one or more of the frames or specific regions directly from an input nucleotide sequence. |
FastA FastX Tfasta TfastX |
- | Pearson's homology-search program, available as a standalone program. |
fetch | seqret seqretsplit |
Pull one or more sequences out of the databases. seqret/seqretsplit can save output in various sequence formats. |
figure | - | Generates plots from other GCG programs. The equivalent EMBOSS programs usually generate plots (e.g. plotorf). |
findpatterns | fuzznuc fuzzpro |
searches for patterns in a sequence or database |
fingerprint | - | Finds the products of T1 ribonuclease digestion. |
fitconsensus | - | Use after Consensus to find the best fits. |
framealign | - | Finds best local alignment including frame shifts between a protein and nucleotide sequence. |
frames | plotorf showorf |
Show open reading frames. plotorf does this graphically |
framesearch | - | Homology searches including frameshifts between protein and nucleotide sequences |
fromembl fromfasta fromgenbank fromig frompir fromstaden fromtrace | - | Converts from various formats to GCG sequence format. Unnecessary in EMBOSS because it can accept most sequence formats, but seqret can convert between formats if desired. |
Gap | needle stretcher |
Needleman-Wunsch algorithm to compare 2 sequences. stretcher uses the Myers-Miller algorithm which is more memory-efficient. For sequences larger than 10kb, I would suggest you to use 'stretcher' program in EMBOSS which is also a global alignment program. If one of your sequence is genomic and you are trying to align an est sequence to it, you may want to consider the 'est2genome' program. On the other hand, water->matcher->supermatcher are local alignment programs for small, medium, and large sequences, respectively. |
Gapshow | plotcon | Graphical representation of similarity of 2 sequences. |
GCGtoBlast | - | Makes a Blast database. Use NCBI's 'formatdb' instead. |
GelAssemble
GelDisassemble GelEnter GelMerge GelStart GelView |
megamerger merger union |
Parts of GCG's gel assembly suite. |
GetSeq | seqret | Type in a new sequence |
GrowTree | - | Creates phylogenetic tree. Can use Phylip or Clustal instead. |
HelicalWheel | pepwheel | Plots peptide sequence as helical wheel to help recognize amphiphilic regions. |
HmmerAlign HmmerBuild HmmerCalibrate HmmerEmit HmmerFetch HmmerIndex HmmerPfam HmmerSearch |
- | Sean Eddy's HMMER package, available on Helix and Biowulf. |
HTHScan | helixturnhelix | Finds HTH motifs in protein sequences. |
IsoElectric | iep | Calculates isoelectric pt of protein. |
Lineup | - | Edits multiple sequence alignments |
ListFile | - | for printing. Can use Unix pcprint command instead. |
Lookup | - | Versatile program for finding sequences in a database. whichdb in emboss can search for accession numbers, but GCG's lookup is much more sophisticated. Use NCBI Entrez instead. |
Map Mapplot Mapsort |
restrict remap restover |
finds
restriction enzyme cleavage sites. GCG & EMBOSS may display different isoschizomers of the same enzyme, but the results are equivalent. The EMBOSS remap program may not display a few of the available isoschizomers. |
MeltTemp | dan | Computes melting temperature of oligos |
MEME | - | Finds conserved motifs in a group of unaligned sequences. Use the standalone Meme/Mast on Helix (short interactive jobs) or Biowulf (long batch jobs) |
MFold | - | Predicts nucleotide secondary structure. GCG's version is an old version of Zuker's MFOLD, use the standalone MFOLD instead. |
Moment | pepnet,octanol hmoment |
Makes
a contour plot of the helical hydrophobic moment of a
peptide sequence hmoment prints the text output of the calculation. |
Motifs | patmatmotifs | Finds
common Prosite motifs in a sequence. Use '-full' tag to
display abstract information when using EMBOSS
patmatmotifs. Note that both these programs will only
find Prosite 'Patterns' (e.g. CAMP Phosphorylation
Site),
and not Prosite 'Matrices' (e.g. Helix-turn-Helix).
Use Interproscan
to find all known domains and functional
sites. (http://www.ebi.ac.uk/InterProScan/). patmatmotifs can accept file containing multiple sequences or patterns. |
Meme + Motifsearch |
prophecy + profit | Search a sequence or database with a matrix or profile. |
Names | infoseq | provides some info about sequence specifications. |
NetBlast Netfetch |
- | remote access to NCBI's Blast. Use standalone Blast on Helix instead. |
NoOverlap | diffseq | Finds differences between 2 sequences. NoOverlap can work with a group of sequences. |
OldDistances | - | Makes a table of the pairwise similarities within a group of sequenes. |
onecase | - | converts sequence into lower or upper case. Can be performed by Unix shell command. |
Overlap | - | Compares 2 sets of sequences using Wilbur-Lipman algorithm. |
Paupdisplay + Paupsearch |
- | PAUP Phylogenetic Analysis. Use the standalone PAUP on Helix instead. |
Pepdata | getorf sixpack |
Translates in all 6 reading frames. sixpack displays the DNA sequence with 6-frame translations and orfs. |
Pepplot | pepinfo | Pepplot plots protein 2ndary structure and hydrophobicity. pepinfo plots hydrophobicity, and garnier does protein 2ndary structure prediction. |
Peptidemap | digest | Enzyme/reagent cleavage map of a protein. |
Peptidesort | digest pepstats |
GCG peptidesort sorts fragments from an
enzyme/reagent cleavage of one or more proteins according
to position, mol. wt., and HPLC retention. EMBOSS digest
only processes one reagent cleavage at a time. EMBOSS pepstats can be used to
determine the composition of the fragments afterwards. The EMBOSS programs do not provide the elution times from HPLC. If you need this data, try the UCSF MS-Digest program which has an option for HPLC Indices. |
Peptidestructure Plotstructure |
garnier antigenic pepwindow pepwindowall |
Secondary
structure prediction. Garnier does not include
Jameson-Wolf antigenic indexing. antigenic predicts potentially antigenic regions of a protein sequence, using the method of Kolaskar and Tongaonkar. pepwindow displays Kyte-Doolittle protein hydropathy. pepwindowall produces a set of superimposed Kyte & Doolittle hydropathy plots from an aligned set of protein sequences. |
Pileup | emma | Multiple sequence alignment. emma is an interface to ClustalW. Can also use the standalone Clustal, or web ClustalW. |
PlasmidMap | cirdna lindna |
Plot DNA constructs. |
PlotFold | - | Plots MFold output. Use the standalone MFOLD instead, which is more up-to-date and makes output plots in postscript. |
PlotSimilarity | plotcon | Graphical representation of the similarity along a set of aligned sequences. |
Pretty prettybox |
cons prettyplot showalign |
Calculates consensus sequence from a multiple sequence alignment, and displays them prettily. |
Prime | eprimer3 | Selects oligonucleotide primers. |
Profilegap Profilemake |
prophecy prophet distmat |
Creates matrices/profiles from multiple alignments. Gapped alignment for profiles and sequences. |
PrimePair | primersearch? | Evaluates individual primers to determine their compatibility for use as PCR primer pairs. |
Profilescan | patmatdb | Searches sequences or db for protein motifs. Profilescan uses Gribskov method. |
Profilesearch | profit | Scans a sequence or database with a matrix or profile. |
Profilesegments | - | Alignments for results of Profilesearch |
Publish | seqret showseq |
Makes publication-quality displays of sequences. |
Reformat | seqret | GCG requires input sequences to be in GCG format, hence other formats need to be converted with 'reformat'. Emboss programs accept most sequence formats, so conversion is rarely required, but 'seqret' can be used to convert between formats if desired. |
Repeat | equicktandem etandem einverted palindrome |
Finds tandem repeats in sequences. The equivalent group of Emboss programs will also look for inverted or palindromic repeats. |
Replace | biosed degapseq |
Replaces characters in a text file. Degapseq is specific for replacing gap characters. Can be performed with Unix shell utilities like sed, awk or tr. |
Reverse | revseq | Reverse/complement a sequence. |
Sample | extractseq | Extract regions from a sequence. |
Seg | maskseq | Masks off low-complexity regions from a sequence. |
Seqed | biosed, cutseq, degapseq, descseq, entret, extractfeat, extractseq, listor, maskfeat, maskseq, newseq, noreturn, notseq, nthseq, pasteseq, revseq, seqret, seqretsplit, skipseq, splitter, trimest, trimseq, union, vectorstrip, yank | Sequence
editor. EMBOSS has several tools for specific editing tasks. Or use
a text editor (not word processor!).
Try the Jemboss
alignment editor for editing multiple sequence
alignments. |
SeqLab | - | X-windows interface to GCG. |
Setkeys | - | Redefines keyboard keys, mainly used for GCG's gel assembly programs. |
Shiftover | - | Moves text by column. Use the nedit editor instead. |
Shuffle | shuffleseq | Shuffles a sequence. |
Simplify | - | Reduce the number of symbols in a sequence. |
Spew | - | Sends a sequence from a remote computer (e.g. Helix) to your desktop. Use one of the File transfer mechanisms instead. |
SPScan | sigcleave | Predicts signal peptides in protein sequences. |
Ssearch | - | Part of Pearson's Fasta package, available as a standalone program on Helix. |
StatPlot | - | Plotting program. Rarely used. |
StemLoop | palindrome etandem |
Finds inverted repeats. |
Stringsearch | textsearch | Finds text phrases in sequence or database. Use NCBI's Entrez instead. |
Terminator | - | searches for prokaryotic factor-independent RNA polymerase terminators according to the method of Brendel and Trifonov. |
Testcode | wobble | Plots 3rd-position variability as an indicator of potential coding regions. |
ToFastA ToIG ToPIR ToStaden |
seqret | Emboss accepts most sequence formats, therefore format conversion is rarely required. seqret can be used to convert between formats if desired. |
Translate | transeq | Translates nucleotide -> Protein sequences |
Transmem | - | predicts transmembrane helices. |
Window + Statplot | freak | Residue/base frequency table or plot. |
Wordsearch Segments |
- | Homology search using Wilbur/Lipman algorithm. Segments displays the result. |
Xnu | - | Masks tandem repeats for future Blast search. |
- | abiview | Reads ABI file and displays trace |
- | antigenic | Finds antigenic sites in proteins |
- | banana | Bending and curvature plot in B-DNA |
- | btwisted | Calculates the twisting in a B-DNA sequence |
- | cai | CAI codon adaptation index, to measure synonymous codon usage bias. |
- | chaos | Create a chaos game representation plot for a sequence |
- | charge | Protein charge plot. |
- | checktrans | Reports STOP codons and ORF statistics of a protein |
- | coderet | Extract CDS, mRNA and translations from feature tables |
- | cpgplot cpgreport newcpgreport newcpgseek |
Plots and reports CpG-rich regions. |
seqed | cutseq | Removes a specified section from a sequence. seqed is interactive, cutseq is command-line. |
seqed | degapseq | Alter name/description of sequence. |
Findpatterns | dreg | Regular expression search of a sequence. Findpatterns is an approximate equivalent. |
- | emma | interface to ClustalW program. |
- | emowse | Protein identification by Mass spectrometry. |
- | epestfind | Finds PEST motifs as potential proteolytic cleavage sites |
- | est2genome | Align EST and genomic DNA sequences. |
- | extractfeat | Extract features from a sequence. |
- | findkm | Find Km and Vmax for an enzyme reaction by a Hanes/Woolf plot |
- | fuzztran | Protein pattern search after translation |
- | geecee | Calculates the fractional GC content of nucleic acid sequences |
- | isochore | Plots isochores in large DNA sequences |
- | listor | Writes a list file of the logical OR of two sets of sequences |
- | makenucseq makeprotseq |
Create random nucleotide and protein sequences |
- | marscan | Finds MAR/SAR sites in nucleic sequences |
- | maskfeat | Mask off features of a sequence. |
- | mwcontam | Shows molwts that match across a set of files |
- | mwfilter | Filter noisy molwts from mass spec output |
- | noreturn | remove carriage return from a ASCII files. Can be performed by Unix utilities like 'tr'. |
Reformat | nthseq | Pulls one sequence out of a multiple set. Reformat will pull a sequence out of an MSF or RSF file. |
- | oddcomp | Finds protein sequence regions with a biased composition |
- | polydot | Displays all-against-all dotplots of a set of sequences |
- | printsextract | Extract data from PRINTS |
- | pscan | Scans proteins using PRINTS |
- | rebaseextract redata |
Search and extract from REBASE. |
- | recoder | Remove restriction sites but maintain the same translation |
- | seqmatchall | all-against-all comparison of a set of sequences. |
- | showdb | Shows info about currently available databases. |
- | showfeat | Shows features of a sequence |
- | silent | Silent mutation restriction enzyme scan |
- | sirna | Finds siRNA duplexes in mRNA |
- | stssearch | Searches a DNA database for matches with a set of STS primers |
- | supermatcher | Finds a match of a large sequence against one or more sequences |
- | tfextract | Extract data from TRANSFAC database. |
gcghelp | tfm | shows documentation for a program. |
- | tfscan | Scans DNA sequences for transcription factors |
- | tmap | Displays membrane spanning regions |
- | tranalign | Align nucleic coding regions given the aligned proteins |
- | trimest trimseq |
Trim bits off ends of sequences. Can be done interactively with GCG's seqed. |
- | twofeat | inds neighbouring pairs of features in sequences |
- | vectorstrip | Strips out DNA between a pair of vector sequences |
- | wordcount | Counts words of a specified size in a DNA sequence |
- | wordmatch | Finds all exact matches of a given size between 2 sequences |