HIV sequence database

N-GlycoSite

Purpose: Highlight and tally predicted N-linked glycosylation sites (Nx[ST] patterns, where x can be any amino acid).

Input

Paste your alignment here [Sample Input]
or upload your file

Options

Details:
During glycosylation, an oligosaccharide chain is attached to asparagine (N) occurring in the tripeptide sequence N-X-S or N-X-T, where X can be any amino acid except Pro. This sequence is called a glycosylation sequon. The N-GlycoSite tool marks and tallies the locations where this pattern occurs.

The likelihood of N-linked glycosylation of a particular site can be influenced by the context in which it is embedded, and could be expanded to a 4-amino acid NX[ST]Z pattern, where the amino acid in the X or Z position can be important determinants of glycosylation efficiency. For example, a proline in position X or Z strongly disfavors N-linked glycosylation.

O-linked glycosylation signals are more difficult predict, but one can estimate their positions using the NetPhos program at Center for Biological Sequence Analysis.

Input:
Input can be one amino acid sequence, or an alignment of amino acid sequences, from any organism. If you just want to tally the number of N-glycosylation sites, the protein sequences do not need to be aligned. Standard sequence alignment formats are recognized.

Exclude NP[ST] pattern:
A second position proline (site pattern NP[ST]) is strongly disfavored for glycosylation. Thus the default option excludes these patterns. You may uncheck the box to include them.

Grouped Sequence Names:
If you are analyzing multiple sequences, you can choose how to group them in the analysis. If you are analyzing a single sequence, or you do not want to group your sequences, just ignore these options. Your sequences can be grouped by the first character in the sequence names, or by a set of characters delimiting the sequence names, or by providing a list of groups.

Each sequence must be on a separate line, and groups are separated by an empty line. The first item ending in ':' in a group will be taken as the group name, but this line is optional. If group names are omitted, names will be assigned as Group-1, Group-2, etc. Sequences that are not present in any group will be named 'Others' and colored gray. This is useful for highlighting some groups of sequences out of a target set.

The following can be pasted in as the "grouped sequence names" for testing with the Sample Input:

Non-recombinants:
A1.KE.93.Q23-17
B.FR.HXB2
C.BR.92.92BR025
D.UG.94.94UG1141
O.CM.-.ANT70
CPZ.CM.-.CAM3

Recombinants:
01_AE.CF.90.90CF11697
02_AG.CM.97.97CM-MP807

Reference coordinates:
If your sequences are HIV-1, HIV-2, or SIV, your results can show the position of the glycosylation sites relative to the relevant reference sequence coordinates. For best accuracy of coordinates, we recommend having the reference sequence included in the alignment (as the first sequence).

References:

Zhang M et al., Glycobiology. 14(12):1229-46 (2004) -- please cite this reference if you use our tool in a publication.
Marshall RD, Biochem Soc Symp. 40:17-26 (1974)
Kasturi et al., Biochem J. 323 (Pt 2):415-9 (1997)
Mellquist JL et al., Biochemistry. 37(19):6833-7 (1998)

last modified: Mon Apr 11 11:14 2016

Questions or comments? Contact us at seq-info@lanl.gov.

Index of all tools	Genome Browser	PrimerDesign-M
Alignment Slicer	Heatmap	Protein Feature Accent
AnalyzeAlign	Hepitope	Quality Control
AnnotateTree	Highlighter	QuickAlign
Branchlength	HIV BLAST	Rainbow Tree
CATNAP	HIVAlign	Recombinant HIV-1 Drawing Tool
Codon Alignment	Hypermut	RIP
CombiNAber	jpHMM at GOBICS	SeqPublish
Consensus Maker	Mosaic Vaccine Tool Suite	Sequence Locator
ELF	Motif Scan	SNAP
ElimDupes	N-Glycosite	SUDI Subtyping
Entropy	PCOORD	SynchAlign
Epigraph	PepMap	Translate
FindModel	PeptGen	TreeMaker
Format Converter	PhyloPlace	TreeRate
Gap Strip/Squeeze	PhyML	Variable Region Characteristics
GenBank Entry Generation	Pixel	VESPA
Gene Cutter	Poisson-Fitter	External Tools