HIV sequence database

GenBank Entry Generation

Make a Sequin file for HIV-1, HIV-2, or SIV sequences

Purpose: To prepare HIV-1, HIV-2, or SIV sequence sets, together with related data, for submission to GenBank.

Before you use this tool: Please verify that the sequences to be submitted are correct. For example, are you sure there are no sample mix-ups, contaminants, or hypermutants? If you have not verified the quality of the sequences, please use the Quality Control tool. From the QC tool, you can examine the data and then continue on to generate a Sequin file.

Step 1: Sequence information

Sequence information > Contact information > Manuscript information > Annotation data

Choose organism
Paste your sequence set [Sample Input]
Or upload sequence set

Job info

Job title
Your email for job results

Details

Required information:

HIV-1, HIV-2, or SIV nucleotide sequences in Fasta format.
author and manuscript information.

Optional information:
You will be prompted to enter annotation information that GenBank does not have a place for. These data can be entered directly from a comma delimited (.csv) file. To save your data in Excel to comma delimited format, go to File> Save As, select CSV (Comma delimited)(*.csv).

Each row in the comma delimited (.csv) file should correspond to a sequence in the Fasta file. The first column should contain the names of the sequences exactly as they appear in the Fasta file. Any differences in sequence names will lead to errors. The order of sequences need not match the Fasta file. Annotation data are associated with sequences by matching sequence names, rather than the order in the files.

Columns may contain sequence annotation data, such as the viral subtype, patient code, viral load, sample date, sample country, etc. For details about supported annotation and the requisite format, see the Annotation fields. This information will be stored in human-readable form in the comment field of the GenBank entry, making it available to researchers worldwide. Once the GenBank record is released, this information will be automatically loaded into the Los Alamos Sequence Database, allowing the data to be searchable from our search interface.

See an example of Fasta sequences and their CSV annotation data.

Please note:

This tool does not deposit your sequences, it only prepares them for deposit. Your results e-mail will contain instructions for submission to GenBank.
After deposit to GenBank, your entries will automatically appear in the Los Alamos Sequence Database, typically a few weeks after deposit to GenBank.
This tool may not find the correct protein translations for some SIV sequences. It works well for SIVmac and SIVsmm, but less-well for other SIVs.

last modified: Thu Jan 21 13:17 2016

Index of all tools	Genome Browser	PrimerDesign-M
Alignment Slicer	Heatmap	Protein Feature Accent
AnalyzeAlign	Hepitope	Quality Control
AnnotateTree	Highlighter	QuickAlign
Branchlength	HIV BLAST	Rainbow Tree
CATNAP	HIVAlign	Recombinant HIV-1 Drawing Tool
Codon Alignment	Hypermut	RIP
CombiNAber	jpHMM at GOBICS	SeqPublish
Consensus Maker	Mosaic Vaccine Tool Suite	Sequence Locator
ELF	Motif Scan	SNAP
ElimDupes	N-Glycosite	SUDI Subtyping
Entropy	PCOORD	SynchAlign
Epigraph	PepMap	Translate
FindModel	PeptGen	TreeMaker
Format Converter	PhyloPlace	TreeRate
Gap Strip/Squeeze	PhyML	Variable Region Characteristics
GenBank Entry Generation	Pixel	VESPA
Gene Cutter	Poisson-Fitter	External Tools