[
Search
] - [
Help
] - [
FAQ
] - [
FTP data
] - [
Release Notes
] - [
Build Procedure
]
[
Contact Us
] - [
Related Links
] - [
IMAGE Home
]
IMAGEne Release Notes
http://image.llnl.gov/image/imagene/4.9/bin/search
The data for this release was obtained on the following dates:
Human - OCT-10-2006
Mouse - OCT-11-2006
Rat - OCT-12-2006
Bos taurus - OCT-09-2006
Xenopus tropicalis - OCT-10-2006
Zebrafish - OCT-12-2006
- October 13 2006 The 'Bos taurus' and 'Xenopus tropicalis'
species were added to IMAGEne. The results are available in
this release.
- The definition for predicted fulls has been updated.
http://image.llnl.gov/image/imagene/4.8/bin/search
The data for this release was obtained on the following dates:
Human - DEC-23-2005
Mouse - DEC-22-2005
Zebrafish - DEC-25-2005
Rat - DEC-22-2005
- December 21 2005 The species rat was added to the IMAGEne build.
The results are available in this release.
- The Alignments Java Applet will now display in Mozilla/Firefox for the Mac.
Release 4.7 ( January 10 2005 )
The data for this release was obtained on the following dates:
Human - JAN-06-2005
Mouse - JAN-05-2005
Zebrafish - JAN-05-2005
- There were no changes to the IMAGEne software in this release.
Release 4.6 (June 2 2004)
The data for this release was obtained on the following dates:
Human - MAY-27-2004
Mouse - MAY-26-2004
Zebrafish - MAY-26-2004
- May 5 2004 - Corrected a semantic bug in the aligment process that allowed, in a small
set of cases, the reverse complement of a high quality sequence match to be aligned to
the reference sequence, even if the reverse was a poorer match than the original. A side
effect of correcting this error was that the KG process was performance optimized due to
a reduced number of calls to the SIM alignment algorithm needed to align ESTs and clones
to the reference sequence.
- May 5 2004 - Updated the display JavaScript so that the red highlighting for the current
contig works in non-IE browsers (Mozilla, Netscape).
- May 5 2004 - Updated the display JavaScript to correct a problem with the pull-down
menus for the Java display. The "Show Alignments" menu will no longer automatically reset
to "by clone" if "by est" is selected, and the contig is changed.
- May 7 2004 - Fixed the format of data sent to the HTML and Java displays. The data displayed
is not affected.
- May 11 2004 - Rewrote part of the L2 clustering algorithm to remove a cluster-singleton
overlap problem.
- May 19 2004 - Updated the FTP README.
Release 4.5 (March 24 2004)
The data for this release was obtained on the following dates:
Human - MAR-16-2004
Mouse - MAR-15-2004
Zebrafish - MAR-15-2004
- March 15 2004 - Added zebrafish to the list of species processed by IMAGEne.
- March 15 2004 - Fixed a bug that caused the data in the *-Candidate_gold and *-Master FTP files to be swapped.
Release 4.4 (January 09 2004)
http://image.llnl.gov/image/imagene/4.4/bin/search
The data for this release was obtained on the following dates:
Human - DEC-19-2003
Mouse - DEC-19-2003
- June 3 2003 - A bug in the generation of the Tissues in IMAGEne XML file was causing some lines to be ommitted, resulting in an improperly formatted XML file. This affected both the web-accessible version, as well as the downloadable version of the file. All versions of the XML files prior to 4.4 contain this error.
Release 4.3 (June 12 2003)
The data for this release was obtained on the following dates:
Human - JUN-08-2003
Mouse - JUN-07-2003
- June 3 2003 - A subtle bug was causing problem clones to be allowed into the CG build.
- June 3 2003 - Minor changes to the queries that collect sequence data from the database that were needed because of some improvements made to the production pipeline schema in the last of couple of months.
- June 12 2003 - Changes made to the alignments applet to handle the occurence of duality sequences (for lack of a better term... the sequences that can represent one or another nucleotide, eg. R for Purine which could be either an 'A' or a 'G' ). These types of sequences are extremely rare, as such we've chosen to represent them with a dash. This allows us to maintain the highly compressed data stream used to transmit data to the applet (which is what allows it to load so quickly over the internet).
Release 4.2 (Feb 4 2003)
The data for this release was obtained on the following dates:
Human - JAN-31-2003
Mouse - JAN-31-2003
- Jan 13 2003 - A problem with Megablast that allowed it to miss identifying sequences that were too closely related, because the CG process compares ESTs against other ESTs and itself, it was possible for Megablast to miss identifying an EST as homologous to any sequence, including itself, which would cause it to fall out of the build. This happened with primarily ESTs that had too many repeats in them. We worked with NCBI to identify the problem and verify that the patch they gave us fixes the problem.
- Jan 13 2003 - Two portions of the CG build were optimized: the first, previously taking 10 hours was reduced to 3 and the second, taking 14 hours was reduced to 2. Both speed ups were attained through decreasing I/O requirements (i.e. less hard disk access).
- Jan 16 2003 - Added code to the beginning of the build to update our list of predicted fulls to include any new clones in the IRAK or IRAL collections
- Feb 4 2003 - Fixed a problem where blast searching against all species would not return results from human matches.
- Feb 4 2003 - Further optimizations were made to distributed portions of the build, reducing total build time to 55 hours for a full NxN build of both human and mouse.
Release 4.1 (Oct 15 2002)
The data for this release was obtained on the following dates:
Human - OCT-07-2002
Mouse - OCT-08-2002
- Oct 15 2002 - Megablast replaces BLAST as primary local alignment tool in initial similarity search for KG process
- Oct 15 2002 - Megablast replaces BLAST as primary local alignment tool in L1 clustering for CG process
- Oct 15 2002 - The Search Results page has been enhanced. It now searches the Singletons data set when searching by accession, clone ID or sequence.
When searching by clone ID or accession it logically groups database hits by section from which they come (ie KG, CG, Singleton or Not found).
For an example try searching version 4.1 for these accessions (AA101995 AA632285 BM951361 H15813 AA000000) and with "All" for species.
- Oct 15 2002 - The Build Procedure document revised for clarity and an overall
flowchart of the process has been added to complement the document.
- Oct 15 2002 - Singletons FASTA data added to the FTP site. See FTP data.
- Oct 15 2002 - We compiled a list of related links that relate to IMAGEne
- Oct 15 2002 - A number of enhancements were made to help reduce build times, coupled with our adoption of Megablast and with hardware enhancements, we reduced
the time down to 106 hours for a full NxN build of both human and mouse.
Release 4.0 (Aug 9 2002)
The data for this release was obtained on the following dates:
Human - JUL-10-2002
Mouse - MAY-28-2002
- Aug 9 2002 - We now have clusters for mouse sequences!
- Aug 9 2002 - The description line for candidate gene clusters now includes fields for species and number of contigs in the cluster
- Aug 9 2002 - The 'Show Alignments' pull down was moved from the search page to the applet frame of the display page. Making it much more interactive.
- Aug 9 2002 - We created a new document describing the build process, which is linked from the FAQ, and the navigation bar, or just click here.
- Aug 9 2002 - Singletons data added to the FTP site. See FTP data.
Release 3.4 (May 20 2002)
The dbest data used for this release corresponds to:
GenBank Flat File Release Release 128.0
Release Date: Release Feb 15 2002
Close -of-Data: Release Feb 13 2002
NCBI's Reference Sequence data is dated Apr 12 2002.
- May 16 2002 - Additional information is available in the
alignments applet. In clone view you can see the coverage of the clone next
to the clone id, eg. 510973:U and in EST view you can see the endedness of
the seqeunce next to the accession number, e.g. AA100175:5
- May 16 2002 - Some minor build performance issues were addressed
and further automation enhancements made..
- May 16 2002 - The alignments applet now communicates with our
server using compressed data, the result is much faster load times..
- May 16 2002 - The alignments applet now color codes bases
for easier comparison of sequences.
- May 16 2002 - The method used for sequence queries has been
changed. The submitted sequence is blasted against the known gene sequences
and then the candidate gene sequences, instead of against the ESTs in the
Imagene set. The result is much faster blast searching.
Release 3.3 (Mar 04 2002)
The dbest data used for this release corresponds to:
GenBank Flat File Release Release 127.0
Release Date: Release Dec 15 2001
Close -of-Data: Release Dec 17 2001
NCBI's Reference Sequence data is dated Jan 10 2002.
- Mar 04 2002 - The L1 clustering of the Candidate Gene process has been optimized (as well as put onto the Sun Grid Engine), helping reduce the time necessary for this portion from weeks to days.
- Mar 04 2002 - The Known Gene clustering process, and portions of the Candidate Gene clustering process, now make use of a system for running jobs on multiple computers called the Sun Grid Engine. It has allowed us to more than double the number of CPUs used in the process, thus decreasing the length of time necessary for a build.
- Mar 04 2002 - FASTA, used in the known gene portion of Imagene, was upgraded to version 3.0 for this build. Also, Arian Smit advised us on settings to help get the best homology matches, as a result our known gene clusters should be of a higher quality this time around.
- Mar 04 2002 - The data structures in our database that track Imagene data have been redesigned. The new schema allows us to relate data in new ways and has helped to give the new functionality in this build.
- Mar 04 2002 - ESTs that have been pulled from the NCBI database at the time we build the Imagene clusters do not take part in our clustering algorithms and thus do not appear in the Imagene data set.
- Mar 04 2002 - ESTs and/or clones for which we have problems listed in our problem database at the time we build the Imagene clusters do not take part in our clustering algorithms and thus do not appear in the Imagene data set.
- Mar 04 2002 - Clones have now been categorized into four types: Fulls, Predicted Fulls, Unknowns and Partials. See the Imagene FAQ for definitions of the categories.
Release 3.2.1 (Mar 09 2001)
The dbest data used for this release corresponds to:
GenBank Flat File Release Release 121.0
Release Date: Release Dec 15 2000
Close -of-Data: Release Dec 20 2000
NCBI's Reference Sequence data is dated Jan 19 2001.
- Feb 14 2001 - Clones coming from the MGC project and predicted to be full-length by the NCBI have been introduced into Imagene and labeled 'Predicted Fulls'. They are labeled 'PF' in the Imagene display and are colored green in the clone list and the Java based alignments applet.
- Feb 14 2001 - Upgraded the Blast version used to 2.1.2, which fixed a bug found in Blast that caused random cluster members to be dropped.
- Feb 14 2001 - A default Blast parameter was recognized that was limiting result sets to 500. Once we set the value arbitrarily high we started getting much larger clusters against the reference sequences and hence less clones were being fed to the Candidate Gene process, providing a much more accurate picture in the Candidate Gene set.
- Feb 14 2001 - Because of the huge growth of the image collection and the possibility of very large and redundant coverage in a cluster, we have limited the web display to show a max of 500 clones.
- Feb 14 2001 - The Java based alignments applet was improved to allow better error detection and improved memory performance.
- Feb 14 2001 - Due to increased data from all points (ie predicted fulls, over 5000 new reference sequences and many more sequence verified clones) the quality of Imagene 3.2.1 clusters is improved vastly over Imagene 3.1 clusters.
- Feb 14 2001 - We are now tracking numerous cluster statistics from each build, and analyzing the changes in the clusters over time. This adds to the overall quality of Imagene clusters, and will aid us in the maintainence of the IMAGE collection.
- Feb 14 2001 - Web search and display results as well as build process databases have been implemented using XML to allow a more generalized communication protocol in anticipation of allowing Imagene to provide clusters from non-IMAGE datasets.
- Feb 14 2001 - We are now making use of a nightly process that repeat masks ESTs as we are retrieving them from GenBank. As a result we no longer repeat mask the reference sequences at build time but are making use of repeat masked ESTs instead. This has greatly improved the speed of the known gene build process.
Release 3.1 (Jun 29 2000)
The dbest data used for this release corresponds to:
GenBank Flat File Release Release 116.0
Release Date: Release Feb 15 2000
Close -of-Data: Release Feb 18 2000
NCBI's Reference Sequence data is dated Apr 30 2000.
- Jun 30 2000 - Additional functionality added to Imagene to distinguish candidate gene contigs from candidate gene clusters. IDs in the form of C##### represent clusters and IDs in the form of C######-## represent a contig of that cluster.
- Jun 30 2000 - Functionality added to the Search and Search Results page to allow any match to a contig to also show matches to its associated cluster. The user can then view just the contig or its entire cluster.
- Jun 30 2000 - New Display interface that vastly improves the way clusters are viewed. The differences between known gene clusters, candidate gene clusters and candidate gene contigs is now clearer and more intuitive.
- Jun 30 2000 - A concurrency problem with the server side cache that could allow display data to be garbled has been fixed.
- Jun 30 2000 - The set of singletons has been reduced greatly by filtering through them to remove any clone whose sequence does not contain at least 50 base pairs of contiguos non-repeat sequence.
Release 3.0 (Feb 27 2000)
The dbest data used for this release corresponds to:
GenBank Flat File Release Release 115.0
Release Date: Release Dec 15 1999
Close -of-Data: Release Dec 10 1999
NCBI's Reference Sequence data is dated Jan 28 2000.
- May 31 2000 - An incorrect value of 244409 was given on the search page as the number of singletons for this version. The correct value of 237,678 now appears in its place.
- Mar 27 2000 - A server side cache was developed to speed up delivery of IMAGEne clusters
- Mar 27 2000 - Additional information about Candidate Gene clusters can now be obtained by clicking on the cluster ID button in the display
- Jan 25 2000 - Imagene 3.0 now includes Candidate Gene Clusters (not associated with known genes)
- Jan 25 2000 - Imagene 3.0 has been almost completely redesigned internally to provide access to Candidate Gene Clusters, increased performance, reduced data for improved transmission times and improved search ability. Future versions will have improved functionality that takes advantage of the new internal structure.
- Jan 25 2000 - Additional search criteria includes Cluster ID and TIGR ID.
- Jan 25 2000 - Enhanced ranking now sorts first by length and then by library, which is reflected in the cluster display and the Master and Candidate Gold Listings
- Jan 25 2000 - The GenBank column now lists clone and then all Accession numbers that reference that clone. Click on the GenBank accession number to search using Entrez.
- Jan 25 2000 - Full insert sequences of I.M.A.G.E. clones are now obtained from the primate section of GenBank, and can be found in Known Gene and Candidate Gene Clusters.
Release 2.0 (Sep 16 1999)
The dbest data used for this release corresponds to:
GenBank Flat File Release Release 113.0
Release Date: Release Aug 15 1999
Close -of-Data: Release Aug 06 1999
NCBI's Reference Sequence data is dated Aug 24 1999.
- Oct 01 1999 - The bug that caused alignments in the alignments window to be cutoff and to not list all of the clones or ests has been fixed.
- Sep 14 1999 - Due to increased parallelism and optimizations the time it takes to create an Imagene build has been decreased by about 75%.
- Sep 14 1999 - The clusters are now grouped using Blast v2.0.8, previous grouping had been done with v1.4
- Sep 14 1999 - The master gene list now comes from NCBI's Reference Sequence. To learn more about the Reference Sequence click here.
- Sep 14 1999 - A bug causing the first and last records of our blast database to be dropped from the cluster has been fixed.
- Sep 14 1999 - An error that caused a large number of 'High Quality Sequence stop' notations to be ignored has been fixed. Therefore, a large number of ESTs are now properly truncated to their High Quality length.
- Sep 14 1999 - This version now includes 'Reversed' clones
- Sep 14 1999 - This version throws out any EST whose 'High Quality' sequence length is less than 100.
- Sep 14 1999 - Three organizations who sequence verify IMAGE clones have been added to the 'Sequence Verified By' column of the display. They are: Genome Systems, Inc. (GS), Washington University (WASHU) and The Baylor College of Medicine Human Genome Sequencing Center (HGSC).
- Sep 14 1999 - This version employs improved error checking and automation in the build process. It also has an improved version tracking system.
- Sep 14 1999 - The master gene sequence was RepeatMasked using RepeatMasker version date 4-21-99.
Release 1.4 (Jan 28 1999)
The dbest data used for this release corresponds to:
GenBank Flat File Release Release 110.0
Release Date: Release Dec 15 1998
Close -of-Data: Release Dec 05 1998
NCBI's humannr known gene data is dated Jan 14 1998.
- Jul 14 1999 - A column called 'Seq Verified By' has been added to the imagene display. If the clone or EST has been sequence verified by some organization then a link of that organization's home page will appear. Currently, Research Genetics is the only participating organization.
- Jun 29 1999 - Recent problems with the 'Search by' options: 'IMAGE Clone ID', 'GB Accession' and 'Sequence' have been fixed.
- Jun 29 1999 - Clicking on a Clone ID in IMAGEne's search results now brings up a NCBI Entrez Nucleotide Query result for the corresponding ESTs (which is the preferred method of NCBI). It also fixes a problem with unwanted ESTs appearing in the list.
- Mar 19 1999 - This release updated the accession numbers links to the clusters. These links had been unchanged since Version 1.0.
Release 1.3 (Nov 05 1998)
The dbest data used for this release corresponds to:
GenBank Flat File Release Release 108.0
Release Date: Release Aug 15 1998
Close -of-Data: Release Aug 10 1998
NCBI's humannr known gene data is dated Oct 01 1998.
- Nov 05 1998 - Recent problems with searching by accession number should be resolved.
Release 1.2 (Oct 27 1998)
The dbest data used for this release corresponds to:
GenBank Flat File Release Release 108.0
Release Date: Release Aug 15 1998
Close -of-Data: Release Aug 10 1998
NCBI's humannr known gene data is dated Oct 01 1998.
Release 1.1 (Jul 29 1998)
The dbest data used for this release corresponds to:
GenBank Flat File Release Release 107.0
Release Date: Release Jun 15 1998
Close -of-Data: Release Jun 10 1998
NCBI's humannr known gene data is dated Jul 07 1998.
- Jul 29 1998 - In previous releases the gene data was directly supplied by NCBI staff.
Release 1.0 (Apr 25 1998)
The dbest data used for this release corresponds to:
GenBank Flat File Release Release 105.0
Release Date: Release Feb 15 1998
Close -of-Data: Release Feb 08 1998
NCBI's humannr known gene data is dated Feb 06 1998.
PROTOTYPE Release 0.6 (1997)
The dbest data used for this release corresponds to:
Genbank Flat File Release 101.0
Approximately June, 1997