Report from the Zebrafish Genomics Initiative
Grantees Meeting
March 16, 1999
Progress reports from the following grantees are available on
this site:
Mark Fishman
William Talbot
Len Zon
Stephen Johnson
Marnie Halpern
Monte Westerfield
Up to Top
Progress report of Mark Fishman, Massachusetts General
Hospital
Project: Construction of a Genetic Linkage Map of Zebrafish
Grant Number: R01DK55390
Stated aims:
Increasing the density of microsatellite markers on the zebrafish
genetic map to a total of 8,000 (providing an average inter-marker
distance of 0.3 cM).
Update:
Our map is constructed using microsatellite repeats, primarily CA
repeats. These markers are easy to use, occur frequently in the
zebrafish genome. We have made them readily available, from Research
Genetics. We began about 8 years ago with a small effort, supported
by the NCRR, with the goal of placing 100 markers on the map. Because
of the sense from the community that this was to be the essential
genetic map for cloning and for anchoring of physical maps, we sought
and obtained industrial sponsorship. This allowed us to generate
a first edition map with 700 markers and by end of last year, a
next installment of 2000. Assuming the zebrafish genome to be 2,400
cM, this gives a marker density of 0.8 markers per cM. The map is
freely available on the web, and all markers available, without
restriction, from Research Genetics.
With the new NIH funding, we will further increase the density
and usefulness of our map. Our goals are: (a) to increase the density
of markers to about three per cM, and (b) to develop informatics
both for keeping, collating, and analyzing data and for distributing
it to the community in a "user-friendly" way.
Trajectory as of 3/16/99:
To reach our goal of 3 markers per cM, we will need approximately
7,200 markers on the map; thus, we need to add 5,200 additional
markers. From prior experience we know that approximately ten percent
of the microsatellite-containing clones we isolated from our genomic
library will end up on the map; the losses are taken early. After
sequencing, clones are omitted for a number of reasons. Many are
duplicates of clones we sequenced previously; others are too short
to give reliable primers or do not give PCR products in the 200-300
base pair range (a range we chose, more or less arbitrarily, so
that all markers could be resolved using the same gel conditions).
In other words, to generate 7,200 markers we will need to sequence
about 72,000 clones.
Progress to date:
Map
|
Clones Sequenced
|
Primer Sets
|
Polymorphic
|
Mapped
|
705 map
|
(not available)
|
|
|
705
|
2k map
|
12576
|
3250
|
1827
|
1295
|
TOTAL (previous maps)
|
16384
|
4183
|
1827
|
2000
|
Progress : as of 3/16/99
|
3808
|
933
|
|
|
Project goals
|
69921
|
18069
|
10158
|
7200
|
Remaining
|
53537
|
13886
|
8331
|
5200
|
Our goal is to add 400-500 new markers to the map every quarter,
on average.
Update on informatics: Don Jackson (Massachusetts General):
While gearing up on the experimental side, we have been working
to improve our methods for information management and distribution.
Our goal is to develop a database system that will allow us to
easily track and consolidate information in-house, to submit it
easily to the community resources, and to present it on a local
server (providing multiple means of access to the same data). We
want to make sure that the data is presented as rapidly and accessibly
as possible. We have begun setting up an on-line database using
data on the existing 2,000 markers. We are using this data to test
and refine the design of our database so that as new markers are
mapped they can be shared with the community as soon as possible.
Our first decision was selecting a database program. In the past,
we used Filemaker Pro (a Macintosh-based system), but it could not
easily handle the amount of information for even a 2,000-marker
map. We needed a system with the power to rapidly search large data
sets, to pull out solid data from all the steps of marker generation
to a single resource, and to allow multi-user access simultaneously,
where all of the functions involved, either entering or accessing
data, could be automated. We also needed a system that would allow
us to display our data on the Web and at the same time be inexpensive.
We selected PostgresSQL, a publicly available database program
for UNIX/Linux computers that we chose for the following reasons:
Power: Postgres is a relational database, allowing rapid queries
of large data sets while accommodating the various types of experimental
data generated in the course of making the map.
Flexibility: Postgres is SQL compliant and internet-aware, allowing
direct connections to other databases. For example, we are coordinating
with Eck Doerry to allow ZFIN to directly query our database via
the internet. This will allow ZFIN to update map data in the fastest
and easiest possible manner.
Programmability: Interfaces are available for a number of programming
languages including C, PERL and JAVA. We will use CGI scripts
written in PERL to implement web-based user interfaces for local
and remote access to the database.
Cost: the Postgres software is free. Our database server is a
pentium II computer running the FreeBSD UNIX operating system
and the Apache web server (cost: $1600).
Our database contains four primary tables: the first table contains
sequence information for microsatellite clones that we have isolated;
the second has information on primers (designed using STS-Pipeline,
an automated sequence analysis package); the third has all the genotyping
information; the fourth has position information for each marker
placed on the map. We also have supporting tables showing information
on homologies, targets, and eliminated sequences (undesired sequences,
duplicate hits, etcetera).
Demonstration of Web site:
(URL: http://zebrafish.mgh.harvard.edu/mapping/ssr_map_index.html)
This site offers two display options:
(a) graphic display of all markers in a linkage group (or fragment
thereof).
(b) detailed report on a single marker.
Option (a) dynamically generates a linkage map based on information
in the database-i.e., as soon as the marker has been mapped and
added to the database, it will appear on the Web site when someone
pulls up that linkage group. Thus, the maps are always up to date.
The linkage group figures are clickable image maps, allowing users
to easily retrieve detailed information on any marker by clicking
on it in the map figures. This displays a detailed report (as in
option (b)). This option provides a choice of information: sequence
information, BLAST results, characterization information on different
strains, and detailed mapping information. This view includes links
to the relevant sequence entries in NCBI and to map marker entries
in ZFIN.
New information will be flagged according to the date it was added
to the map. So far, we have deposited 2,600 sequences; that includes
1,800 sequences for markers and 800 other EST sequences that we
are not calling markers.
Up to Top
Progress report of William Talbot, Stanford University
Project: Characterizing the Zebrafish Genome
Grant Number: R01DK55378
Stated aims:
1. Constructing a framework map in a homozygous diploid mapping
panel by assigning 500 publicly available simple sequence length
polymorphisms (SSLP; CA-repeat) markers. To be completed by end
of first year.
2. Genetically mapping 3,000 genes in a homozygous diploid mapping
panel by scoring single-strand conformational polymorphisms (SSCPs)
in 3'UTRs and other nonconserved regions. First-year goal: 500.
3. Implementing informatics: Streamlining data management and
allowing rapid public access to map information generated by the
project, including comparative analysis, by means of a WWW interface.
Update:
We are working to create an integrated linkage map for the zebrafish
with our collaborators: (1) John Postlethwait at University of Oregon,
(2) informatics: Ruben Abagyan's group at NYU (now re-located at
Scripps), (3) Ron Davis' group at the Stanford DNA Sequencing and
Technology Center.
Rationales include
1. Functional genomics. A dense gene map for the zebrafish will
facilitate the identification of candidates for mutations.
2. Comparative genomics. Genes are uniquely suited markers for
comparing the structures of vertebrate genomes. Chromosomal segments
conserved among vertebrates can be identified by comparing locations
of genes in the zebrafish and their counterparts in humans.
Specific goals and progress on them:
1. Framework map construction. We have assembled a mapping panel
with 47 homozygous diploid individuals (heat shock progeny from
2 C32/SJD F1 females) and collected enough genomic DNA to score
more than 10,000 markers. The Postlethwait lab has distributed the
panel to three additional mapping labs (Steve Johnson, Len Zon,
Dave Beier). 189 SSLP markers from the Fishman group have been scored
on the panel. We expect to complete the framework map by scoring
~300 more SSLP by fall 1999.
2. Mapping 3,000 genes and ESTs by SSCP. We have scored more than
140 SSCPs linked to genes and ESTs. 99 of these have been assigned
tentative map positions; analysis of the rest is in progress. We
plan to score at least 350 additional SSCPs by fall 1999.
We have also provided more than 600 of the primer sets we have
synthesized to the Haffter group radiation hybrid mapping project.
They have scored many of these markers, enabling a straightforward
comparison of the genetic and physical maps.
3. Informatics tools: We have completed an SQL database for target
selection and primer design. The database is accessible to project
members via the WWW (http://saturn.med.nyu.edu:8080/zfish/pub/).
We have designed primers for more than 2000 genes and ESTs in GenBank,
of which about half have been synthesized.
We have also developed a tool for semi-automated phylogenetic comparisons.
Starting with a mapped sequence, the system does BLAST searches,
retrieves the 50 closest neighbors, and assembles a phylogenetic
tree. At that point, we analyze the trees and decide which human
genes are likely orthologues, or closest counterparts, to the zebrafish
genes. Our comparative analysis has identified 28 chromosomal segments
conserved between zebrafish and human.
We envision two protocols for data release.
1. Refined maps extensively checked for discrepancies. We will
release the first of these (with 500+ markers) by July 1999 and
generate new version every 4-6 months thereafter.
2. Between releases of refined maps, we will make our complete
map data set available for download in MapManager format. This will
allow any user to take the data and generate their own map using
simple features of the MapManager software. These interim releases
will contain preliminary map assignments, so it will be the responsibility
of users in the community to evaluate the map data and decide which
assignments are reliable enough for their purposes.
We will try to make it clear on the web site that preliminary assignments
are not as reliable as assignments in the refined maps, for which
all discrepant data points are identified and re-tested. We could
devote more effort toward releasing refined maps more frequently,
but this would divert effort from mapping additional genes. We hope
this two-tier release system will provide rapid access to the map
data for sophisticated users and at the same time allow us to release
refined maps at a reasonable frequency.
Regarding integration of genetic maps: We have created a new haploid
mapping panel (now published in Genome Research (9: 334-347). We
scored 389 polymorphisms for a total of more than 18,000 genotype
assays (about 10 percent of scope of the current heat shock diploid
mapping project). These polymorphisms included 104 new gene/ESTs,
217 MGH SSLP markers, and 53 previous mapped genes. This facilitates
comparison between the SSLP and gene maps because there are now
quite a few common markers.
Recently we have mapped about 20 genes from the Thisses in Strasbourg.
Their group is systematically examining expression patterns of clones
in cDNA libraries, and they have provided 3' EST sequences for genes
with interesting patterns. We plan to expand this mapping effort
as part of our project, with the aim of adding map position to the
Thisses database of expression pattern and sequence.
Up to Top
Progress report of Len Zon, Boston Childrens Hospital
Project: Construction of a Genetic Linkage Map of Zebrafish
Grant Number: R01DK55381
Stated aims:
1. Developing an RH map of the zebrafish genome.
A. Comparing four RH panels for retention frequency and resolution.
B. Constructing an anchored framework map of microsatellite
markers and cloned cDNAs on the RH panel with the most appropriate
resolution.
C. Positioning 5,000 EST markers on the RH panel.
2. Distributing information and RH information by means of a
World Wide Web site.
Update:
Members of the team:
Marc Ekker
In charge of panel mapping: Yi Zhou
Regarding the RH panel mapping project: A lot of decisions are
being made over the next 2 months about which panel to use, and
on how the project will mature.
The choice of the Ekker panel or the Goodfellow panel is a choice
that will have to be made.
Why do we need radiation hybrid panels? The real reason is to establish
linkage to the zebrafish genes and candidates.
The goal: People in the community will want to know whether they
have a candidate for the mutation. We can take candidate genes and
put them on the panel, and integrate to see whether co-localization
occurs with a mutant gene.
We now have two positional cloning projects in which the RH panels
have helped. We have actually done a chromosomal walk in one of
these cases. We have found a candidate gene very close to the linked
marker. And when we have sequenced the candidate gene, there is
a stop code indicating it is clearly the right gene.
We'd like this to happen more and more frequently. In other words,
the effort is to make the RH panel accessible to the community so
that they won't have to do too many positional cloning projects.
Update (Ekker):
The panel we made was done using a zebrafish AB9 as donor and a
mouseB78 as recipient.
We first made some tests with varying doses of radiation, since
this influenced the retention and the average fragment size in the
panel. We did preliminary characterization of three panels, produced
at doses of 3,000, 4,000, and 5,000 rads, and looked for retention
rate and fragment size.
From there we created what we felt would be an optimal panel, with
retention rate as determining factor. This panel of 93 lines was
extended so we had at least 10 mg DNA per line, with possibly one
or two exceptions. This panel has been distributed to the community.
(Funded under contract with the National Institute of Child Health
and Human Development (NICHD.)
Since the last meeting, we met for genotyping of this RH panel.
It was a collective effort, obtaining data from my lab, as well
as from Zon, Johnson, Dawid, Hudson in Montreal, and others.
We tested around 1,200 markers. The overall retention was 22 percent.,
varying according to the different linkage groups between the lowest
group 12 (13 percent) and the highest linkage groups 20 and 14 (at
36 percent retention).
We had the opportunity to compare these with the Goodfellow panel,
and for many of the linkage groups there seems to be a good correlation.
But some linkage groups are more represented in our panel, and some
are better represented in the Goodfellow panel (e.g., linkage group
5).
All of the analysis was done by Igor Dawid's group using RHMAPPER.
(Note that analysis of the Goodfellow panel was done by another
program, called SA Mapper.)
From these data we determined the average fragment size at 14.8
Mb (1 cR = 148 kilobases). (The average fragment size in the Goodfellow
panel is about three times smaller.)
Retention was about the same: 22 percent for ours, 18 percent for
the Goodfellow panel. This indicates a smaller number of larger
fragments in Ekker panel, while a larger number of smaller fragments
in the Goodfellow panel. In other words, we have confirmed our thinking
in September: that what we probably have in our panel is a smaller
number of larger fragments, and in the other panel a larger number
of smaller fragments.
With the data, we established a framework map with 703 markers
(total map size, 11,501 centi-rads). You can calculate potential
resolution to about 750 kb.
We tested about 300 EST sequences or CA repeats, and cloned cDNA.
So far, the coverage (the percentage of times we were able to determine
map position for these sequences) is around 87 percent. (The Goodfellow
panel's coverage is 83-84 percent.)
In progress:
1. Establishing a Web site where people can see the data and also
utilize the framework map; collaborating closely with ZFIN.
2. In the coming 3 months, planning to work on more weakly covered
regions of the RH map (i.e., either a lower density of markers,
or where the linkage between markers doesn't give as strong a score
as others).
3. Will make more detailed comparisons with Goodfellow panel, or
Research Genetics, and ours-among other things, by analysis with
the same program (whether RHMAPPER or SA Mapper).
The groups working on the RH panel are working very closely together.
We have had at least one phone conference including NIH members.
The goal is to get data analyzed in a way that we can make sense
of it, to make a choice as to how to move further.
Our lab's major contribution is that we have done seven chromosomal
walks, so we have the ability to actually test the resolutions of
the panels. We have typed four of those walks. We are happy with
the data.
The one walk for which we have physical data as well as genetic
data is for the sauternes gene.
If you look at the Research Genetics panel and look at the order
that is generated from the markers, the order matches the order
that we got from Research Genetics and from the physical distances.
If you look at the Mark Ekker panel, there are inversions; so it
is not exact. In other words, for three of the four walks we have
looked at, the Goodfellow panel predicts the order more accurately
than does the Ekker panel (23.8 percent).
However, for one of the walks, the Research Genetics panel bunches
all these markers in roughly a 200 kb region-but the Ekker panel
can resolve that with the correct genetic order. In short, it is
good that we have two panels for positional cloning projects.
Although for positional cloning projects it is great to have the
two panels, in choosing the panel that will be utilized to put 5,000-10,000
ESTs on it, we are going to have to make a judgment call.
(Re genetic order: Research Genetics panel, 12.8 percent
Mark Ekker panel, 14.7 percent)
The data are almost available. In theory, we will analyze the data
with same program and will use robots in 384 well format, but the
system will not be set up for another month or two.
Up to Top
Progress report of Stephen Johnson, Washington University
School of Medicine
Project: Integrated Zebrafish Genomic Resources
Grant Number: R01DK55379
Stated aims:
1. Generation of expressed sequence tags (ESTs) from various tissues
and stages of zebrafish development.
A. Oligonucleotide hybridization fingerprint clustering of 278,000
independent zebrafish cDNA clones from various tissues and developmental
stages to be analyzed to identify clusters, each likely to represent
a different zebrafish gene.
B. Generation of 5´ and 3´ sequencing reads from representative
cDNAs of up to 50,000 different clusters.
2. Sequence tagged site (STS) development from EST sequence and
generation of a radiation hybrid map.
A. Generation of 7,500 STS markers from 3´ EST reads.
B. Genotyping of 2,500 EST-based STSs on a radiation hybrid
(RH) panel. Provision of up to 5,000 EST-based STSs for collaborating
RH typing projects (Len Zon) to generate a 10,000-marker RH map.
Markers genotyped on the RH panel to include markers identified
as SNPs and genotyped on meiotic panels (i.e., aim 3), allowing
maximal integration between genetic and physical maps.
C. Improvement and maintenance of inbred genetic strains.
Update:
A. Informatics
Everyone has access to the data as soon as we do, and can visit
home page-Washington University Zebra Fish Genome Resources-and
link to our EST project. We do not yet have the RH mapping project
on line or have access to a panel, but in a few weeks we will have
a link to an integrated map.
With NCBI or BLAST, you can search with a sequence, keyword, or
name of clone for what you might be interested in, but with NCBI
it is actually very difficult to browse. We developed a way to browse
groups of sequences on web. This enables users to look window shop
for possibly interesting clones, or to detect possible errors in
submission that might interfere with accurate clone retrieval. Early
on, we found that occasionally when we did our annotation and assigned
submission numbers corresponding to well positions, the 3' reads
sometimes dont match 5 reads in register. Instead, they
may correspond to reads apparently a few wells off, or there may
be no correspondence between 3' reads and 5' reads. This can result
from lane tracking errors on the sequencing gel, or alternatively
from labeling errors during sequencing chemistry. Our browsing mode
gives the zebrafish research a chance to survey clones that are
likely to appropriately correspond to its archived well prior to
ordering. We have recently resequenced ~ 2% of the EST project to
identify such lane tracking or labeling errors that effect large
numbers of clones, that will allow the appropriate correction to
be made for more accurate clone retrieval.
B. EST Project
Total estimated zebrafish ESTs in database: 13,252 (100 percent):
Fishman (heart) |
|
1971 |
|
15% |
Gong (Singapore) |
|
1080 |
|
15%
|
Talbot (NYU) |
|
180 |
|
1% |
WashU |
|
10021 |
|
78% |
|
Total |
|
13252 |
|
100% |
Currently we are shipping about 700 ESTs a week from NCBI to our
own databases. There is about a 1-week lag; we are working on reducing
it.
Now that we have done some sequencing, we can ask how well the
project is working. If you did not precluster the library, if you
did not normalize it somehow, you would find you were sequencing
some of the highly expressed genes over and over again-i.e., the
problem in the EST project is to remove the redundancy. That is
what Matt Clark's project in Berlin is doing; what we are asking
is how well it works.
Example: the fin:
About 3,500 reads, about 13 percent annotated (ribosomal protein)-similar
to Talbot's project (about 16 percent). By comparison, the fingerprinting
library we have been using has only about 3 percent ribosomal protein.
Some of these proteins are duplicated in the library (hit 2 or 3
times). So the project isn't working perfectly, but pretty well.
We are happy enough with it for the next year.
We are starting with two libraries for the first part of the project:
a late somitogenesis zebrafish embryo library and an adult liver
library. After that, the fingerprint project will additional, novel
clones from shield stage library, fin regeneration libraries, kidney,
brain, and nose. We would also like to find some other stages-e.g.,
when organogenesis is getting underway (2-3 days of embryonic development).
In part, the choice of libraries was up to Matt Clark: it was clear
we needed good representation at the early embryo stage, since most
people in zebrafish are working on early embryo. Others were just
what we could get into the pipeline at the beginning of the project.
We found some good brain libraries, for instance, but we do need
more. Part of the project is to get more into the pipeline.
C. Regarding RH panels
The goal is to type a lot of the genes on an RH panel in order
to create a high density of markers, and so that we know genes in
regions of chromosomes that will give us ideas to send.
When we compared two panels on a fairly small sample, we found
25.4 percent retention of markers on the Ekker panel and about 13-14
percent on the Goodfellow panel. Overall, regarding the whole genome,
Goodfellow has 18.3 percent retention, and Ekker 20.7 percent (1,000
markers). (That is not a random distribution.)
More recently on this panel, we have been typing ESTs (not mapping,
because we do not yet have access to the data), and finding 23.4
percent retention.
In addition, I've been told that approximately 90 percent of ESTs
we're typing are actually placed on the map and are located somewhere
with high confidence. With the Goodfellow, the figure is more like
80 percent.
We have good retention, good reliability, and, likely, good resolution.
We will probably use the Ekker panel.
Our contribution to the genotyping effort so far: We have to keep
things going through the "factory"; we have genotyped 59 CA repeats,
34 genes, 125 ESTs, and 218 markers.
We are currently funded for 800 markers a year, so we are a little
behind schedule. But we are now typing 20 markers a week. With a
little more money, by the end of the year we could probably scale
to 30 (in order to meet our original goal of 1,800 markers a year).
How we present the data:
At the last meeting on Non-mammalian Model Systems, many of us
who do genomics in zebrafish were "politely reprimanded" for not
integrating all the different maps, so I have been working hard
over the past month to integrate them.
Progress report:
365 STSs on the haploid map.
3,077 cM.
215 SSLP/SSR.
150 genes on this panel from John Postlethwaits lab and others
(120 from Washington University).
We will integrate gene map and Massachusetts General Hospital (MGH)
map; it, in turn, will give a framework of markers for presenting
the data in a way that you can look at the map and say, "I can now
draw information from other projects as well." (Time estimate: in
about 3 weeks.)
Up to Top
Progress report of Marnie Halpern, Carnegie Institution
of Washington
Andreas Fritz, Emory University
Project: Generation of a Deletion Panel for the Zebrafish
Genome
Grant Number: R01DK55390
Stated aims:
1. Collecting, preserving, and cataloguing existing deficiency
and translocation strains.
2. Recovering and characterizing recently isolated and newly
gamma-ray-induced deficiencies
A. Mapping existing potential deletion mutants.
B. Screening for new mutations.
3. Localizing and determining the extent of gamma-ray-induced
deficiencies for assembly of a deletion panel for the zebrafish
genome:
A. Refining the existing deletion map.
B. Mapping already recovered mutations.
C. Mapping new gamma-ray-induced mutations.
D. Assembling the deletion DNA panel
4. Correlating expressed sequences with mutant phenotypes:
A. Cataloguing deficiency phenotypes.
B. Retrieving deficiencies from cryopreserved sperm.
Update:
1. We have obtained and recovered carriers from a number of the
Oregon lines.
Our first aim was to collect and preserve existing deletion strains,
which were scattered among different stocks/labs. A lot of these
strains are not trivial to work with, because many are carried
as balanced reciprocal translocation that exhibit non-Mendelian
segregation frequencies. Deletion phenotypes are segregated out
upon haploid production
2. We have obtained DNA for 13 previously identified deletions
(i.e., from haploid mutant phenotypes in sufficient amount that
it will end up part of a DNA panel that can be used in typing).
3. We have isolated DNA from 21 newly identified deletions.
We have carried out five mutagenesis screens so far (squeezed
about 500 fish); 102 females have been productively screened,
i.e., have obtained at least 50 embryos for phenotypic analysis
at 24 hours. Out of these, there are more than 43 potential new
lines. We have already recovered 12 with diploid phenotypes in
the F1. We recover potential deletions in next generation by back-crossing
to the initial mother or by F1 blind intercrosses. The quickest
and easiest way of recovering deletions is by producing haploids
from F1 females.
Summary of screen phenotypes:
Largest class: |
CNS necrosis |
(33) |
(localized or global brain degeneration) |
|
Tail |
(19) |
|
|
Early short axis |
(10) |
|
|
Eyes |
(6) |
|
|
General necrosis |
(4) |
|
Sometimes in the initial haploid screen you don't realize all
of the phenotypes that are present; thus, new phenotypes (and
hence deletions) can be recovered in the next generation.
In our screen, we are only using gamma-irradiated females.
If zebrafish colleagues are interested in screening for specific
mutations, we're willing to send them gamma-irradiated males
and they can return recovered deletions for inclusion in the
panel. We have already sent gamma-irradiated males to several
other labs. For example, Dr. Solnica-Krezel at Vanderbilt was
interested in finding a deletion uncovering the bozozok mutation
to confirm that it was a null allele (described in Fekany et
al., 1999). She returned an allele to our lab that she had identified
and mapped.
In order to give out deletion strains to the community in the
future, it will be important to have a catalogue of what the
mutant phenotypes look like. We are documenting haploid and
diploid phenotypes by digital imaging.
4. We have commenced sperm freezing.
5. We have begun to assemble multiplexed sets of primers.
Andreas Fritz has been working out procedures to multiplex
either previously mapped genes or genes of interest that have
not been mapped yet. He is particularly focusing on using markers
for chromosomal regions that we do not already have deletions
for. We are also in the process of arraying Z-markers so we
will be able to take an unknown deletion and run a series of
markers on it to map it. We currently have prepared DNA from
over 20 new deletions that he is in the process of mapping.
We would like to come up with a quick and non gel-based method
of mapping deletions (e.g., using fluorescent primers that will
give a simple "plus" or "minus").
On getting resources out to the community: We are not ready to
establish a data base; however, even though we don't yet have a
formal database, we receive deletion requests from researchers around
the world. We send back either fish or DNA (if available).
Responding to requests can be "incredibly labor-intensive" and
we're not at the stage where we can make this a general service.
In fact, the Halpern lab is unlikely to have staff to generalize
this service and it would be better dealt with by a stock center.
Ideally, we would like to get the deletion strains out of our labs
and into a stock center facility. Our principle goal at this point,
though we do respond to all requests in a timely fashion, is to
get more deletions generated and mapped.
Up to Top
Progress Report of Monte Westerfield, University of Oregon
Project: Informatics
Stated aim: To have all information presented in a centralized
way.
Grant Number: P40RR12546
Update:
Usage. Use of the ZFIN database has continued to increase
to over 80,000 "pages" of information requested each month.
Most users are located in the United States, England, and Germany.
The ZFIN staff conducted an email survey to learn how users feel
about ZFIN. Approximately 15% of the registered users responded.
The majority of users rated ZFIN as "very" or "somewhat" useful.
Contents.
ZFIN currently contains:
Record Type |
No. of Records |
Community Information
|
Person |
1740 |
Lab |
200 |
Company/Supplier |
35 |
Publications |
1991 |
Mutants and Phenotypes
|
Mutants |
1915 |
Images |
1851 |
Genomics
|
Meiotic Panels |
3 |
Rad. Hybrid Panels |
0 |
Anonymous Markers
(RAPD, SSLP, AFLP) |
3138 |
Genes |
326 |
ESTs |
43 |
Mutants |
178 |
Release of confidential data. Based on discussion at the
last awardees meeting, the ZFIN staff developed a method for information
to be made public in an anonymous way. This allows researchers to
post information about mutants and mapped genes and mutants prior
to publication, thus encouraging collaborations without jeopardizing
publication priority. The map positions of most genes and mutants
from the Tuebingen laboratory are now shown on ZFIN under "anonymous"
names with an email contact to obtain more information.
Development of tools for entering and viewing the genetic map.
Mapped genes and mutants are now completely integrated with anonymous
map markers. Thus, a user can search on a gene and get information
about the gene, including both primary information and information
on how it was mapped (i.e., data that support the map assignment)
and a tabular listing of linked markers. The long-term goal is to
provide graphical representations of the maps. However, given current
financial constraints, effort to date has been directed toward getting
the data into ZFIN and making it available in an, albeit, simple
form. The gene and genomic data in ZFIN will be released publicly
in April.
Submission of data. The laboratories funded by the Zebrafish
Genome Initiative have been working closely with the ZFIN staff
to develop methods for bulk submission of data. Meiotic maps have
been submitted from the Fishman, Postlethwait and Talbot laboratories.
To date, no RH data has been submitted. However, the ZFIN staff
has been working with the Ekker and Dawid groups to develop tools
for bulk submission of RH data and for making the RHMAPPER and MAPMAKER
data available from ZFIN as flat files. They are also developing
tools for automatic submission of data from the Fishman meiotic
map.
Support of orthology/homology relationships. An immediate
goal is to link each zebrafish gene record to records of homologues
in other species where homologous relationships are known. A longer-term
goal is to provide information common to all model organisms in
a unified database. As a result of the NIH Model Organism Databases
Workshop held in December 1998, the first steps are being taken
to identify the minimal set of data shared by all the model organisms
and then to provide links to these data in ZFIN. A central database,
like NCBI, will probably ultimately handle this information.
Up to Top
|