Working Group Summary
The Next Step Population
Studies in the "-OMIC" Age
Bethesda Park Clarion
Bethesda,
Maryland
June 14, 2005
TABLE OF CONTENTS
Introductions
Dr. Jaquish opened the meeting at 8:30 AM and welcomed
participants. Introductions were made and the meeting was
turned over to the Co-Chairs Drs. Shea and Van Eyk.
CHARGE: Drs. Shea and Van Eyk outlined the charge to
the Working Group. The charge for the day was to:
- Assess the current implementation of high throughput
technology in population studies and clinical trials
- Assess what is needed to accelerate the integration of
“Omics” into population and clinical studies.
- Determine where the integration of high throughput
biology and population / clinical studies will produce the
highest yield.
- Specific areas to be discussed and assessed include:
bioinformatics, technology, samples, funding, consent /
notification.
Back
to Table of Contents
Current Status of NHLBI Population Studies
Dr. Manolio gave an overview of the population based studies and resources supported by NHLBI. She outlined the current
approach to large scale studies in the Division of Epidemiology and Clinical Applications (DECA). The paradigm generally
includes assessment of family clustering,identification of novel and established risk factors,identification of
interventions, dissemination and application of knowledge to public health and clinical efforts.
Dr. Manolio gave an overview of the large observational studies and trials that include genetics. She provided a
list of family studies and trials/cohorts which have incorporated genotyping with a summary of the demographics of each
cohort and major phenotypes collected. Details are provided in her slides.
An overview of the Limited Access Data Policy was given. This policy is based on the idea that the NHLBI large cohorts
and trials provide a wealth of data which could not be completely mined by any one group. Therefore, studies are required
to provide a data set (including genetic data) which is accessible to the scientific community. Careful consideration to
participant confidentiality and privacy has been given in the development of this policy. The process for requesting these
data was reviewed.
Dr. Manolio also discussed a new initiative in the area of Large Scale Genotyping. This proposal provides dense SNP
genotyping in candidate genes in the large NHLBI cohorts as well as genome wide SNP typing in a sample
of cases and controls. These data along with phenotype data would be made available to investigators with IRB approval.
This project will present a bioinformatics challenge. Efforts are currently underway to develop a pilot database for NHLBI
studies with standard data definitions based on the NCI caBIG model. This will be a major challenge and requirement as
NHLBI moves more into “Omics” research.
Dr. Paltoo from the NHLBI Division of Heart and Vascular Disease (DHVD) presented an overview of clinical and
technology programs utilizing an “Omic approach”. She began with the NHLBI Shared Microarray Facilities
that provide study design guidance, analytical tools, knowledge / skills development and education in addition to the
technological service. Even with this service there is an ongoing need for support and microarray services because of
the rapidly evolving technology. Discussion expanded and the participants agreed that ongoing support and infrastructure
would be necessary for other Omic technology for the same reasons. Suggestions such as developing repositories and networks
and means of ensuring readily accessible data were discussed. It was stressed that this was NOT just
keeping existing grants and contracts alive, but rather extending them with the field as it grows and develops.
Dr. Paltoo then continued with a discussion of the new NHLBI Resequencing and Genotyping Service which will provide
sequencing and genotyping for fine mapping. The idea was based upon the success of the Mammalian Genotyping Service.
Dr. Paltoo provided a brief overview of the repositories available to NHLBI grantees. Details of these are in her
slides. Overviews of the Rat Genome Database, the Human Phenome Database and the Pharmacogenetics Knowledge Base were
given. Questions were raised regarding how long these resources would be supported. It was suggested that NHLBI needs
to have measures of productivity, importance and use for these web sites, databases and core resources.
Dr. Paltoo then presented the Programs in Genomic Applications which were designed to advance the field of functional
genomics. All tools and resources developed are freely accessible to the scientific community and training and education
have been built into the program. The Proteomics initiative develops innovative proteomic technologies while focusing
on NHLBI phenotypes. Dr. Paltoo then highlighted several programs in Pharmacogenetics and Gene by Environment Interaction
and the GeneLink Program which is an effort to promote a collaborative approach to gene finding in NHLBI funded familystudies.
Several group members commented that many of the initiatives are driven by technology and there appeared to be little
effort to integrate across “levels” of research. In addition, design and analysis often takes a back seat to technology.
Investigators can not expect to “deliver data” to analysts, research must be collaboration from the beginning of the study.
Dr. Fahy observed that limitations in skill sets tend to drive people together and efforts are needed in education, particularly of young
investigators. Dr. Gibbons expressed surprise at the number and diversity of NHLBI Programs. He felt investigators need
more familiarity with the NHLBI portfolio and resources. Dr. Van Eyk stated that there was a need for coordination of
resources with a “big picture” approach.
Back to Table of Contents
The Role of NHLBI Population Studies in the "-Omic Age" High Throughput Genotyping
Dr. Nickerson provided a discussion of the current status and future of high throughput genotyping. She focused on
SNP genotyping, beginning with the statement that this technology will probably change the most in the next 5 years.
She began with a discussion of the approach and design of studies. Investigators must have a clear hypothesis and
rationale in order to collaborate with genomics investigators to design a study. The heritability of the trait will
determine the approach and the power to detect genetic variants. The scientific question will also determine the number
of SNPs to be typed. Are questions related to candidate genes (5-10 SNPs), pathways (384-1500 SNPs), or genomes
(100K-500K SNPs)? The advent of whole genome amplification has made cell lines unnecessary in most situations.
Dr. Nickerson then discussed the technology involved in several of the available genotyping assays. The basic
technolog for genotyping is solid. She reviewed the allele specific hybridization technology used by Affymetrix and
the oligonucleotide ligation approach used by the Illumina platform. Microtiter plates are generally used for small projects.
Electrophoresis has been used for moderate size genotyping, but will most likely phase out soon, because it is just as expensive as arrays. Various
types of arrays have become standard for large scale genotyping.
Dr. Nickerson presented a slide of cost for various genotyping approaches. Costs range from $0.6/ SNP to $0.004/SNP.
The details appear in her slides. Generally the larger the scale the less ability you have to pick your SNPs.
The great technology leap in genotyping has been the elimination of pre- PCR. Products can be derived directly
from genomic DNA. She then discussed the general technology of several platforms such as Illumina (Bead based) and
Affymetrix and Parallele (Chip based). A recent innovation is the Affymetrix “Chiplet” which has a chip in each well of a
96 well plate.
Dr. Nickerson the posed the question: “Is SNP typing just an intermediate stop?” and answered: “yes”. The ultimate
goal is whole genome sequencing. She presented a new method of insitu sequencing where the fluorescence changes as
each base is added. This method allows simultaneous sequencing at millions of spots on a chip and could cost as
little as $1000 per sequence.
In Summary, many different genotyping approaches are available and we will probably need to utilize multiple
formats depending on the question and whether specific SNPs are targeted. Costs still tend to dictate study design and
there are tradeoffs in throughput, i.e. number of samples vs. SNPs.
Back to Table of Contents
Proteomics
Dr. Ware presented a discussion of Proteomics. She divided the current research into two general areas: (1) targeted
protein detection (investigation of biomarkers) and (2) discovery proteomics (unbiased proteomic investigation).
She gave an overview of the Acute Respiratory Distress Syndrome (ARDS) Network which has taken both approaches to
proteomic research. First in the targeted approach the investigators looked at biomarkers previously identified in other
studies. Individual markers did not have good prediction of outcomes. However, bivariate analyses showed that considering
two biomarkers simultaneously provided much better prediction of outcomes revealing the underlying complexity of the
system. The ARDS investigators also used blood, bronchial lavage, tissue, and urine to discover new biomarkers for the disease. Several different platforms exist
to perform these experiments.
Dr. Ware discussed several difficulties encountered when using blood samples for Proteomic research. First, there is a
large abundance of proteins, such as albumin and globulin which interfere with your analysis. Purification to remove these
proteins has been very time consuming and is now being automated.
In summary, Dr. Ware cited many of the current challenges in the field such as sample collection and storage.
The freeze and thaw problem makes it difficult to tack proteomics on to existing studies. However, there is a need for
existing samples for validation. A mechanism is needed to easily obtain basic sample information to assess if a sample
can be used for a specific question. There is a clear role for repositories to provide such information and to connect
investigators who have tools / samples. Sample processing / purification methods must be developed to remove high abundance
proteins. High quality clinical databases that can be integrated with proteomic data are essential to progress in the field.
Proteomics is far from having established standards, and lags behind genomics. It is also not clear how much of the
observed proteome is a function of an individual at time t. In addition, appropriate controls for proteomic studies must be
defined and biostatistical and bioinformatic algorithms developed to analyze complex proteomic data. In addition, there
is a need for integration of genetic and gene expression data with proteomic data.
The ultimate goal is translation of highly technical discovery proteomics platforms or multiplex biomarker assays to
clinical diagnostic laboratory tests. Some proteins such as PSA are great for managing disease, but terrible for diagnosis.
Therefore, translation may be protein specific.
Back to Table of Contents
Functional Genomics
Dr. Gibbons then discussed cardiovascular functional genomics specifically vascular remodeling with respect to
hypertension. He identified several core questions that a “user” of functional genomics would face and discussed the need
to address these questions to support research. The first challenge is locating a comparative gene profile data set.
Another question to be addressed is: “What is the translational significance of findings in model organisms?” Also how do
results translate to different disease contexts? One must be easily able to search information regarding genes such as
proximity to a QTL and existence of related transcription factors.
Dr. Gibbons mentioned some opportunities to collect human specimens for HLBS phenotypes. He proposed that the NHLBI
community needs ready access to a shared resource repository of gene expression data on human specimens
linked to clinical phenotypes.
Dr. Gibbons then discussed the importance of epigenomics in genome expression. DNA Methylation Patterns and chromatin
remodeling / histones can influence gene expression. He provided examples from the vascular transcriptome and hypertension.
Dr. Gibbons suggested that there may be a need for a systematic effort to characterize patterns of methylation in
banked human samples with well defined phenotypes.
Dr. Gibbons then discussed expression profiles of peripheral cells and their role in vascular cell differentiation.
He proposed that these cells, their phenotypic characterization and their mRNA expression profiles could be in a shared
data repository of tissue specimens relevant to HLBS disorders. Sources of such cells could include: NHLBI clinical networks,
population-based cohorts and clinical trials.
Dr. Gibbons then discussed use of candidate genes and resources in functional genomics research. He presented an example
of PPAR gamma in the knock out mouse to illustrate the need for a multidisciplinary approach to research. The search for
functional SNPs in this study necessitated going beyond initial 1-2 Kb. Dr. Gibbons suggested that we should look at highly
conserved regions of functional interest on a genome wide level.
Dr. Gibbons then suggested possible opportunities and resources for NHLBI “Omics” research. The first being the idea
of HLBS Molecular Pathology Tissue Sample Repositories. These could include target tissues (biopsies, lavage, surgery)
linked to phenotype (consomics; clinical phenotypes) and peripheral progenitor cell phenotypes. Epigenomics and disease
phenotypes, such as genome-wide DNA methylation patterns (links to expression profiles, QTLs) and genome-wide histone code
(links to expression profiles, QTLs) provide opportunities for functional genomics research. Data repositories and data
sharing and access will be critical to this effort. Bioinformatics tools would provide accessible links
between repositories and data-mining tools and promote data sharing between researchers. The discussion turned to how to
move to the next level. Investigators are not traditionally supported to do this and there is a need to promote
communication. Networks and RFAs have acted as band-aides because they do not provide the lasting infrastructure
necessary. There is a need for a sustainable infrastructure at the NIH level to promote sharing and generate resources.
Back
to Table of Contents
Pharmacogenomics
Dr. Weiss began his discussion of Pharmacogenomics by describing the Pharmacogenetics Research Network (PGRN).
The PharmGKB consists of a national resource linking genomics lab and clinical data. The data base
provides analytic functions to users and links with other data bases such as Genbank and dbSNP. Dr. Weiss stressed that
part of “Omics” is accessibility. Human subjects concerns are a critical consideration when addressing data accessibility.
Access to data in the PharmGKB was discussed.
Pharmacogenomics research can be performed in family based as well as case control data sets. As with many of the
“OMIC” approaches multiple comparisons pose a problem. The pros and cons of each approach were discussed.
Although difficult, Dr. Weiss preferred the family based association approach. He posed the question of how to move from
association to actually finding the functional variant? This is an important question and the answer is not straightforward.
Dr. Weiss ended with a list of issues which need to be addressed for the future of Pharmacogenomic research.
These included: sample size, multiple comparisons, patterns of linkage disequilibrium, epistasis, and
gene - drug interactions. New statistical methods for modeling epistasis and interaction are needed.
Back to Table of Contents
Other "-Omics"
Dr. Ordovas opened his discussion of “other –omics” by stating that there has always
been Omics defined as the path to personalized prevention and therapy. The field has been structured around technology
not concepts. There is a need to integrate the environmental factors into the OMICS scheme. Dr. Ordovas likened the
scheme to building blocks for health. One starts with the technology, annotated phenotypes in databases, builds
knowledge by linking metabolism to phenotypes and then makes recommendations for health care and prevention. Replication
problems can muddy this path. Dr. Ordovas used the example of LIPC genotype and fat intake. If you consider the environment
you can replicate results. However, the LIPC genotype is not a predictor by itself; it is still far from clinical use.
Dr. Ordovas used examples from the Framingham Heart Study to demonstrate the complexity of the interactions between
two genes one dietary component and two risk factors. As you add more genes (even just one) you can not make clinical and
public health recommendations, it becomes too complex.
Dr. Ordovas then discussed deep phenotyping of lipids. The “Omics” of lipids started with the measurement of
subfractions. Dr. Ordovas showed some examples of Lipomics utilizing true mass measures and Surveyer data visualization.
Dr. Ordovas then mentioned the field of Metabolomics, specifically pertaining to steroids. A question when looking at
the NMR spectrum of urine is what to extract? He gave an example of separating healthy and vascular disease with NMR s
pectrum of urine. When environmental perturbations are assessed, there is a vector representing the response. The example
of statin response was given. It was noted that variability in response is huge even in inbred mice. Essentially there are
multiple vectors in hyperspace; some vectors can be modified and others can not.
In summary, the technology is available, the knowledge is there. The challenge is to extract the information and bring
people together. Dr. Ordovas gave the example of NUGO, the European Nutrigenomics Organization. This group was funded to
network, develop methods/ tools and set the foundation. The program is in its third year and time will tell.
Back to Table of Contents
NHLBI Population Studies for the Future
Population Study Design Dr. Boerwinkle then
discussed the future of NHLBI population Studies. He began
with the premise that we have a strong foundation to build
upon. We have already made a strong investment in longitudinal
cohort studies with state of the art phenotyping. Omic studies
are costly, but we already have a control / comparison group
collected. In general the current studies have good genomics,
but other Omics are limited. We have DNA available, we need to
assure that we have consent and the ability to share data.
Dr. Boerwinkle proposed that we need new population studies
because we may have been good at predicting risk, but not good
at predicting disease. Since disease progression is a lifelong
process perhaps we need a cohort beginning with children.
There is a changing pattern of disease, much of it due to
obesity and a changing pattern of health care, with an
emphasis on prevention which could be addressed by a new
cohort study.
Population studies of the future must be geographically,
demographically and ethnically representative. This will
require a large sample size to assess interactions, allow for
loss to follow up, and to be truly representative. To foster a
culture of data sharing requires social contracts in the
beginning with the participants, investigators, NIH and the
public.Dr. Boerwinkle proposed an NHLBI population study for
the future that is big, longitudinal and inclusive with the
goal of identifying strategies for using Omics to keep people
healthy. All participants should be sequenced and a synthetic
cohort could be used to cover the life span. The family should
be the unit of sampling because it helps enrollment and
retention.
Dr. Boerwinkle proposed that NHLBI and NHGRI support a
pilot study with the goals of: working out protocols and
recruitment, measuring environmental exposures, fostering a
culture of sharing, building relationships with institutions
(i.e., schools, HMOs) and building data base infrastructure
Dr. Boerwinkle proposed a sample size of 100,000 for the full
study and 10,000-20,000 for pilot study. If the study lasted
20 years it would amount to 2 million person years of
observation.
Dr. Boerwinkle commented that we need a new study to build
a culture of sharing from the beginning; it can not be forced
on existing studies. Posting data to Web sites presents a
legitimate concern regarding interpretation, intellectual
property and Human Subjects issues. Dr. Shea added that the
conceptualization of studies has changed, they are viewed as
resources.
Back to Table of Contents
Clinical Trial Study Design Dr. Fahey discussed
incorporating Omics into clinical studies and trials, using
studies of lung disease as an example. The goal being to
evaluate effect of intervention on a sample and identify the
baseline that predicts response. NCI has been very successful
with this approach, much of it attributable to the
availability of good tissue samples. For lung studies sample
collection can be very invasive.
Dr Fahey then identified barriers to incorporating Omics
into clinical trials. The two major barriers are complexity
and cost. Complexity influences sample collection. One needs
to be able to assess if the disease is expressed systemically
or if organ specific samples are needed. There are a variety
of biological samples available in Lung Disease Trials such
as: blood, urine, saliva, sputum, bronchial lavage, bronchial
brushings and bronchial biopsies. The technology platforms for
genomics and proteomics are also quite complex. The overall
complexity serves to increase cost, limit sample size, limit
the type of centers that can participate and require training
of physician scientists who are comfortable in the clinic and
laboratory.
Dr. Fahey then gave an example of incorporating gene
expression studies into an asthma trial. The goals were to
comprehensively characterize the gene expression changes in
the epithelial compartment in asthma and compare with changes
in smokers, and also to identify steroid-responsive genes in
asthmatics. The analysis identified differentially expressed
genes in asthmatic versus healthy subjects, as well as those
that are induced in asthma and are steroid responsive. The
study was accomplished using a team consisting of physicians,
NHLBI microarray core, biostatisticians, and lab experts. A
collaboration with industry and multiple funding sources were
utilized.
Barriers between physicians and “Omics scientists” were
encountered. A major barrier was to learn each other’s
language. The technology has been available for a while but
there is trouble applying it because there is a limited
knowledge base, it is labor intensive and there is a lack of
new investigators who have the necessary skills.
In summary, it is feasible to incorporate Omics into
clinical trials. To date there has been limited application,
most likely because of complexity and cost. Part of the
strategy for increasing the penetration of Omics into clinical
trials is the training of new physician scientists.
Back to Table of Contents
Biological Samples Dr. Tracy then spoke about
biological specimens necessary for Omics research. The
research focuses on going from the individual molecule to
pathways. To do this one needs the full complement of protein,
lipid, carbohydrate and nucleic acid data. One also needs to
stimulate pathways and look at response, both the magnitude of
response and who responds.
There are a large variety of biological samples available
for research. Blood and urine will continue to be valuable;
RNA, DNA, and living tissue are also available for research.
Issues such as the effect of tissue collection and handling on
tissue behavior and the consistency of living tissue sample
response need to be addressed.
Many proteins are not sensitive to proteases, however
peptides are sensitive. Oxidation and nuclease inhibition are
also issues for sample storage. There are many other post
translational changes; little is known about the importance of
these. Research into preparing cells so that metabolic
pathways remain intact is needed. However, this assumes we
know the pathways that are active in vivo and have assays for
these pathways. Research is also needed on ex vivo
modifications; we need a more complete knowledge of the
chemistries and their effects. Shipping of samples for Omics
research involves careful consideration of sample preparation
and the possible effect of freeze thaw cycles. Methods are
needed to preserve metabolic activity during shipping.
Dr. Tracy then discussed ways to improve repositories for
biological samples. We need better coordination of
repositories; it should be easy for researchers to use
samples. However, we must assure that the science produced
from these samples is good and that the originating
researchers receive proper recognition. It should be easy to
get samples from more than one repository at a time. It helps
to have coordination of studies contributing to the repository
from the start. Investigators need a comprehensive database of
what is currently available and improved communication to
shorten the time from inquiry to analysis.
Dr. Tracy presented some general issues for future
biological sample collection. The phenotypic work has become
more complex therefore central training facilities for
technical staff could be used to standardized methods and
provide certification. Method development for stimulation
tests is needed. A good example of one is the oral glucose
tolerance test. Others are needed for inflammation, immune
response, etc.
Back to Table of Contents
Human Subjects: Ethics and Feasibility Dr.
Guttmacher addressed the ethical concerns pertinent to adding
new technology and research on to existing cohorts and trials.
Informed consent is a major concern. The need for reconsent
depends on original consent and the process of reconsenting
participants can be difficult and costly. He suggested that we
seek to make consents at baseline appropriate for potential,
even unimagined, future applications.
In the “-Omics Age,” few if any data are truly anonymized.
Two criteria for truly anonymized data are: (1) the ID is
irretrievably removed or (2) it is impossible to identify data
under ANY circumstances. In cohorts and trials we may have to
settle for partially anonymized data with specific processes
in place to protect the identity of participants.
In family studies, the manner in which relatives are
recruited can have ethical implications. It is not necessary
to recruit all relatives using the proband as an intermediate
contact. However, the means of contact must protect the family
members’ privacy.
Studies that produce data in a way that affects the
population groups from which subjects come may require design
input and/or consent from those populations, not just
individual subjects. In these cases it is essential to engage
participants and group leaders early in the design and
planning for the study. This can be very costly, but it
increases the population’s investment in the research. This
approach may make retainment easier and can improve the
research. Investigators can engage groups by (1) including
them on the team, (2) providing the benefits of research and
infrastructure. The role of IRBs was briefly discussed. In the
age of complex, multi-center trials, we need to develop hybrid
systems that allow local IRBs to oversee local issues and
central IRBs to oversee central issues.
Dr. Guttmacher advocated as free and immediate access to
data as possible. He views research as a core resource, not a
study. However, one must weigh open access against the
participants’ rights and the rights of the PI who produced the
data.
Reporting of research results to participants has also
become an important issue. Participants should be involved in
determining what information they receive. However, the
difficulty arises because what is irrelevant today may be
relevant tomorrow. There are possible psychological
implications for family members of shared genetic research
results. Dr. Guttmacher stated that the perception of genetic
data as “different” is starting to change.
In summary, public consultation should be extensive in
planning a study. Researchers should strive for open-ended
informed consent, with encrypted databases to protect privacy
and confidentiality. A Central IRB would be highly
advantageous and data should be immediately accessible to
investigators who have IRB approval.
Back to Table of Contents
Bioinformatics Dr. Quackenbush then spoke to the
bioinformatic challenges of “Omic based” population studies.
Bioinformatics is basically information management systems,
and can link all levels of biology from DNA to Ecologies.
Dr. Quackenbush identified several research areas where
Bioinformatics can play a major role such as gene
identification and prediction of protein structure and
function, reconstruction of pathways and information networks,
linking of genotype and phenotype and prediction of relevant
outcomes as well as cross species investigations. He then
listed some of the fallacies regarding bioinformatics such as
the assumption that researchers need cutting edge
bioinformatics tools. Most researchers need simple tools that
are 10 years old. Additionally, bioinformatics is not cheap;
it can be as expensive as technology and data collection. Once
computational tools are produced they require maintenance and
modification, which is often overlooked.
Dr. Quackenbush presented the “Omics Dream” as a fully
integrated resource of clinical samples indexed and anonymized
but linked to clinical records. Genomic microarray proteomic
and metabolomic data collected using standardized format and
protocols would be available. All data would be housed in a
central, user friendly data base containing tools for data
integration and interpretation for scientists and
clinicians.
However, there are many challenges to implementing
bioinformatics in population and clinical studies. Challenges
in establishing the resource include establishing a standard
format for data collection and entry, a phenotypic ontology, a
mechanism for follow up and capture of data, secure access,
tracking and distribution of data and samples and the ability
to adapt to changing protocols. HIPPA regulations need to be
considered when developing such resources.
A user friendly, interoperable data base will enable access
as well as linking and referencing of data and analytic tools
from multiple sources. Bioinformatics tools are often
developed in a vacuum separate from needs of biologists. Tools
need to be developed in conjunction with biologists /
clinicians, it is clear that analysis in the absence of
biology is not a useful exercise.
In summary, engineering is not “sexy science” but
necessary. Much of this work falls outside of the traditional
funding, publishing and tenure realm. Bioinformatics should be
a balance of research and “consulting”. Bioinformatic tools
must be freely available and professionally documented. Data
should be freely accessible within the constraints of HIPPA.
Training can occur in different ways, training through
partnerships to solve problems is the most productive manner.
We need to keep in mind that mechanistic studies and biomarker
searches are different and require different tools.
Back to Table of Contents
Recommendations of Working Group
- Interactions between disciplines and integrated
research need to be promoted. Research should
incorporate the genome, proteome, environment/behavior and
phenotype and include epidemiologists, clinicians,
informaticians, statisticians, and –omics scientists.
- There is a need for training of interdisciplinary
investigatorsfor the future and retooling of mid and
senior investigators entering emerging interdisciplinary
fields
- Informatics components need greater emphasis, such as standardization across studies of
phenotypic definitions, data handling and sample tracking. New tools must be developed, for example:
methods for data and meta analysis/pooling. Computational analytic methods need to converge to
biological paradigms and bioinformatics tools need to be developed in concert with biologists and
clinicians.
- Basic aspects of the proteome need to be defined. Issues such as stability over time within
individuals, tissue specificities, post translational changes and reactivity to environmental
influences need to be addressed.
- Need to ensure data and samples are freely and widely accessible, to the extent practical and
possible. Efforts to ensure free and open access to data must consider the investigator rights,
human subject issues and notification of subjects of possible clinical implications of findings.
This will involve increasing leverage in existing studies. It will be necessary to establish
catalogues of studies and samples. Attention must be given to QC, data handling, and knowledge
of methods (phenotypic measurements, sample acquisition/handling/storage, genomic/proteomic) in
original studies.
- Statistical methodological development and expertise is needed in the OMICs field. Areas
identified for methodological development include: replication and independent validation of
findings, data integration and meta analysis, multiple comparison problems and pathway analysis.
There is a clash between the epidemiology tradition (hypothesis testing) and “Omic” discovery/data
mining. Communication and collaboration should be promoted to integrate the two approaches.
- Assure the continuity of Core Resources such as databases, web sites, genotyping and microarray
facilities. Metrics for assessment of use, needs and importance of NHLBI core resources are needed.
- New study or studies are essential for fresh look, fresh ideas, and fresh faces. Such a study would
need a large sample, long term follow-up (synthetic cohort) to capture progression of subclinical
disease and/or health state transitions. Due to the high cost there is a need for a new paradigm.
Pilot study and/or protocol development and testing prior to initiating the full study would be beneficial.
- An NHLBI Repository of gene expression data linked to clinical phenotypes is needed. This repository
should be openly accessible and easily searchable. Coordination between studies contributing to
these repositories is necessary when studies are designed and data are collected.
- There is a need for HLBS Molecular Pathology Tissue Sample Repositories. Coordination between
studies contributing to these repositories is necessary when samples are collected. The repository
should be openly accessible and easily searchable. Information allowing users to assess the utility
of samples for a variety of experiments is necessary.
- Establish a sustainable informatics infrastructure for HLBS studies, much like NCBI. This
would provide HLBS investigators with easily accessible and searchable, integrated information
regarding available samples, repositories, clinical and population study designs, phenotypes,
genotypes, clinical trials, etc..
- Need for methods development for stimulation tests in additional risk factor areas such as
inflammation and immune response. A good example of an existing test is the oral glucose tolerance
test. Stimulation tests could be developed on the cellular or organismal level.
- Develop fully integrated resources of anonymized clinical samples linked to clinical records.
In the “-Omics Age,” few if any data are truly anonymized. Therefore, we may have to settle for
partially anonymized data with specific processes in place to protect the identity of participants
in order to fully utilize existing clinical resources.
Back
to Table of Contents
|