Working Group Summary
The Next Step Population Studies in the "-OMIC" Age

Bethesda Park Clarion
Bethesda, Maryland

June 14, 2005

TABLE OF CONTENTS

Introductions

Dr. Jaquish opened the meeting at 8:30 AM and welcomed participants. Introductions were made and the meeting was turned over to the Co-Chairs Drs. Shea and Van Eyk.

CHARGE:
Drs. Shea and Van Eyk outlined the charge to the Working Group. The charge for the day was to:

Assess the current implementation of high throughput technology in population studies and clinical trials
Assess what is needed to accelerate the integration of “Omics” into population and clinical studies.
Determine where the integration of high throughput biology and population / clinical studies will produce the highest yield.
Specific areas to be discussed and assessed include: bioinformatics, technology, samples, funding, consent / notification.

Back to Table of Contents

Current Status of NHLBI Population Studies

Dr. Manolio gave an overview of the population based studies and resources supported by NHLBI. She outlined the current approach to large scale studies in the Division of Epidemiology and Clinical Applications (DECA). The paradigm generally includes assessment of family clustering,identification of novel and established risk factors,identification of interventions, dissemination and application of knowledge to public health and clinical efforts.

Dr. Manolio gave an overview of the large observational studies and trials that include genetics. She provided a list of family studies and trials/cohorts which have incorporated genotyping with a summary of the demographics of each cohort and major phenotypes collected. Details are provided in her slides.

An overview of the Limited Access Data Policy was given. This policy is based on the idea that the NHLBI large cohorts and trials provide a wealth of data which could not be completely mined by any one group. Therefore, studies are required to provide a data set (including genetic data) which is accessible to the scientific community. Careful consideration to participant confidentiality and privacy has been given in the development of this policy. The process for requesting these data was reviewed.

Dr. Manolio also discussed a new initiative in the area of Large Scale Genotyping. This proposal provides dense SNP genotyping in candidate genes in the large NHLBI cohorts as well as genome wide SNP typing in a sample of cases and controls. These data along with phenotype data would be made available to investigators with IRB approval. This project will present a bioinformatics challenge. Efforts are currently underway to develop a pilot database for NHLBI studies with standard data definitions based on the NCI caBIG model. This will be a major challenge and requirement as NHLBI moves more into “Omics” research.

Dr. Paltoo from the NHLBI Division of Heart and Vascular Disease (DHVD) presented an overview of clinical and technology programs utilizing an “Omic approach”. She began with the NHLBI Shared Microarray Facilities that provide study design guidance, analytical tools, knowledge / skills development and education in addition to the technological service. Even with this service there is an ongoing need for support and microarray services because of the rapidly evolving technology. Discussion expanded and the participants agreed that ongoing support and infrastructure would be necessary for other Omic technology for the same reasons. Suggestions such as developing repositories and networks and means of ensuring readily accessible data were discussed. It was stressed that this was NOT just keeping existing grants and contracts alive, but rather extending them with the field as it grows and develops.

Dr. Paltoo then continued with a discussion of the new NHLBI Resequencing and Genotyping Service which will provide sequencing and genotyping for fine mapping. The idea was based upon the success of the Mammalian Genotyping Service.

Dr. Paltoo provided a brief overview of the repositories available to NHLBI grantees. Details of these are in her slides. Overviews of the Rat Genome Database, the Human Phenome Database and the Pharmacogenetics Knowledge Base were given. Questions were raised regarding how long these resources would be supported. It was suggested that NHLBI needs to have measures of productivity, importance and use for these web sites, databases and core resources.

Dr. Paltoo then presented the Programs in Genomic Applications which were designed to advance the field of functional genomics. All tools and resources developed are freely accessible to the scientific community and training and education have been built into the program. The Proteomics initiative develops innovative proteomic technologies while focusing on NHLBI phenotypes. Dr. Paltoo then highlighted several programs in Pharmacogenetics and Gene by Environment Interaction and the GeneLink Program which is an effort to promote a collaborative approach to gene finding in NHLBI funded familystudies.

Several group members commented that many of the initiatives are driven by technology and there appeared to be little effort to integrate across “levels” of research. In addition, design and analysis often takes a back seat to technology. Investigators can not expect to “deliver data” to analysts, research must be collaboration from the beginning of the study. Dr. Fahy observed that limitations in skill sets tend to drive people together and efforts are needed in education, particularly of young investigators. Dr. Gibbons expressed surprise at the number and diversity of NHLBI Programs. He felt investigators need more familiarity with the NHLBI portfolio and resources. Dr. Van Eyk stated that there was a need for coordination of resources with a “big picture” approach.

Back to Table of Contents

The Role of NHLBI Population Studies in the "-Omic Age" High Throughput Genotyping

Dr. Nickerson provided a discussion of the current status and future of high throughput genotyping. She focused on SNP genotyping, beginning with the statement that this technology will probably change the most in the next 5 years. She began with a discussion of the approach and design of studies. Investigators must have a clear hypothesis and rationale in order to collaborate with genomics investigators to design a study. The heritability of the trait will determine the approach and the power to detect genetic variants. The scientific question will also determine the number of SNPs to be typed. Are questions related to candidate genes (5-10 SNPs), pathways (384-1500 SNPs), or genomes (100K-500K SNPs)? The advent of whole genome amplification has made cell lines unnecessary in most situations.

Dr. Nickerson then discussed the technology involved in several of the available genotyping assays. The basic technolog for genotyping is solid. She reviewed the allele specific hybridization technology used by Affymetrix and the oligonucleotide ligation approach used by the Illumina platform. Microtiter plates are generally used for small projects. Electrophoresis has been used for moderate size genotyping, but will most likely phase out soon, because it is just as expensive as arrays. Various types of arrays have become standard for large scale genotyping.

Dr. Nickerson presented a slide of cost for various genotyping approaches. Costs range from $0.6/ SNP to $0.004/SNP. The details appear in her slides. Generally the larger the scale the less ability you have to pick your SNPs.

The great technology leap in genotyping has been the elimination of pre- PCR. Products can be derived directly from genomic DNA. She then discussed the general technology of several platforms such as Illumina (Bead based) and Affymetrix and Parallele (Chip based). A recent innovation is the Affymetrix “Chiplet” which has a chip in each well of a 96 well plate.

Dr. Nickerson the posed the question: “Is SNP typing just an intermediate stop?” and answered: “yes”. The ultimate goal is whole genome sequencing. She presented a new method of insitu sequencing where the fluorescence changes as each base is added. This method allows simultaneous sequencing at millions of spots on a chip and could cost as little as $1000 per sequence.

In Summary, many different genotyping approaches are available and we will probably need to utilize multiple formats depending on the question and whether specific SNPs are targeted. Costs still tend to dictate study design and there are tradeoffs in throughput, i.e. number of samples vs. SNPs.

Back to Table of Contents

Proteomics

Dr. Ware presented a discussion of Proteomics. She divided the current research into two general areas: (1) targeted protein detection (investigation of biomarkers) and (2) discovery proteomics (unbiased proteomic investigation).

She gave an overview of the Acute Respiratory Distress Syndrome (ARDS) Network which has taken both approaches to proteomic research. First in the targeted approach the investigators looked at biomarkers previously identified in other studies. Individual markers did not have good prediction of outcomes. However, bivariate analyses showed that considering two biomarkers simultaneously provided much better prediction of outcomes revealing the underlying complexity of the system. The ARDS investigators also used blood, bronchial lavage, tissue, and urine to discover new biomarkers for the disease. Several different platforms exist to perform these experiments.

Dr. Ware discussed several difficulties encountered when using blood samples for Proteomic research. First, there is a large abundance of proteins, such as albumin and globulin which interfere with your analysis. Purification to remove these proteins has been very time consuming and is now being automated.

In summary, Dr. Ware cited many of the current challenges in the field such as sample collection and storage. The freeze and thaw problem makes it difficult to tack proteomics on to existing studies. However, there is a need for existing samples for validation. A mechanism is needed to easily obtain basic sample information to assess if a sample can be used for a specific question. There is a clear role for repositories to provide such information and to connect investigators who have tools / samples. Sample processing / purification methods must be developed to remove high abundance proteins. High quality clinical databases that can be integrated with proteomic data are essential to progress in the field.

Proteomics is far from having established standards, and lags behind genomics. It is also not clear how much of the observed proteome is a function of an individual at time t. In addition, appropriate controls for proteomic studies must be defined and biostatistical and bioinformatic algorithms developed to analyze complex proteomic data. In addition, there is a need for integration of genetic and gene expression data with proteomic data.

The ultimate goal is translation of highly technical discovery proteomics platforms or multiplex biomarker assays to clinical diagnostic laboratory tests. Some proteins such as PSA are great for managing disease, but terrible for diagnosis. Therefore, translation may be protein specific.

Back to Table of Contents

Functional Genomics

Dr. Gibbons then discussed cardiovascular functional genomics specifically vascular remodeling with respect to hypertension. He identified several core questions that a “user” of functional genomics would face and discussed the need to address these questions to support research. The first challenge is locating a comparative gene profile data set. Another question to be addressed is: “What is the translational significance of findings in model organisms?” Also how do results translate to different disease contexts? One must be easily able to search information regarding genes such as proximity to a QTL and existence of related transcription factors.

Dr. Gibbons mentioned some opportunities to collect human specimens for HLBS phenotypes. He proposed that the NHLBI community needs ready access to a shared resource repository of gene expression data on human specimens linked to clinical phenotypes.

Dr. Gibbons then discussed the importance of epigenomics in genome expression. DNA Methylation Patterns and chromatin remodeling / histones can influence gene expression. He provided examples from the vascular transcriptome and hypertension. Dr. Gibbons suggested that there may be a need for a systematic effort to characterize patterns of methylation in banked human samples with well defined phenotypes.

Dr. Gibbons then discussed expression profiles of peripheral cells and their role in vascular cell differentiation. He proposed that these cells, their phenotypic characterization and their mRNA expression profiles could be in a shared data repository of tissue specimens relevant to HLBS disorders. Sources of such cells could include: NHLBI clinical networks, population-based cohorts and clinical trials.

Dr. Gibbons then discussed use of candidate genes and resources in functional genomics research. He presented an example of PPAR gamma in the knock out mouse to illustrate the need for a multidisciplinary approach to research. The search for functional SNPs in this study necessitated going beyond initial 1-2 Kb. Dr. Gibbons suggested that we should look at highly conserved regions of functional interest on a genome wide level.

Dr. Gibbons then suggested possible opportunities and resources for NHLBI “Omics” research. The first being the idea of HLBS Molecular Pathology Tissue Sample Repositories. These could include target tissues (biopsies, lavage, surgery) linked to phenotype (consomics; clinical phenotypes) and peripheral progenitor cell phenotypes. Epigenomics and disease phenotypes, such as genome-wide DNA methylation patterns (links to expression profiles, QTLs) and genome-wide histone code (links to expression profiles, QTLs) provide opportunities for functional genomics research. Data repositories and data sharing and access will be critical to this effort. Bioinformatics tools would provide accessible links between repositories and data-mining tools and promote data sharing between researchers. The discussion turned to how to move to the next level. Investigators are not traditionally supported to do this and there is a need to promote communication. Networks and RFAs have acted as band-aides because they do not provide the lasting infrastructure necessary. There is a need for a sustainable infrastructure at the NIH level to promote sharing and generate resources.

Back to Table of Contents

Pharmacogenomics

Dr. Weiss began his discussion of Pharmacogenomics by describing the Pharmacogenetics Research Network (PGRN). The PharmGKB consists of a national resource linking genomics lab and clinical data. The data base provides analytic functions to users and links with other data bases such as Genbank and dbSNP. Dr. Weiss stressed that part of “Omics” is accessibility. Human subjects concerns are a critical consideration when addressing data accessibility. Access to data in the PharmGKB was discussed.

Pharmacogenomics research can be performed in family based as well as case control data sets. As with many of the “OMIC” approaches multiple comparisons pose a problem. The pros and cons of each approach were discussed. Although difficult, Dr. Weiss preferred the family based association approach. He posed the question of how to move from association to actually finding the functional variant? This is an important question and the answer is not straightforward.

Dr. Weiss ended with a list of issues which need to be addressed for the future of Pharmacogenomic research. These included: sample size, multiple comparisons, patterns of linkage disequilibrium, epistasis, and gene - drug interactions. New statistical methods for modeling epistasis and interaction are needed.

Back to Table of Contents

Other "-Omics"

Dr. Ordovas opened his discussion of “other –omics” by stating that there has always been Omics defined as the path to personalized prevention and therapy. The field has been structured around technology not concepts. There is a need to integrate the environmental factors into the OMICS scheme. Dr. Ordovas likened the scheme to building blocks for health. One starts with the technology, annotated phenotypes in databases, builds knowledge by linking metabolism to phenotypes and then makes recommendations for health care and prevention. Replication problems can muddy this path. Dr. Ordovas used the example of LIPC genotype and fat intake. If you consider the environment you can replicate results. However, the LIPC genotype is not a predictor by itself; it is still far from clinical use.

Dr. Ordovas used examples from the Framingham Heart Study to demonstrate the complexity of the interactions between two genes one dietary component and two risk factors. As you add more genes (even just one) you can not make clinical and public health recommendations, it becomes too complex.

Dr. Ordovas then discussed deep phenotyping of lipids. The “Omics” of lipids started with the measurement of subfractions. Dr. Ordovas showed some examples of Lipomics utilizing true mass measures and Surveyer data visualization.

Dr. Ordovas then mentioned the field of Metabolomics, specifically pertaining to steroids. A question when looking at the NMR spectrum of urine is what to extract? He gave an example of separating healthy and vascular disease with NMR s pectrum of urine. When environmental perturbations are assessed, there is a vector representing the response. The example of statin response was given. It was noted that variability in response is huge even in inbred mice. Essentially there are multiple vectors in hyperspace; some vectors can be modified and others can not.

In summary, the technology is available, the knowledge is there. The challenge is to extract the information and bring people together. Dr. Ordovas gave the example of NUGO, the European Nutrigenomics Organization. This group was funded to network, develop methods/ tools and set the foundation. The program is in its third year and time will tell.

Back to Table of Contents

NHLBI Population Studies for the Future

Population Study Design
Dr. Boerwinkle then discussed the future of NHLBI population Studies. He began with the premise that we have a strong foundation to build upon. We have already made a strong investment in longitudinal cohort studies with state of the art phenotyping. Omic studies are costly, but we already have a control / comparison group collected. In general the current studies have good genomics, but other Omics are limited. We have DNA available, we need to assure that we have consent and the ability to share data.

Dr. Boerwinkle proposed that we need new population studies because we may have been good at predicting risk, but not good at predicting disease. Since disease progression is a lifelong process perhaps we need a cohort beginning with children. There is a changing pattern of disease, much of it due to obesity and a changing pattern of health care, with an emphasis on prevention which could be addressed by a new cohort study.

Population studies of the future must be geographically, demographically and ethnically representative. This will require a large sample size to assess interactions, allow for loss to follow up, and to be truly representative. To foster a culture of data sharing requires social contracts in the beginning with the participants, investigators, NIH and the public.Dr. Boerwinkle proposed an NHLBI population study for the future that is big, longitudinal and inclusive with the goal of identifying strategies for using Omics to keep people healthy. All participants should be sequenced and a synthetic cohort could be used to cover the life span. The family should be the unit of sampling because it helps enrollment and retention.

Dr. Boerwinkle proposed that NHLBI and NHGRI support a pilot study with the goals of: working out protocols and recruitment, measuring environmental exposures, fostering a culture of sharing, building relationships with institutions (i.e., schools, HMOs) and building data base infrastructure Dr. Boerwinkle proposed a sample size of 100,000 for the full study and 10,000-20,000 for pilot study. If the study lasted 20 years it would amount to 2 million person years of observation.

Dr. Boerwinkle commented that we need a new study to build a culture of sharing from the beginning; it can not be forced on existing studies. Posting data to Web sites presents a legitimate concern regarding interpretation, intellectual property and Human Subjects issues. Dr. Shea added that the conceptualization of studies has changed, they are viewed as resources.

Back to Table of Contents

Clinical Trial Study Design
Dr. Fahey discussed incorporating Omics into clinical studies and trials, using studies of lung disease as an example. The goal being to evaluate effect of intervention on a sample and identify the baseline that predicts response. NCI has been very successful with this approach, much of it attributable to the availability of good tissue samples. For lung studies sample collection can be very invasive.

Dr Fahey then identified barriers to incorporating Omics into clinical trials. The two major barriers are complexity and cost. Complexity influences sample collection. One needs to be able to assess if the disease is expressed systemically or if organ specific samples are needed. There are a variety of biological samples available in Lung Disease Trials such as: blood, urine, saliva, sputum, bronchial lavage, bronchial brushings and bronchial biopsies. The technology platforms for genomics and proteomics are also quite complex. The overall complexity serves to increase cost, limit sample size, limit the type of centers that can participate and require training of physician scientists who are comfortable in the clinic and laboratory.

Dr. Fahey then gave an example of incorporating gene expression studies into an asthma trial. The goals were to comprehensively characterize the gene expression changes in the epithelial compartment in asthma and compare with changes in smokers, and also to identify steroid-responsive genes in asthmatics. The analysis identified differentially expressed genes in asthmatic versus healthy subjects, as well as those that are induced in asthma and are steroid responsive. The study was accomplished using a team consisting of physicians, NHLBI microarray core, biostatisticians, and lab experts. A collaboration with industry and multiple funding sources were utilized.

Barriers between physicians and “Omics scientists” were encountered. A major barrier was to learn each other’s language. The technology has been available for a while but there is trouble applying it because there is a limited knowledge base, it is labor intensive and there is a lack of new investigators who have the necessary skills.

In summary, it is feasible to incorporate Omics into clinical trials. To date there has been limited application, most likely because of complexity and cost. Part of the strategy for increasing the penetration of Omics into clinical trials is the training of new physician scientists.

Back to Table of Contents

Biological Samples
Dr. Tracy then spoke about biological specimens necessary for Omics research. The research focuses on going from the individual molecule to pathways. To do this one needs the full complement of protein, lipid, carbohydrate and nucleic acid data. One also needs to stimulate pathways and look at response, both the magnitude of response and who responds.

There are a large variety of biological samples available for research. Blood and urine will continue to be valuable; RNA, DNA, and living tissue are also available for research. Issues such as the effect of tissue collection and handling on tissue behavior and the consistency of living tissue sample response need to be addressed.

Many proteins are not sensitive to proteases, however peptides are sensitive. Oxidation and nuclease inhibition are also issues for sample storage. There are many other post translational changes; little is known about the importance of these. Research into preparing cells so that metabolic pathways remain intact is needed. However, this assumes we know the pathways that are active in vivo and have assays for these pathways. Research is also needed on ex vivo modifications; we need a more complete knowledge of the chemistries and their effects. Shipping of samples for Omics research involves careful consideration of sample preparation and the possible effect of freeze thaw cycles. Methods are needed to preserve metabolic activity during shipping.

Dr. Tracy then discussed ways to improve repositories for biological samples. We need better coordination of repositories; it should be easy for researchers to use samples. However, we must assure that the science produced from these samples is good and that the originating researchers receive proper recognition. It should be easy to get samples from more than one repository at a time. It helps to have coordination of studies contributing to the repository from the start. Investigators need a comprehensive database of what is currently available and improved communication to shorten the time from inquiry to analysis.

Dr. Tracy presented some general issues for future biological sample collection. The phenotypic work has become more complex therefore central training facilities for technical staff could be used to standardized methods and provide certification. Method development for stimulation tests is needed. A good example of one is the oral glucose tolerance test. Others are needed for inflammation, immune response, etc.

Back to Table of Contents

Human Subjects: Ethics and Feasibility
Dr. Guttmacher addressed the ethical concerns pertinent to adding new technology and research on to existing cohorts and trials. Informed consent is a major concern. The need for reconsent depends on original consent and the process of reconsenting participants can be difficult and costly. He suggested that we seek to make consents at baseline appropriate for potential, even unimagined, future applications.

In the “-Omics Age,” few if any data are truly anonymized. Two criteria for truly anonymized data are: (1) the ID is irretrievably removed or (2) it is impossible to identify data under ANY circumstances. In cohorts and trials we may have to settle for partially anonymized data with specific processes in place to protect the identity of participants.

In family studies, the manner in which relatives are recruited can have ethical implications. It is not necessary to recruit all relatives using the proband as an intermediate contact. However, the means of contact must protect the family members’ privacy.

Studies that produce data in a way that affects the population groups from which subjects come may require design input and/or consent from those populations, not just individual subjects. In these cases it is essential to engage participants and group leaders early in the design and planning for the study. This can be very costly, but it increases the population’s investment in the research. This approach may make retainment easier and can improve the research. Investigators can engage groups by (1) including them on the team, (2) providing the benefits of research and infrastructure. The role of IRBs was briefly discussed. In the age of complex, multi-center trials, we need to develop hybrid systems that allow local IRBs to oversee local issues and central IRBs to oversee central issues.

Dr. Guttmacher advocated as free and immediate access to data as possible. He views research as a core resource, not a study. However, one must weigh open access against the participants’ rights and the rights of the PI who produced the data.

Reporting of research results to participants has also become an important issue. Participants should be involved in determining what information they receive. However, the difficulty arises because what is irrelevant today may be relevant tomorrow. There are possible psychological implications for family members of shared genetic research results. Dr. Guttmacher stated that the perception of genetic data as “different” is starting to change.

In summary, public consultation should be extensive in planning a study. Researchers should strive for open-ended informed consent, with encrypted databases to protect privacy and confidentiality. A Central IRB would be highly advantageous and data should be immediately accessible to investigators who have IRB approval.

Back to Table of Contents

Bioinformatics
Dr. Quackenbush then spoke to the bioinformatic challenges of “Omic based” population studies. Bioinformatics is basically information management systems, and can link all levels of biology from DNA to Ecologies.

Dr. Quackenbush identified several research areas where Bioinformatics can play a major role such as gene identification and prediction of protein structure and function, reconstruction of pathways and information networks, linking of genotype and phenotype and prediction of relevant outcomes as well as cross species investigations. He then listed some of the fallacies regarding bioinformatics such as the assumption that researchers need cutting edge bioinformatics tools. Most researchers need simple tools that are 10 years old. Additionally, bioinformatics is not cheap; it can be as expensive as technology and data collection. Once computational tools are produced they require maintenance and modification, which is often overlooked.

Dr. Quackenbush presented the “Omics Dream” as a fully integrated resource of clinical samples indexed and anonymized but linked to clinical records. Genomic microarray proteomic and metabolomic data collected using standardized format and protocols would be available. All data would be housed in a central, user friendly data base containing tools for data integration and interpretation for scientists and clinicians.

However, there are many challenges to implementing bioinformatics in population and clinical studies. Challenges in establishing the resource include establishing a standard format for data collection and entry, a phenotypic ontology, a mechanism for follow up and capture of data, secure access, tracking and distribution of data and samples and the ability to adapt to changing protocols. HIPPA regulations need to be considered when developing such resources.

A user friendly, interoperable data base will enable access as well as linking and referencing of data and analytic tools from multiple sources. Bioinformatics tools are often developed in a vacuum separate from needs of biologists. Tools need to be developed in conjunction with biologists / clinicians, it is clear that analysis in the absence of biology is not a useful exercise.

In summary, engineering is not “sexy science” but necessary. Much of this work falls outside of the traditional funding, publishing and tenure realm. Bioinformatics should be a balance of research and “consulting”. Bioinformatic tools must be freely available and professionally documented. Data should be freely accessible within the constraints of HIPPA. Training can occur in different ways, training through partnerships to solve problems is the most productive manner. We need to keep in mind that mechanistic studies and biomarker searches are different and require different tools.

Back to Table of Contents

Recommendations of Working Group

Interactions between disciplines and integrated research need to be promoted. Research should incorporate the genome, proteome, environment/behavior and phenotype and include epidemiologists, clinicians, informaticians, statisticians, and –omics scientists.
There is a need for training of interdisciplinary investigatorsfor the future and retooling of mid and senior investigators entering emerging interdisciplinary fields
Informatics components need greater emphasis, such as standardization across studies of phenotypic definitions, data handling and sample tracking. New tools must be developed, for example: methods for data and meta analysis/pooling. Computational analytic methods need to converge to biological paradigms and bioinformatics tools need to be developed in concert with biologists and clinicians.
Basic aspects of the proteome need to be defined. Issues such as stability over time within individuals, tissue specificities, post translational changes and reactivity to environmental influences need to be addressed.
Need to ensure data and samples are freely and widely accessible, to the extent practical and possible. Efforts to ensure free and open access to data must consider the investigator rights, human subject issues and notification of subjects of possible clinical implications of findings. This will involve increasing leverage in existing studies. It will be necessary to establish catalogues of studies and samples. Attention must be given to QC, data handling, and knowledge of methods (phenotypic measurements, sample acquisition/handling/storage, genomic/proteomic) in original studies.
Statistical methodological development and expertise is needed in the OMICs field. Areas identified for methodological development include: replication and independent validation of findings, data integration and meta analysis, multiple comparison problems and pathway analysis. There is a clash between the epidemiology tradition (hypothesis testing) and “Omic” discovery/data mining. Communication and collaboration should be promoted to integrate the two approaches.
Assure the continuity of Core Resources such as databases, web sites, genotyping and microarray facilities. Metrics for assessment of use, needs and importance of NHLBI core resources are needed.
New study or studies are essential for fresh look, fresh ideas, and fresh faces. Such a study would need a large sample, long term follow-up (synthetic cohort) to capture progression of subclinical disease and/or health state transitions. Due to the high cost there is a need for a new paradigm. Pilot study and/or protocol development and testing prior to initiating the full study would be beneficial.
An NHLBI Repository of gene expression data linked to clinical phenotypes is needed. This repository should be openly accessible and easily searchable. Coordination between studies contributing to these repositories is necessary when studies are designed and data are collected.
There is a need for HLBS Molecular Pathology Tissue Sample Repositories. Coordination between studies contributing to these repositories is necessary when samples are collected. The repository should be openly accessible and easily searchable. Information allowing users to assess the utility of samples for a variety of experiments is necessary.
Establish a sustainable informatics infrastructure for HLBS studies, much like NCBI. This would provide HLBS investigators with easily accessible and searchable, integrated information regarding available samples, repositories, clinical and population study designs, phenotypes, genotypes, clinical trials, etc..
Need for methods development for stimulation tests in additional risk factor areas such as inflammation and immune response. A good example of an existing test is the oral glucose tolerance test. Stimulation tests could be developed on the cellular or organismal level.
Develop fully integrated resources of anonymized clinical samples linked to clinical records. In the “-Omics Age,” few if any data are truly anonymized. Therefore, we may have to settle for partially anonymized data with specific processes in place to protect the identity of participants in order to fully utilize existing clinical resources.