HMS Press Release:

Harvard Medical School Researchers Create First Known Library of Reliably Expressible Proteins of a Human Disease, Breast Cancer
Publicly Available Library to Advance the Pace of Breast Cancer Research
First Use of Library in Model System of Mammalian Cells Opens New Research Pathways, Illustrates Power of Large Scale Testing of Disease-Associated Proteins in Simulated Environment

During the original screens, researchers, using a viral vector, inserted by hand the cancer causing cDNAs into breast epithelial cells. The Harvard Institute of Proteomics has now developed an automated process for these screens allowing them to run multiple screens. Photo by Marc Raila, HMS Office of Public Affairs

BOSTON-February 8, 2006-In research that could significantly advance the pace of drug discovery in the fight against breast cancer, Harvard Medical School investigators announce in today's online Journal of Proteome Research that they have created the first publicly available library of reliably expressible proteins of a human disease, in this case for breast cancer.

Perhaps more significantly, these researchers expressed a subset of the 1,300 protein-expressing complementary DNAs in the library into a model system mimicking cells of a human breast, allowing them to study on a broad scale how these proteins might contribute to the development of breast cancer. Through this comprehensive approach, they identified potentially novel functional activities for both well known and lesser-known breast cancer-associated proteins.

"The process of carcinogenesis is complex and involves the activation of many different cellular programs," says Joan Brugge, PhD, Chair, HMS Department of Cell Biology, and co-principal investigator of this initiative, called Breast Cancer 1000. "A significant limitation for breast cancer research has been the inability to distinguish whether certain proteins that are altered in breast tumor cells are the cause or the effect of onversion of normal breast cells to malignancy. The systematic approach that we've enabled and demonstrated will allow researchers to track cancer-causing proteins in simulated environments, with the goal of learning how to impede them."

"The availability of this collection will enable pilot experimentation and accelerate the development of faster techniques for studying breast cancer in a mammalian setting," says Joshua LaBaer, MD, PhD, director of the Harvard Institute of Proteomics (a division of Harvard Medical School), and also co-principal investigator. "To advance breast cancer research quickly, we are making the BC1000 library publicly available. It can be viewed from the Harvard Institute of Proteomics website.

"Drug design teams in the pharmaceutical industry traditionally have not used proteomics approaches to screen for potential targets, primarily because systematic proteomic tools are in their infancy," said Steven Carr, PhD, who was not part of this research team, and who leads the Proteomics group at the Broad Institute (of Harvard University and Massachusetts Institute of Technology). "While this work is highly in-vitro and needs further validation, the tools and approaches demonstrated in this study show a potentially valuable screening tool for drug companies, primarily as a means to triage for novel targets to design drugs around," said Carr, who prior to joining the Broad was director of Computational and Structural Sciences at SmithKline Pharmaceuticals and (now GlaxoSmithKline) and led protein science and proteomics groups at Millennium Pharmaceuticals. "This study helps lay the groundwork for new and refined proteomics tools for cancer and other diseases."

The American Cancer Society estimated that 211,240 new cases of invasive breast cancer would be diagnosed among women in 2005, as well as an estimated 58,490 additional cases of non-invasive (in situ) breast cancer. The ACS also estimated that approximately 40,410 women would die from breast cancer last year. Only lung cancer accounts for more cancer deaths in women.

ASSEMBLING THE LIBRARY

The Breast Cancer 1000 library is a collection of complementary DNA (cDNA) associated with breast cancer. Complementary DNA is generated from mRNA, which is produced by genes and contains the instructions on how to produce proteins. However, mRNAs are unstable outside of a cell, and therefore scientists convert it to cDNA for further long-term use. Researchers in BC1000 created a sequence-validated cDNA collection of roughly 1,300 breast cancer-related cDNAs ranging from well-studied breast cancer-causing genes to less conspicuous breast cancer-associated cDNAs.

Selection of the cDNAs for inclusion in the BC1000 library was a multipart effort. The first 200 genes were suggested by Boston area experts in breast cancer research. Another 50 genes were shown to be overexpressed in ductal carcinoma, one form of breast cancer. The remainder were identified by MedGene, a literature-mining software application developed by the Harvard Institute of Proteomics that searches all titles and abstracts in the Medline database to identify cDNAs co-cited with a particular disease and utilizes statistical methods to rank the relative strengths of these gene-disease relationships based on the frequency of total citation and co-citation.

"The work to isolate, sequence, and validate the BC1000 cDNAs was an immense undertaking, with multiple parties involved,î says LaBaer. "While the library covers a broad spectrum of breast cancer-related genes, it is not all inclusive," says LaBaer. "The addition of new genes to this collection, including genes more recently linked to breast cancer and genes more difficult to clone, is an ongoing effort."

OPENING THE BOOK ON PROTEINS INVOLVED IN CANCER

To assess the range and functionality of the cDNAs in the library, the investigators introduced the first 265 constructed cDNAs into a line of immortalized breast epithelial cells and subjected these cells to a single screen to examine their relationships to cell migration, proliferation and morphogenesis. From this screen, the researchers identified cDNAs already known to play roles in each, validating this approach as a means to identify relevant cDNAs. They also received hits from less-studied breast cancer genes, demonstrating the capability of using unbiased functional proteomics approaches to identify novel genes related to various aspects of disease biology.

The screen also identified novel functional activities for cDNAs known to be involved in other aspects of carcinogenesis. For example, researchers identified several proteins that stimulate migratory behavior, specifically, IL4, IL11and IL13. These results support previous findings and implications that this class of proteins may also be involved in bone metastases, and reflect the diverse functional activities of proteins that contribute to migratory behavior.

"The migration findings are particularly important, as historically the roles of genes in the process of invasion and metastasis--the most devastating aspects of cancer--have been very difficult to test," says LaBaer, "But tools such as BC1000 make this research much more accessible."

Several unexpected cDNAs were also found capable of inducing migration cooperatively when a known cancer associated cell-signaling pathway was also activated. For example, proteins SGK (serum and glucocorticoid-regulated kinase-1) and TNFRSF10B (tumor necrosis factor receptor, 10B) were both identified as pro-migratory, however they were previously recognized for their involvement in cell survival. The finding that cDNAs known to be involved in other cellular processes may also play a role in migration suggests that this approach may help uncover unanticipated activities for previously identified proteins.

The screen also identified genes that predictably and strongly induce cell proliferation. But in addition to the known genes, several other proteins that had not previously been implicated in cell proliferation were identified.

The morphogenesis and migration screens produced the greatest number of hits from the BC1000 cDNAs. Of the 75 cDNAs that induced cellular migration in the preliminary, single-pass screen, 66 were retested and 41 of these reproducibly scored as valid hits. The 41 validated migration hits were also reassessed in a morphogenesis assay, in which breast epithelial cells are able to organize into structures that resemble the glandular units of the normal breast. Of these 41 migration hits, 20 induced alterations in the morphology of such structures. The majority of these cDNAs prevented the formation of the typical hollow, spherical masses, and many of the disorganized structures showed a protrusive behavior resembling certain aspects of invasive tumor cells.

"Our labs will be following up on these new hits and further characterizing their meaning to the field," say Brugge and LaBaer. "This open process is exciting and hopefully will lead to the development of new therapy concepts."

There are several advantages of using a defined cDNA collection like the BC1000, compared to random, non-specific pooled cDNA libraries that have historically had a number of limitations that restricted the kind of detailed investigations necessary to fully interpret the causes of disease. In the BC1000 library, the identity of each clone is known; each clone is known to be of good quality, i.e. full-length and lack mutations; complex phenotypic assays are feasible as it is not necessary to sample millions of clones to compensate for redundancy found in pooled cDNA libraries; and lastly, there is more assurance that rare cDNAs are represented.

This work received significant funding from The Breast Cancer Research Foundation, the Cell Migration Consortium, and a program project and SPORE grant from the National Cancer Institute.

Contact:
John Lacey
Harvard Medical School
617-432-0442
(public_affairs@hms.harvard.edu)