skip to content National Cancer Institute www.cancer.gov U.S. National Institutes of Health
DCEG Linkage Logo Banner
November 2008 • Number 34
   

Kai Yu Loves to Solve Problems

A photograph of Kai Yu.

Kai Yu

“It was really love at first sight,” said Kai Yu, Ph.D., referring to the first time he saw the advertisement for his present position as an investigator in the Biostatistics Branch (BB). “I wasn’t even really looking. I was happy where I was, but I saw the job ad on the Internet, and it looked like the perfect opportunity.”

Dr. Yu completed his M.S. in applied mathematics at the Beijing University of Posts and Telecommunications in China before coming to the United States to study computational engineering. He changed direction because he “found applied mathematics too confining. You tend to focus on one technique and apply it to any relevant question.” Interested in biology but trained in mathematics, he combined the two disciplines, completing a Ph.D. in biostatistics at the University of Pittsburgh followed by postdoctoral work in statistical genetics at Stanford University.

Excited by rapid advances in medical technology and the development of large datasets, Dr. Yu saw statistical genetics as a very promising field. “I had little difficulty picking up the basics of genetics,” Dr. Yu said. “There are a few fundamental rules, and everything flows from them. Besides,” he chuckled, “one of the foundational concepts of genetics, the Hardy-Weinberg equilibrium, was first described by a mathematician.”

His first contact at NCI in 2005 was Sholom Wacholder, Ph.D. (BB), who quickly confirmed Dr. Yu’s initial impressions of the Institute as a place for groundbreaking research. Dr. Yu’s conversations with Dr. Wacholder convinced him to leave his position in the School of Medicine at Washington University in St. Louis and join NCI. “Dr. Wacholder has become my mentor,” said Dr. Yu. “I have worked with him on several projects and he is always extremely helpful. Of course my other colleagues in the branch are also very supportive.”

What Dr. Yu loves about his work at NCI is the opportunity to work on high-impact projects. “You’re not just working on abstract problems, but rather on something relevant to current research,” he said. For example, he recently teamed up with investigators in the Cancer Genetic Markers of Susceptibility (CGEMS) project to study the impact of population stratification in genome-wide association studies (GWAS) with different control selection strategies.

For this investigation, researchers generated two new studies using data from the CGEMS multistage studies of prostate and breast cancers. One combined the case data from the prostate cancer GWAS with control data from the breast cancer GWAS; the second used control data from the prostate cancer GWAS to compare with case data from the breast cancer GWAS. Analysis of the two original studies revealed very minor inflation of type I error when the cases were compared with their own controls. As expected, exchanging control groups increased this inflation factor. Dr. Yu and his colleagues developed a principal components selection procedure that allowed them to correct this inflation back to the original levels. As Dr. Yu explained, “our findings suggested that the reuse of controls from other studies can have acceptable type I error when a strategy to correct the effects of population satisfaction is employed. This opens the door to more cost-efficient GWAS with equally reliable results.”

Dr. Yu has also explored new approaches to more effective correction of population stratification, working with a postdoctoral fellow, Qizhai Li, Ph.D. (BB). Using multidimensional scaling techniques, they uncovered both clustered and continuous patterns of population substructure and adjusted the data with improved strategies. “Currently, in carefully designed studies of European Americans, we don’t see a big problem with population substructure,” he explained. “For more diverse populations, this could become a real issue.”

Explaining another of his recent investigations, Dr. Yu noted that “once you find a ‘hit’ (i.e., a single nucleotide polymorphism cancer association), you want to design a second study to replicate it, to see if your results are reliable. You need to know how big your sample should be and whether to combine data from the second study with those from the first.” Dr. Yu and his colleagues used data from a published study of non-Hodgkin lymphoma to estimate the effect size for disease markers by a bootstrap procedure. “Because the naive estimate (the one observed from the first study) tends to be biased upward,” Dr. Yu explained, “your power calculation gives a smaller sample size than actually needed.” They showed that their estimates were more accurate, and their sample-size calculations gave power closer to the nominal level. They also concluded that reusing data from the first study is generally superior. “This study is an important first step toward a solution to this question,” Dr. Yu said.

Perhaps the most exciting of Dr. Yu’s ongoing investigations is looking for ways to combine biological knowledge of genetic pathways and evidence from microarray studies with GWAS to uncover true genetic risk factors for cancer. Dr. Yu believes it is critical to better incorporate existing knowledge of the disease into the gene-finding algorithm. “We need a better, more powerful algorithm guided by existing knowledge that can search the entire space of all possible models more intelligently.”

“I love solving problems,” he said. “The good thing about my work here at NCI is that problem solving can have a direct impact on public health.”

—Terry Taylor, M.A.

Back to Top

National Cancer Institute U.S. Department of Health & Human Services National Institutes of Health USA.gov