Interview with Dr. Atul Butte
October 17, 2007

Introduction: This Findings Podcast is brought to you by the National Institute of General Medical Sciences, part of the National Institutes of Health. The Findings Podcast series features the NIGMS-funded scientists profiled in each issue of the Findings magazine.

Machalek: Hi, I'm Alisa Machalek and I'm here with Atul Butte, who is both a physician and a researcher at Stanford School of Medicine. As a doctor, he treats kids that have growth problems and diabetes. And as a scientist, he develops new computational tools to analyze huge amounts of biological data. He has now combined these two passions into a new project in an area called computational biology, or bioinformatics. Dr. Butte, can you explain to us what computational biology or bioinformatics is?

Butte: So bioinformatics and computational biology to me are very similar terms. The nature of life science research today is increasingly digital and so it’s not a new thing. The idea of sequencing DNA itself gets you a digitized set of letters or nucleotides of the genome. People knew that you could consider DNA analysis digitally for the past 20 or 30 years. The thing now is that a lot of the data in life sciences research is digital and it is growing exponentially. So when people think about how fast computers are getting each year, actually the data in life science is growing faster than that. In fact, we are going to reach a point where today’s computers are just not even going to be able to compute on all the life science data we have. And at that point we have to start thinking intelligently about what are we actually going to do with all this life sciences data we are collecting. And to me that is really the essence of bioinformatics.

Machalek: Dr. Butte, can you tell me what it is that you do in this area?

Butte: So I am a medical doctor. I am a pediatric endocrinologist. I take care of kids with diabetes and growth problems. I think about diseases. So I’ve been increasingly interested in thinking about all diseases. There are lots of diseases that affect us, diseases that affect kids that I treat, as well as elderly adults. And we’ve been thinking about how to classify diseases going back more than a thousand years. But most recently, instead of thinking about diseases the old-fashioned way, maybe about diseases that affect the heart or diseases that affect the lungs, we want to think about diseases based on hard numbers now. How do these diseases affect the entire genome? And now we have these technologies called gene chips that researchers all over the world are using that let us quantitate or measure every single gene in the genome and how much they change in each disease. So what it is we do specifically in my lab is we go to the repositories, the banks of all this data, and we pull it all down into our databases and we sift through it and figure all the different diseases that have ever been studied, using this 10-year-old technology. Using that, we start to build a brand new classification scheme for all of medicine. It’s a nice idea to do that, it makes some pretty pictures, but the real use for that is to think about diseases that maybe we don’t consider similar. For example, we know that elderly adults get myocardial infarctions, or heart attacks. Everyone knows what a heart attack is. Sometimes kids are born with a rare condition called muscular dystrophy. People know about the muscular dystrophy telethon, it’s a muscle weakness that kids are born with. And computationally now we can say that maybe these two diseases are actually similar to each other. And that’s fun for us because that means that we can start to think about all those drugs that we use to treat heart attacks, maybe a few of them, maybe even one of them, could be useful in a kid with muscular dystrophy, because today we don’t have any drugs for that. So we are using computers and publically available data that’s growing like crazy to start to make new predictions for existing drugs, new uses for those drugs.

Machalek: Dr. Butte, what advice would you have for young people who are interested in following in your footsteps, getting into this same kind of field of work?

Butte: I think this is an amazing time to go into computational biology and bioinformatics. It’s amazing because any high school kid today has enough Internet know-how to go to the same repositories I go to and download hundreds of thousands of these little gene chips. The data is all digitized. You don’t even have to learn how to handle cells in a wet biology lab. It’s all digital. Any kid on the Internet who knows how to click on Web sites and search for things using the Internet can easily find 5,000 samples on breast cancer today. Imagine doing that for a high school biology fair project. And those microarrays were priceless 10 years ago. So the fact is that so many researchers use that, that NIH and the funding agencies make us, we’re under mandates to share that data with others. But that gets down to the level of high school students. So, the first recommendation I would have is to get excited about biology again. Because it doesn’t have to be about plants and animals, it can actually be about computer science. The second thing is that many colleges and universities now offer majors in biomedical computation. We offer one at Stanford; there are many other universities that offer this. If I had a major like that when I was in college 10, 15 years ago I would have absolutely majored in it. It’s an exciting time to be able to take courses in biology and computer science and learn how to put those two things together. So really, the common theme is to get excited about biology and computer science at the same time. The data is growing like crazy. What we are really lacking is folks and kids and really researchers who can think about new questions we can ask of this data.