Skip navigation links
 
NIGMS Home | Site Map | Staff Search

The Virtual Cell Project

Dr. Les Loew
Workshop on the Development of Databases of Interacting Systems
November 11, 1999

Well, I couldn't have had a better introduction to my talk than Dr. Selkoff's--Dr. Selkoff's talk, as well as the comment by Dr. Paulson at the end there, because I think that the issue that Dr. Paulson raised is precisely the issue that we are, perhaps very over-ambitiously, trying to address.

Several people talked about the virtual cell as being sort of the ultimate goal of many of these database projects. We're being a little bit presumptuous, obviously, in calling our project The Virtual Cell.

[SLIDE 1] This is not what you're supposed to read, but it's kind of amusing if you are reading it. This was written by Auguste DeCompte, who was a mathematician and biologist in the middle of the 19th century.

Believe it or not, I think many of my biologist colleagues--experimental biologists--still hold to this view. It's our mission, I think, as computational biologists and bio-informaticists, to try to make people realize that that's not the appropriate view. Let me see if I can get to where I want to go.

[SLIDE 2] My acknowledgement is, first of all, to all of the people who have been working on it, and I can't list their names. There are about 10 or 12 people who have been working either on the biology side or the computational side.

The two primary developers are Jim Schaff, a computer scientist, and Boris Slepchenko, who is a physicist and expert in numerical methods.

[SLIDE 3] I also need to acknowledge NCRR and NIGMS. NCRR is sponsoring the development project, but NIGMS has been sponsoring the biology that's been going on in my lab for the last 21 years. So, since I'm here, obviously it's very political to acknowledge that long-term support.

[SLIDE 4] So what is The Virtual Cell Project? It's not really a virtual cell. We're not trying to build a virtual E. coli--learn everything there is about a virtual E. coli or virtual neuron or virtual hela cell.

We're trying to build a modeling environment to do just what Dr. Paulson said we should be doing--to take all this information that we're gathering and start being able to make predictions, prompt new experiments, and go back and refine models and hypotheses.

So the modeling system is designed to be used interactively with experiment. It enables construction and testing of complex models. We can also look at simple systems.

A key point is that this is a spatial system--a spatial model that can be built. So the geometry that we use to map our simulation can be taken directly from real images.

An investigator gets an image through a microscope and looks at, say, a calcium wave, which is one of the examples that I'll give. He can then map the simulation to exactly the same image and use the tools that a biologist would use to analyze such an image from an experiment to also analyze the results of the simulation.

The math, the physics, and numerics are transparent to the experiment list, so that goes to the issue that Dr. Paulson raised. But they're in there and they're fully accessible to a theorist, somebody who does professional modeling.

This is a screen, the last screen, really, from The Virtual Cell. It shows you that we're mapping our simulations to a very complex geometry. This is a geometry of a neuron with an extending neurite. The right-hand side, that little bulb at the end, is a growth cone. We can look at our simulations in that real biological context.

[SLIDE 5] The math and physics that underlies The Virtual Cell, the domain of The Virtual Cell, is described on this slide. It encompasses the Nernst-Planck equation shown at the top, and, of course, reactions.

It can be any kind of reaction. We don't have to specify it in terms of a mass action type of equation. A general kind of equation could describe it. The reaction could be a Michaelis-Menten equation, for example.

Then these are combined into a series of differential equations to define the change in concentration over time in terms of what's called a reaction diffusion equation--diffusion plus source term. That's the general domain of the system, and that encompasses quite a bit.

We can look at neuronal processes; we can look at changes in membrane potential. Some of the things that Christopher Hogue said his system can't do, our system can do. But, of course, our system is not a database.

It can look at the entire domain of signal transduction processes for metabolic pathways-- all of that is possible. What it can't do, and I'll say very clearly what it can't do at this point, although hopefully in the future we will add that to this list, is the kinds of cell biological processes that involve mechanical forces and changes in shape. We can't model the entire process of cytokinesis, for example. That's not in the system as of today, but hopefully in the future we'll be able to include that as well.

[SLIDE 6] The structure of the modeling process is shown here, and it's very modular. The reason that that's important, and especially can be appreciated by this group, is that it lends itself to a database sort of structure.

The physiology of a system is described in terms of the topological arrangement of the compartments of the cell, and I'm sure Dr. Selkoff has incorporated this kind of approach into his system as well. If you're looking at a complex process that involves several molecules from different compartments, say from the cytosol, the nucleus, the endoplasmic reticulum, you have to define the topology of those different compartments with respect to each other.

You have to obviously define the molecules that are involved, the enzymes, the substrates, the signal transduction molecules. You have to define how those molecules interact, you have to tell the system what the reactions are, and you have to include rate information. You have to include numbers because we're solving differential equations.

You also can define the geometry of the system, as I showed you in the previous slide, from a real geometry, or if you want to construct a spherical cell, you can do that by expressing analytical geometry.

The topology that you define is then mapped to the geometry. In specifying a specific model, the reactions are also analyzed. You specify diffusion coefficients, initial conditions, and boundary conditions.

Then this whole system is then converted to a math description language. So the input is in the form that biologists are used to, images and reactions, and then the combination of all that is converted into equations which are available but don't have to concern the biologist.

The equations are then--there's automatic co-generation to encode and to produce the C++ program that is solved by the solvers to produce simulations. Simulations can be analyzed and compared to experiment. This is a critical part--that there has to be a constant interplay with experiment, and that can then be used to refine the model. So, it's a very cyclical process.

The link to database resources is still unavailable, and I think this is a missing piece that obviously the group in this workshop can certainly help address. This is something that we're very interested in doing. We've made a little bit of a start on it. But we don't have any interest in reinventing the wheel, so if you guys can help us link our system to one of your systems, that would be great.

I'm just going to go through the first part of the talk giving a little overview of the system, and then the second half of the talk, I really wanted to show you a couple of examples to give you an idea that this is something that's concrete and that's available. If I were braver, I would have tried to show it to you all on the Web, because we have a Web site that allows you to enter models directly from anyplace in the world using Java technology.

[SLIDE 7] This is one of the screens from the system, and this is where you define the topology. In this case, extra-cellular is separated from cytosol by a plasma membrane, and the endoplasmic reticulum (ER) is inside the cytosol.

We also define molecules in this particular screen--calcium, inositol trisphosphate, etc. This is one of the models that I'll show you as an example.

We can double-click on any one of those compartments. Say you double-click on the cytosol to find a reaction. [SLIDE 8] So this is a simple reaction--just calcium interacting with a calcium binding protein. We can double-click on this little yellow dumbbell and [SLIDE 9] get a dialogue box that allows us to put in the equation for the reaction kinetics.

[SLIDE 10] We can also define fluxes across membranes. We double-click on any of the membranes, and we can link species from one compartment to the other. Here's calcium moving from the endoplasmic reticulum, which is a calcium storage organelle, to the cytosol through these channels.

[SLIDE 11] As I mentioned, we have the ability to use real geometries. So here's a little histology image from intestinal epithelium. We can take this little part here and do a segmentation to say where the nucleus is assigned a gray level of 128, call the cytosol 255, and call the extra-cellular 1. We segment the image, and then the model can be mapped, the topology can be mapped, directly to those regions.

[SLIDE 12] After all this is said and done, you click a button and it automatically generates the math code. This is our math interface language. If a modeler wants to get in there and change something at this level or write this code out from the beginning, he can do that as well.

[SLIDE 13] This is the overall architecture of the system as it stands today. As I've said, it's completely accessible through a Web browser using Java technology. The server, which resides in Connecticut, gives us the remote method interface services (RMI), the resident Java approach to dealing with distributed architecture. The image database is built in Oracle, but that's the only part of the system that's currently sort of a database system, and it's very primitive. The reaction kinetics, the models themselves, and the simulation results are all on a simple Unix file repository. So we plan to merge this information into a database ultimately as well, but we haven't done so yet, and we could certainly use help. Finally, the actual execution of the numerics takes place on an application server, which is a cluster of alpha computers.

[SLIDE 14] So this is usually the point where I try to convince experimentalists that this is a worthwhile enterprise. Why do we want to do this? Presumably, most of you can appreciate this. Obviously, if you build a model, you can explore how well that model--and model is really just a synonym for a hypothesis--how well does that hypothesis predict the experiment? So, you can explore how good your hypotheses are.

You can make predictions that can be tested experimentally. They could strengthen the model. Or actually, a more powerful result would be if the model is shot down because the prediction doesn't fit the experiment. That tells you that you have to go back to the model again to refine it.

You can look at molecules that can't be visualized experimentally. As you know, we can visualize calcium, for example, inside cells using fluorescent calcium indicators, but you can't visualize inositol trisphosphate or other sugars in living cells. So, since the IP3, the inositol trisphosphate, is a messenger that causes the release of calcium from the endoplasmic reticulum, this map gives us a handle on being able to watch that process, that particular variable in the system. In fact, we can look at any variable that is in our model just as easily as we can look at variables that are experimentally accessible. So, it gives us new insights.

These are the examples that we've been working on, either in our own labs or with collaborators. I'll talk about this calcium wave example. We've looked at long-term depression in cerebellar Purkinje cells. We've looked at a fertilization wave in frog eggs. We've looked at mitochondrial respiration. In fact, you can look at models that are just small parts of a cell or single organelles. You don't have to look at an entire cell. This is an example of that. We're looking at RNA granule trafficking, nuclear envelope breakdown, epithelial cell transport. These are all ongoing projects.

[SLIDE 15] I'll talk about the calcium wave. This is an experiment. This is the kind of experiment you can get by putting a fluorescent indicator inside a cell--this is a neuronal cell--and visualizing the change in calcium. This cell has been treated with a neuromodulator called bradykinin. It's a small peptide that has receptors on the plasma membrane. That triggers a chain of events, which ultimately causes the release of calcium from the endoplasmic reticulum. So we add bradykinin at time zero, and this is a little movie that shows what's going on. So you don't see anything for the first couple of seconds, and then there's this beautiful calcium wave that goes throughout the cell. The picture over here is just a series of frames from that movie. The critical pieces of information that you need to remember, is that the calcium wave starts off in the middle of the cell, in the neurite, and it spreads into the rest of the cells in both directions. The amplitude of the calcium wave is about the same everywhere in the cell. So it starts off at about one micromolar at peak in the neurite and ultimately reaches one micromolar in the cell body, the soma, where the nucleus is.

[SLIDE 16] This is the biochemistry underlying that calcium wave, and each one of these steps has to be modeled. Bradykinin, these little gray balls here, are representing the bradykinin molecule. It binds to the bradykinin receptor. That sets off a chain of events, ultimately causing the breakdown of a lipid molecule called phosphatidyl inositol trisphosphate, to release IP3, which is now water soluble and can diffuse through the cytosol. IP3 binds to IP3 receptors on the endoplasmic reticulum, which also happen to be calcium channels.

So, as I mentioned before, the ER is a calcium storage organelle. The calcium, these blue balls, is much higher in the ER than it is in the cytosol; when IP3 binds to this calcium channel, it can trigger the opening of the channel. That causes calcium to come spilling out into the cytosol, and that's the signal that we measure. That's the observable. That's the signal that we can measure. But, in addition to this, we also have to take into account the breakdown of IP3. IP3 can be degraded during the time course that we're measuring to inactive components.

We also have to worry about the binding of calcium to endogenous calcium binding proteins and to the indicator that we added to do the measurement, because that can also perturb the system.

Finally, the recovery phase is controlled by a calcium pump in the endoplasmic reticulum called a SERCA-pump. This is an ATPase that pumps calcium back in, and there are also some calcium extrusion mechanisms that send calcium out of the cell through the plasma membrane.

So all of these components have to be included and built into the model, and they're not just connect-the-dots. This one, in particular, the IP3 receptor, is an extraordinarily complex mechanism that involves both, obviously, activation by IP3, but also positive feedback by calcium itself and negative feedback by calcium in two separate steps. So the channel here is activated by calcium initially, but then as calcium builds up, it becomes inactivated by calcium. So, you can get the kinds of oscillations that Dr. Selkoff was talking about because of that mechanism. This particular cell that I've shown you, or this particular bradykinin-induced calcium wave, does not display those kinds of oscillations, but we are able to model them.

In addition to all of the biochemistry--this is a spatial model, after all--we have to worry about the spatial component. One obvious feature of the spatial component is the actual shape of the cell. It's not a simple wall--it has this very complex neuronal shape. In addition to just the shape of the cell, we have to worry about the distribution of the receptors. We have to worry about the distribution of the bradykinin receptor along the plasma membrane. [SLIDE 17] Here, we're showing you the distributions of the IP3 receptor on the endoplasmic reticulum membrane. You can see that it's much denser in the soma, the cell body--this is a cross section for the cell body--compared to the neurite. The SERCA distribution, the calcium ATPase, shows a similar distribution, much denser in the soma than the neurite. The reason for that, basically, is that the endoplasmic reticulum has this very uneven distribution. So we have to worry about that kind of factor as well and map that onto our model.

[SLIDE 18] So usually it takes me about 10 or 15 minutes to just go through all of the inputs to this very complex model, but I'm summarizing it very quickly on this slide. We look at IP3 development and degradation. This is biochemistry done in the lab. We get a lot of information from the literature, because this is going to very well study cell type. We measure the calcium indicator concentration, and we also take an image of the cell and map the distribution of these various receptors onto the image.

[SLIDE 19] So, when all is said and done, here are the results. Now this is the simulation. You see two color bars here, one for calcium and one for IP3. IP3 can't be measured, but we can visualize it with our simulation. IP3 comes up before calcium, and then calcium comes up afterwards. The calcium wave that you see here is virtually superimposable on the calcium wave from the experiment.

The match is very good. The reason that the match is good is not because we did all this and it worked, but because it was a real interplay between experiment and simulation. This took about 2 years of a graduate student's life to do to get to this point.

Now, I'll try to give you at least a feeling for some of that interplay. The IP3 goes up faster than the calcium, and it accumulates preferentially in the neurite, in the thin part of the cell. The IP3 plot is showing you that there's a much greater amount of IP3 in the neurite than there is in the soma.

The question is, if there's so much more IP3 accumulating in the neurite than the soma, then why is it that the calcium levels end up being about the same in both regions of the cell?

The answer takes us back to the uneven distribution of the endoplasmic reticulum. The endoplasmic reticulum is biased towards the soma, so the two things balance each other out. And it's that kind of fact that we were able to glean from doing this model.

For our initial model, of course, we took the simplest road--we had an even distribution of ER. We didn't have any reason to do all of those immunofluorescence experiments, so we initially just assumed the ER is probably of uniform density throughout the cell.

When we did that, we didn't get a calcium wave that was anywhere near what we saw in the experiment. There are a number of other examples of that kind of interactive process that makes this kind of modeling really, really valuable.

[SLIDE 20] Another kind of experiment that we did, was to--again, we did the simulation before the experiment--uncage IP3 uniformly throughout the cell. This involves microinjecting an IP3 that is covalently modified with a photolyzable group that renders it inactive. Flashing the cell with UV light releases active IP3 uniformly throughout the volume. It's not coming in from the membrane the way it is for the bradykinin. Well, then the calcium should go up more in the cell body because that's where we have a higher density of ER. The simulation, of course, shows that and its prediction is realized in the experiment.

[SLIDE 21] In a real physiological situation, of course, bradykinin isn't delivered to these long neuronal cells uniformly the way we did in our experiment--it's delivered to specific parts of the cell. And it occurred to us that, since the middle of the cell seems to be the active region or the region that gets the biggest IP3 response in the simulation, that maybe that's also the region that is sufficient for initiation of a calcium wave. If we only stimulate the middle of the cell, do we get a calcium wave? The answer is yes. We can sort of turn off all the bradykinin receptors in the soma and in the tip of the neurite and only have them active in the middle of the neurite and do this sort of thought experiment. And, sure enough, you get a complete calcium wave here.

If you only stimulate the tip of the neurite, you get a calcium increase, but not in a complete wave. If you only stimulate the soma, you get a calcium increase but not a complete wave.

[SLIDE 22] Of course, the next step is to test this experimentally, which we did by taking a micropipet drawn out to a fine capillary point and puffing on bradykinin to just these three regions of the cell, and the behavior is exactly as predicted by the simulation.

So we believe that our model is robust. But, again, I don't want to give you the impression that this is the first time we got there. This was an iterative process between experiment and model.

[SLIDE 23] This is Chuck Fink, the graduate student who did that 2 years' worth of work. This is his birthday party, but also his graduation party, so he looks very, very happy on this day. This is just a summary of some of the conclusions, not all of which I had time to really tell you about in detail.

[SLIDE 24] But I wanted to show you one other system. This is only two slides. The reason I want to show you a different system is because it's really very different. It illustrates the scope of The Virtual Cell framework. It shows that stochastic models can be addressed in our formulation. This is available in the framework, but it's not in the user interface yet. In the user interface, we can only deal with deterministic problems at this point.

This was a collaboration with John Carson, a colleague of mine, who is interested in RNA trafficking in oligodendrocytes. Oligodendrocytes are the cells that produce the myelin sheath that surrounds axons in the nervous system.

He was interested in a process by which RNA moves out of the nucleus into the cytosol, is captured by granules that are then transported down microtubules to the place where protein synthesis has to occur. This happens to be RNA that codes from myelin-basic protein, which is produced in those long membranous sheaths.

[SLIDE 25] I'll show you a picture of the cell. This is what one of these cells actually looks like. The nucleus is in here, and these are these long processes that culminate in the myelin sheaths. The myelin membranes are too thin to really see in this image. You'll see little pieces of them over here. But these are these long tubes that connect the nucleus to the myelin sheath.

Some branches don't have any RNA granules in them, so the question that John Carson was interested in is, how does the cell decide to send RNA granules down one path and not another? So now in addition to a stochastic model, the other thing that we're illustrating here is that we can model the motion--we can model uni-directional motion, because we're moving in one direction down these microtubules away from the nucleus. The other thing that we can model is a very simple geometry. We don't have to base our geometry on something as complex as this, although we could have.

So these are five microtubules sitting in a row. The blue objects at the beginning here are RNA granules that are waiting to pop onto the microtubule. They're blue when they're sitting in the cytosol; they turn red when they become bound to a microtubule. So let's start this movie. We have two target microtubules, or two sets of target microtubules with different on rates. The on rates over here are 0.1 per second; the on rate over here is 1 per second. So the question is, are these two different on rates sufficient to give you the kind of selectivity that we are seeing in the experiment? And it looks pretty good. It looks like most of the microtubules prefer to march down the target microtubule with the higher on rate. But if you actually count the granules, instead of being 10:1, according to this 10:1 ratio of the on rate, it's something like 3.5:1.

So why isn't it perfect? Well, one idea that we had was that you could change the on rate of the source microtubules so that when they reach the end over here, they can pop on and off a little bit more and sample the volume and spend a little more time making a decision. (They can't go backwards, because the motor doesn't go backwards.)

If you do that and count the number of granules that go down here versus here, it comes to 10:1. So the bottom line is that you can also create very simple models to test very simple ideas, in addition to the very complex models that I described before.

With that, I think I'll stop. Thank you.

This page last updated November 19, 2008