pmc logo imageJournal ListSearchpmc logo image
Logo of pnasPNAS Home page.Reference to the article.PNAS Info for AuthorsPNAS SubscriptionsPNAS About
Proc Natl Acad Sci U S A. 2006 November 21; 103(47): 17807–17812.
Published online 2006 November 10. doi: 10.1073/pnas.0608512103.
PMCID: PMC1693828
Developmental Biology
Transcription factor profiling in individual hematopoietic progenitors by digital RT-PCR
Luigi Warren,* David Bryder, Irving L. Weissman,§ and Stephen R. Quake**
*Division of Biology, California Institute of Technology, Pasadena, CA 91125;
Hematopoietic Stem Cell Laboratory, Lund University, SE-22184 Lund, Sweden;
Departments of Pathology and Developmental Biology, Stanford Institute of Stem Cell Biology and Regenerative Medicine, and
Department of Bioengineering, Stanford University, Stanford, CA 94305; and
Howard Hughes Medical Institute, Chevy Chase, MD 20815
§To whom correspondence may be addressed. E-mail: irv/at/stanford.edu
**To whom correspondence may be addressed at: James H. Clark Center, E-300, 318 Campus Drive, Stanford, CA 94305., E-mail: quake/at/stanford.edu
Contributed by Irving L. Weissman, September 26, 2006.
Author contributions: L.W., D.B., and S.R.Q. designed research; L.W. and D.B. performed research; L.W., D.B., S.R.Q., and I.L.W. analyzed data; and L.W., D.B., S.R.Q., and I.L.W. wrote the paper.
Received July 2, 2006.
Abstract
We report here a systematic, quantitative population analysis of transcription factor expression within developmental progenitors, made possible by a microfluidic chip-based “digital RT-PCR” assay that can count template molecules in cDNA samples prepared from single cells. In a survey encompassing five classes of early hematopoietic precursor, we found markedly heterogeneous expression of the transcription factor PU.1 in hematopoietic stem cells and divergent patterns of PU.1 expression within flk2 and flk2+ common myeloid progenitors. The survey also revealed significant differences in the level of the housekeeping transcript GAPDH across the surveyed populations, which demonstrates caveats of normalizing expression data to endogenous controls and underscores the need to put gene measurement on an absolute, copy-per-cell basis.
Keywords: gene profiling, hematopoiesis, microfluidics, PU.1, stem cells
 
Stem cells gives rise to terminally differentiated cells of diverse types through a stepwise process involving the production of intermediates of progressively restricted lineage potential. This unfolding program is controlled by a transcriptional regulatory network: a chemical state machine with sequencing logic implemented by cross-regulating transcription factors, the states of the network realized in the abundance profile of these regulatory molecules. Transitions between preferred states are brought on by intrinsic metastability, stochastic fluctuation, and external signals (13). Understanding the behavior of these networks is the key to understanding development itself. A prerequisite is the ability to characterize network states quantitatively, but the sensitivity of current gene profiling methods is not fully adequate to this task. Here we report on a study of hematopoietic stem cells (HSCs) and other early blood progenitors using an assay that overcomes the sensitivity problem.

Conventional gene expression assays typically require thousands of cells' worth of RNA as analyte. Developmentally interesting cells, especially stem cells, are not always easily isolated in such quantities. More fundamentally, population-average expression data provide an incomplete picture, because functionally significant variations in regulatory-network state undoubtedly exist in cell types defined on the basis of a few phenotypic criteria. One consequence of this is that population-averaged experiments are subject to systematic errors in interpretation: although one can reliably infer qualitative trends from their results, it is difficult if not impossible to generate precise, quantitative results. Modern theories of systems biology are able to make quantitative predictions, and to test these theories quantitative data are required. Flow cytometry has transformed the study of cellular differentiation by revealing diversity in the patterns of surface protein expression within populations of superficially similar cells. Similarly, one would like to survey transcriptional network states within populations cross-sectionally, which is possible only by measuring gene expression in individual cells.

In principle, RT-PCR has the sensitivity required for single-cell gene-expression analysis. However, the quantitation of rare messages, such as those for transcriptional regulators, pushes the limits of the art. Published single-cell protocols tend to be elaborate in terms of assay validation and practice (4). To address this problem, we have developed a highly sensitive quantitative RT-PCR assay based on standard 5′-nuclease probe (TaqMan) chemistry and primer–probe design rules. The method uses a commercially available microfluidic chip to partition individual cDNA molecules into discrete reaction chambers before PCR amplification (Fig. 1). In effect, the chip performs a massively parallel limiting-dilution assay, a form of “digital PCR” (5). In conventional quantitative PCR, quantitation is based on the number of amplification cycles required for dye fluorescence to reach a given threshold. Slight variations in amplification efficiency between reactions are magnified because of the exponential character of PCR; for this reason, interassay comparisons are only valid if gene-of-interest measurements are normalized to measurements on endogenous controls or synthetic standards (6). In digital PCR, quantitation relies on binary, positive/negative calls for each subreaction within the partitioned analyte, affording an absolute readout of DNA copy number with single-molecule resolution. Applying the chip assay to cDNA generated from synthetic RNA standards, we have demonstrated that the sensitivity and linearity of quantitation is sufficient to address transcript measurements on single cells (see Materials and Methods).

Fig. 1.Fig. 1.
The Digital Array chip. (a) A PCR end-point scan of a chip. In this false-color image, the FAM signal (GAPDH) is shown in green and the Cy5 signal (PU.1) is shown in red. The 12 samples analyzed here correspond to cDNA preparations derived from individual (more ...)

We applied the digital assay to a single-cell gene expression survey focused on the early steps of hematopoiesis. After staining with fluorescent antibodies, flow cytometry can be used to fractionate blood progenitors based on membrane-protein expression. The lineage potential of many different subsets has been investigated by using clonal assays, resulting in schema for the prospective isolation of progenitors based on surface antigen profiles. Immunophenotyped cells are readily sorted into individual tubes for single-cell analysis (7). In our experiments, cells were sorted directly into RT-PCR buffer; we subsequently added primers for the genes of interest, reverse-transcribed the RNA, and quantitated the cDNA in the digital PCR chip (Fig. 2). The study encompassed murine blood progenitors belonging to the following canonical populations: HSCs, common lymphoid progenitors (CLPs), common myeloid progenitors (CMPs), and megakaryocyte–erythroid progenitors (MEPs) (813). (Fig. 5, which is published as supporting information on the PNAS web site, positions these cell types within the classical model of the hematopoietic lineage tree.) Some recent work argues that the CMP subset is heterogeneous, functional diversity being correlated with differential expression of the cytokine receptor flk2 (14, 15). We therefore decided to look at flk2+ and flk2 CMP subsets, to see whether their gene expression profiles were different. Our survey includes data from 116 individual cells, about two dozen from each of the five cell types of interest (HSC, CLP, CMP/flk2+, CMP/flk2, and MEP).

Fig. 2.Fig. 2.
Experimental procedure used in the single-cell survey. (a) Cells are harvested from mouse bone marrow, then enriched for c-kit+ early progenitors by immunomagnetic separation. (b) Purified cells are stained with a panel of fluorescent antibodies to surface (more ...)

We measured the levels of two transcripts within every cell: a transcription factor, PU.1, and a housekeeping gene, GAPDH. PU.1 is known to be a major regulator of hematopoiesis. Its best understood role is the promotion of granulocyte–macrophage fate: expressed at high levels, PU.1 activates granulocyte–macrophage differentiation gene batteries, and PU.1 up-regulation seems to be instrumental in funneling CMPs toward the granulocyte–macrophage progenitor (GMP) lineage (16, 17). PU.1 is also thought to play other, context-dependent roles in blood differentiation at intermediate levels of expression (18, 19). GAPDH encodes a glycolytic enzyme, glyceraldehyde-3-phosphate dehydrogenase. This gene commonly serves as an endogenous control in quantitative RT-PCR assays. In this practice, the readout for every gene of interest is normalized to the GAPDH signal, on the idealized assumption that GAPDH expression is uniform across cell types. Our assay reports absolute transcript levels, in copies per cell, so we did not need a reference for PU.1 quantification. However, we were interested in finding out to what extent GAPDH expression is truly independent of cell type. In addition, we expected levels of PU.1 to be so low that Poisson noise might obscure any clues our analysis would give to the general character of expression distributions. GAPDH is a high-abundance transcript, so we anticipated that its expression would be more informative in this regard.

Results and Discussion

We carried out on-chip assays using RNA runoff template to measure the efficiency and reproducibility of the digital RT-PCR assay. The estimated RNA-to-cDNA conversion efficiency was 0.50 ± 0.10 for PU.1 (CV = 20%) and 0.29 ± 0.09 for GAPDH (CV = 29%). We found an interassay CV of ≈10% in similar trials using DNA standards, so some of the variability in the efficiency estimates came from the chip itself. Variation in the loaded sample volume probably accounts for most of the chip-related technical noise. The limiting factor in the precision of the digital RT-PCR method is likely to remain the technical variability of reverse transcription (20).

The results of the single-cell survey are summarized in Fig. 3. All cell types gave mean readouts for GAPDH and PU.1 substantially exceeding the background of false positive signals detected in No Template and No RT control panels (Table 2, which is published as supporting information on the PNAS web site).

Fig. 3.Fig. 3.
Gene expression in cDNA copies per cell, by cell type. The histograms show the number of individual cells in each subset that expressed PU.1 and GAPDH within the indicated bin ranges. PU.1 expression is heterogeneous in the stem cells, up-regulated in (more ...)

PU.1 expression is highly elevated in the CMP/flk2+ subset, and strongly down-regulated in the MEPs (Table 3, which is published as supporting information on the PNAS web site). The other three subsets show intermediate levels of expression, but the CMP/flk2 resembles the MEP, with a less pronounced downshift. We used the Kolmogorov–Smirnov (K-S) test to measure the resemblance between the PU.1 data sets (Table 4, which is published as supporting information on the PNAS web site). The HSC distribution bears a strong resemblance to the distributions in the CLP and CMP/flk2 cell types. The CMP/flk2+ stands alone: the P value for similarity was <0.01 in every comparison involving this set.

A previous, nonquantitative single-cell study of blood progenitors found that HSCs display variegated expression of transcripts normally associated with downstream lineages, including PU.1 (21). This could represent nonproductive, “leaky” transcription, if the loci for downstream lineages are kept in a default, open chromatin state until fate commitment (22). Alternatively, “noisy” transcription might be a mechanism for symmetry breaking, priming daughter cells toward diverse fates when the stem cell starts to proliferate and differentiate (23). The distinction between leaky or noisy transcription and regulated transcription may be hard to draw; our data suggests that wide variations in message abundance between individual cells are the rule rather than the exception. However, the K-S comparison results and the relatively broad profile of the PU.1 distribution in the HSC subset argue that PU.1 expression is either loosely regulated or heterogeneously regulated within the stem cell compartment.

A keystone of the classical model of hematopoietic differentiation is the division of progenitors into two major populations downstream of the multipotent progenitor (MPP): the lymphoid-restricted CLP, which gives rise to pro-B and pro-T cells, and the myeloerythroid-restricted CMP, which gives rise to granulocyte–macrophage progenitors and MEPs (24). It has recently been claimed that the canonical CMP population is internally heterogeneous with respect to lineage potential (14). According to this research, the expression level of the cytokine receptor flk2 is a marker for functional divergence: the flk2+ CMP compartment is PU.1hi, has lost MEP potential, and retains lymphoid as well as myeloid potential; flk2 CMPs comprise mostly PU.1lo cells with predominantly MEP potential. Our measurements reveal a sharp divergence in PU.1 expression within the flk2 and flk2+ CMP subsets. The GAPDH results for these subsets, discussed below, add further evidence that they are nontrivially distinct, as the two-dimensional gene-expression plot in Fig. 4 makes clear. The similarity between PU.1 expression in the CMP/flk2 cells and the MEPs meshes with the observation that the bulk of the flk2 CMP compartment is already megakaryocyte-erythroid-lineage restricted.

Fig. 4.Fig. 4.
Resolution of flk2 and flk2+ CMP populations based on gene expression. (a) The sort gates used to fractionate CMP cells into flk2+ and flk2 subsets (biexponential plot). These gates were applied after first selecting Lineage (more ...)

The expression of GAPDH was not constant across the six cell types examined, with the subset mean expression levels varying over a 2-fold range (Table 5, which is published as supporting information on the PNAS web site). K-S tests show that the differences are statistically significant (Table 6, which is published as supporting information on the PNAS web site). In 4 of 10 pairwise comparisons between subsets, the hypothesis that the data came from the same underlying distribution had a P value below 0.05. The CMP/flk2+ subset had the highest GAPDH expression, and it was also the best resolved from the other subsets by the K-S measure.

The variation in GAPDH level within these closely related subsets highlights a problem with the use of endogenous controls in RT-PCR quantitation. Normalization to the GAPDH signal reduces the apparent magnitude of PU.1 up-regulation in the CMP/flk2+ cell type, and equalizes the differences in mean expression between the HSC, CLP and CMP/flk2 types (Table 1). It is impossible to say, at this level of analysis, whether such equalization is justified. Although normalized measurements are not necessarily less informative than absolute measurements, no two housekeeping genes can reasonably be expected to show the same dependence on cell type. No consensus exists as to the best choice of endogenous control and, indeed, no one gene is likely to be a good reference for every application (25). Weighted normalization schemes based on multiple housekeeping genes have been proposed (26). Still, it can be argued that this only makes the problem of standardization worse. The uncertainty which the practice of normalization introduces into gene measurement comparisons can only be resolved by a move to absolute quantitation, either through the use of quantitated synthetic controls (e.g., purified PCR product or RNA runoff transcript), or by the adoption of techniques which yield absolute measurements directly, such as the one described here.

Table 1.Table 1.
Gene expression by subset

It has recently been reported that the abundance of gene transcripts is lognormally distributed at the single-cell level (27). We used several standard normality tests to ask whether the expression of GAPDH within each population was compatible with a normal or lognormal distribution (Table 7, which is published as supporting information on the PNAS web site). In all but the CMP/flk2+ population, the lognormal model was clearly preferred. Lognormal distributions can arise when normally distributed variations compound multiplicatively, as might occur during the sequential steps of biochemical synthesis (28). Intermittent, exponentially distributed bursts of biosynthesis can also give rise to similar, nonnormal, positively skewed distributions (29) which are also consistent with our observations. The geometric standard deviations for the GAPDH data sets are in the range of 1.8–3.1 (Table 5), which indicates that transcript levels can routinely fluctuate over a full one-log range. It might seem surprising that robust behavior can be achieved by a system in which signal levels vary so widely. It must be remembered, however, that a snapshot of mRNA transcript level is not necessarily a true measure of the abundance of the corresponding protein. Messenger transcripts generally turn over much faster than the proteins they encode, which implies that protein expression may be buffered against stochastic fluctuations at the mRNA level.

Redundancy and distributed control are additional strategies by which cells could make up for the inherent sloppiness of biochemical signaling (30). If so, efforts to “reverse-engineer” the transcriptional circuits controlling development must ultimately address the synthesis of quantitative observations on multiple transcription factors within single cells. The power of flow cytometric population analysis has increased as the technology for multiplexing has improved; we expect the same to be true of single-cell surveys conducted at the transcriptional network level. In hematopoiesis, cell fate decisions depend on the coordinate activity of multiple transcription factors (31). When the targets for quantification are present at of the order of ten copies, a subdivision of the sample to permit independent single-plex assays introduces substantial measurement noise. For several reasons, digital PCR offers improved scope for high-order multiplexing relative to conventional quantitative PCR. In a standard multiplex PCR, mismatched transcript levels can lead to competitive inhibition of reactions involving less abundant targets, which is typically addressed by adjusting primer concentrations so that the amplification of abundant targets is primer-limited. Such fine-tuning may not be practical in single-cell surveys, if transcript levels vary widely on a cell-by-cell basis, and is obviated in the digital assay. A second benefit arises from the concentration of template molecules because of reaction partitioning, which ameliorates the impact of primer–dimer side reactions. On the readout side, the digital assay lends itself to bar-coding schemes, whereby distinct probe color combinations are assigned to each target (32).

Conclusion

If cross-sectional analysis of cell populations at the transcriptional network level were to become routine, the impact on developmental studies could be profound. In the near term, PCR-based methods cannot be expected to yield single-cell expression data with the speed and economy of flow cytometry. This must be set against the consideration that transcription factor studies provide data bearing directly on the internal decision-making machinery of the cell. In a small-scale survey we could easily resolve two subpopulations within a progenitor type, the CMP, which has until recently been considered homogeneous. Here we were able to focus the analysis based on recent findings from the hematopoiesis literature. In principle, however, the heterogeneity in PU.1 levels within the CMP compartment could have been detected in a “blind” single-cell survey. The scale of survey required to detect network state diversity will depend on several factors, including (i) the relative frequencies of divergent subsets, (ii) the magnitude and sharpness of the expression differences, and (iii) the extent to which such differences are correlated across transcripts and surface markers analyzed in the survey. If the case of PU.1 expression in the CMP is representative, indications of heterogeneity should emerge after looking at a few tens of cells, and surveys at the 100- to 1,000-cell level may offer significant insight into population substructure.

We have shown that it is possible to extend the sensitivity of quantitative RT-PCR to permit profiling of transcription factor expression within individual cells. This opens the door to sophisticated regulatory network analysis on even the rarest developmental progenitors. The dynamic range of the chip assay is suited to measuring the gamut of expression levels for regulatory genes, whether working from single-cell samples or from higher numbers of cells prepared at appropriately scaled concentration. By combining flow cytometry and digital RT-PCR, we can put gene expression measurements on an absolute, copy-number-per-cell basis. The attainment of this “gold standard” should facilitate the spread of public databases cataloguing cell-type-specific expression data. Our assay can also support the progressive refinement of the taxonomies underlying such resources through the single-cell survey approach, helping to uncover diversity at the level of the cell's most delicate apparatus, the transcriptional regulatory network.

Materials and Methods

Microfluidic Digital PCR Chip. The single-cell measurements reported here were made using the Digital Array chip (Fluidigm, South San Francisco, CA). This single-use device supports the simultaneous analysis of 12 samples. Within each sample panel, fluid is distributed into parallel, dead-end channels under pneumatic pressure. After the load step, a comb valve with teeth at right angles to these channels is actuated, deflecting an elastomeric membrane down to partition the panel into 1,200 isolated reaction chambers. The chip is then thermocycled, carrying out a total of 14,400 PCRs at once. A microarray scanner is used to image the chip at the end point of PCR. If a panel holds [double less-than sign]1,200 copies of template at the start of the PCR, the copy number can be read out accurately just by counting positive reactions. At higher DNA titers, a significant fraction of the reaction chambers capture more than one copy of template, and there is no longer a simple correspondence between positive reactions and individual template molecules. However, unless a panel is at or near saturation, template abundance can still be calculated with acceptable precision (Supporting Text, which is published as supporting information on the PNAS web site). The quantitative dynamic range of the Digital Array is therefore about three logs: from a single copy to on the order of a thousand copies. This should be sufficient for single-cell quantification of all but the most abundant mRNA species (33).

Synthetic Standards. PU.1 and GAPDH RNA runoff transcripts were made for use in evaluating the RT efficiency and PCR amplification efficiency of our assays. The transcripts were designed to flank the amplicon regions of the TaqMan assays by at least 100 bases on each side, so that the secondary structure context seen by the reverse transcriptase would be similar to that in lysate-based reactions. PCR products incorporating a T7 RNA polymerase promoter were used as template for the runoff reactions, which were done with the MEGAscript T7 kit (Ambion, Austin, TX). The concentration of the purified RNA was measured by UV absorbance spectroscopy, and the percentage of full-length template was estimated with a capillary electrophoresis system (Experion; Bio-Rad, Hercules, CA).

The PCRs used to make templates for runoff transcription were based on Mouse GAPDH DECAtemplate and Mouse Thymus PCR-Ready cDNA (Ambion). The PCR primers were as follows (T7 promoter tails underlined): for PU.1, 5′-TAATACGACTCACTATAGGGAGACTGACCCACGACCGTCCAGT-3′ (forward) and 5′-TTGTCCTTGTCCACCCACCA-3′ (reverse); for GAPDH, 5′-TAATACGACTCACTATAGGGAGCCCATCACCATCTTCC-3′ (forward) and 5′-CTGTAGCCGTATTCATTGTC-3′ (reverse).

TaqMan Assay Design. All RT-PCR data reported here were obtained by using duplex PU.1/GAPDH TaqMan assays. Primers and probes were designed with commercial software (Beacon Designer; Premier Biosoft, Palo Alto, CA), accepting the default Tm criteria for TaqMan assay design, which are based on a standard 60°C annealing/extension step during the PCR. Primers were chosen so that the amplicon range was free of predicted secondary structure, as this is thought to impede efficient reverse transcription. To minimize background signal from genomic template, the PU.1 assay was designed so that the forward primer straddled an exon splice site. Assays were validated empirically using conventional quantitative PCR standard curve analysis with runoff transcript as template. Primers were at 100 nM concentration and probes at 50 nM concentration in all of the reported experiments. Oligonucleotides were synthesized by Integrated DNA Technologies (Coralville, IA).

The oligonucleotides used in the TaqMan assays were as follows: for PU.1, 5′-CATAGCGATCACTACTGGGATTTC-3′ (forward primer), 5′-GGTTCTCAGGGAAGTTCTCAAA-3′ (reverse/RT primer), and 5′-CGCACACCATGTCCACAACAACGA-3′ (Cy5-labeled probe); for GAPDH, 5′-CCAATGTGTCCGTCGTGGATC-3′ (forward primer), 5′-GCTTCACCACCTTCTTGATGTC-3′ (reverse/RT primer), and 5′-CGTGCCGCCTGGAGAAACCTGCC-3′ (FAM-labeled probe).

RT Efficiency Measurements. For on-chip standard curve assays, equimolar mixtures of PU.1 and GAPDH runoff transcript were added to RT-PCR buffer and the same digital RT-PCR protocol used on the cell lysates (described below) was executed on the samples. Three identical on-chip standard curve experiments were run, with each chip bearing four sets of three samples, at nominal template concentrations of 12.5, 25, 50, and 100 copies per microliter. All samples were derived from the same master mix; each individual sample was reverse-transcribed in a separate tube. Data were recovered from 35 of the 36 panels in these chips.

Cell Isolation and Staining. A bone marrow cell suspension was prepared from five 8- to 12-week-old C57BL/6 mice. The suspension was filtered through a nylon membrane and contaminating red cells were lysed with ACK. The isolate was enriched for c-kit-positive early progenitor cells by immunomagnetic separation with anti-c-kit MACS beads (Miltenyi Biotec Auburn, CA). The cells were next stained for other surface markers by using the following fluorescent antibodies: CD34 FITC, flk2 phycoerythrin, Lineage phycoerythrin-Cy5 (a mixture of antibodies including CD3, CD4, CD5, CD8, B220, Mac1, Gr1, and Ter119), Sca-1 Cy5.5-phycoerythrin, FcGr allophycocyanin, c-kit allophycocyanin-Cy7, and IL7Ra biotin. All antibodies were from Ebiosciences (San Diego, CA), except CD34 FITC (Pharmingen, San Diego, CA) and IL7Ra biotin (I.L.W.'s laboratory). The cells were then stained with streptavidin QD605 (Quantum Dot, Hayward, CA) to tag the IL7Ra antibodies and resuspended in PBS plus 2% FCS, with propidium iodide added to mark apoptotic cells.

Cell Sorting and Lysis. Cells were sorted to 0.2-ml sample tubes in 12-tube strips by using the FACSAria cell sorter (BD Biosciences, San Jose, CA). Doublets and dead or apoptotic cells were excluded based on forward scatter/side scatter and propidium iodide staining. All cells were sorted using Lineage and c-kit+ gates; additional sort criteria used to fractionate the cells into specific progenitor subsets were as indicated in Fig. 5. Individual cells were dispensed into 10-μl aliquots of RT-PCR buffer. The buffer components included a commercial RT-PCR mix (Platinum One-Step Reaction Buffer; Invitrogen, Carlsbad, CA), an RNase inhibitor (Ambion SUPERase-In), and 0.15% Tween 20 detergent. The latter was included as a surfactant to prevent nucleic acids binding the PDMS walls of the Digital Array during the PCR assay. No special cell lysis reagents were added. In tests, the efficiency and reproducibility of cDNA recovery was at least as good using direct hypotonic/detergent lysis in the RT-PCR buffer as obtained using chaotropic lysis with subsequent RNA purification.

Reverse Transcription. Reverse transcription reactions were done at 55°C for 15 min, and followed by a 5-min, 70°C step to heat-denature the reverse transcriptase. Completed reactions were stored at −20°C for later PCR analysis. The RT step was carried out in a 96-well block thermocycler. Three microliters of 5× primer–probe mix was added to each frozen lysate, after which the samples were spun down and transferred to the preheated thermocycler block. As a precaution to minimize primer–dimer extension by the reverse transcriptase, the samples were warmed to 55°C before adding 2-μl aliquots of enzyme to the reactions. MMLV RT/Taq polymerase enzyme blend (CellsDirect SuperScript III/Platinum Taq; Invitrogen) was diluted in RT-PCR buffer to stabilize the enzyme; the final reaction concentration was as directed by the manufacturer.

cDNA Quantitation. Digital Array chips mounted on 75 × 50-mm glass slides were primed for use by filling the control layer with osmolyte (35% PEG 3,350). For each assay, 12 stored reverse transcription reactions were thawed, drawn into gel tips, and loaded into a chip under 15-psi pneumatic pressure. The load was performed in a cold room at 4°C and took ≈30 min. The sample load step was controlled manually with a 12-port manifold fed from a house air supply and connected to the gel tips by Tygon tubing with custom-made hose adaptors. After the load, the chip was transferred to a flat-block thermocycler; paraffin oil was used to improve thermal contact between the block and the glass-slide base. Samples were partitioned by applying 27.5-psi pneumatic pressure to the control layer comb valve via a gel tip filled with osmolyte solution. The PCR profile included a 3-min, 95°C hot-start to activate the Taq, followed by 40 cycles of a two-step program: 15 seconds at 95°C (denaturation) and 60 seconds at 60°C (annealing and extension). After PCR, chips were imaged with an ArrayWoRx microarray scanner (Applied Precision, Issaquah, WA), adapted to accept a chip carrier. Scans were done at 13-μm resolution by using FAM and Cy5 filter sets. We wrote our own software to process the PCR end-point images and compute the template concentrations in each panel. The loaded sample volume in the chip, 7.5 μl (manufacturer's data), represents half the volume of the reverse-transcribed lysate. All of the single-cell cDNA copy number data reported here was derived by multiplying the calculated template concentration in copies per microliter by the total lysate reaction volume, 15 μl.

Data Analysis. Normality tests were done with XLSTAT (Addinsoft, New York, NY). The test input consisted of the reported cDNA copy number for each cell in the set (normal test), or the log of the copy number (lognormal test). We wrote software to perform expression data set comparisons based on the K-S routines given in ref. 34. When used to compare two data sets, the algorithm calculates the maximum absolute distance between their respective cumulative distributions, and computes from this measure the significance of the null hypothesis that both data sets came from the same underlying distribution. The analysis does not involve any assumptions about the character of the underlying distributions involved.

Supplementary Material
Supporting Information
Acknowledgments

This work was supported by U.S. Public Health Service Grant 5 P01 DK 053074 (from the National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health) (I.L.W. and D.B.), National Science Foundation Grant 0535870 (S.R.Q. and L.W.), and the National Institutes of Health Director's Pioneer Award (S.R.Q. and L.W.).

Abbreviations

K-SKolmogorov–Smirnov
CLPcommon lymphoid progenitor
HSChematopoietic stem cell
CMPcommon myeloid progenitor
MEPmegakaryocyte–erythroid progenitor.

Footnotes
Conflict of interest statement: S.Q. is a founder, shareholder, and consultant for Fluidigm Corporation.
References
1.
Sprinzak, D; Elowitz, MB. Nature. 2005;438:443–448. [PubMed]
2.
Revilla-i-Domingo, R; Davidson, EH. Int J Dev Biol. 2003;47:695–703. [PubMed]
3.
Gilman, A; Arkin, AP. Annu Rev Genomics Hum Genet. 2002;3:341–369. [PubMed]
4.
Dixon, AK; Richardson, PJ; Pinnock, RD; Lee, K. Trends Pharmacol Sci. 2000;21:65–70. [PubMed]
5.
Vogelstein, B; Kinzler, KW. Proc Natl Acad Sci USA. 1999;96:9236–9241. [PubMed]
6.
Bustin, SA. J Mol Endocrinol. 2000;25:169–193. [PubMed]
7.
Gaynor, EM; Mirsky, ML; Lewin, HA. Biotechniques. 1996;21:286–291. [PubMed]
8.
Adolfsson, J; Borge, OJ; Bryder, D; Theilgaard-Monch, K; Astrand-Grundstrom, I; Sitnicka, E; Sasaki, Y; Jacobsen, SE. Immunity. 2001;15:659–669. [PubMed]
9.
Akashi, K; Traver, D; Miyamoto, T; Weissman, IL. Nature. 2000;404:193–197. [PubMed]
10.
Kondo, M; Weissman, IL; Akashi, K. Cell. 1997;91:661–672. [PubMed]
11.
Nakauchi, H. Rinsho Ketsueki. 1995;36:400–405. [PubMed]
12.
Suda, J; Suda, T; Ogawa, M. Blood. 1984;64:393–399. [PubMed]
13.
Christensen, JL; Weissman, IL. Proc Natl Acad Sci USA. 2001;98:14541–14546. [PubMed]
14.
Nutt, SL; Metcalf, D; D'Amico, A; Polli, M; Wu, L. J Exp Med. 2005;201:221–231. [PubMed]
15.
Karsunky, H; Merad, M; Cozzio, A; Weissman, IL; Manz, MG. J Exp Med. 2003;198:305–313. [PubMed]
16.
Nerlov, C; Graf, T. Genes Dev. 1998;12:2403–2412. [PubMed]
17.
Laiosa, CV; Stadtfeld, M; Graf, T. Annu Rev Immunol. 2006;24:705–738. [PubMed]
18.
Dahl, R; Simon, MC. Blood Cells Mol Dis. 2003;31:229–233. [PubMed]
19.
Rothenberg, EV; Anderson, MK. Dev Biol. 2002;246:29–44. [PubMed]
20.
Stahlberg, A; Hakansson, J; Xian, X; Semb, H; Kubista, M. Clin Chem. 2004;50:509–515. [PubMed]
21.
Miyamoto, T; Iwasaki, H; Reizis, B; Ye, M; Graf, T; Weissman, IL; Akashi, K. Dev Cell. 2002;3:137–147. [PubMed]
22.
Akashi, K; He, X; Chen, J; Iwasaki, H; Niu, C; Steenhard, B; Zhang, J; Haug, J; Li, L. Blood. 2003;101:383–389. [PubMed]
23.
Enver, T; Heyworth, CM; Dexter, TM. Blood. 1998;92:348–351. [PubMed]
24.
Kondo, M; Wagers, AJ; Manz, MG; Prohaska, SS; Scherer, DC; Beilhack, GF; Shizuru, JA; Weissman, IL. Annu Rev Immunol. 2003;21:759–806. [PubMed]
25.
Bustin, SA; Benes, V; Nolan, T; Pfaffl, MW. J Mol Endocrinol. 2005;34:597–601. [PubMed]
26.
Vandesompele, J; De Preter, K; Pattyn, F; Poppe, B; Van Roy, N; De Paepe, A; Speleman, F. Genome Biol. 2002;3:RESEARCH0034. [PubMed]
27.
Bengtsson, M; Stahlberg, A; Rorsman, P; Kubista, M. Genome Res. 2005;15:1388–1392. [PubMed]
28.
Koch, AL. J Theor Biol. 1966;12:276–290. [PubMed]
29.
Cai, L; Friedman, N; Xie, XS. Nature. 2006;440:358–362. [PubMed]
30.
Levsky, JM; Singer, RH. Trends Cell Biol. 2003;13:4–6. [PubMed]
31.
Cantor, AB; Orkin, SH. Curr Opin Genet Dev. 2001;11:513–519. [PubMed]
32.
Levsky, JM; Shenoy, SM; Pezo, RC; Singer, RH. Science. 2002;297:836–840. [PubMed]
33.
Carter, MG; Sharov, AA; VanBuren, V; Dudekula, DB; Carmack, CE; Nelson, C; Ko, MS. Genome Biol. 2005;6:R61. [PubMed]
34.
Press, WH; Flannery, BP; Teukolsky, SA; Vetterling, WT. Numerical Recipes in C. 2nd Ed. Cambridge, UK: Cambridge Univ Press; 1992.