Abstract
We present a search for electroweak single top quark production using
2.7 fb-1
of CDF II data collected between February 2002 and
April 2008 at the Tevatron in proton-antiproton collisions at a
center-of-mass energy of 1.96 TeV. The analysis employs a
matrix-element
technique which calculates event probability densities for signal and
background hypotheses. We combine the probabilities to form a
discriminant variable which is evaluated for signal and background
Monte Carlo events. The resulting template distributions are fit to the
data using a binned likelihood approach. We search for a combined
single top s-
and t-channel signal and measure a cross section of
2.7+0.8-0.7pb, assuming a top quark mass of 175
GeV/c2.
The probability that the observed excess originated
from a background fluctuation (p-value) is 9×10-6 (4.2σ) and the expected (median)
p-value in pseudo-experiments is 8×10-7 (4.8σ).
Event Selection
This analysis uses events from leptonic decay of the W boson.
We require a single, well isolated high-transverse-energy lepton,
large missing transverse energy (from the neutrino), and exactly two
or three high-transverse-energy jets. Of these jets, we require at least one
to be identified as originating from a b-quark by secondary
vertex tagging. The secondary vertex tag identifies tracks associated
with the jet originating from a vertex displaced from the primary
vertex. We further require the missing transverse energy and the jets
not to be collinear for low values of missing transverse energy. This
requirement removes a large fraction of the non-W background
while retaining most of the signal.
Our major backgrounds come from W + heavy flavor jets, Wbb-bar,
Wcc-bar, and Wc+jet; mistags which are W + light quark/gluon
events that are mistakenly tagged as b-jets due to detector resolution effects; Non-W, which are mostly multijet
events in which a jet is mistakenly identified as a lepton and jets are
mismeasured, providing a false missing transverse energy signature; and
top pair production events in which one lepton or two jets are lost
due to detector acceptance.
Predicted event yield
with 2.7 fb-1 |
Process | Two-jet
events | Three-jet events |
s-channel | 49.3
± 7.0 | 16.3 ± 2.3 |
t-channel | 74.3
± 10.9 | 22.3 ± 3.2 |
Single top | 123.6
± 17.9 | 38.6 ± 5.5 |
W+bottom | 549.1
± 165.5 | 169.8 ± 51.3 |
W+charm | 453.5
± 139.9 | 126.7 ± 39.0 |
W+light | 410.7
± 51.0 | 125.5 ± 15.8 |
tt-bar | 173.5
± 24.8 | 410.5 ± 58.4 |
Diboson/Z+jets | 105.6
± 12.1 | 39.0 ± 4.6 |
Non-W | 75.6
± 30.2 | 27.4 ± 11.0 |
Total background | 1768.0
± 311.9 | 898.9 ± 108.2 |
Total prediction | 1891.6
± 312.4 | 937.5 ± 108.3 |
Observed | 1874 | 902
|
Jet multiplicity distribution
for signal and background processes. We compare the predicted number of
events in each W+jet bin to the number of events observed in data.
Uncertainty on the data are statistical; the hatch marks represent systematic errors in the background estimate.
Analysis Method
This analysis is based on a Matrix-Element method in order to maximize
the use of information in the events [2,3]. We calculate event probability densities
under the signal and background hypotheses as follows. Given a set of
measured variables of each event (the 4-vectors of the lepton and the
two jets), we calculate the probability densities that these variables
could result from a given underlying interaction (signal and
background). The probability is constructed by integrating over the
parton-level differential cross-section, which includes the matrix
element for the process, the parton distribution functions, and the
detector resolutions. This analysis calculates probabilities for four
different underlying processes: s-channel,
t-channel, Wbb-bar, tt-bar, Wcc-bar, Wc+jet
and Wgg.
Transfer functions are used to include detector effects. Lepton
quantities and jet angles are considered to be well measured.
However, jet energies are not, and their resolution is parameterized
from Monte Carlo simulation to create a jet resolution transfer
function. We integrate over the quark energies and over the
z-momentum of the neutrino to create a final probability density.
We use the probabilities to construct a discriminant variable for each
event. The two single-top channels are combined to form a single
signal probability. We also introduce extra non-kinematic information
by using the output (b) of a neural network b-tagger which assigns a
probability (0 < b < 1) for each b-tagged jet to originate from a
b quark. The event probability discriminant variable (EPD) is
then constructed as:
We evaluate the event probability discriminant in the W+2jets sample and W+3jets sample.
The corresponding templates (normalized to unit area) are shown below for W+2jet events (left)
and W+3jets events (right).
To quantify the single top content in the data, we perform a binned
maximum likelihood fit to the data. We fit a linear combination of signal
and background shapes of the event probability discriminant to the data.
The background normalization are Gaussian constraint in the fit. The fit determines the most
probable value of the single-top cross section. All sources of systematic
uncertainty are included as nuisance parameters in the likelihood
function. Sources of systematic uncertainties can affect
the normalization and shape for a given process. Correlations between
both are taken into account through a common nuisance parameter (δi).
The likelihood function is reduced through a standard Bayesian marginalization technique.
Here βj; is the template fit parameter for each process, indexed by j;
δi; are the nuisance parameters for each systematic effect, with (relative) normalization
uncertainty εji; and (relative) shape uncertainty κjik;, indexed by
ji;k indexes the bins of the event probability discriminant.
H(δi) denotes the Heavyside function to treat
asymmetric uncertainties properly.
The plot below shows the linearity scan for various assumed multiples (beta) of
the SM single top quark cross-sections. Each point corresponds to 1500 pseudo-experiments.
Validation of the Method
Several tests have been performed for this analysis.
We compare the distribution of many kinematic variables
predicted by Monte Carlo simulation for signal and
background to the data. In particular, we compare the distributions
of the input variables to
ensure the data matches the Monte Carlo prediction. We evaluate the event probability
discriminant in the untagged W
+ 2 jets sample, a high-statistics control sample with
very little single-top content (<0.5%). We also evaluate the event
probability discriminant
in the tagged dipleton + 2 jets sample (using only the most energetic
lepton) and in tagged lepton + 4 jets sample (using only the two most
energetic jets
as input to the discriminant), which
should agree well with tt-bar Monte Carlo. In all control samples, the data agrees well with the Monte Carlo prediction.
Evaluation of the event probability discriminant in the high statistics
taggable but untagged W + 2 jets control sample.
The discriminants evaluated in the tagged lepton + 4 jets sample (2.7 fb-1), control sample which is mainly
composed of tt-bar events.
The input variables to the signal and background event probability calculations in the b-tagged W + 2 jet data sample.
Systematic Uncertainties
Each source of systematic uncertainty can posses a normalization uncertainty and a shape
uncertainty. The normalization uncertainty includes changes to the
event yield due to the systematic effect, and the shape uncertainty
includes changes to the template histograms. Both of these effects are
included in the likelihood function as shown above.
Listed below are systematic uncertainties estimated from various Monte
Carlo samples.
- The jet energy scale systematic is obtained by changing the jet
energy scale by 1 standard deviation (SD) and recalculating the event yield and
the discriminant template histograms. This affects both normalization and shape.
- We increase or decrease the amount of initial state radiation in
the Monte Carlo to assign a systematic from this effect.
- We increase or decrease the amount of final state radiation in
the Monte Carlo to assign a systematic from this effect.
- We vary the eigenvectors in the CTEQ parton distribution function
tables to determine the uncertainty from this effect. We also include
the effect of using different versions of CTEQ and of using MRST with
different values of ΛQCD.
- We include a systematic error to account for the modeling
of the single top sample (MadEvent).
- We include an uncertainty on event detection efficiency due to the
scale factors that we apply to our Monte Carlo samples (mainly
b-tagging and lepton ID scale factors)
- We include a 6% uncertainty on our measured luminosity.
- We include a systematic which accounts for systematic variation
of the neural network b tagger output.
- We use an alternative model for our mistag model and use
the difference to the default model as a systematic uncertainty.
- We use an alternate model to model our non-W
background. We also assign a systematic effect to the flavor
composition of the background, which is necessary to include for the
neural-net b tagger to run.
- We vary the factorization and renormalization scele (Q2) in the Monte Carlo samples that
have been created with the ALPGEN Monte Carlo program.
Systematic uncertainty | Rate |
Shape |
Jet energy scale | 0...16%
| X |
Initial state radiation | 0...11% | X |
Final state radiation | 0...15% | X |
Parton distribution functions |
2...3% | X |
Monte Carlo generator | 1...5%
| |
Event detection efficiency | 0...9%
| |
Luminosity | 6.0% | |
Neural-net b tagger | N/A | X |
Mistag model | N/A | X |
Non-W model | N/A | X |
Q2 scale in Alpgen MC | N/A | X |
Monte Carlo mismodeling | N/A | X |
W+bottom normalization | 30%
| |
W+charm normalization | 30%
| |
Mistag normalization | 17...29%
| |
tt-bar normalization | 23%
| |
Systematic uncertainties. The numbers here are given for the combined
single-top channel. Jet energy scale and neural network b tagger
systematics are applied to all processes (not shown here).
Results
- Cross Section Measurement
The result of the binned maximum likelihood fit is shown below. All
sources of systematic uncertainties (normalization and shape) are
included in the result.
Results from full dataset (all W+2/3 jets candidate events):
=2.7+0.8-0.7pb
![](epd.gif)
Event probability discriminant distribution for signal and background
processes. All templates are normalized to the prediction.
The inset shows the most sensitive bins of the analysis (EPD>0.7).
- Vtb Measurement:
We use the measured single top cross section to directly measure the CKM matrix element
- Hypothesis Test:
We have calculate the signal significance of this result using a standard
likelihood ratio technique [4]. In this approach, pseudo-experiments are generated
from background only events. The likelihood ratio is used as the test statistic.
We then calculate the p-value which is the probability of the background
only hypothesis (b) to fluctuate to the observed result in data.
We estimate the expected p-value, by taking the median of the
test hypothesis (signal + background) distribution as the 'observed' value (dashed red line).
![](pvalue.png)
Single Top Signal Features
We enrich the sample with signal events by
making increasing cuts on our event probability discriminant (EPD) and
look for characteristic changes in these sensitive variables. Although
the uncertainties are large, there is a good agreement between data
and the Monte Carlo simulation. For shape comparisons, the predicted Monte Carlos
shapes are normalized to the data.
Increasing cuts on the EPD for the product of the lepton charge and
the pseudo-rapidity of the untagged jet, a variable known to be
sensitive to t-channel (left) and the invariant mass of the W
and the b-tagged jet, a quantity which is close to the top quark mass.
The top row includes data with discriminant scores (EPD>0.8)
and the bottom row includes with discriminant scores (EPD>0.95).
Conclusions
We have updated our search for single top using a Matrix-Element
based analysis and applied it to 2.7 fb-1 of data
taken by the CDF experiment. We include rate and shape systematic
uncertainties in our analysis method. We measure a single top cross-section of
σsingle top
=2.7+0.8-0.7pb.
We use a likelihood ratio method
to calculate the signal significance. The observed p-value in 2.7 fb-1
of CDF data is 9×10-6 (4.2σ). The expected (median) p-value
in pseudo-experiments is 8×10-7 (4.8σ). The cross section measurement is used
to directly determine the CKM matrix element Vtb and we measure
|Vtb| = 0.97±0.13experiment
± 0.07theory. The 95% confidence lower limit
on |Vtb| is 0.71.
References
-
Understanding
single-top-quark production, Z. Sullivan,
Phys. Rev. D 70, 114012 (2004)
- B. Stelzer, PhD Thesis, University of Toronto, FERMILAB-THESIS-2005-79
- D∅ Collaboration, V.M. Abazov, et, al., Nature 429
(2004); D∅ Collaboration, V.M. Abazov, et. al.,
Phys. Lett. B 617 (2005); P.
Dong, Ph.D. thesis, University of California, Los Angeles (2008);
F.
Canelli, Ph.D. thesis, University of Rochester (2003);
-
T. Junk, Nucl. Instrum. Meth. A 434, 435 (1999), L. Read, J.Phys.G 28, 2693 (2002), Website
|