Combination of CDF Single Top Searches with 3.2 fb-1 of Data

Analysis contacts: Catalin Ciobanu, Kevin Lannon

Authors:


[PDF][EPS][PNG]

Summary


[EPS][GIF]
Expected p-value estimated to be > 5.9 σ
Observed p-value = 3.1 × 10-7 (5.0 σ)
σSingle Top = 2.3 +0.6-0.5 pb
|Vtb| = 0.91 ± 0.11 (exp.) ± 0.07 (theory)
|Vtb| > 0.71 at 95% confidence level

Abstract

There are currently six separate CDF searches for single top production. Five of the analyses use a data set selected for having a reconstructed electron or muon, missing transverse energy, and jets (at least one of which is consistent with a b-jet). Four of these lepton + jets analyses include an analysis using boosted decision trees (BDT) , an analysis using neural networks (NN), an analysis using a multivariate likelihood function technique (LF) , and an analysis using matrix element discriminants (ME) . One lepton + jets analysis, the s-channel likelihood function analysis (SLF) considers only events consistent with have exactly two b-jets, using an s-channel optimized multivariate likelihood function. In addition, there is one analysis that analyzes events using a data sample selected for having missing transverse energy and jets (at least one consistent with a b-jet), in order to pick up signal events which don't contain a reconstructed electron or muon, including events where a hadronic tau is present. The analysis analyzing this MET + jets dataset uses a neural network to distinguish between the single top signal and the background.

The combination of these analyses proceeds in two steps. First, discriminant outputs from these five lepton + jets analyses described above are combined into a single, more powerful super discriminant (SD) using neural networks. The weights for the combination neural networks are chosen using genetic algorithms to provide optimal sensitivity. Then, a simultaneous fit is performed to the super disciminant output in the lepton + jets channel and the Neural Network output in the MET + jets channel. More details on this combination technique are given below.

This analysis uses 3.2 fb-1 of CDF Run II data collected between February 2002 and August 2008 at the Tevatron in proton-antiproton collisions at a center-of-mass energy of 1.96 TeV. We measure a combined single top s- and t-channel cross section of 2.3+0.6-0.5 pb. The observed signal has a significance of 5.0σ, which is sufficient for observation. The median significance in psuedo-experiments is estimated to be greater than 5.9σ. These sensitivities represent an improvement of approximately 16% for the observed significance and 13% for the expected significance over the best single analysis. From the cross section measurement we extract a value for |Vtb| of 0.91 ± 0.11 (exp.) ± 0.07 (theory).

Technique

We use the same data sample, event selection, background estimate and systematic uncertainties used by each of the individual CDF single top analyses (see the boosted decision tree, neural network, multivariate likelihood function, or matrix element analysis pages as well as the MET + jets analysis page for more details). The outputs of the the individual analyses are combined into a single super discriminant using a neural network. The neural network weights and topolgy are optimized using a technique known as neuro-evolution of augmenting topologies (NEAT). In addition, through a careful choice of initial neural network configuration, we enable NEAT also to optimize the binning used.

In order to make optimal use of the collected data, both the lepton + jets and the MET + jets analyses subdivide their data sets into sub-samples of differing purity. For the lepton + jets analysis, selected events are first categorized by whether they were collected on one of the dedicated electron or muon triggers (the trigger lepton coverage or TLC dataset) or were collected on the MET + Jets trigger. In the latter case, events are considered part of the lepton + jets data set if they contain a reconstructed muon that failed to fire the dedicated muon triggers. This subsample is refered to as the extended muon coverage (EMC) sample. Each of the above two subsamples is further subdivided based on the number of jets and b-tagged jets in event. The full list of lepton plus jets channels is listed below. A separate super discriminant is optimized for each of these channels, using the listed discriminants from the individual analyses as inputs.

In addition, the MET + Jets data (without a reconstructed electron or muon) is divided into three subsamples, depending on the number of b-tagged jets each event has:

We fit discirimiants outputs from the above eleven channels simultaneously to signal and background templates from Monte Carlo using a binned likelihood technique to extract a cross section and limit on |Vtb|. In the TLC and EMC channels, we fit the superdiscriminant output while in the MET + jets channel, we use the MET + Jets neural network output distributions. In addition, we compare the data to two hypotheses: H0 assumes that there is no single top production while H1 supposes the Standard Model rate of single top. The likelihood ratio Q = -2ln(p(H1)/p(H0)) is used to perform this comparison. We calculate a p-value for the observed data assuming the null hypothesis (H0) and compare it to our expected p-value, evaluated from ensembles of pseudo-experiments constructed assuming H1 (SM amount of single top).

Results

  1. Neural Network Output
  2. Cross Section and |Vtb|
  3. P-Value
  4. Cross Sections by Channel
  5. Interesting Events

Neural Network Output

The plot below collects the neural network output for events from all channels (2 and 3 jets, single and double tag, trigger lepton and extended muons) into a single plot. For this plot, we do not use the optimized binning chosen in during neuro-evolution, but simply choose a convenient binning for display purposes.
All Channels

Shown on a linear scale
[EPS][GIF]
All Channels

Shown on a log scale
[EPS][GIF]

Below, we show the output of the superdiscriminant applied to data, compared to the prediction for the SM amount of single top plus backgrounds, separately for each channel. Recall that the neuro-evolution technique employed here optimizes not only the shape, but also the binning. The plots shown below use the optimized binning chosen during the evolution. Note: The MET + Jets channels are not shown here because the discriminant used in those channels is simply the MET + jets NN.
2-Jet, 1-Tag TLC Channel

Shown on a linear scale
[EPS][GIF]
2-Jet, 1-Tag TLC Channel

Shown on a log scale
[EPS][GIF]
2-Jet, 2-Tag TLC Channel

Shown on a linear scale
[EPS][GIF]
2-Jet, 2-Tag TLC Channel

Shown on a log scale
[EPS][GIF]
3-Jet, 1-Tag TLC Channel

Shown on a linear scale
[EPS][GIF]
3-Jet, 1-Tag TLC Channel

Shown on a log scale
[EPS][GIF]
3-Jet, 2-Tag TLC Channel

Shown on a linear scale
[EPS][GIF]
3-Jet, 2-Tag TLC Channel

Shown on a log scale
[EPS][GIF]
2-Jet, 1-Tag EMC Channel

Shown on a linear scale
[EPS][GIF]
2-Jet, 1-Tag EMC Channel

Shown on a log scale
[EPS][GIF]
2-Jet, 2-Tag EMC Channel

Shown on a linear scale
[EPS][GIF]
2-Jet, 2-Tag EMC Channel

Shown on a log scale
[EPS][GIF]
3-Jet, 1-Tag EMC Channel

Shown on a linear scale
[EPS][GIF]
3-Jet, 1-Tag EMC Channel

Shown on a log scale
[EPS][GIF]
3-Jet, 2-Tag EMC Channel

Shown on a linear scale
[EPS][GIF]
3-Jet, 2-Tag EMC Channel

Shown on a log scale
[EPS][GIF]

Cross Section and |Vtb|

Cross Section
Limit of |Vtb|

[EPS][GIF]

[EPS][GIF]
σSingle Top = 2.3+0.6-0.5 pb
|Vtb| = 0.91 ± 0.11 (exp.) ± 0.07 (theory)
Limit on |Vtb| assuming a flat prior on |Vtb|2.

P-Value

The expected and observed p-values are given below. The expected significance for the combination represents a 13% improvement over the best single analysis. The observed significance is 16% better than the best single analysis.
Expected and Observed P-Value

[EPS][GIF]
Expected p-value estimated to be > 5.9 σ
Observed p-value = 3.1 × 10-7 (5.0 σ)

Cross Section by Channel


[EPS][GIF]
This plot shows the results of fitting for the σSingle Top in each of the eleven channels separately, as well as the result for the global fits of all channels lepton + jets channels, all MET + jets channels, and all channels.

Interesting Events

The table below lists the five most single-top like events from the most pure analysis channel, TLC, 2-Jet, 1-Tag. The quantities shown in the plot include:


[PDF][EPS][PNG]

Below are event displays for some of the interesting events.

Run 148916, Event 792764
Run 206282, Event 3294678
Run 242557, Event 1564229
Run 262776, Event 4920497

Figures from the PRL

Inputs to Super Discriminant
Super Discrimiant and MET+jets

[EPS][GIF]

[EPS][GIF]
The five analyses that provide inputs for the super discriminant. The super discriminant and MET+jets shapes that contribute to the final fit (summed over all channels).

Last modified: Thu Mar 5 06:14:10 CST 2009