Search for Electroweak Single Top-Quark Production using Neural Networks with 2.7 fb-1 of CDF II data

Dominic Hirschbühl, Jan Lück, Thomas Müller, Adonis Papaikonomou, Manuel Renz, Wolfgang Wagner

KIT, Universität Karlsruhe

 


Abstract
Results
Event Selection
Neural Network Input Variables
Templates for Combined Search
Templates for Separate Search
Systematic Uncertainties
Expected Significance for Combined Search
Binned Likelihood Fit to Data for Combined Search
Binned Likelihood Fit to Data for Separate Search
Observed Significance for Combined Search
 

To download a plot in .eps format, left-click on the plot.

To view a plot with full resolution in .gif format , right-click and select "View Image."

 

Abstract

We report on a search for electroweak single top-quark production with CDF II data corresponding to 2.7 fb-1 of integrated luminosity. We apply neural networks to construct discriminants that distinguish between single top-quark and background events. Two analyses are performed, assuming a top-quark mass of 175 GeV/c2.

In the first one, we combine s- and t-channel events to one single top-quark signal under the assumption that the ratio of the two processes is given by the standard model (SM). The expected significance under the assumption of a SM cross-section is determined to be 5.0 σ (p-value of 0.26 x 10-6). A binned likelihood fit to the data measures a single top-quark production cross-section of 2.1-0.6+0.7 pb. The observed p-value is 103.94 x 10-6 which corresponds to a significance of 3.7 σ.

In the second analysis, we separate the two single top--quark production modes, namely s- and t-channel. A binned likelihood fit done simultanously to two-dimensional and one-dimensional distributions of neural network outputs yields most probable values for the cross sections of 2.0-0.7+0.7 pb for the s-channel and 0.8-0.5+0.6 pb for the t-channel production mode.

 

Results
Combined s- and t-channel Search Separate s- and t-channel Search
The sum of the NN Outputs of all eight channels. Background and signal templates are normalized to the SM prediction.







The likelihood fit estimate for the simultaneous s- and t-channel cross section measurement. The contours of the 1σ, 2σ, and 3σ uncertainties are valid for both production channels simultaneously. The error bars represent the 1σ, 2σ, and 3σ uncertainties of the given production channel without any assumptions on the other production channel.



For the combined search the observed single top-quark production cross section is: For the separate search the observed s- and t-channel single top-quark production cross sections are:

 

 

 

Event Selection

The CDF event selection exploits the kinematic features of the signal final state, which contains a top quark, a bottom quark, and possibly additional light quark jets. To reduce multijet backgrounds, the W boson originating from the top quark is required to decay leptonically. One therefore demands a single high-energetic electron (ET(e) > 20 GeV) or muon (PT(μ) > 20 GeV/c) and large missing transverse energy (MET > 25 GeV) from the undetected neutrino.

The backgrounds belong to the following categories: Wbb, Wcc, Wc, mistags (light quarks misidentified as heavy flavor jets), top pair production tt events (one lepton or two jets are lost due to detector acceptance), non-W (QCD multijet events where a jet is erroneously identified as a lepton), Z→ll and Diboson WW, WZ, and ZZ. We remove a large fraction of the backgrounds by demanding exactly two jets with ET > 20 GeV and |η| < 2.8 be present in the event. At least one of these two jets has to be tagged as a b-quark jet by using displaced vertex information from the silicon vertex detector (SVX). The non-W content of the selected dataset is further reduced by several requirements to MET, MET significance, transverse W boson mass, and several angles between the MET vector, lepton vectors and jet vectors.

 

Neural Network Input Variables
Using neural networks, kinematic or event shape variables are combined to a powerful discriminant. In the combined search we use four different networks in our analysis, one for the 2jet1tag category, one for 2jet2tag events, one for 3jet1tag events, and one for 3jets2tags. We devide each of the four categories into two separate channels, one containing triggered electrons and muons called Triggered Lepton Coverage (TLC), and the other containing muons from an Extended Muon Coverage (EMC) accepted through the MET + 2 jets trigger. For the separate search we include an additional network in the 2jet1tag category to build a 2D discriminant. This improves the apriori sensitivity for s-channel of about 15%.
One of the variables is the output of the KIT flavor separator. The KIT flavor separator gives an additional handle to reduce the large background components where no real b quarks are contained, mistags and charm-backgrounds. Both of them amount to about 50% in the W+2 jets data sample even after imposing the requirement that one jet is identified by the secondary vertex tagger of CDF. The following plots show the 14 variables for the TLC 2jet1tag channel. The plots in the third column show the variables in the "0 Tag" sample (for cross-check).

MC distributions: the mass of the reconstructed top-quark data - MC comparison: the mass of the reconstructed top-quark data - MC comparison: the mass of the reconstructed top-quark
MC distributions: the neural network output of the KIT flavor separator for the b-tagged jet data - MC comparison: the neural network output of the KIT flavor separator for the b-tagged jet
MC distributions: the invariant mass of the two jets data - MC comparison: the invariant mass of the two jets data - MC comparison: the invariant mass of the two jets
MC distributions: the product of the lepton-charge and the pseudorapidity of the light quark jet data - MC comparison: the product of the lepton-charge and the pseudorapidity of the light quark jet data - MC comparison: the product of the lepton-charge and the pseudorapidity of the light quark jet
MC distributions: the transverse mass of the reconstructed top-quark data - MC comparison: the transverse mass of the reconstructed top-quark data - MC comparison: the transverse mass of the reconstructed top-quark
MC distributions: the cosine of the polar angle between the tight lepton and the light-quark jet in the top-quark rest-frame data - MC comparison: the cosine of the polar angle between the tight lepton and the light-quark jet in the top-quark rest-frame data - MC comparison: the cosine of the polar angle between the tight lepton and the light-quark jet in the top-quark rest-frame
MC distributions: the transverse energy of the light-quark jet data - MC comparison: the transverse energy of the light-quark jet data - MC comparison: the transverse energy of the light-quark jet
MC distributions: the cosine of the polar-angle between the charged lepton in the W-Boson rest-frame and the direction of the W-boson data - MC comparison: the cosine of the polar-angle between the charged lepton in the W-Boson rest-frame and the direction of the W-boson data - MC comparison: the cosine of the polar-angle between the charged lepton in the W-Boson rest-frame and the direction of the W-boson
MC distributions: the pseudorapidity of the reconstructed W boson data - MC comparison: the pseudorapidity of the reconstructed W boson data - MC comparison: the pseudorapidity of the reconstructed W boson
MC distributions: the transverse mass of the reconstructed W-boson data - MC comparison: the transverse mass of the reconstructed W-boson data - MC comparison: the transverse mass of the reconstructed W-boson
MC distributions: the sum of the pseudorapidities of the two jets data - MC comparison: the sum of the pseudorapidities of the two jets data - MC comparison: the sum of the pseudorapidities of the two jets
MC distributions: the transverse momentum of the charged lepton data - MC comparison: the transverse momentum of the charged lepton data - MC comparison: the transverse momentum of the charged lepton
MC distributions: the scalar sum of transverse energies data - MC comparison: the scalar sum of transverse energies data - MC comparison: the scalar sum of transverse energies
MC distributions: the cosine of the angle between the charged lepton in the W-boson rest-frame and the W-boson momentum in the top-quark rest-frame data - MC comparison: the cosine of the angle between the charged lepton in the W-boson rest-frame and the W-boson momentum in the top-quark rest-frame data - MC comparison: the cosine of the angle between the charged lepton in the W-boson rest-frame and the W-boson momentum in the top-quark rest-frame

 

Templates for Combined Search
We use four different neural networks, one for the 2jet1tag category, one for the 2jet2tag category, one for the 3jet1tag category, and one for the 3jet2tag category. We devide each of the four categories into two separate channels, one containing triggered electrons and muons called Triggered Lepton Coverage (TLC), and the other containing muons from an Extended Muon Coverage (EMC) accepted through the MET + 2 jets trigger. Since this is a combined search, we have one fit template for single top-quark events, which is the combination of the template for s-channel and the template for t-channel single top-quark production according to the ratio of the cross-sections predicted by the SM.

Fit templates of the TLC 2jet1tag channel. Fit templates of the EMC 2jet1tag channel.

Fit templates of the TLC 2jet2tag channel. Fit templates of the EMC 2jet2tag channel.

Fit templates of the TLC 3jet1tag channel. Fit templates of the EMC 3jet1tag channel.

Fit templates of the TLC 3jet2tag channel. Fit templates of the EMC 3jet2tag channel.

 

Templates for Separate Search
For the separate search we use five neural networks, whereas in the most sensitive category 2jet1tag two independent neural network outputs are combined to a 2D discriminant. Here one network is trained for s-channel and the other one for t-channel production, which provides the following 2D templates to search for both production channels simultaneously:

2D template of s-channel single top-quark production in the TLC 2 Jet 1 Tag channel 2D template of t-channel single top-quark production in the TLC 2 Jet 1 Tag channel

2D template of top pair production in the TLC 2 Jet 1 Tag channel 2D template of Wbb+Wcc production in the TLC 2 Jet 1 Tag channel
2D template of Wc production in the TLC 2 Jet 1 Tag channel 2D template of Wqq production in the TLC 2 Jet 1 Tag channel

2D template of Diboson production in the TLC 2 Jet 1 Tag channel 2D template of Z+jets production in the TLC 2 Jet 1 Tag channel

2D template of QCD multijet production in the TLC 2 Jet 1 Tag channel

The 2D neural network outputs get unwinded bin by bin to obtain one-dimensional templates to be fitted to data simultaneously with the templates of the network outputs in the remaining channels:

Fit templates of the TLC 2jet1tag channel. Fit templates of the EMC 2jet2tag channel.

Fit templates of the TLC 2jet2tag channel. Fit templates of the EMC 2jet2tag channel.

Fit templates of the TLC 3jet1tag channel. Fit templates of the EMC 3jet1tag channel.

Fit templates of the TLC 3jet2tag channel. Fit templates of the EMC 3jet2tag channel.

 

Systematic Uncertainties
Systematic uncertainties can cause a shift in the event detection efficiency for events of different physics processes, but can also cause a change in the shape of the template distributions. The rate uncertainties for the eight different channels are summarized in the tables. Below you find three examples of systematic shape uncertainties in the 2jet 1tag channel: jet energy scale (JES) for the single top-quark template, factorization and renormalization scale (Q2) for Wbb events, and modeling uncertainty on the KIT flavor separator output (KIT opt.).

Systematic rate uncertainties for the TLC 2jet1tag channel. Systematic rate uncertainties for the TLC 2jet2tag channel.
Systematic rate uncertainties for the TLC 3jet1tag channel. Systematic rate uncertainties for the TLC 3jet2tag channel.



The JES systematic uncertainty for the TLC channels.


Systematic shape uncertainties in the TLC 2jet 1tag channel: jet energy scale (JES) for the single top-quark template. Systematic shape uncertainties in the TLC 2jet 1tag channel: factorization and renormalization scale (Q2) for Wbb events. Systematic shape uncertainties in the TLC 2jet 1tag channel: modeling uncertainty on the KIT flavor separator output (KIT opt.).

 

Expected Significance for Combined Search
To compute the significance of a potentially observed signal, we perform a hypothesis test. Two hypotheses are considered. The first one, H0, assumes that the single-top cross section is zero (β1 = 0) and is called the null hypothesis. The second hypothesis, H1, assumes that the single-top production cross section is the one predicted by the standard model (β1 = 1). The objective of our analysis is to observe single-top, that means to reject the null hypothesis. The hypothesis test is based on the Q-value, Q= -2(ln Lred1=1) - ln Lred1=0)) , where Lred1=1) is the value of the reduced likelihood function at the standard model prediction and Lred1=0) is the value of the reduced likelihood function for a single-top cross section of zero. Using the two ensemble tests the distribution of Q-values is determined for the case with single-top included at the standard model rate, q1, and for the case of zero single-top cross section, q0. The two Q-value distributions are shown below. In order to quantify the probability for the null hypothesis to be correct we define the p-value, often also named 1-CLb. To quantify the sensitivity of our analysis we define the expected p-value pexp = p(Q1med) where Q1med is the median of the Q-value distribution q1 for the hypothesis H1. The meaning of pexp is the following: Under the assumption that H1 is correct one expects to observe pexp with a probability of 50%. We find pexp = 0.26 x 10-6, including all systematic uncertainties. In other words, assuming the predicted single-top cross section, we expect, with a probability of 50%, to see at least that many single-top events that the observed excess over the background corresponds to a 5.0σ background fluctuation.


Distributions of Q-values for two ensemble tests, one with single-top events present at the expected standard model rate, one without any single-top events. The expected significance under the assumption of a SM cross-section is determined to be 5.0 σ.

 

Binned Likelihood Fit to Data for Combined Search
Finally, the templates for all eight channels are fitted simultaneously to the observed distributions using a binned likelihood function. The fit yields a single top-quark production cross section of 2.1-0.6+0.7 pb. Below you find the distributions of observed data and MC normalized to the SM prediction for all eight networks and for the sum.

NN Output for the TLC 2jet1tag channel. NN Output for the EMC 2jet1tag channel.

NN Output for the TLC 2jet2tag channel. NN Output for the EMC 2jet2tag channel.

NN Output for the TLC 3jet1tag channel. NN Output for the EMC 3jet1tag channel.

NN Output for the TLC 3jet2tag channel. NN Output for the EMC 3jet2tag channel.

The sum of the NN Outputs of the eight different channels with inset zoom. The sum of the NN Outputs of the eight different channels without inset zoom.



Summary of the results for the eight different channels and the final result of the simultaneous fit in all channels.

 

Binned Likelihood Fit to Data for Separate Search
The templates for all channels are fitted simultaneously in all eight channel to the observed distributions using a binned likelihood function. The fit yields a s-channel single top-quark production cross section of 1.6-0.9+0.8 pb for the s-channel and 0.8-0.6+0.7 pb for the t-channel production mode. Below you find the resulting likelihood as a function of the s- and t-channel cross section.



The likelihood fit estimate for the simultaneous s- and t-channel production cross section measurement. The contours of the 1σ (Δln(L)=1.15), 2σ (Δln(L)=3.09), and 3σ (Δln(L)=5.92) uncertainties are valid for both production channels simultaneously. The error bars represent the 1σ (Δln(L)=0.5), 2σ (Δln(L)=2.0), and 3σ (Δln(L)=4.5) uncertainties of the given production channel without any assumptions on the other production channel.

 

Observed Significance for Combined Search


The observed Q-value (indicated by the arrow) yields a p-value of 103.94 x 10-6 which corresponds to a observed significance of 3.7 σ.

 

Our single top-quark results were approved (blessed) by CDF on Thursday 7/24/2008.