The ATLAS and CMS combination of Higgs search results
1st December 2011 | By
The Higgs Boson is the only missing piece in the Standard Model of particle physics and its search is undoubtedly one of the most important searches in the history of physics. The Higgs boson is the generator of all elementary particle masses in nature. The mass of the Higgs boson itself is unknown, and before the LHC it was searched for in previous experiments but not found.
LHC experiments have produced excellent results since the start of the data taking. In ATLAS and CMS a discussion was initiated about a year ago to combine the Higgs search results from both experiments. The framework and the procedure to combine results had to be defined and agreed upon before the combined analysis could proceed.
The LHC Higgs working groups of ATLAS and CMS are made of a few hundreds people and the challenge of combining Higgs boson results from different experiments was not just a technical challenge but also a sociological one. Each experiment sent its experts and representatives to discuss with the other experiment experts. From the beginning of 2011, statistical experts and representatives from both experiments started meeting regularly to discuss and converge on the combination procedure. This was the first example of this kind of scientific collaboration at the LHC. I was involved in these discussions as the ATLAS contact person for the combined analyses. Many different issues needed to be addressed for a successful combination.
The ATLAS and CMS analyses need to use consistent Higgs boson production cross-sections and decay branching ratios. The cross-sections and the branching ratios are related to the production rate of the Higgs boson and its decay probability. The Higgs has a very low lifetime and it is searched for in its decay patterns. The branching ratios are related to the relative weight of various decay patterns.
The LHC Higgs cross-section group, formed much earlier, includes members of ATLAS, CMS and the theory community. A lot of efforts have been made in the LHC Higgs cross-section group to provide common tools to compute Higgs cross-sections and decay branching ratios, and their uncertainties. Common tools to estimate the background (noise) cross-sections were also discussed and used. The background is a process which has looks similar to the Higgs boson signal and therefore sophisticated analysis tools are used to separate, as much as possible, the Higgs boson signal from the background noise. Most backgrounds were obtained from data control region measurements (meaning that the data itself was used to estimate the background), only a handful of the background estimations relied upon theoretical predictions.
The combined analyses necessitate a unified framework for common statistical tools and data exchange. After some validation efforts, the RooStat (a statistical analysis framework developed at CERN) was adopted for the combination. RooStat is built upon a common platform for exchanging information, known as the WorkSpace. The WorkSpace contains all the needed information for statistical analyses and simplifies the logistics of data exchange between the collaborations.
There were many discussions to identify the systematics uncertainties that should be correlated between the experiments and the degrees of the correlation. Uncertainties affect our theoretical computations or experimental measurements, often due to our limited understanding, the complexity of the computations and/or the precisions of our measurements. When the uncertainties are of the same sources in both experiments or in different analyses, we say that they are correlated. Theoretical uncertainties from PDF (Parton Density Functions), αs (strong coupling constant), and QCD (Quantum Chromo Dynamics) renormalization and factorization scales - uncertainties resulting from the precision of the theoretical computations of the cross-sections --- were correlated between the experiments and between processes. Uncertainties include:
- The modeling of the underlying event (the proton is made of many constituents, namely quarks and gluons. When two protons collide besides the main interaction there are other additional interactions, that take place making the underlying events)
- Parton shower (the process by which the way the basic constituents of matter such as quarks and gluons manifest themselves in nature) as well as experimental uncertainties on luminosity measurements were also correlated between the experiments (the luminosity is related to the total intensity of the collisions).
QCD scale uncertainties on jet counting (jets are the result of parton showers) in Higgs boson production are also treated as well as QCD uncertainties in data-driven background (noise) estimations where extrapolation factors from data control regions to signal regions were taken from theory.
If particular uncertainties (within one or both experiments) were taken to be 100% correlated they were given the same name. Different names imply no correlation. Any two sources of uncertainties that were believed to be only partially correlated were either broken further down to the independent sub-contributions or declared to be correlated/uncorrelated, whichever was believed to be more appropriate or more conservative. To avoid accidental correlations in the combination, uncertainties specific to each experiment had a prefix ATLAS or CMS. Uncertainties without such prefixes were assumed to be 100% correlated between the two experiments.
The only observable that interests us is the Higgs boson production rate; all other related parameters are called nuisance parameters. In the statistical analyses of the combination, systematic uncertainties on observables were handled by introducing nuisance parameters associated to probability density functions (pdf), with the best estimate of the uncertainties (mean, median, peak) and additional parameters characterizing the shape of the pdf. Many different pdfs were considered. Ultimately, log-normal pdfs were used for uncertainties that were correlated between the experiments and Gamma pdfs were used for the uncorrelated uncertainties such as uncertainties on Monte Carlo statistics or number of events in data control regions.
The Higgs boson test masses used in the combination were discussed. A grid of Higgs boson masses ranging from 110 to 600 GeV were chosen with step sizes driven by the resolutions in Higgs boson decays to two photons or four leptons (leptons are either muons or electrons in this context). Results for Higgs boson mass points for which there were no simulation were provided by interpolation.
The procedure for setting limits or quantifying excess was discussed and agreed upon. In the limit setting procedure, one compares the compatibility of the data with a background-only and a signal+background hypothesis. Monte Carlo pseudo experiments were carried out to generate the pdfs, and the probabilities (p-values) associated to the actual observation of background-only and signal+background were constructed. The ratio of these probabilities (known as CLs) was used to set the exclusion limit at some confidence level. For example when the CLs < 0.05, the signal, at its nominal production rate, is excluded at 95% confidence level (this can be roughly interpreted as a less than 5% chance of the Higgs boson still being there if it was excluded by mistake). Results using asymptotic formulae normally valid when the number of event analyzed is large were also provided in good agreement with the main results.
To quantify excess, a test is normally defined to characterize the probability that the background (noise) could fluctuate upwards to produce an excess of events as large as or larger than the observed one. Through Monte Carlo pseudo experiments, the pdf for the background-only hypothesis was constructed for the given Higgs boson mass being tested and one then could evaluate the significance of potential excess at that mass. Subsequently the probability or the significance of observing excess anywhere in the search range can be computed.
Once the procedure for carrying out the combined analysis was defined and agreed upon validation and crosschecks were performed to ensure agreement in the implementation or interpretation of the procedure. In the validation tests, both collaborations prepared their data by building the individual WorkSpaces mentioned previously. Then, the individual WorkSpaces were shared with the other collaboration. Each collaboration then built their own version of the combined Workspaces, and statistical calculations were then performed on these Workspaces. Both groups then met to compare and discuss the results. In all cases, these results were in excellent agreement. After the validation, the main physics results were prepared and submitted to the collaborations for review and approval.
The first successful ATLAS and CMS combined Higgs results have been presented at the HCP conference in Paris. No evidence of the Standard Model Higgs boson has been found. The results exclude the Standard Model Higgs boson in the mass range of 141-476 GeV at 95% confidence level and the sensitivity is within a factor of two for all the Higgs boson mass points tested. The ATLAS and CMS collaborations have each collected much more data than that used for the current combined results. With this increased amount of data, the next combined analysis will improve upon the current results.
It has been a pleasure for me to be involved in the analyses of these combined data and to be a part of a team that includes ATLAS and CMS representatives.