1 Introduction

Hidden valleys, referring to hidden sectors with a confining gauge group, can produce dark quarks in proton–proton collisions at the Large Hadron Collider (LHC), leading to a dark shower and the production of a large number of dark hadrons (\(\phi _{\textrm{D}}\)), analogous to QCD jets [1,2,3]. Depending on the details of the theory, the dark showers can follow large-angle emission, and dark hadrons do not arrange in narrow QCD-like jets. The decay of dark hadrons results in dark photons (\(Z_{\textrm{D}}\)), which further decay to low-energy Standard Model particles with transverse energy (\(E_{\textrm{T}}\)) of \(\mathcal {O}(10^2)\) \(\,\text {MeV}\), with the final experimental signature being high-multiplicity spherically symmetric Soft Unclustered Energy Patterns, or SUEPs. We focus on a well-motivated scenario, where SUEP is produced in exotic Higgs (H) boson decays via gluon–gluon fusion and all dark hadrons decay promptly and exclusively to pions and leptons—an experimental nightmare scenario because of an overwhelming multi-jet QCD background.

We have developed an Anomaly Detection (AD) technique [4, 5] exploiting unsupervised deep learning for real-time SUEP detection in the High-Level Trigger (HLT) system [6] of the Compact Muon Solenoid (CMS) experiment [7] at the LHC.Footnote 1 Most AD techniques developed for high-energy physics applications rely on the online or offline reconstruction of collision events, which could be ineffective for reconstructing new physics signatures. We utilize raw quantities reconstructed in the HLT system, i.e., energy deposits in different sub-detectors (input for the online particle reconstruction). We have trained a deep convolutional neural autoencoder network (ConvAE) using QCD events, which is designed to detect events significantly different from the training data without prior knowledge of the signal characteristics. The data are three-channel images: two-dimensional (in the \(\eta \)-\(\phi \) plane, where \(\eta \) is the pseudorapidity) \(E_{\textrm{T}}\) deposits in the inner tracker, electromagnetic calorimeter (ECAL), and hadron calorimeter (HCAL) sub-detectors of CMS. Even though our work targets SUEP event tagging, the trained model could identify any other non-QCD-like signature, such as semi-visible jets [8] and emerging jets [9], spanning a large part of the landscape of the hidden valley. We plan on covering those new physics signals in our next paper.

This paper is structured as follows: Sect. 2 contains all the information about the QCD and SUEP event generation, simulation, and reconstruction. The event preselection is highlighted in the same section. The details about the ConvAE architecture and training are provided in Sect. 3. The trained ConvAE is applied to the QCD background and SUEP signal considering several Higgs boson mediator mass scenarios, and the performance results are given in Sect. 4. Section 5 concludes the paper.

2 Datasets

2.1 Generation, simulation, and reconstruction

Both multi-jet QCD and SUEP events are generated using pythia 8.244 [10]. Detector effects are simulated and events are reconstructed with delphes 3.5 [11] using the CMS detector configuration. Proton–proton collisions at the LHC Run 3 center-of-mass energy of 13.6\(\,\text {TeV}\) are considered, with approximately 50 collisions per bunch crossing (pileup). A pileup reduction is applied to the reconstructed tracks: the \(E_{\textrm{T}}\) deposits in the inner tracker by the pileup tracks are removed. No pileup reduction is applied for calorimeters, as per the HLT system’s configuration during Run 3 [6].

The model parameters for the QCD event generation are identical to those in [12]. For SUEP event generation, we used the SUEP_Generator plugin [13], and the following parameter settings have been used:

  • mediator masses (\(m_{H}\)) = 125\(\,\text {GeV}\), 400\(\,\text {GeV}\), 700\(\,\text {GeV}\), and 1000\(\,\text {GeV}\)

  • dark meson mass (\(m_{\phi _{\textrm{D}}}\)) = 2\(\,\text {GeV}\)

  • branching ratio of \(\phi _{\textrm{D}} \rightarrow 2 Z_{\textrm{D}} \) = 100%

  • dark photon mass (\(m_{Z_{\textrm{D}}}\)) = 0.7\(\,\text {GeV}\)

  • branching ratios of \(Z_{\textrm{D}} \rightarrow \pi ^+\pi ^-,\) \(e^+e^-,\) \(\mu ^+\mu ^-\) = 70%, 15%, 15%

  • dark temperature (\(T_{\textrm{D}}\)) = 2\(\,\text {GeV}\)  [2]

Please note that we denote SUEP with a mediator mass of \(m_{H}\) \(\,\text {GeV}\) as SUEP(\(m_{H}\) \(\,\text {GeV}\)) in the rest of the paper.

2.2 Event preselection

A preselection criterion is applied to both QCD and SUEP events based on the detector geometry and the typical characteristics of QCD events. The \(|\eta |\) of \(E_{\textrm{T}}\) deposits is required to be less than 2.5, as defined by the inner tracker coverage. The scalar sum of \(E_{\textrm{T}}\) of reconstructed electrons, muons, photons, and jets (denoted as \(H_{\textrm{T}}\)) in an event is required to be >500\(\,\text {GeV}\), which is a typical \(H_{\textrm{T}}\) requirement at the HLT after L1 trigger selection during Run 3. The event preselection efficiency is found to be \(\sim \)85% for QCD and \(\sim \)2.1%, \(\sim \)10.4%, \(\sim \)18.4%, and \(\sim \)22.7% for SUEP(125\(\,\text {GeV}\)), SUEP(400\(\,\text {GeV}\)), SUEP(700\(\,\text {GeV}\)), and SUEP(1000\(\,\text {GeV}\)), respectively.

The plots showing the \(E_{\textrm{T}}\) deposits for a QCD event and a SUEP(1000\(\,\text {GeV}\)) event, which satisfy the preselection criterion, in the inner tracker, ECAL, and HCAL are given in Figs. 1 and 2, respectively. The ECAL is 25 times more granular (0.0174\(\times \)0.0174 \(\mathrm{rad^2}\) in \(\eta \)-\(\phi \)) than the HCAL (0.087\(\times \)0.087 \(\mathrm{rad^2}\) in \(\eta \)-\(\phi \)). Hence, each HCAL pixel is divided into 25 equal pixels to match the ECAL granularity. The final shape of the images is (288, 360, 3). For this particular QCD event, we can appreciate the typical signature of two jets in the \(\eta \)-\(\phi \) plane. Also, we can acknowledge the sparse nature of the data—only \(\sim \) \({15}{k}\) pixels out of a total of 3,11,040 image pixels have nonzero values. This observation is valid in general for all QCD events. On the other hand, for the SUEP(1000\(\,\text {GeV}\)) event, there is a clear spherically symmetric signature of a large number of low-\(E_{\textrm{T}}\) particles in \(\phi \). We normalized the \(E_{\textrm{T}}\) values such that they are of \(\mathcal {O}(1)\), which facilitates more stable and faster ConvAE training.

Fig. 1
figure 1

\(E_{\textrm{T}}\) deposits for a QCD event in the inner tracker (left), ECAL (middle), and HCAL (right)

Fig. 2
figure 2

\(E_{\textrm{T}}\) deposits for a SUEP(1000\(\,\text {GeV}\)) event in the inner tracker (left), ECAL (middle), and HCAL (right)

3 Autoencoders

Our ultimate target is to reject QCD jets and identify anomalous signatures. The unsupervised learning nature of autoencoders (AEs) perfectly serves the purpose [14]. In its simplest form, an AE is a neural network (NN) that maps the input (ground truth) to a latent compressed representation (encoding), the so-called bottleneck, and then back to the input dimensionality (decoding), providing the output (reconstruction) as an approximation of the input. The quality of the AE training is measured in terms of a distance metric, the so-called reconstruction loss; e.g., for images, the loss function can be defined as the pixel-wise summed mean-squared difference between input and output. Once the AE is trained using background events with the objective of minimizing the reconstruction loss, the model is expected to poorly reconstruct anomalous signatures and result in large reconstruction loss values. A threshold on the reconstruction loss can be imposed to tag anomalies.

3.1 Architecture and training

The ConvAE architecture is detailed in Table 1. For encoding, there are five two-dimensional Convolutional NN (CNN) [15] layers with a kernel_size of (3, 3) and strides of (3, 3), (2, 2), or (2, 3). Up to the bottleneck, the input shape of (288, 360, 3) compresses to (6, 5, 8). The decoder part, which is exactly the mirror image of the encoder part, is composed of five two-dimensional Transposed CNN [16] layers with the same kernel_size and exactly opposite strides. The padding is set to ‘same’. After each (Transposed) CNN layer, a BatchNormalization layer is introduced, which facilitates a more stable and faster training by bringing the mean of the output close to zero and the output standard deviation close to unity. The Parametric Rectified Linear Unit (PReLU) and Rectified Linear Unit (ReLU) [17] activation functions are used for all the hidden layers and the output layer, respectively. This model architecture results in a total number of trainable hyperparameters equal to 3,574,065. The model was implemented in Keras/TensorFlow [18, 19], and it was trained for 100 epochs (with early stopping enabled) with a batch_size of 128. The learning rate was set to \(10^{-2}\), and the optimizer found to yield the best performance was Adam [20]. The training was performed using the NVIDIA® Tesla® V100 PCIe 32 GB Graphics Processing Unit [21], and \(\sim \)25 min per epoch were consumed while model training.

Table 1 The ConvAE architecture

The biggest challenge we faced in accomplishing the task was data sparsity, as mentioned in Sect. 2.2. The model had a huge tendency to learn the zeros rather than the nonzero values. The standard loss functions, such as Mean Squared Error [22] and Cross-Entropy [23], did not work well. We exploited the inverse of the so-called Dice Loss [24] as the loss function, as given by Eq. 1. The \(x_{\text {i}}\) and \(x_{\text {i}}'\) denote the \(\text {i}^{\text {th}}\)-pixel values for the input and output, respectively. The N is the total number of pixels. The denominator term contains the pixel-wise product of input and output, which forces the output to have the same nonzero pixels as the input to minimize the reconstruction loss. On the other hand, the numerator term and minus 1 ensure that the loss function gets its minimum at zero if the output becomes equal to the input.

$$\begin{aligned}L(x, x^{'}) = \frac{\sum _{\text {i}}^{\text {N}}{x_{\text {i}}^2} + \sum _{\text {i}}^{\text {N}} {x_{\text {i}'}^2}}{2\sum _{\text {i}}^{\text {N}}{x_{\text {i}} x_{\text {i}}'}} - 1 \end{aligned}$$
(1)

We used two independent sets of 50k QCD events for the model training and validation. During the model training, no overfitting was observed, and the loss minimization was smooth and saturated after 57 epochs.

4 Results

4.1 Autoencoder reconstruction

The trained ConvAE is tested using 4k QCD and SUEP events each, which satisfy the event preselection criterion detailed in Sect. 2.2. Figure 3 represents a comparison of true \(E_{\textrm{T}}\) (left) and reconstructed \(E_{\textrm{T}}\) (right)—sum over all the pixels—between QCD and SUEP(1000 \(\,\text {GeV}\)). It can be noted that the model reconstructs the \(E_{\textrm{T}}\) for QCD events quite well but struggles for SUEP(1000 \(\,\text {GeV}\)) \(E_{\textrm{T}}\) reconstruction.

Fig. 3
figure 3

Comparisons of true \(E_{\textrm{T}}\) (left) and reconstructed \(E_{\textrm{T}}\) (right) (sum over all the pixels) between QCD and SUEP(1000 \(\,\text {GeV}\))

The combined true \(E_{\textrm{T}}\) for a QCD event in the inner tracker, ECAL, and HCAL is given in the left plot of Fig. 4 (same as Fig. 1 but combined now). The reconstructed \(E_{\textrm{T}}\) for the same event is shown in the right plot of Fig. 4. The model is able to reconstruct the nonzero input pixels quite well. Similar plots for a SUEP(1000 \(\,\text {GeV}\)) event are given in Fig. 5. In this case, the model reconstructed only the large \(E_{\textrm{T}}\) clusters (in the top left quadrant) but failed to reconstruct the whole spherically symmetric signature of a large number of particles in \(\phi \).

Fig. 4
figure 4

Combined true \(E_{\textrm{T}}\) (left) and reconstructed \(E_{\textrm{T}}\) (right) for a QCD event in the inner tracker, ECAL, and HCAL

Fig. 5
figure 5

Combined true \(E_{\textrm{T}}\) (left) and reconstructed \(E_{\textrm{T}}\) (right) for a SUEP(1000 \(\,\text {GeV}\)) event in the inner tracker, ECAL, and HCAL

4.2 Autoencoder performance

The comparisons of ConvAE reconstruction loss for QCD background and SUEP signal with the considered mediator mass scenarios are given in Fig. 6. The corresponding Receiver Operating Characteristic (ROC) curves are given in Fig. 7. The ConvAE reconstruction loss provides decent discrimination power. The areas under ROC curves (AUCs) are \(\sim \)70%, and they barely depend on the mediator mass. For a target of 40% signal-selection efficiency, the QCD event mistagging rate ranges from 10.3% for SUEP(1000 \(\,\text {GeV}\)) to 16% for SUEP(125 \(\,\text {GeV}\)).

Fig. 6
figure 6

Comparisons of ConvAE reconstruction loss for QCD and SUEP(125 \(\,\text {GeV}\)) (top left), SUEP(400 \(\,\text {GeV}\)) (top right), SUEP(700 \(\,\text {GeV}\)) (bottom left), and SUEP(1000 \(\,\text {GeV}\)) (bottom right)

Fig. 7
figure 7

ROC curves for ConvAE reconstruction loss as an AD proxy. The right plot is the same as the left plot but with a logarithmic y-scale

We investigated other ConvAE reconstruction parameters to see if they could be used to further enhance the performance. We observed a parameter, the inverse of reconstructed \(E_{\textrm{T}}\), which provides much better discrimination power. The comparisons of the inverse of the reconstructed \(E_{\textrm{T}}\) for QCD background and SUEP signal with the considered mediator mass scenarios are given in Fig. 8, and the corresponding ROC curves are given in Fig. 9. The AUCs range from 75.6% for SUEP(125 \(\,\text {GeV}\)) to 86.9% for SUEP(1000 \(\,\text {GeV}\)). For a target of 40% signal-selection efficiency, the QCD mistagging rate ranges from 2% for SUEP(1000 \(\,\text {GeV}\)) to 11.9% for SUEP(125 \(\,\text {GeV}\)).

Fig. 8
figure 8

Comparisons of the inverse of reconstructed \(E_{\textrm{T}}\) for QCD and SUEP(125 \(\,\text {GeV}\)) (top left), SUEP(400 \(\,\text {GeV}\)) (top right), SUEP(700 \(\,\text {GeV}\)) (bottom left), and SUEP(1000 \(\,\text {GeV}\)) (bottom right)

Fig. 9
figure 9

ROC curves for the inverse of reconstructed \(E_{\textrm{T}}\) as an AD proxy. The right plot is the same as the left plot but with a logarithmic y-scale

The model inference time has been quantified using a CPU device, an Intel® Core\(^{\texttt {TM}}\) i5-9600KF processor [25], and found to be \(\sim \) \({20}\,\textrm{ms}\), which is very well within the processing-time limits of the HLT system (\(\mathcal {O}(10^2)~\textrm{ms}\)).

5 Conclusions

A deep convolutional neural autoencoder network, trained using background events by taking their signatures in different sub-detectors of an experiment at the Large Hadron Collider as n-channel image data, is highly effective as an Anomaly Detection (AD) technique. The training of the autoencoder using unlabeled background events makes this technique highly model-agnostic. We have developed an autoencoder for AD with the Compact Muon Solenoid experiment, which can easily be adapted for other Large Hadron Collider experiments. We have exploited the raw reconstructed signature in the High-Level Trigger system, with the clear benefit of not relying on online particle reconstruction, which could fail to reconstruct new physics signatures.

The autoencoder has been trained using QCD events by taking transverse energy deposits in the inner tracker, electromagnetic calorimeter, and hadron calorimeter sub-detectors as 3-channel image data. The biggest challenge was the sparsity of the data: only \(\sim \)0.5% of the total \(\sim \) \({300}{k}\) pixels per image had nonzero values. The model had a high tendency to learn zeros rather than nonzero values. We explored several loss functions, and ultimately, a nonstandard loss function served the purpose—the inverse of the so-called Dice Loss, which contains the pixel-wise multiplication of input and output that forces the output images to have the same nonzero pixels as the input.

The trained model has been tested for Soft Unclustered Energy Patterns (or SUEPs) detection—a new physics signal with an experimental signature of high-multiplicity spherically symmetric Standard Model particles, anomalous to QCD jets. To check the robustness of the autoencoder, we considered several signal scenarios depending on the new physics model. It has been observed that the model can detect 40% of the SUEP events at the expense of a QCD event mistagging rate as low as 2%. The model inference time using the Intel® Core\(^{\texttt {TM}}\) i5-9600KF processor is measured to be \(\sim \) \({20}\textrm{ms}\), which perfectly satisfies the HLT system’s latency requirements.