Search by triplet: An efficient local track reconstruction algorithm for parallel architectures

https://doi.org/10.1016/j.jocs.2021.101422Get rights and content

Highlights

  • Local track reconstruction algorithms can be efficiently designed for parallel architectures.

  • Search by triplet is an efficient local track reconstruction algorithm optimized for CPU and GPU parallel architectures.

  • Search by triplet performs track reconstruction of the LHCb VELO detector at a rate of up to 592 kHz in a single GPU.

  • Search by triplet achieves an average physics reconstruction efficiency of 98.52% for the LHCb VELO detector.

  • Search by triplet is one of the main track reconstruction algorithms of the first software trigger stage of LHCb.

Abstract

Millions of particles are collided every second at the LHCb detector placed inside the Large Hadron Collider at CERN. The particles produced as a result of these collisions pass through various detecting devices which will produce a combined raw data rate of up to 40 Tbps by 2021. These data will be fed through a data acquisition system which reconstructs individual particles and filters the collision events in real time. This process will occur in a heterogeneous farm employing exclusively off-the-shelf CPU and GPU hardware, in a two stage process known as High Level Trigger.

The reconstruction of charged particle trajectories in physics detectors, also referred to as track reconstruction or tracking, determines the position, charge and momentum of particles as they pass through detectors. The Vertex Locator subdetector (VELO) is the closest such detector to the beamline, placed outside of the region where the LHCb magnet produces a sizable magnetic field. It is used to reconstruct straight particle trajectories which serve as seeds for reconstruction of other subdetectors and to locate collision vertices. The VELO subdetector will detect up to 109 particles every second, which need to be reconstructed in real time in the High Level Trigger.

We present Search by triplet, an efficient track reconstruction algorithm. Our algorithm is designed to run efficiently across parallel architectures. We extend on previous work and explain the algorithm evolution since its inception. We show the scaling of our algorithm under various situations, and analyse its amortized time in terms of complexity for each of its constituent parts and profile its performance. Our algorithm is the current state-of-the-art in VELO track reconstruction on SIMT architectures, and we qualify its improvements over previous results.

Introduction

The LHCb detector is a large physics detector situated at the Large Hadron Collider at CERN [1]. The detector is being upgraded for the restart of data taking scheduled for 2021 [2]. The full collision data rate of 40 Tbps will be piped through a data acquisition system that will perform a data filtering in real-time, prior to storing data in long-term storage for posterior analysis. The filtering will occur in two stages: the first stage or High Level Trigger 1 (HLT1) will reduce the data rate according to particle kinematics by a factor of 40× in a computing farm composed of 170 servers equipped with GPUs [3]. The second filter stage or High Level Trigger 2 (HLT2) will perform a full event1 reconstruction and reduce data by an additional factor 20× in a computing farm composed of thousands of servers [4]. The introduction of a heterogeneous computing infrastructure in LHCb is motivating the creation of parallel algorithms that are portable and efficient across architectures.

Track reconstruction or tracking is a pattern recognition problem consisting in finding particle trajectories from measurements (hits) in detectors along their path. The problem is equivalent to finding a partition of disjoint sets of measurements that are compatible with the laws of motion of particles as they traverse a detector, accounting for the fact that some measurements may be noise, and considering the presence of sizable magnetic fields which curves the trajectories of charged particles depending on their momentum. Track reconstruction yields momentum and trajectory information of reconstructed particles, which play an essential role in trigger systems of physics experiments. Fig. 1 exemplifies the track reconstruction problem.

The Vertex Locator (VELO) is a tracking detector of LHCb consisting of 52 planes of silicon pixel chips surrounding the LHC interaction point and beamline, shown in Fig. 2. As particles pass through the detection planes, they leave detectable measurements in the form of pixel clusters. VELO track reconstruction constitutes the first reconstructed subdetector of LHCb, and tracks found in the VELO are used to locate the originating collision vertices as well as serve as seeds for subsequent track reconstruction. Therefore reconstructing the VELO is of vital importance towards the correct functioning of LHCb.

Track reconstruction validation is typically performed with Monte Carlo (MC) simulated samples, where reconstructed tracks should match MC particles, which establish the ground truth. The matching of tracks with particles is done on a hit by hit basis. The physics performance of found tracks can be evaluated according to five indicators [5]:

  • The track reconstruction efficiency can be determined by the ratio between the reconstructed tracks of reconstructible particles, over all the reconstructible particles:2

NreconstructedandreconstructibleNreconstructible
  • A fake track (ghost track) is created when a percentage of hits in a track are not from a real track. In LHCb, at least 70% of the hits in a track must belong to the same MC particle to be associated in the validation process. The fake track fraction is the ratio between the fake tracks and all the reconstructed tracks:

NfaketracksNreconstructedtracks
  • The clone track fraction refers to the fraction of tracks associated to the same MC particle as another reconstructed track:

NclonetracksNreconstructedtracks
  • The hit purity in a track refers to the fraction of track hits that belong to the same MC particle:

NtrackhitsinMCparticlehitsNtrackhits
  • Finally, the hit efficiency yields the number of hits correctly found out of the MC particle hits in a track:

NtrackhitsinMCparticlehitsNMCparticlehits

The VELO detector is outside of the range of effect of the LHCb magnet, and therefore trajectories can be considered to be straight lines. Once the detector restarts operation a sustained throughput of 109 particle trajectories at the VELO per second will have to be reconstructed in the trigger systems, while delivering good track reconstruction performance indicators. The VELO reconstruction is therefore a real-time software challenge whereby the design performance of the system must be met within the hardware constraints of the data acquisition system.

Section snippets

Tracking techniques

Due to the interest in tracking by many particle physics experiments, there is a rich literature on track reconstruction techniques [6]. Local tracking methods find tracks iteratively, whereas global methods adapt an equivalent formulation of the problem, typically including all measurements, where solutions map to tracks.

The most common local tracking method consists in finding a track seed and extending it to other detector planes in a process known as track following. The track seed is

Search by triplet

Search by triplet is a local track following algorithm optimized to reconstruct the LHCb VELO detector that exploits the task parallelism inherent to the LHCb data taking regime and the data parallelism of track reconstruction. 30 million events are detected per second in LHCb, where each is independent of each other. Therefore, we assign different tasks to process each individual event. Within each event processing, track reconstruction exposes various levels of parallelism that we tackle in a

Results

The algorithms composing Search by triplet are not run in a separate application, but rather they are embedded as part of the VELO reconstruction sequence. An in-depth discussion of the additional algorithms involved in the sequence is out of the scope of this paper. Nevertheless, to provide an overall perspective of our work we present the VELO reconstruction sequence, which is composed of the following sequence of algorithms:

  • Global event cut – Rejects the 10% most densely populated events.

Conclusions

We have presented Search by triplet, a fast algorithm for VELO reconstruction on parallel architectures. Our algorithm exploits various degrees of parallelism in LHCb VELO track reconstruction, and makes an efficient use of resources in heterogeneous architectures. The algorithm is written in a single codebase in C++, and we have developed architecture-specific optimizations for hot sections of the code.

Our algorithm employs a local tracking technique to detect particle trajectories. The

Authors’ contribution

Daniel Hugo Cámpora Pérez: conceptualization, methodology, software, data curation, draft preparation, writing, visualization, investigation, validation. Niko Neufeld and Agustín Riscos Núñez: supervision, writing – reviewing.

Declaration of Competing Interest

The authors report no declarations of interest.

Acknowledgements

The authors would like to acknowledge the support of the LHCb collaboration throughout the development of the Search by triplet algorithm. We thank the LHCb Online team for the hardware support during our tests. We would also like to thank the LHCb computing, RTA and simulation teams for their support and for producing the simulated LHCb samples used to develop and benchmark our algorithm. We thank R. Schwemmer for fruitful discussions about the performance of our algorithm. We thank R. Aaij

Daniel Campora obtained his degree in Computer Engineering from the University of Sevilla in 2010. He then worked at CERN for 10 years on diverse topics such as data acquisition systems or network administration, and specialized as a developer of efficient software for parallel architectures. During his Ph.D. he worked on parallelizing physics software on GPUs for LHCb and earned an LHCb Early Career Scientist Award for this work. He is now Assistant Professor at the University of Maastricht

References (28)

  • R. Fruhwirth et al.

    Data Analysis Techniques for High-Energy Physics

    (2000)
  • O. Callot

    FastVelo, A Fast and Efficient Pattern Recognition Package for the Velo

    (2011)
  • D. Funke et al.

    Parallel track reconstruction in CMS using the cellular automaton approach

    J. Phys.: Conf. Ser.

    (2014)
  • A. Fröhlich et al.

    MARC – Track Finding in the Split Field Magnet Facility

    (1976)
  • Cited by (5)

    Daniel Campora obtained his degree in Computer Engineering from the University of Sevilla in 2010. He then worked at CERN for 10 years on diverse topics such as data acquisition systems or network administration, and specialized as a developer of efficient software for parallel architectures. During his Ph.D. he worked on parallelizing physics software on GPUs for LHCb and earned an LHCb Early Career Scientist Award for this work. He is now Assistant Professor at the University of Maastricht and closely collaborates with LHCb. His research includes high-throughput programming, GPU accelerators, Quantum Computing and ML.

    Niko Neufeld studied engineering physics and computer science at TU Wien in Austria. After a PhD in experimental particle physics, he switched to computing for his first post-doc position at CERN, where he codeveloped the first high throughput data acquisition system based on Ethernet for the LHCb experiment. Later he worked for the University of Lausanne (UNIL) and the EPF Lausanne on electronics and embedded SoCs. He became a staff scientist at CERN in 2005 and has been working in many areas of high throughput computing and networking since. He is now leading the project for the next generation LHCb data acquisition which will be the biggest system of its kind.

    Agustín Riscos Núñez, Associate professor at Department of Computer Science and Artificial Intelligence, guarantor researcher at the Smart Computer systems Research and Engineering Lab (SCORE), head of the Research Group on Natural Computing, founding member and Secretary of the Research Institute of Computer Engineering (I3US) at Universidad de Sevilla, Spain. Founding member of the “International Membrane Computing Society (IMCS)”, IEEE Member. His main areas of expertise are bio-inspired computing and artificial intelligence. His research interests mainly focus in computational modeling of complex systems and population dynamics, as well as other practical applications in the fields of bioinformatics, biomedicine, high performance computing and robotics.

    The code (and data) in this article has been certified as Reproducible by Code Ocean: (https://codeocean.com/). More information on the Reproducibility Badge Initiative is available at https://www.elsevier.com/physical-sciences-and-engineering/computer-science/journals.

    View full text