Skip to main content
Log in

Application of Machine Learning Methods in Baikal-GVD: Background Noise Rejection and Selection of Neutrino-Induced Events

  • MACHINE LEARNING IN FUNDAMENTAL PHYSICS
  • Published:
Moscow University Physics Bulletin Aims and scope

Abstract

Baikal-GVD is a large (\(\sim\)1 km\({}^{3}\)) underwater neutrino telescope located in Lake Baikal, Russia. In this report, we present two machine learning techniques developed for its data analysis. First, we introduce a neural network for an efficient rejection of noise hits, emerging due to natural water luminescence. Second, we develop a neural network for distinguishing muon- and neutrino-induced events. By choosing an appropriate classification threshold, we preserve \(90\%\) of neutrino-induced events, while muon-induced events are suppressed by a factor of \(10^{-6}\). Both of the developed neural networks employ the causal structure of events and surpass the precision of standard algorithmic approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

REFERENCES

  1. IceCube Collab., Science 342, 1242856 (2013). https://doi.org/10.1126/science.1242856

  2. Baikal-GVD Collab., in Proc. 37th Int. Cosmic Ray Conf., Berlin, 2021 (PoS, Trieste, 2021), Vol. 395, p. 2. https://doi.org/10.22323/1.395.0002

  3. S. Aiello et al. (KM3NeT collab.), Astropart. Phys. 111, 100 (2019). https://doi.org/10.1016/j.astropartphys.2019.04.002

    Article  ADS  Google Scholar 

  4. M. G. Aartsen et al. (IceCube Collab.), J. Phys. G: Nucl. Part. Phys. 48, 060501 (2021). https://doi.org/10.1088/1361-6471/abbd48

  5. V. A. Allakhverdyan et al. (Baikal-GVD Collab.), Phys. Rev. D 107, 042005 (2023). https://doi.org/10.1103/PhysRevD.107.042005

  6. Y. Malyshkin et al. (Baikal-GVD Collab.), Nucl. Instrum. Methods Phys. Res., Sect. A 1050, 168117 (2023). https://doi.org/10.1016/j.nima.2023.168117

  7. N. Choma et al. (IceCube Collab.), in Proc. 17th IEEE Int. Conf. on Machine Learning and Applications, Orlando, Fla., 2018 (IEEE, 2019), pp. 386–391. https://doi.org/10.1109/ICMLA.2018.00064

  8. M. Huennefeld et al. (IceCube Collab.), in 35th Int. Cosmic Ray Conf., Busan, Korea, 2017 (PoS, Trieste, 2018), Vol. 301, p. 1057. https://doi.org/10.22323/1.301.1057

  9. M. Huennefeld, EPJ Web Conf. 207, 05005 (2019). https://doi.org/10.1051/epjconf/201920705005

    Article  Google Scholar 

  10. S. Reck et al. (KM3NeT Collab.), J. Instrum. 16, C10011 (2021). https://doi.org/10.1088/1748-0221/16/10/C10011

  11. S. Aiello et al. (KM3NeT Collab.), J. Instrum. 15, P10005 (2020). https://doi.org/10.1088/1748-0221/15/10/P10005

  12. J. García-Méndez et al. (ANTARES Collab.), J. Instrum. 16, C09018 (2021). https://doi.org/10.1088/1748-0221/16/09/C09018

  13. The IceCube Collab., J. Instrum. 16, P07041 (2021). https://doi.org/10.1088/1748-0221/16/07/P07041

  14. A. D. Avrorin et al. (Baikal-GVD Collab.), in Proc. 36th Int. Cosmic Ray Conf., Madison, Wis., 2019 (PoS, Trieste, 2021), Vol. 358, p. 875. https://doi.org/10.22323/1.358.0875

  15. N. N. Kalmykov and S. S. Ostapchenko, Phys. At. Nucl. 56, 346 (1993).

    Google Scholar 

  16. D. Heck, J. Knapp, J. N. Capdevielle, et al., CORSIKA: A Monte Carlo Code to Simulate Extensive Air Showers (Forschungszentrum Karlsruhe, Karlsruhe, Germany, 1998).

    Google Scholar 

  17. L. Dominé, et al. (DeepLearnPhysics Collab.), Phys. Rev. D 104, 032004 (2021). https://doi.org/10.1103/PhysRevD.104.032004

  18. J. Long, E. Shelhamer, and T. Darrell, in Proc. IEEE Conf. on Computer Vision and Pattern Recognition, Boston, 2015 (IEEE, 2015), pp. 3431–3440. https://doi.org/10.1109/CVPR.2015.7298965

  19. O. Ronneberger, P. Fischer and T. Brox, in Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015, Ed. by N. Navab, J. Hornegger, W. Wells, and A. Frangi, Lecture Notes in Computer Science, Vol. 9351 (Springer, Cham, 2015), pp. 234–241. https://doi.org/10.1007/978-3-319-24574-4_28

    Book  Google Scholar 

  20. Z. Zhou, M. M. Rahman Siddiquee, N. Tajbakhsh, and J. Liang, in Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support. DLMIA ML-CDS 2018, Ed. by D. Stoyanov et al., Lecture Notes in Computer Sciences, Vol. 11045 (Springer, Cham, 2018), pp. 3–11. https://doi.org/10.1007/978-3-030-00889-5_1

  21. Pierre Auger Collab., J. Instrum. 16, P07016 (2021). https://doi.org/10.1088/1748-0221/16/07/P07016

  22. D. Bahdanau, K. Cho, and Y. Bengio, ‘‘Neural machine translation by jointly learning to align and translate,’’ arXiv Preprint (2014). https://doi.org/10.48550/arXiv.1409.0473

  23. I. Sutskever, O. Vinyals, and Q. V. Le, in Proc. 27th Int. Conf. on Neural Information Processing Systems, Montreal, Canada, 2014 (MIT Press, Cambridge, 2014), Vol. 2, pp. 3104–3112. https://doi.org/10.48550/arXiv.1409.3215

  24. S. Hochreiter and J. Schmidhuber, Neural Comput. 9, 1735 (1997). https://doi.org/10.1162/neco.1997.9.8.1735

    Article  Google Scholar 

  25. B. Shaybonov et al. (Baikal-GVD Collab.), in Proc. 37th Int. Cosmic Ray Conf., Berlin, 2021 (PoS, Trieste, 2021), Vol. 395, p. 1063. https://doi.org/10.22323/1.395.1063

  26. K. He, X. Zhang, S. Ren and J. Sun, in Proc. IEEE Conf. on Computer Vision and Pattern Recognition, Las Vegas, 2016 (IEEE, 2016), pp. 770–778. https://doi.org/10.1109/CVPR.2016.90

  27. T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollár, IEEE Trans. Pattern Anal. Mach. Intell. 42, 318–327 (2018). https://doi.org/10.1109/TPAMI.2018.2858826.

    Article  Google Scholar 

Download references

Funding

This work was supported by the Russian Science Foundation, grant no. 22-22-20063.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to A. V. Matseiko or I. V. Kharuk.

Ethics declarations

The authors declare that they have no conflicts of interest.

Additional information

Publisher’s Note.

Allerton Press remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

APPENDIX

APPENDIX

1.1 DERIVATION OF Eq. (3)

In this section we derive distributions, expected values and dispersions of random values in Eq. (2). Then, using them, we evaluate the error of estimating the number of neutrino events using Eq. (2). In this section, we use \(P\) to denote probability of some outcome for a random variable, and \(M\) and \(D\) for its expected value and dispersion, accordingly.

We start by discussing the properties of random variables in the experimental dataset. Let the latter contains \(n^{0}\) events in total, \(n_{\nu}^{0}\) of which are neutrino-induced and \(n_{\mu}^{0}\equiv n^{0}-n_{\nu}^{0}\) are EAS-induced. \(n_{\nu}^{0}\) and \(n_{\mu}^{0}\) are random variables distributed according to the Poisson law with parameters \(\nu\) and \(\mu\), respectively. Hence,

$$P(n_{\nu}^{0}=k)=\frac{\nu^{k}e^{-\nu}}{k!},$$
(A1)
$$P(n_{\mu}^{0}=k)=\frac{\mu^{k}e^{-\mu}}{k!}.$$
(A2)

Since \(n^{0}\) is a sum of \(n^{0}_{\nu}\) and \(n^{0}_{\mu}\), it also follows the Poisson distribution,

$$P(n^{0}=k)=\frac{(\nu+\mu)^{k}e^{-(\nu+\mu)}}{k!}.$$
(A3)

Its expected value and dispersion are:

$$M(n^{0})=D(n^{0})=\nu+\mu.$$
(A4)

Let us now address the classification of events by the neural network. A trained neural network can be considered as a black box. As it was discussed in Subsection 4.3, for a fixed classification threshold \(\xi\), the network classifies a neutrino-induced event correctly with some probability \(E\), and EAS-induced event is identified incorrectly with the probability \(S\). Hence, the number of identified true and false neutrino-induced events are independent random variables with binomial distributions:

$$P(n_{\nu}=k|n_{\nu}^{0}=m)=Bin(m,E)(k),$$
(A5)
$$P(n_{\mu}=k|n_{\mu}^{0}=m)=Bin(m,S)(k).$$
(A6)

Here \(n_{\nu}(\xi)\equiv n_{\nu}\), \(n_{\mu}(\xi)\equiv n_{\mu}\), and \(Bin(m,p)(k)\) stands for the binomial distribution with number of experiments \(m\) and success probability \(p\):

$$Bin(m,p)(k)=C_{m}^{k}p^{k}(1-p)^{m-k}.$$
(A7)

The number of neutrino-induced events identified by the neural network on a test dataset is subject to both of the above-described random processes. Hence, the full probability distributions, \(P_{\text{f}}\), of \(n_{\nu}\) and \(n_{\mu}\) can be obtained by multiplying the corresponding Poisson and binomial distributions,

$$P_{\text{f}}(n_{\nu}=k)$$
$${}=\sum_{m=0}^{\infty}P(n_{\nu}=k|n_{\nu}^{0}=m)P(n_{\nu}^{0}=m),$$
(A8)
$$P_{\text{f}}(n_{\mu}=k)$$
$${}=\sum_{m=0}^{\infty}P(n_{\mu}=k|n_{\mu}^{0}=m)P(n_{\mu}^{0}=m).$$
(A9)

Using Eqs. (A8), (A9), (A1), and Eq. (A2), one can evaluate the expected values and dispersions of \(n_{\nu}\) and \(n_{\mu}\):

$$M(n_{\nu})=D(n_{\nu})=E\nu,$$
(A10)
$$M(n_{\mu})=D(n_{\mu})=S\mu.$$
(A11)

For the random variable \(n\), which is a sum of \(n_{\nu}\) and \(n_{\mu}\), one has:

$$M(n)=D(n)=E\nu+S\mu.$$
(A12)

Next, let us elaborate on the distribution of \(\tilde{E}\) and \(\tilde{S}\) considered as random variables. Let the test Monte Carlo dataset contains \(\tilde{n}^{0}\) events, \(\tilde{n}_{\nu}^{0}\) of which are neutrino-induced and \(\tilde{n}_{\mu}^{0}\) are EAS-induced. Since the measured exposure and suppression, \(\tilde{E}\) and \(\tilde{S}\), are given by Eq. (1), they follow the probability distributions

$$P(\tilde{E}=\alpha)=P(\tilde{n}_{\nu}=\alpha\tilde{n}_{\nu}^{0})$$
$${}=Bin(\tilde{n}_{\nu}^{0},E)(\alpha\tilde{n}_{\nu}^{0}),$$
(A13)
$$P(\tilde{S}=\beta)=P(\tilde{n}_{\mu}=\beta\tilde{n}_{\mu}^{0})$$
$${}=Bin(\tilde{n}_{\mu}^{0},S)(\beta\tilde{n}_{\mu}^{0}),$$
(A14)

where \(\alpha\) and \(\beta\) are discrete variables so that \(\alpha\tilde{n}_{\nu}\) and \(\beta\tilde{n}_{\mu}\) are integers. Hence,

$$M(\tilde{E})=E;\quad D(\tilde{E})=\frac{E(1-E)}{\tilde{n}_{\nu}^{0}},$$
(A15)
$$M(\tilde{S})=S;\quad D(\tilde{S})=\frac{S(1-S)}{\tilde{n}_{\mu}^{0}},$$
(A16)

proving that \(\tilde{E}\) and \(\tilde{S}\) are unbiased estimates of true values of \(E\) and \(S\), respectively.

Now, we are ready to evaluate the dispersion of \(N_{\xi}\) estimated using Eq. (2). For this purpose, we used the standard formula for the dispersion of a function of random variables,

$$\sigma_{N_{\xi}}^{2}=\sum_{v}\left(\frac{\partial N_{\xi}}{\partial v}\right)^{2}\sigma_{v}^{2}$$
$${}+2\sum_{v\neq u}\left(\frac{\partial N_{\xi}}{\partial v}\right)\left(\frac{\partial N_{\xi}}{\partial u}\right)Cov_{v,u}.$$
(A17)

Here, \(v\) and \(u\) denote arguments of the function \(N_{\xi}\), which are \(n(\xi)\equiv n\), \(n(0)\equiv n^{0}\), \(\tilde{E}(\xi)\) and \(\tilde{S}(\xi)\); \(\sigma_{v}^{2}\) stands for squared variance of \(v\), and \(Cov_{v,u}\) denotes covariance between \(v\) and \(u\).

Let us explicitly write out the estimation of the variances and covariances. According to Eq. (A4), \(\sigma_{n^{0}}^{2}\) can be estimated as

$$\sigma_{n^{0}}^{2}=n^{0}.$$
(A18)

Further, from Eq. (A12), one gets

$$\sigma_{n}^{2}=n.$$
(A19)

Next, \(\sigma_{\tilde{E}}^{2}\) and \(\sigma_{\tilde{S}}^{2}\) can be obtained from Eq. (A15) and Eq. (A16),

$$\sigma_{\tilde{E}}^{2}=\frac{\tilde{E}(1-\tilde{E})}{\tilde{n}_{\nu}^{0}},\quad\sigma_{\tilde{S}}^{2}=\frac{\tilde{S}(1-\tilde{S})}{\tilde{n}_{\mu}^{0}}.$$
(A20)

Finally, note that there are only two dependent random variables in Eq. (2)—\(n\) and \(n^{0}\). Their covariance can be calculated using Eqs. (A3), (A8), and (A9),

$$Cov_{n^{0},n}=E\nu+S\mu=M(n).$$
(A21)

Therefore this covariance can be estimated as \(n\).

By calculating the partial derivatives of \(N_{\xi}\) in Eq. (A17) and substituting the obtained expressions for variances and covariances, we arrive at the final result:

$$\sigma_{N_{\xi}}^{2}=\frac{(n-n^{0}\tilde{S})^{2}}{(\tilde{E}-\tilde{S})^{4}}\frac{\tilde{E}(1-\tilde{E})}{\tilde{n}_{\nu}^{0}}$$
$${}+\frac{(n-n^{0}\tilde{E})^{2}}{(\tilde{E}-\tilde{S})^{4}}\frac{\tilde{S}(1-\tilde{S})}{\tilde{n}_{\mu}^{0}}$$
$${}+\frac{n-2n{\tilde{S}}+n^{0}(\tilde{S})^{2}}{(\tilde{E}-\tilde{S})^{2}}.$$
(A22)

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Matseiko, A.V., Kharuk, I.V. Application of Machine Learning Methods in Baikal-GVD: Background Noise Rejection and Selection of Neutrino-Induced Events. Moscow Univ. Phys. 78 (Suppl 1), S71–S79 (2023). https://doi.org/10.3103/S0027134923070226

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.3103/S0027134923070226

Keywords:

Navigation