Abstract
We introduce the infix inclusion problem of two languages S and T that decides whether or not S is a subset of the set of all infixes of T. This problem is motivated by the need for identifying malicious computation patterns according to their semantics, which are often disguised with additional sub-patterns surrounding information. In other words, malicious patterns are embedded as an infix of the whole pattern. We examine the infix inclusion problem for the case where a source S and a target T are finite, regular or context-free languages. We prove that the problem is 1) co-NP-complete when one of the languages is finite, 2) PSPACE-complete when both S and T are regular, 3) EXPTIME-complete when S is context-free and T is regular, 4) undecidable when S is either regular or context-free and T is context-free and 5) undecidable when one of S and T is in a language class where the emptiness of its languages is undecidable, even if the other is finite. We, furthermore, explore the infix inclusion problem for visibly pushdown languages, a subclass of context-free languages.
Similar content being viewed by others
References
Chapman, C., Stolee, K.T.: Exploring regular expression usage and context in Python. In: Proceedings of the 25th international symposium on software testing and analysis, pp 282–293 (2016)
Davis, J.C., Michael IV, L.G., Coghlan, C.A., Servant, F., Lee, D.: Why aren’t regular expressions a lingua franca? an empirical study on the reuse and portability of regular expressions. In: Proceedings of the 27th ACM joint meeting on European software engineering conference and symposium on the foundations of software engineering, pp 443–454 (2019)
McNaughton, R., Yamada, H.: Regular expressions and state graphs for automata. IRE Trans. Elect. Comput. EC-9(1), 39–47 (1960)
Thompson, K.: Programming techniques: Regular expression search algorithm. Commun. ACM 11(6), 419–422 (1968)
Spencer, H.: A regular-expression matcher. In: Software solutions in C, academic press professional, Inc., MA, USA, pp 35–71 (1994)
Berglund, M., Drewes, F., van der Merwe, B.: Analyzing catastrophic backtracking behavior in practical regular expression matching. In: Proceedings of the 14th international conference on automata and formal languages, pp 109–123 (2014)
Davis, J.C., Coghlan, C.A., Servant, F., Lee, D.: The impact of regular expression denial of service (ReDoS) in practice: an empirical study at the ecosystem scale. In: Proceedings of the 26th ACM joint meeting on European software engineering conference and symposium on the foundations of software engineering, pp 246–256 (2018)
Wüstholz, V., Olivo, O., Heule, M.J.H., Dillig, I.: Static detection of DoS vulnerabilities in programs that use regular expressions. In: Proceedings of the 23rd international conference on tools and algorithms for the construction and analysis of systems, Part II, pp 3–20 (2017)
Meyer, A.R., Stockmeyer, L.J.: The equivalence problem for regular expressions with squaring requires exponential space. In: Proceedings of the 13th annual symposium on switching and automata theory, pp 125–129 (1972)
Câmpeanu, C., Moreira, N., Reis, R.: Distinguishability operations and closures. Fund. Informaticae 148(3–4), 243–266 (2016)
Gao, Y., Moreira, N., Reis, R., Yu, S.: A survey on operational state complexity. J. Automata, Language Combinator. 21(4), 251–310 (2017)
Pribavkina, E.V., Rodaro, E.: State complexity of prefix, suffix, bifix and infix operators on regular languages. In: Proceedings of the 14th international conference on developments in language theory, pp 376–386 (2010)
Pribavkina, E.V., Rodaro, E.: State complexity of code operators. Int. J. Found. Comput. Sci. 22(7), 1669–1681 (2011)
Alur, R., Madhusudan, P.: Visibly pushdown languages. In: Proceedings of the 36th annual ACM symposium on theory of computing, pp 202–211 (2004)
Geffert, V., Bednároná, Z., Szabari, A.: Input-driven pushdown automata for edit distance neighborhood. Theoretical Comput. Sci. 918, 105–122 (2022)
Sipser, M.: Introduction to the Theory of Computation, 3rd edn. Cengage Learning, MA, USA (2013)
Wood, D.: Theory of Computation. Harper & Row, NY, USA (1987)
Arora, S., Barak, B.: Computational Complexity - A Modern Approach. Cambridge University Press, UK (2009)
Chandra, A.K., Stockmeyer, L.J.: Alternation. In: Proceedings of the 17th annual symposium on foundations of compuer science, pp 98–108 (1974)
Clemente, L.: On the complexity of the universality and inclusion problems for unambiguous context-free grammars. In: Proceedings of the 8th international workshop on verification and program transformation and 7th workshop on horn clauses for verification and synthesis, pp 29–43 (2020)
Bousquet, N., Löding, C.: Equivalence and inclusion problem for strongly unambiguious Büchi automata. In: Proceedings of the 4th international conference on language and automata theory and applications, pp 118–129 (2010)
Bruyère, V., Ducobu, M., Gauwin, O.: Visibly pushdown automata: Universality and inclusion via antichains. In: Proceedings of the 7th international conference on language and automata theory and applications, pp 190–201 (2013)
Champavère, J., Gilleron, R., Lemay, A., Niehren, J.: Efficient inclusion checking for deterministic tree automata and XML schemas. Inf. Comput. 207(11), 1181–1208 (2009)
Clemente, L., Mayr, R.: Efficient reduction of nondeterministic automata with application to language inclusion testing. Logical Methods Comput. Sci. 15(1), 12–11273 (2019)
Cheon, H., Hahn, J., Han, Y.-S.: On the decidability of infix inclusion problem. In: Proceedings of the 26th international conference on developments in language theory, pp 115–126 (2022)
Acknowledgements
A preliminary version of the paper appeared in Proceedings of the 26th International Conference on Developments in Language Theory, DLT 2022 [25]. This research was supported by the NRF grant (RS-2023-00208094) and the AI Graduate School Program (No. 2020-0-01361) funded by the Korean government (MSIT). The first two authors contributed equally to this work.
Author information
Authors and Affiliations
Contributions
H. Cheon and J. Hahn wrote the main manuscript text and H. Cheon prepared Figures 1–5 and Table 1. Y.-S. Han supervised the project. All authors reviewed and approved the manuscript
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Cheon, H., Hahn, J. & Han, YS. On the Decidability of Infix Inclusion Problem. Theory Comput Syst (2024). https://doi.org/10.1007/s00224-023-10160-w
Accepted:
Published:
DOI: https://doi.org/10.1007/s00224-023-10160-w