当前位置: X-MOL 学术Theory Comput. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Random Access in Persistent Strings and Segment Selection
Theory of Computing Systems ( IF 0.5 ) Pub Date : 2022-12-17 , DOI: 10.1007/s00224-022-10109-5
Philip Bille , Inge Li Gørtz

We consider compact representations of collections of similar strings that support random access queries. The collection of strings is given by a rooted tree where edges are labeled by an edit operation (inserting, deleting, or replacing a character) and a node represents the string obtained by applying the sequence of edit operations on the path from the root to the node. The goal is to compactly represent the entire collection while supporting fast random access to any part of a string in the collection. This problem captures natural scenarios such as representing the past history of an edited document or representing highly-repetitive collections. Given a tree with n nodes, we show how to represent the corresponding collection in O(n) space and \(O(\log n/ \log \log n)\) query time. This improves the previous time-space trade-offs for the problem. Additionally, we show a lower bound proving that the query time is optimal for any solution using near-linear space. To achieve our bounds for random access in persistent strings we show how to reduce the problem to the following natural geometric selection problem on line segments. Consider a set of horizontal line segments in the plane. Given parameters i and j, a segment selection query returns the j th smallest segment (the segment with the j th smallest y-coordinate) among the segments crossing the vertical line through x-coordinate i. The segment selection problem is to preprocess a set of horizontal line segments into a compact data structure that supports fast segment selection queries. We present a solution that uses O(n) space and support segment selection queries in \(O(\log n/ \log \log n)\) time, where n is the number of segments. Furthermore, we prove that that this query time is also optimal for any solution using near-linear space.



中文翻译:

持久字符串中的随机访问和段选择

我们考虑支持随机访问查询的相似字符串集合的紧凑表示。字符串集合由有根树给出,其中边由编辑操作(插入、删除或替换字符)标记,节点表示通过在从根到节点的路径上应用编辑操作序列获得的字符串节点。目标是紧凑地表示整个集合,同时支持对集合中字符串的任何部分进行快速随机访问。此问题捕获自然场景,例如表示已编辑文档的过去历史或表示高度重复的集合。给定一棵有n 个节点的树,我们展示了如何在O ( n ) 空间中表示相应的集合,并且\(O(\log n/ \log \log n)\)查询时间。这改善了之前针对问题的时空权衡。此外,我们展示了一个下限,证明查询时间对于使用近线性空间的任何解决方案都是最佳的。为了实现持久字符串中随机访问的界限,我们展示了如何将问题简化为线段上的以下自然几何选择问题。考虑平面中的一组水平线段。给定参数ij,段选择查询返回穿过x坐标i的垂直线的段中的第j个最小段(具有第j个最小y坐标的段). 线段选择问题是将一组水平线段预处理成支持快速线段选择查询的紧凑数据结构。我们提出了一个使用O ( n ) 空间并支持在\(O(\log n/ \log \log n)\)时间内进行段选择查询的解决方案,其中n是段数。此外,我们证明该查询时间对于使用近线性空间的任何解决方案也是最优的。

更新日期:2022-12-17
down
wechat
bug