Usage-aware representation learning for critical information identification in transportation networks,Transportation Research Part C: Emerging Technologies

当前位置： X-MOL 学术 › Transp. Res. Part C Emerg. Technol. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Usage-aware representation learning for critical information identification in transportation networks
Transportation Research Part C: Emerging Technologies ( IF 8.3 ) Pub Date : 2024-02-27 , DOI: 10.1016/j.trc.2024.104538
Ran Sun , Yueyue Fan

Extracting meaningful information from noisy high-dimensional data is attracting increasing attention as richer and higher resolution data is being collected and used for transportation system planning and management purposes. Discovering critical information via effective data representation learning not only helps reduce data dimension, it also enables a deeper understanding of the underlying properties of noisy data, which could then lead to better planning and operations decisions. In this study, we present a new perspective that, unlike most existing approaches in the general data science literature, the design of data representation should go beyond the data itself; it should incorporate an understanding of how the data is used in the domain-specific applications. We further argue that this design philosophy is particularly important for transportation data because of the high spatial correlations of transportation data brought by network interdependence. We propose a usage-aware representation learning framework by incorporating the information loss for downstream application into the data encoding-decoding process. The proposed approach is formulated as a Stiefel manifold optimization problem. The effectiveness of the proposed framework is demonstrated in two network applications: modeling transportation network flows and estimating network-level vehicular emissions. The performance of the learned representation from our approach is compared with existing approaches using multiple evaluation context, including data reconstruction quality, clustering, anomaly detection, and critical information identification, through case studies implemented in Sioux Falls, Boston, and San Jose networks. The good performance of our approach consistently observed in those experiments indicates the importance of incorporating the downstream data usage in the process of data representation learning.

中文翻译：

用于交通网络中关键信息识别的使用感知表示学习

随着更丰富、更高分辨率的数据被收集并用于交通系统规划和管理目的，从嘈杂的高维数据中提取有意义的信息越来越受到关注。通过有效的数据表示学习发现关键信息不仅有助于减少数据维度，还可以更深入地了解噪声数据的基本属性，从而制定更好的规划和运营决策。在这项研究中，我们提出了一个新的观点：与一般数据科学文献中大多数现有方法不同，数据表示的设计应该超越数据本身；它应该包含对如何在特定领域的应用程序中使用数据的理解。我们进一步认为，这种设计理念对于交通数据尤其重要，因为网络相互依赖带来的交通数据的高度空间相关性。我们通过将下游应用程序的信息丢失合并到数据编码-解码过程中，提出了一种使用感知的表示学习框架。所提出的方法被表述为 Stiefel 流形优化问题。所提出框架的有效性在两个网络应用中得到了证明：交通网络流量建模和网络级车辆排放估算。通过在苏福尔斯、波士顿和圣何塞网络中实施的案例研究，将我们方法中学习到的表示的性能与使用多种评估环境的现有方法进行比较，包括数据重建质量、聚类、异常检测和关键信息识别。在这些实验中一致观察到我们的方法的良好性能表明在数据表示学习过程中纳入下游数据使用的重要性。

更新日期：2024-02-27

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文

全部期刊列表>>