Abstract
Heavy-duty diesel vehicles are important sources of urban nitrogen oxides (NOx) in actual applications for environmental compliance, emitting more than 80% of NOx and more than 90% of particulate matter (PM) in total vehicle emissions. The detection and control of heavy-duty diesel emissions are critical for protecting public health. Currently, vehicles on the road must be regularly tested, every six months or once a year, to filter out high-emission mobile sources at vehicle inspection stations. However, it is difficult to effectively screen high-emission vehicles in time with a long interval between annual inspections, and the fixed threshold cannot adapt to the dynamic changes of vehicle driving conditions. An on-board diagnostic device (OBD) is installed inside the vehicle and can record the vehicle’s emission data in real time. In this paper, we propose a temporal optimization long short-term memory (LSTM) and adaptive dynamic threshold approach to identify heavy-duty high-emitters by using OBD data, which can continuously track and record the emission status in real time. First, a temporal optimization LSTM emission prediction model is established to solve the attention bias discrepancy problem on time steps that is caused by the large number of OBD data streams in practice. Then, the concentration prediction error sequence is detected and distinguished from the anomalous emission contexts using flexible criteria, calculated by an adaptive dynamic threshold with changing driving conditions. Finally, a similarity metric strategy for the time series is introduced to correct some pseudo anomalous results. Experiments on three real OBD time-series emission datasets demonstrate that our method can achieve high accuracy anomalous emission identification.
摘要
在实际场景中, 重型柴油车是城市氮氧化物的重要来源, 其排放的氮氧化物(NOx)占车辆总排放量的80%以上, 颗粒物(PM)占90%以上. 检测和控制重型柴油车的排放对保护公众健康至关重要. 目前, 道路上的车辆必须每6个月或每年定期检测一次, 在车辆检查站过滤出高排放的移动源. 然而, 由于年检间隔时间较长, 很难及时有效地筛选出高排放车辆, 而且固定的阈值不能适应车辆驾驶工况的动态变化. 车载诊断设备(OBD)安装在车辆内部, 可以连续跟踪和实时记录排放数据. 本文提出一种时间优化长短期记忆(LSTM)和自适应动态阈值方法, 使用OBD数据识别重型高排放车辆. 首先, 建立一个时间优化LSTM排放预测模型, 以解决实际中大量OBD数据流造成的时间步注意力偏重问题. 然后, 利用灵活的阈值标准检测浓度预测误差序列, 以区分异常排放情况, 该阈值随驾驶条件变化自适应计算得到. 最后, 引入时间序列的相似性度量策略, 以纠正一些假的异常结果. 在3个真实OBD时间序列排放数据集上的实验表明, 该方法得到优异的高排放源识别结果.
Data availability
The data that support the findings of this study are available from the corresponding authors upon reasonable request.
References
Chandola V, Banerjee A, Kumar V, 2009. Anomaly detection: a survey. ACM Comput Surv, 41(3):15. https://doi.org/10.1145/1541880.1541882
Cho K, van Merriënboer B, Gulcehre C, et al., 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. Proc Conf on Empirical Methods in Natural Language Processing, p.1724–1734. https://doi.org/10.3115/v1/D14-1179
Fang SW, Li Q, Karimian H, et al., 2022. DESA: a novel hybrid decomposing-ensemble and spatiotemporal attention model for PM2.5 forecasting. Environ Sci Poll Res, 29(36):54150–54166. https://doi.org/10.1007/s11356-022-19574-4
Franco V, Kousoulidou M, Muntean M, et al., 2013. Road vehicle emission factors development: a review. Atmos Environ, 70:84–97. https://doi.org/10.1016/j.atmosenv.2013.01.006
Guo HF, Zeng J, Hu YM, 2006. Neural network modeling of vehicle gross emitter prediction based on remote sensing data. Proc IEEE Int Conf on Networking, Sensing and Control, p.943–946. https://doi.org/10.1109/ICNSC.2006.1673275
He ZY, Xu XF, Deng SC, 2003. Discovering cluster-based local outliers. Patt Recogn Lett, 24(9–10):1641–1650. https://doi.org/10.1016/S0167-8655(03)00003-5
Jiang MF, Tseng SS, Su CM, 2001. Two-phase clustering process for outliers detection. Patt Recogn Lett, 22(6–7):691–700. https://doi.org/10.1016/S0167-8655(00)00131-8
Karimian H, Li Q, Li CC, et al., 2019. Spatio-temporal variation of wind influence on distribution of fine particulate matter and its precursor gases. Atmos Poll Res, 10(1):53–64. https://doi.org/10.1016/j.apr.2018.06.005
Li YR, Zhu ZF, Kong DQ, et al., 2019. EA-LSTM: evolutionary attention-based LSTM for time series prediction. Knowl-Based Syst, 181:104785. https://doi.org/10.1016/j.knosys.2019.05.028
Li ZR, Kang Y, Lv WJ, et al., 2021. High-emitter identification model establishment using weighted extreme learning machine and active sampling. Neurocomputing, 441:79–91. https://doi.org/10.1016/j.neucom.2021.01.074
Liu FT, Ting KM, Zhou ZH, 2008. Isolation forest. Proc 8th IEEE Int Conf on Data Mining, p.413–422. https://doi.org/10.1109/ICDM.2008.17
Liu YQ, Gong CY, Yang L, et al., 2020. DSTP-RNN: a dual-stage two-phase attention-based recurrent neural network for long-term and multivariate time series prediction. Expert Syst Appl, 143:113082. https://doi.org/10.1016/j.eswa.2019.113082
Lucas JM, Saccucci MS, 1990. Exponentially weighted moving average control schemes: properties and enhancements. Technometrics, 32(1):1–12. https://doi.org/10.2307/1269835
Lukashevich H, Nowak S, Dunker P, 2009. Using one-class SVM outliers detection for verification of collaboratively tagged image training sets. Proc IEEE Int Conf on Multimedia and Expo, p.682–685. https://doi.org/10.1109/ICME.2009.5202588
Malhotra P, Vig L, Shroff GM, et al., 2015. Long short term memory networks for anomaly detection in time series. Proc 23rd European Symp on Artificial Neural Networks, Computational Intelligence and Machine Learning.
McClintock PM, 2007. High Emitter Remote Sensing Project. Prepared for Southeast Michigan Council of Governments. http://refhub.elsevier.com/S1352-2310(18)30187-0/sref52 [Accessed on Mar. 29, 2022]. McClintock PM, 2011. The Colorado Remote Sensing Program January—December 2010. The Colorado Department of Public Health and Environment. http://refhub.elsevier.com/S1352-2310(18)30187-0/sref80 [Accessed on Mar. 29, 2022].
Ministry of Ecology and Environment of the People’s Republic of China, 2022. China Mobile Source Environmental Management Annual Report (in Chinese). https://www.mee.gov.cn/hjzl/sthjzk/ydyhjgl/202212/W020221207387013521948.pdf [Accessed on Mar. 29, 2022].
Pujadas M, Domínguez-Sáez A, de la Fuente J, 2017. Real-driving emissions of circulating Spanish car fleet in 2015 using RSD technology. Sci Total Environ, 576:193–209. https://doi.org/10.1016/j.scitotenv.2016.10.049
Senin P, 2008. Dynamic Time Warping Algorithm Review. University of Hawaii, Honolulu, USA.
Shipmon DT, Gurevitch JM, Piselli PM, et al., 2017. Time series anomaly detection; detection of anomalous drops with limited features and sparse examples in noisy highly periodic data. https://arxiv.org/abs/1708.03665
Smit R, Bluett J, 2011. A new method to compare vehicle emissions measured by remote sensing and laboratory testing: high-emitters and potential implications for emission inventories. Sci Total Environ, 409(13):2626–2634. https://doi.org/10.1016/j.scitotenv.2011.03.026
Stephens RD, Cadle SH, Qian TZ, 1996. Analysis of remote sensing errors of omission and commission under FTP conditions. J Air Waste Manag Assoc, 46(6):510–516. https://doi.org/10.1080/10473289.1996.10467486
Williamson DF, Parker RA, Kendrick JS, 1989. The box plot: a simple visual method to interpret data. Ann Int Med, 110(11):916–921. https://doi.org/10.1059/0003-4819-110-11-916
Wu CL, Li Q, Hou JX, et al., 2018. PM2.5 concentration prediction using convolutional neural networks. Sci Surv Map, 43(8):68–75 (in Chinese). https://doi.org/10.16251/j.cnki.1009-2307.2018.08.011
Xie H, Zhang YJ, He Y, et al., 2019. Automatic and fast recognition of on-road high-emitting vehicles using an optical remote sensing system. Sensors, 19(16):3540. https://doi.org/10.3390/s19163540
Xie H, Zhang YJ, He Y, et al., 2021. Parallel attention-based LSTM for building a prediction model of vehicle emissions using PEMS and OBD. Measurement, 185:110074. https://doi.org/10.1016/j.measurement.2021.110074
Xu XW, Yuruk N, Feng ZD, et al., 2007. SCAN: a structural clustering algorithm for networks. Proc 13th ACM SIGKDD Int Conf on Knowledge Discovery and Data Mining, p.824–833. https://doi.org/10.1145/1281192.1281280
Xu ZY, Kang Y, Cao Y, et al., 2021. Spatiotemporal graph convolution multifusion network for urban vehicle emission prediction. IEEE Trans Neur Netw Learn Syst, 32(8):3342–3354. https://doi.org/10.1109/TNNLS.2020.3008702
Yu Y, Si XS, Hu CH, et al., 2019. A review of recurrent neural networks: LSTM cells and network architectures. Neur Comput, 31(7):1235–1270. https://doi.org/10.1162/neco_a_01199
Zeng J, Guo HF, Hu YM, 2008. A PKGV-ANN model for vehicle high emitters identification based on remote sensing data. Proc 27th Chinese Control Conf, p.171–175. https://doi.org/10.1109/CHICC.2008.4604922
Zhang GP, 2003. Time series forecasting using a hybrid ARIMA and neural network model. Neurocomputing, 50:159–175. https://doi.org/10.1016/S0925-2312(01)00702-0
Zhou HY, Zhang SH, Peng JQ, et al., 2021. Informer: beyond efficient transformer for long sequence time-series forecasting. Proc 35th AAAI Conf on Artificial Intelligence, p.11106–11115. https://doi.org/10.1609/aaai.v35i12.17325
Author information
Authors and Affiliations
Contributions
Zhenyi XU designed the research. Zhenyi XU and Renjun WANG processed the data. Zhenyi XU drafted the paper. Renjun WANG helped organize the paper. Yu KANG helped in data control and project management. Yang CAO and Yu KANG provided the funding acquisition and revised and finalized the paper.
Corresponding authors
Ethics declarations
Zhenyi XU, Renjun WANG, Yang CAO, and Yu KANG declare that they have no conflict of interest.
Additional information
Project supported by the National Natural Science Foundation of China (Nos. 62033012 and 62103124) and the Major Special Science and Technology Project of Anhui Province, China (No. 202003a07020009)
List of supplementary materials
1 Overview of anomaly detection
2 Datasets for OBD
3 Algorithms and the box plot
4 Extended experimental analysis
Table S1 Description of the properties of the OBD data streams
Table S2 Detailed time information about NES and PES
Fig. S1 Raw time-series data stream for OBD
Fig. S2 Principle of the box plot
Fig. S3 TSAO-LSTM prediction errors for different split ratios on OBDs
Algorithm S1 TSAO-LSTM optimization of time-step attention weights
Algorithm S2 Similarity metric algorithm between emission prediction error series using DTW
Rights and permissions
About this article
Cite this article
Xu, Z., Wang, R., Cao, Y. et al. High-emitter identification for heavy-duty vehicles by temporal optimization LSTM and an adaptive dynamic threshold. Front Inform Technol Electron Eng 24, 1633–1646 (2023). https://doi.org/10.1631/FITEE.2300005
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1631/FITEE.2300005