当前位置: X-MOL 学术Environ. Ecol. Stat. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Time series clustering using trend, seasonal and autoregressive components to identify maximum temperature patterns in the Iberian Peninsula
Environmental and Ecological Statistics ( IF 3.8 ) Pub Date : 2023-07-15 , DOI: 10.1007/s10651-023-00572-9
Arnobio Palacios Gutiérrez , Jose Luis Valencia Delfa , María Villeta López

Time series (TS) clustering is a crucial area of data mining that can be used to identify interesting patterns. This study introduces a novel approach to obtain clusters of TS by representing them with feature vectors that define the trend, seasonality and noise components of each series in order to identify areas of the Iberian Peninsula (IP) that follow the same pattern of change in regards to maximum temperature during 1931–2009. This representation allows for dimensionality reduction, and is obtained based on singular spectrum analysis decomposition in a sequential manner, which is a well-developed methodology of TS analysis and forecasting with applications ranging from the decomposition and filtering of nonparametric TS to parameter estimation and forecasting. In this approach, the trend, seasonality and residual components of each TS corresponding to a specific area in the Iberian region are extracted using the proposed SSA methodology. Afterwards, the feature vectors of the TS are obtained by modelling the extracted components and estimating their parameters. Finally, a clustering algorithm is applied to group the TS into clusters, which are defined according to the centroids. This methodology is applied to a climate database with reasonable results that align with the defined characteristics, enabling a spatial exploration of the IP. The results identified three differentiated zones that can be used to describe how the maximum temperature varied: in the northern and central zones, an increase in temperature was noted over time, whereas in the southern zone, a slight decrease was noted. Moreover, different seasonal variations were observed across the zones.



中文翻译:

使用趋势、季节和自回归分量的时间序列聚类来识别伊比利亚半岛的最高温度模式

时间序列 (TS) 聚类是数据挖掘的一个关键领域,可用于识别有趣的模式。本研究引入了一种新颖的方法来获取 TS 簇,通过用特征向量来表示它们,这些特征向量定义了每个系列的趋势、季节性和噪声分量,以便识别伊比利亚半岛 (IP) 中遵循相同变化模式的区域1931-2009 年期间达到最高温度。这种表示形式允许降维,是基于奇异谱分析顺序分解而获得的,这是一种成熟的传输流分析和预测方法,其应用范围从非参数传输流的分解和过滤到参数估计和预测。在这种方法中,趋势,使用所提出的 SSA 方法提取与伊比利亚地区特定区域相对应的每个 TS 的季节性和剩余成分。然后,通过对提取的组件进行建模并估计其参数来获得 TS 的特征向量。最后,应用聚类算法将TS分组为根据质心定义的簇。该方法应用于气候数据库,得到与定义的特征相符的合理结果,从而能够对IP进行空间探索。结果确定了三个不同的区域,可用于描述最高温度的变化:在北部和中部区域,温度随着时间的推移而增加,而在南部区域,温度略有下降。而且,

更新日期:2023-07-16
down
wechat
bug