Huber Loss Meets Spatial Autoregressive Model: A Robust Variable Selection Method with Prior Information

Song, Yunquan; Zhan, Minmin; Zhang, Yue; Liu, Yongxin

doi:10.1007/s11067-024-09614-6

Huber Loss Meets Spatial Autoregressive Model: A Robust Variable Selection Method with Prior Information

Research
Published: 27 January 2024

Volume 24, pages 291–311, (2024)
Cite this article

Networks and Spatial Economics Aims and scope Submit manuscript

Yunquan Song¹,
Minmin Zhan¹,
Yue Zhang¹ &
…
Yongxin Liu²

140 Accesses
Explore all metrics

Abstract

In recent times, the significance of variable selection has amplified because of the advent of high-dimensional data. The regularization method is a popular technique for variable selection and parameter estimation. However, spatial data is more intricate than ordinary data because of spatial correlation and non-stationarity. This article proposes a robust regularization regression estimator based on Huber loss and a generalized Lasso penalty to surmount these obstacles. Moreover, linear equality and inequality constraints are contemplated to boost the efficiency and accuracy of model estimation. To evaluate the suggested model’s performance, we formulate its Karush-Kuhn-Tucker (KKT) conditions, which are indicators used to assess the model’s characteristics and constraints, and establish a set of indicators, comprising the formula for the degrees of freedom. We employ these indicators to construct the AIC and BIC information criteria, which assist in choosing the optimal tuning parameters in numerical simulations. Using the classic Boston Housing dataset, we compare the suggested model’s performance with that of the model under squared loss in scenarios with and without anomalies. The outcomes demonstrate that the suggested model accomplishes robust variable selection. This investigation provides a novel approach for spatial data analysis with extensive applications in various fields, including economics, ecology, and medicine, and can facilitate the enhancement of the efficiency and accuracy of model estimation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Robust variable selection with exponential squared loss for the partially linear varying coefficient spatial autoregressive model

Article 13 February 2024

Robust variable selection with exponential squared loss for partially linear spatial autoregressive models

Article 03 May 2023

Variable Selection with Spatially Autoregressive Errors: A Generalized Moments LASSO Estimator

Article 06 November 2018

Data Availability

The data that support the findings of this study are available from the corresponding author upon request.

References

Bascle G (2008) Controlling for endogeneity with instrumental variables in strategic management research. Strateg Organ 6(3):285–327
Article Google Scholar
Demmel JW (1986) Matrix Computations (Gene H. Golub and Charles F. van Loan). SIAM Rev 28(2):252–255
Article Google Scholar
Efron B, Hastie T, Johnstone I, Tibshirani R (2004) Least angle regression. Ann Statist 32(2):407–499
Article MathSciNet Google Scholar
Gilley OW, Pace RK (1996) On the Harrison and Rubinfeld data. J Enviro Econ Manage 31(3):403–405
Article Google Scholar
Harrison D Jr, Rubinfeld DL (1978) Hedonic housing prices and the demand for clean air. J Environ Econ Manage 5(1):81–102
Article Google Scholar
Hehn TM, Kooij JFP, Hamprecht FA (2020) End-to-end learning of decision trees and forests. Int J Comput Vis 128(4):997–1011
Article MathSciNet Google Scholar
Koenker R, Bassett Jr G (1978) Regression quantiles. Econometrica 46(1):33–50
Lambert-Lacroix S, Zwald L (2011) Robust regression through the Huber’s criterion and adaptive lasso penalty. Electron J Stat 5:1015–1053
Article MathSciNet Google Scholar
Liang H, Wu H, Zou G (2008) A note on conditional AIC for linear mixed-effects models. Biometrika 95(3):773–778
Article MathSciNet PubMed Google Scholar
Liu J, Yuan L, Ye JP (2010) An efficient algorithm for a class of fused lasso problems. In: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, pp 323–332
Liu Y, Zeng P, Lin L (2021) Degrees of freedom for regularized regression with Huber loss and linear constraints. Stat Pap 62(5):2383–2405
Article MathSciNet Google Scholar
Maity AK, Basu S, Ghosh S (2021) Bayesian criterion-based variable selection. J R Stat Soc Ser C Appl Stat 70(4):835–857
Article MathSciNet Google Scholar
Nowakowski S, Pokarowski P, Rejchel W, Sołtys A (2023) Improving group lasso for high-dimensional categorical data. In: Mikyška J, de Mulatier C, Paszynski M, Krzhizhanovskaya VV, Dongarra JJ, Sloot PM (eds) Computational Science – ICCS 2023. ICCS 2023. Lecture Notes in Computer Science, vol 14074. Springer, Cham
Piribauer P, Crespo Cuaresma J (2016) Bayesian variable selection in spatial autoregressive models. Spat Econ Anal 11(4):457–479
Article Google Scholar
Roth V (2004) The generalized LASSO. IEEE Trans Neural Netw 15(1):16–28
Article PubMed Google Scholar
Sakamoto Y, Ishiguro M, Kitagawa G (1986) Akaike information criterion statistics. Springer Dordrecht, 978-90-277-2253-9
Stoica P, Selen Y (2004) Model-order selection: a review of information criterion rules. IEEE Signal Process Mag 21(4):36–47
Article ADS Google Scholar
Su L, Yang Z (2011) Instrumental variable quantile estimation of spatial autoregressive models. Research Collection School Of Economics. Singapore Management University 1–35. https://ink.library.smu.edu.sg/soeresearch/1074
Tibshirani R, Taylor J (2011) The solution path of the generalized lasso. Ann Statist 39:1335–1371
Article MathSciNet Google Scholar
Tibshirani RJ, Taylor J (2012) Degrees of freedom in lasso problems. Ann Statist 40:1198–1232
Article MathSciNet Google Scholar
Vrieze SI (2012) Model selection and psychological theory: a discussion of the differences between the Akaike information criterion (AIC) and the Bayesian information criterion (BIC). Psychol Methods 17(2):228
Article PubMed PubMed Central Google Scholar
Xie L, Wang X, Cheng W et al (2019) Variable selection for spatial autoregressive models. Commun Stat Theory Methods 50:1–16
MathSciNet Google Scholar
Xie T, Cao R, Du J (2020) Variable selection for spatial autoregressive models with a diverging number of parameters. Stat Pap 61:1125–1145
Article MathSciNet Google Scholar
Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. J R Stat Soc Ser B Stat Methodol 68(1):49–67
Article MathSciNet Google Scholar
Zhang Z, Lai Z, Xu Y et al (2017) Discriminative elastic-net regularized linear regression. IEEE Trans Image Process 26(3):1466–1481
Article MathSciNet PubMed ADS Google Scholar

Download references

Funding

The researches are supported by the National Key Research and Development Program of China (2021YFA1000102).

Author information

Authors and Affiliations

College of Science, China University of Petroleum, Qingdao, 266580, P.R. China
Yunquan Song, Minmin Zhan & Yue Zhang
School of Statistics and Mathematics, Nanjing Audit University, Nanjing, 211815, P.R. China
Yongxin Liu

Authors

Yunquan Song
View author publications
You can also search for this author in PubMed Google Scholar
Minmin Zhan
View author publications
You can also search for this author in PubMed Google Scholar
Yue Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yongxin Liu
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Yunquan Song came up with the idea and developed the theory. Minmin Zhan and Yue Zhang conceived of the presented idea, Yue Zhang performed the computations, Minmin Zhan verified the analytical methods. Yunquan Song, Minmin Zhan and Yue Zhang contributed to the final version of the manuscript. Yongxin Liu supervised the project.

Corresponding author

Correspondence to Yunquan Song.

Ethics declarations

Competing Interests

The authors declare no competing interests.

Additional information

The researches are supported by the Fundamental Research Funds for the Central Universities (No.23CX03012A), National Key Research and Development Program (2021YFA1000102) of China.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Proof of Theorem 1 Let $P_{{\text {null}}}$ be the projection matrix of ${\text {null}}\left( G_{-A, B}\right)$, then $P_{{\text {null}}} G_{-A, B}^T=0$. Multiplying both sides of Eq. (2.21) by $P_{\text{ null }}$ yields:

$$\begin{aligned} \begin{aligned}&P_{\text{ null } } X_{-v}^*{ }^T X_{-v}^* \hat{\beta }^*=P_{\text{ null } } X_{-v}^*{ }^T y_{-v}-P_{\text{ null } } D_A^* \lambda s_A-P_{\text{ null } } G_{-A, B}^T \hat{\theta }_{-A, B}+P_{\text{ null } } X_v^* t s_v \\&=P_{\text{ null } } X_{-v}^*{ }^T y_{-v}-P_{\text{ null } } D_A^{* \top } \lambda s_A+P_{\text{ null } } X_v^{* \top } t s_\nu \\&\end{aligned} \end{aligned}$$

(A.1)

$\hat{\beta }^*$ can be decomposed into the sum of two parts as follows:

$$\begin{aligned} \begin{aligned} \hat{\beta }^*&=P_{\text{ null } } \hat{\beta }^*+P_{\text{ col } \left( G_{-A, B}^T \right) } \hat{\beta }^* \\&=P_{\text{ null } } \hat{\beta }^*+G_{-A, B}^T\left( G_{-A, B} G_{-A, B}^T\right) ^{+} G_{-A, B} \hat{\beta }^* \\&=P_{\text{ null } } \hat{\beta }^*-A g_{-A, B} \end{aligned} \end{aligned}$$

The last equality holds because $G_{-A, B} \hat{\beta }^*=g_{-A, B}$. Substituting the expression for $\hat{\beta }^*$ into (A.1) and simplifying, we obtain:

$$\begin{aligned} \begin{array}{r} P_{\text{ null } } X_{-v}^*{ }^T X_{-v}^* P_{\text{ null } } \hat{\beta }^*=P_{\text{ null } } X_{-v}^*{ }^T y_{-v}-P_{\text{ null } } D_A^*{ }^T \lambda s_A+P_{\text{ null } } X_v^{* \top } t s_v+P_{\text{ null } } X_{-v}^*{ }^T X_{-v}^* A g_{-A, B} \\ \quad =P_{\text{ null } } X_{-v}^*{ }^T\left[ y_{-v}-\left( X_{-v}^* P_{\text{ null } } X_{-v}^*{ }^T\right) + X_{-v}^* P_{\text{ null } }\left( D_A^* \lambda s_A-X_v^* t s_v\right) +X_{-v}^* A g_{-A, B}\right] \end{array} \end{aligned}$$

The last equality holds because from (A.1), we can deduce that $P_{\text{ null }} D_A^* \lambda s_A-P_{\text{ null }} X_v^{* T} t s_v \in {\text {col}}\left( P_{\text{ null }} X_{-v}^*{ }^{\top }\right)$. Further, we can derive:

$$\begin{aligned} P_{\text{ null } } D_A^{* T} \lambda s_A-P_{\text{ null } } X_v^{* T} t s_v=\left( P_{\text{ null } } X_{-v}^*{ }^T\right) \left( X_{-v}^* P_{\text{ null } } X_{-v}^*\right) +\left( X_{-v}^* P_{\text{ null } }\right) \left( P_{\text{ null } } D_A^{* T} \lambda s_A-P_{\text{ null } } X_v^{* T} t s_v\right) \end{aligned}$$

Therefore,

$$\begin{aligned} \begin{aligned}&X_{-v}^* P_{\text{ null } } \hat{\beta }^*=X_{-v}^* P_{\text{ null } }\left( P_{\text{ null } } X_{-v}^*{ }^T X_{-v}^* P_{\text{ null } }\right) + P_{\text{ null } } X_{-v}^*{ }^T \\&{\left[ y_{-v}-\left( X_{-v}^* P_{\text{ null } } X_{-v}^*{ }^T\right) + X_{-v}^* P_{\text{ null } }\left( D_{\mathcal {A}}^{* T} \lambda s_{\mathcal {A}}-X_v^{* T} t s_v\right) +X_{-v}^* A g_{-\mathcal {A}, \mathcal {B}}\right] } \\&=P_{X_{-v}^* P_{\text{ null } }}\left[ y_{-v}-\left( X_{-v}^* P_{\text{ null } } X_{-v}^*{ }^T\right) + X_{-v}^* P_{\text{ null } }\left( D_{\mathcal {A}}^{* T} \lambda s_{\mathcal {A}}-X_v^{* T} t s_v\right) +X_{-v}^* A g_{-\mathcal {A}, \mathcal {B}}\right] \end{aligned} \end{aligned}$$

Therefore, the fitted values $X_{-v}^* \hat{\beta }^*$ can be expressed as:

$$\begin{aligned} \begin{aligned} X_{-v}^* \hat{\beta }^*&=X_{-v}^* P_{\text{ null } } \hat{\beta }^*-X_{-v}^* A g_{-\mathcal {A}, \mathcal {B}} \\&=P_{X_{-v}^* P_{\text{ null } }}\left[ y_{-v}-\left( X_{-v}^* P_{\text{ null } } X_{-v}^*{ }^T\right) + X_{-v}^* P_{\text{ null } }\left( D_{\mathcal {A}}^{* T} \lambda s_{\mathcal {A}}-X_v^{* T} t s_v\right) +X_{-v}^* A g_{-\mathcal {A}, \mathcal {B}}\right] -X_{-v}^* A g_{-\mathcal {A}, \mathcal {B}} \\&=P_{X_{-v}^* P_{\text{ null } }}\left[ y_{-v}-\left( X_{-v}^* P_{\text{ null } } X_{-v}^*{ }^T\right) + X_{-v}^* P_{\text{ null } }\left( D_{\mathcal {A}}^* \lambda s_{\mathcal {A}}-X_v^{* T} t s_v\right) \right] -\left( I-P_{X_{-v}^* P_{\text{ null } }}\right) X_{-v}^* A g_{-\mathcal {A}, \mathcal {B}} \end{aligned} \end{aligned}$$

Proof of Theorem 2 $\hat{\mu }(y)=\hat{y}$ is continuous and almost everywhere differentiable with respect to y. Therefore, we can use Stein’s lemma to calculate the degrees of freedom for the Space Huber Lcg-Lasso fitted values. Thus,

$$\begin{aligned} \textrm{df}(\hat{\mu })=\mathbb {E}\left[ \sum _{i\in V}^n \frac{\partial \hat{y}_i}{\partial y_i}\right] =\mathbb {E}\left[ \sum _{i\in V} \frac{\partial \hat{y}_i}{\partial y_i}+\sum _{i \in V^C} \frac{\partial \hat{y}_i}{\partial y_i}\right] \end{aligned}$$

As $\hat{\beta }^*$ depends only on $y_{-v}$, the derivatives of the fitted values $\hat{y}_i$ with respect to $y_i$ are 0 for $i \in \mathcal {V}$, i.e.,

$$\begin{aligned} \frac{\partial \hat{y}_i}{\partial y_i}=\frac{\partial x_i^T \hat{\beta }\left( y_{-v}\right) }{\partial y_i}=0 \text{, } \text{ for } i \in \mathcal {V} \end{aligned}$$

Thus, the expression for the degrees of freedom for the fitted values $\hat{\mu }$ becomes

$$\begin{aligned} \textrm{df}(\hat{\mu })=\mathbb {E}\left[ \sum _{i \in V^C} \frac{\partial \hat{y}_i}{\partial y_i}\right] =E\left[ \left( \nabla \cdot \hat{y}_{-v}\right) \left( y_{-v}\right) \right] \end{aligned}$$

Considering the expression of $\hat{y}_{-v}$ from Theorem 1, we have

$$\begin{aligned} \begin{aligned}&\left. \hat{y}_{-v}=P_{X_{-v}^* P_{\text{ null }}} y_{-v}-P_{X_{-v}^* P_{\text{ null }}}\left( X_{-v}^* P_{\text{ null } } X_{-v}^{* T}\right) + X_{-v}^* P_{\text{ null } }\left( D_{\mathcal {A}}^{* T} \lambda s_A-X_v^{* T} t s_v\right) \right] \\&-\left( I-P_{X_{-v}^* P_{\text{ null } }}\right) X_{-v}^* A g_{-A, B} \text{. } \end{aligned} \end{aligned}$$

The first term on the right-hand side depends directly on y, while the remaining parts depend only on the boundary sets $\mathcal {A}$, $\mathcal {B}$, $\mathcal {V}$, and the signs $s_{\mathcal {A}}$, $s_{\mathcal {V}}$. $\mathcal {A}$, $\mathcal {B}$, $\mathcal {V}$, $s_A$, and $s_V$ are locally constant in the neighborhood of y, implying their derivatives with respect to y are zero. Therefore,

$$\begin{aligned} \textrm{df}(\hat{\mu })=\mathbb {E}\left[ \left( \nabla \cdot \hat{y}_{-v}\right) \left( y_{-v}\right) \right] ={\text {tr}}\left( P_{X_{-v}^* P_{\text{ mul } }}\right) \end{aligned}$$

Since the trace of the projection matrix equals the dimension of the corresponding linear space, the theorem is established.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Song, Y., Zhan, M., Zhang, Y. et al. Huber Loss Meets Spatial Autoregressive Model: A Robust Variable Selection Method with Prior Information. Netw Spat Econ 24, 291–311 (2024). https://doi.org/10.1007/s11067-024-09614-6

Download citation

Accepted: 15 January 2024
Published: 27 January 2024
Issue Date: March 2024
DOI: https://doi.org/10.1007/s11067-024-09614-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Huber Loss Meets Spatial Autoregressive Model: A Robust Variable Selection Method with Prior Information

Abstract

Access this article

Similar content being viewed by others

Robust variable selection with exponential squared loss for the partially linear varying coefficient spatial autoregressive model

Robust variable selection with exponential squared loss for partially linear spatial autoregressive models

Variable Selection with Spatially Autoregressive Errors: A Generalized Moments LASSO Estimator

Data Availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing Interests

Additional information

Publisher's Note

Appendix

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Huber Loss Meets Spatial Autoregressive Model: A Robust Variable Selection Method with Prior Information

Abstract

Access this article

Similar content being viewed by others

Robust variable selection with exponential squared loss for the partially linear varying coefficient spatial autoregressive model

Robust variable selection with exponential squared loss for partially linear spatial autoregressive models

Variable Selection with Spatially Autoregressive Errors: A Generalized Moments LASSO Estimator

Data Availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing Interests

Additional information

Publisher's Note

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation