Abstract
The partially linear varying coefficient spatial autoregressive model is a semi-parametric spatial autoregressive model in which the coefficients of some explanatory variables are variable, while the coefficients of the remaining explanatory variables are constant. For the nonparametric part, a local linear smoothing method is used to estimate the vector of coefficient functions in the model, and, to investigate its variable selection problem, this paper proposes a penalized robust regression estimation based on exponential squared loss, which can estimate the parameters while selecting important explanatory variables. A unique solution algorithm is composed using the block coordinate descent (BCD) algorithm and the concave-convex process (CCCP). Robustness of the proposed variable selection method is demonstrated by numerical simulations and illustrated by some housing data from Airbnb.
Similar content being viewed by others
References
Beck A, Teboulle M (2009) A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J Imaging Sci 2(1):183–202
Cliff A, Ord J (1973) Spatial autocorrelation. Pion. Progress in Human Geography, London, pp 245–249
Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. Publ Am Stat Assoc 96(456):1348–1360
Forsythe GE, Moler CB, Malcolm MA (1977) Computer methods for mathematical computations. Prentice-hall, Hoboken
Guo S, Wei CH (2015) Variable selection for spatial autoregressive models. J Minzu Univ China (Nat Sci Ed)
Kelejian HH (2008) A spatial j-test for model specification against a single or a set of non-nested alternatives. Lett Spat Resour Sci 1(1):3–11
Kelejian HH, Piras G (2011) An extension of Kelejian’s j-test for non-nested spatial models. Reg Sci Urban Econ 41(3):281–292
Kelejian HH, Piras G (2014) An extension of the j-test to a spatial panel data framework. J Appl Econom 31(2):387–402
Li T, Yin Q, Peng J (2020) Variable selection of partially linear varying coefficient spatial autoregressive model. J Stat Comput Simul 90(15):2681–2704
Liu X, Chen J, Cheng S (2018) A penalized quasi-maximum likelihood method for variable selection in the spatial autoregressive model. Spat Stat 25:86–104
Ma Y, Pan R, Zou T, Wang H (2020) A naive least squares method for spatial autoregression with covariates. Stat Sin 30(2):653–672
Mu J, Wang G, Wang L (2020) Spatial autoregressive partially linear varying coefficient models. J Nonparametric Stat 32(2):428–451
Song Y, Liang X, Zhu Y, Lin L (2021) Robust variable selection with exponential squared loss for the spatial autoregressive model. Comput Stat Data Anal 155(1):107094
Su L, Jin S (2010) Profile quasi-maximum likelihood estimation of partially linear spatial autoregressive models. J Econom 157(1):18–33
Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc Ser B 58(1):267–288
Wang H, Li G, Jiang G (2007) Robust regression shrinkage and consistent variable selection through the lad-lasso. J Bus Econ Stat 25(3):347–355
Wang X, Jiang Y, Huang M, Zhang H (2013) Robust variable selection with exponential squared loss. JASA: J Am Stat Assoc 108:632–643
Yuille AL, Rangarajan A (2003) The concave-convex procedure. Neural Comput 15(4):915–936
Zhang X, Yu J (2018) Spatial weights matrix selection and model averaging for spatial autoregressive models. J Econom 203(1):1–18
Author information
Authors and Affiliations
Contributions
JY and YS wrote the main manuscript text and JD prepared Tables 1–3. All authors reviewed the manuscript.
Corresponding author
Ethics declarations
Competing interest
The authors declare no competing interests.
Additional information
Handling Editor: Luiz Duczmal.
Their researches are supported by the Fundamental Research Funds for the Central Universities (No.23CX03012A), National Key Research and Development Program of China (2021YFA1000102).
Appendices
A Proof of Theorem 1
Let \(\xi =n^{-1/2}+a_n\). Similar to Fan and Li (2001), we first prove that for any given \(\varepsilon >0\), there exists a constant C such that
where u is a (p+1)-dimensional vector such that \(\parallel u\parallel =C\). Then, we need to show that there exists a local maximizer \(\theta _n\) such that \(\left\| \hat{\theta }_n-\theta _0\right\| =0_p(\xi )\). We know that minimizing (10) is the same thing as maximizing
Let
Since \(p_{\lambda _j}(0)=0\) for \(j=1,\ldots ,p\) and \(\gamma _n-\gamma _0=o_p(1)\), by Taylor’s expansion we have
Note that \(n^{-1/2}D(\theta _0,\gamma _0) = O_p(1)\). Therefore, the order of the first term on the right side is equal to \(O_p\left( n^{1/2}\xi \right) =O_p \left( n\xi ^2\right)\) in (A.4). By choosing a sufficiently large C, the second term dominates the first term uniformly in \(\left\| u\right\| =C\). Since \(b_n = o_p(1)\), the third term is also dominated by the second term of (A.4). Therefore, (A.1) holds by choosing a sufficiently large C.
B Proof of Theorem 2
Proof of Theorem 2(1)
We now show the sparsity. By Theorem 1, it is sufficient to show that, for any \(\theta _1\) which satisfies \(\theta _1-\theta _{01}=O_p\left( n^{-1/2}\right)\) and some given small \(\varepsilon =Cn^{-1/2}\) and \(j=s+1,\ldots ,p\), we have \(\partial \ell /\partial \theta _j>0\), for \(0<\theta _j<\varepsilon _n\), and \(\partial \ell /\partial \theta _j<0\), for \(-\varepsilon _n<\theta _j<0\). Let
By Taylor’s expansion we have
where \(\theta ^{*}\) lies between \(\theta\) and \(\theta _{0}\). Note that
Since \(b_{n}=o_{p}(1)\) and \(\sqrt{n} a_{n}=o_{p}(1)\) , we obtain \(\theta -\theta _{0}=O_{p}\left( n^{-1 / 2}\right)\). By \(\sqrt{n}\left( \gamma _{n}-\gamma _{0}\right) =O_{p}(1)\) , we have
Since \(\frac{1}{\min _{s+1 \le j \le p+1} \sqrt{n} \lambda _{j}}=O_{p}(1)\) and \(\lim _{n \rightarrow \infty }inf\lim _{t \rightarrow 0^{+}}inf\left\{ \min _{s+1\le j\le d}p_{\lambda }^{\prime }(|t|)/\lambda _j>0\right\}\) with probability 1, the sign of the derivative is completely determined by that of \(\theta _j\). \(\square\)
Proof of Theorem 2(2)
We have shown that a \(\hat{\theta }_{n1}\) exists that is a \(\sqrt{n}\)-consistent local maximizer of \(\ell _n\left\{ (\theta _1,0)\right\}\) satisfying \(\partial \ell \{(\hat{\theta }_{n1},0)\}/ \partial \theta _j=0\), for \(j=1,\ldots ,s\).
Since \(\theta _{n1}\) is a consistent estimator, we have
The above equation can be rewritten as follows
and
Since \(\sqrt{n}\left( \gamma _{n}-\gamma _{0}\right) =o_{p}(1)\), invoking the Slutsky’s lemma and the Lindeberg-Feller central limit theorem, we have \(\Sigma _{1}={\text {diag}}\left\{ p_{\lambda _{1}}^{\prime \prime }\left( \left| \theta _{01}\right| \right) , \ldots , p_{\lambda _{s}}^{\prime \prime }\left( \left| \theta _{0 s}\right| \right) \right\}\), \(\Sigma _{2}={\text {cov}}\left( \exp \left( -r^{2} / \gamma _{0}\right) \frac{2r}{\gamma _{0}}\tilde{G}_{i 1}\right)\), \(\Delta =\left( p_{\lambda _{j}}^{\prime }\left( \left| \theta _{01}\right| \right) {\text {sign}}\left( \theta _{01}\right) , \ldots , p_{\lambda _{j}}^{\prime }\left( \left| \theta _{0 s}\right| \right) {\text {sign}}\left( \theta _{0 s}\right) \right) ^{T}\), \(I_{1}\left( \theta _{01}, \gamma _{0}\right) =\frac{2}{\gamma _{0}} E\left[ \exp \left( -r^{2} / \gamma _{0}\right) \left( \frac{2 r^{2}}{\gamma _{0}}-1\right) \right] \times \left( E \tilde{G}_{i 1} \tilde{G}_{i 1}^{T}\right)\). \(\square\)
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Yu, J., Song, Y. & Du, J. Robust variable selection with exponential squared loss for the partially linear varying coefficient spatial autoregressive model. Environ Ecol Stat 31, 97–127 (2024). https://doi.org/10.1007/s10651-024-00603-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10651-024-00603-z