Quantile generalized measures of correlation

Zhang, Xinyu; Shi, Hongwei; Zhou, Niwen; Tan, Falong; Guo, Xu

doi:10.1007/s11222-024-10414-8

Quantile generalized measures of correlation

Original Paper
Published: 12 March 2024

Volume 34, article number 92, (2024)
Cite this article

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Statistics and Computing Aims and scope Submit manuscript

Quantile generalized measures of correlation

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Xinyu Zhang¹,
Hongwei Shi¹,
Niwen Zhou²,
Falong Tan³ &
…
Xu Guo¹

413 Accesses
1 Altmetric
Explore all metrics

Abstract

In this paper, we introduce a quantile Generalized Measure of Correlation (GMC) to describe nonlinear quantile relationship between response variable and predictors. The introduced correlation takes values between zero and one. It is zero if and only if the conditional quantile function is equal to the unconditional quantile. We also introduce a quantile partial Generalized Measure of Correlation. Estimators of these correlations are developed. Notably by adopting machine learning methods, our estimation procedures allow the dimension of predictors very large. Under mild conditions, we establish the estimators’ consistency. For construction of confidence interval, we adopt sample splitting and show that the corresponding estimators are asymptotic normal. We also consider composite quantile GMC by integrating information from different quantile levels. Numerical studies are conducted to illustrate our methods. Moreover, we apply our methods to analyze genome-wide association study data from Carworth Farms White mice.

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

1 Introduction

Characterizing the correlation between two random variables is a classical and fundamental topic in statistics. Although the classical Pearson correlation is widely used in practice, it is well known that the Pearson correlation is only effective to detect linear relationship. To describe nonlinear correlation, in a seminal paper, Székely et al. (2007) introduced the distance correlation which is zero if and only if the two random vectors are independent. The introduction of distance correlation stimulates the strong interests in developing new nonlinear correlations. For instance, Zhu et al. (2017) proposed the projection correlation which does not need any moment condition. Pan et al. (2020) developed Ball covariance to describe dependence in Banach space. Chatterjee (2021) introduced a new correlation with a simple expression. There are also other important nonparametric correlation methods. For instance, Gretton et al. (2005) introduced the Hilbert-Schmidt independence criterion (HSIC) for measuring statistical dependence of the distribution-based Hilbert space embedding in statistical inference. Reshef et al. (2011) proposed the maximal information coefficient (MIC) to evaluate relationships between two random variables, characterized by two properties: generality and equitability. For other important methods, kindly see the review by Josse and Holmes (2016) and Tjøstheim et al. (2022).

Though the above recently developed correlations are powerful to describe nonlinear relationship, they are related with all aspects of the distributions of the two random vectors. However, in many applications such as regression problems, only certain aspect of the distribution is concerned. Following this line, Shao and Zhang (2014) introduced the martingale difference correlation which is a conditional mean version of the distance correlation. It is zero if and only if the two random vectors are conditional mean independent, i.e. the conditional mean of one variable given the other one is equal to its unconditional mean. However when its value is equal to one, the interpretation is not clear. Zheng et al. (2012) proposed the Generalized Measure of Correlation (GMC) which is a nonparametric extension of the classical $R^2$ in linear regression models. It fully describes the conditional mean dependence. Instead of conditional mean, correlation in terms of conditional quantile is also of great importance and interest. Actually compared with conditional mean regression, conditional quantile regression (Koenker 2005) is robust to outliers, heavy tailed distributions, and can capture more information of the distribution. As an extension of the classical Pearson correlation, Li et al. (2015) introduced a novel quantile correlation. This new quantile correlation has been applied and extended by many authors. For instance, Ma et al. (2017) adopted this quantile correlation to screen features. Ma and Zhang (2016) proposed a composite quantile correlation. However as pointed out by Xu (2017), both the quantile correlation and the composite quantile correlation can be zero even when there is clear relationship.

In this paper, we also introduce correlations in terms of quantile. However, different from the quantile correlation in Li et al. (2015), our correlation can fully capture the conditional quantile dependence. Actually it is a quantile version of the Generalized Measure of Correlation and thus is called quantile Generalized Measure of Correlation (qGMC). The introduced qGMC has several appealing properties. In fact, it takes value between zero and one. The value is zero if and only if the two random vectors are conditionally quantile independent, and is one if and only if one variable is equal to the conditional quantile function. When there is a third random vector, it is also of great interest to develop a partial correlation. To this aim, Wang et al. (2015) introduced conditional distance correlation. Azadkia and Chatterjee (2021) extended the correlation in Chatterjee (2021) and introduced a new partial correlation. While Park et al. (2015) introduced the partial martingale difference correlation. We also extend our quantile Generalized Measure of Correlation and introduce a quantile partial Generalized Measure of Correlation (qPGMC). Compared with the recently developed other partial correlations, the qPGMC is concerned with the quantile aspect.

We also make inference for these new measures. To estimate these new measures, we need to estimate unknown conditional quantile functions. When the dimension is high, classical local polynomial methods do not work well. To this end, modern flexible machine learning methods are adopted to estimate conditional quantile functions. We show that when these conditional quantile functions can be consistently estimated, our new measures can also be consistently estimated. For constructing confidence interval, sample splitting is required to further reduce asymptotic bias. We establish the asymptotic normalities of relevant estimators under mild conditions. We further consider composite quantile GMC by integrating information from different quantile levels. Its properties and estimators are investigated. We illustrate our proposed quantile GMC in a real data analysis.

The paper is organized as follows. In Sect. 2, we introduce our quantile GMC and quantile partial GMC and discuss their properties. In Sect. 3, we make inference for these two new measures. In Sect. 4, we extend to composite quantile GMC. In Sect. 5, numerical studies are conducted. A real data analysis is given in Sect. 6. We then give some discussions and conclusions in Sect. 7. Proofs are given in the Appendix.

2 Quantile generalized measure of correlation and its partial version

Now let $Y\in \mathbb {R}$ be the one-dimensional random variable and ${\textbf{X}}=({\textbf{Z}}^\top ,\varvec{W}^\top )^\top \in \mathbb {R}^p$ be the p-dimensional random vector. Here ${\textbf{Z}}\in \mathbb {R}^{p_1}$ and $\varvec{W}\in \mathbb {R}^{p_2}$ with $p_1+p_2=p$ are two subsets of ${\textbf{X}}$. In the following, we introduce new correlations to quantify the dependence between ${\textbf{X}}$ and Y and the partial dependence between $\varvec{W}$ and Y given ${\textbf{Z}}$.

To quantify the relationship between ${\textbf{X}}$ and Y, we propose the following quantile Generalized Measure of Correlation (qGMC)

$$\begin{aligned} \hbox {qGMC}_{\tau }(Y,{\textbf{X}})=1-\frac{\mathbb {E}[\rho _{\tau }(Y-q_{\tau }({\textbf{X}}))]}{\mathbb {E}[\rho _{\tau }(Y-\xi _{\tau })]}. \end{aligned}$$

(1)

Here $\tau \in (0,1), \rho _{\tau }(u)=(\tau -1/2)u+|u|/2=u\{\tau -I(u<0)\}>0$ is the check loss function, $q_{\tau }({\textbf{X}})=\arg \min _{\theta }\mathbb {E}[\rho _{\tau }(Y-\theta )|{\textbf{X}}]$ is the $\tau $th conditional quantile function of Y given ${\textbf{X}}$, while $\xi _{\tau }=\arg \min _{\theta }\mathbb {E}[\rho _{\tau }(Y-\theta )]$ is the $\tau $th unconditional quantile of Y. In other words, we adopt the relative check loss with and without ${\textbf{X}}$ to measure the correlation between Y and ${\textbf{X}}$. From the definitions of $q_{\tau }({\textbf{X}})$ and $\xi _{\tau }$, it is clear that $\mathbb {E}[\rho _{\tau }(Y-q_{\tau }({\textbf{X}}))]\le \mathbb {E}[\rho _{\tau }(Y-\xi _{\tau })]$. Accordingly $0\le \hbox {qGMC}_{\tau }\le 1$. Recall the definition of GMC in Zheng et al. (2012), that is,

$$\begin{aligned} \hbox {GMC}(Y,{\textbf{X}})=1-\frac{\mathbb {E}[\{Y-\mathbb {E}(Y|{\textbf{X}})\}^2]}{\mathbb {E}[\{Y-\mathbb {E}(Y)\}^2]}. \end{aligned}$$

It is clear that the $\hbox {qGMC}_{\tau }$ is the quantile version of the GMC and can be interpreted as the quantile version of the nonparametric $R^2$. Compared with the GMC, the qGMC is robust to outliers, heavy-tailed distributions, and can capture more information by allowing different quantile levels $\tau $’s. Further the non-smoothness of the check loss function $\rho _{\tau }(u)$ makes the inference of qGMC more challenging. We will investigate its inference in the next section.

We further note that our introduced quantile GMC has a genetic interpretation. Actually as in Visscher et al. (2008), the above GMC is also known as the heritability in genetics. The heritability is the proportion of variation in a phenotype that is attributable to genetic factors. It is a fundamental quantity in genetic applications (Yang et al. 2010; Evans et al. 2018; Speed and Balding 2019). Clearly our quantile GMC can be interpreted as quantile heritability. Compared with classical heritability, this quantile heritability is robust to outliers and can capture more information.

Besides the overall quantile correlation of ${\textbf{X}}$ and Y, we are also interested in the partial correlation of Y and $\varvec{W}$ after adjusting for (or controlling for) ${\textbf{Z}}$. Inspired by the above qGMC, we consider the following partial correlation:

$$\begin{aligned} \hbox {qPGMC}_{\tau }(Y,\varvec{W}|{\textbf{Z}})=1-\frac{\mathbb {E}[\rho _{\tau }(Y-q_{\tau }({\textbf{X}}))]}{\mathbb {E}[\rho _{\tau }(Y-q_{\tau }({\textbf{Z}}))]}. \end{aligned}$$

Here $q_{\tau }({\textbf{Z}})=\arg \min _{\theta }\mathbb {E}[\rho _{\tau }(Y-\theta )|{\textbf{Z}}]$. If the above defined $\hbox {qPGMC}_{\tau }$ is small, then $\mathbb {E}[\rho _{\tau }(Y-q_{\tau }({\textbf{X}}))]$ and $\mathbb {E}[\rho _{\tau }(Y-q_{\tau }({\textbf{Z}}))]$ are close and thus the additional introduction of $\varvec{W}$ leads to little increase in explanatory power. Thus, the $\hbox {qPGMC}_{\tau }$ can be used to measure the importance of $\varvec{W}$ in explaning Y given ${\textbf{Z}}$.

Denote $\epsilon =Y-q_{\tau }({\textbf{X}})$. Let $f_{\epsilon |{\textbf{X}}}$ be the conditional density function of $\epsilon $ given ${\textbf{X}}$. Denote by $\delta = Y-\xi _{\tau }$ and $f_{\delta }$ be the density function of $\delta $. Similarly define $\eta =Y-q_{\tau }({\textbf{Z}})$ and $f_{\eta |{\textbf{Z}}}$. Before presenting the properties and interpretations of $\hbox {qGMC}_{\tau }$ and $\hbox {qPGMC}_{\tau }$, the following condition is assumed.

(C1). For a around zero, assume that $0<c<\inf _a f_{\epsilon |{\textbf{X}}}(a)<\sup _{a} f_{\epsilon |{\textbf{X}}}(a)<C<\infty $. The same assumption is imposed for $f_{\eta |{\textbf{Z}}}$ and $f_{\delta }$.

The boundedness on the conditional density function in (C1) is often assumed in quantile regression models. See for instance (Wang et al. 2009, 2012; Zhang et al. 2020; Tang et al. 2022).

Now we are ready to present the properties and interpretations of $\hbox {qGMC}_{\tau }$ and $\hbox {qPGMC}_{\tau }$ in the following proposition. Its proof is given in the Appendix.

Proposition 1

Under condition (C1), we have

1.
$\hbox {qGMC}_{\tau }$ and $\hbox {qPGMC}_{\tau }$ are two indicators lying between 0 and 1, that is,
$$\begin{aligned} 0\le \hbox {qGMC}_{\tau }, \hbox {qPGMC}_{\tau }\le 1. \end{aligned}$$
2.
$\hbox {qGMC}_{\tau }=0$ if and only if $q_{\tau }({\textbf{X}})=\xi _{\tau },$ a.s. That is, ${\textbf{X}}$ does not add any information about the $\tau $th conditional quantile of Y.
3.
$\hbox {qPGMC}_{\tau }=0$ if and only if $q_{\tau }({\textbf{X}})=q_{\tau }({\textbf{Z}}),$ a.s. That is, given ${\textbf{Z}}$, $\varvec{W}$ does not add any further information about the $\tau $th conditional quantile of Y.
4.
$\hbox {qGMC}_{\tau }=1$ and $\hbox {qPGMC}_{\tau }=1$ if and only if $Y=q_{\tau }({\textbf{X}})$, a.s. That is, Y is almost surely equal to the conditional quantile function of ${\textbf{X}}$.
5.
Further we have
$$\begin{aligned} \hbox {qPGMC}_{\tau }(Y,\varvec{W}|{\textbf{Z}})=\frac{\hbox {qGMC}_{\tau }(Y,{\textbf{X}})-\hbox {qGMC}_{\tau }(Y,{\textbf{Z}})}{1-\hbox {qGMC}_{\tau }(Y,{\textbf{Z}})}. \end{aligned}$$
6.
Suppose ${\textbf{Z}}$ is a subset of ${\textbf{Z}}'$ and ${\textbf{X}}=({\textbf{Z}}'^\top , \varvec{W}'^\top )^\top $, then we have:
$$\begin{aligned} \hbox {qGMC}_{\tau }(Y,{\textbf{X}})&\ge \hbox {qPGMC}_{\tau }(Y,\varvec{W}|{\textbf{Z}})\\&\ge \hbox {qPGMC}_{\tau }(Y,\varvec{W}'|{\textbf{Z}}'). \end{aligned}$$

From the above proposition, we know that the introduced quantile GMC and quantile partial GMC enjoy many interesting features. Firstly, they have simple expressions, which makes them easy to interpret. Secondly, they are between zero and one. The values zero and one have clear interpretations as stated in properties (2)–(4). Thirdly, they can capture nonlinear (partial) quantile dependence relationship. Fourthly, the dimensions of ${\textbf{X}}$ and ${\textbf{Z}}$ are not limited to be one and are allowed to be very large. The relationship between the quantile partial GMC and the quantile GMC is established in property (5). Since the quantile GMC is viewed as a quantile generalization of the classical $R^2$, the quantile partial GMC can then be viewed as the relative difference in the population quantile $R^2$ obtained using the full set of ${\textbf{X}}$ as compared to the subset of ${\textbf{Z}}$ only. Moreover from property (6), when the conditional information is more, the reminding variables provide less additional information about the response Y. Further the quantile partial GMC is always up bounded by the quantile GMC.

In the following, we make a comparison between the quantile correlation and the quantile partial correlation introduced by Li et al. (2015). Actually the quantile correlation takes the following form

$$\begin{aligned} \hbox {qcor}_{\tau }(Y,{\textbf{X}})&=\frac{\mathbb {E}[\psi _{\tau }(Y-\xi _{\tau })({\textbf{X}}-\mathbb {E}{\textbf{X}})]}{\sqrt{(\tau -\tau ^2)\sigma ^2_X}}\\&=\hbox {corr}(I(Y>\xi _{\tau }), {\textbf{X}}), \end{aligned}$$

where $ \psi _{\tau } (u) =\tau -I(u<0)$ and $\sigma ^2_X=\hbox {var}({\textbf{X}})$. From the above formula, it is clear that the quantile correlation is actually the Pearson correlation of $I(Y>\xi _{\tau })$ and ${\textbf{X}}$. Besides, the quantile partial correlation is defined as follows,

$$\begin{aligned} \text {qpcor}_{\tau }(Y, \varvec{W}| {\textbf{Z}}) =\frac{\mathbb {E}\left[ \psi _{\tau }\left( Y-\alpha _{2}-\varvec{\beta }_{2}^{\top } {\textbf{Z}}\right) \varvec{W}\right] }{\sqrt{\left( \tau -\tau ^{2}\right) \sigma _{\varvec{W}\mid {\textbf{Z}}}^{2}}}, \end{aligned}$$

where ${(\alpha _{2}, \varvec{\beta }_{2}^{\top })}^{\top }=\arg \min _{\alpha ,\varvec{\beta }} \mathbb {E}\left[ \rho _{\tau }\left( Y-\alpha -\varvec{\beta }^{\top } {\textbf{Z}}\right) \right] $ and $\sigma _{\varvec{W}\mid {\textbf{Z}}}^{2}=\mathbb {E}(\varvec{W}-\alpha _{1}$ $-\varvec{\beta }_{1}^{\top } {\textbf{Z}})^{2}$ with ${(\alpha _{1}, \varvec{\beta }_{1}^{\top })}^{\top }=\arg \min _{\alpha , \varvec{\beta }} \mathbb {E}\left( \varvec{W}-\alpha -\varvec{\beta }^{\top } {\textbf{Z}}\right) ^{2}$.

Compared with our introduced quantile GMC, the quantile correlation has two main drawbacks. Firstly and most importantly, the $\hbox {qcor}_{\tau }$ can only measure linear relationship between Y and ${\textbf{X}}$ at the $\tau $th quantile. While our introduced quantile GMC can detect nonlinear quantile correlation. Secondly, $\hbox {qcor}_{\tau }$ is only applicable to one dimensional ${\textbf{X}}$, while the dimension of ${\textbf{X}}$ is not limited for our correlation. Actually we can even allow the dimension of ${\textbf{X}}$ be diverging. A similar comparison can be given to $\hbox {qpcor}_{\tau }$ and $\hbox {qPGMC}_{\tau }$ and the details are omitted here.

Now we present a simple example to illustrate the drawbacks of $\hbox {qcor}_{\tau }$ and $\hbox {qpcor}_{\tau }$.

Example 1

Let $Y=\varvec{W}^2+ {\textbf{Z}}^4$, where $\varvec{W}$ and ${\textbf{Z}}$ are independent and both from N(0, 1). From property (4) in Proposition 1, it is clear that $\hbox {qGMC}_{\tau }(Y,{\textbf{X}})=\hbox {qPGMC}_{\tau }(Y,\varvec{W}|{\textbf{Z}})=1$. However $\hbox {qcor}_{\tau }(Y,\varvec{W})=\hbox {qcor}_{\tau }(Y,{\textbf{Z}})=\hbox {qpcor}_{\tau }(Y,\varvec{W}|{\textbf{Z}})=0$.

In fact note that

$$\begin{aligned} \mathbb {E}[\psi _{\tau }(Y-\xi _{\tau })(\varvec{W}-\mathbb {E}\varvec{W})]&=\mathbb {E}[\psi _{\tau }(Y-\xi _{\tau })\varvec{W}I(\varvec{W}<0)] \\&\quad +\mathbb {E}[\psi _{\tau }(Y-\xi _{\tau })\varvec{W}I(\varvec{W}>0)]\\&=\mathbb {E}[\psi _{\tau }(Y-\xi _{\tau })\varvec{W}I(\varvec{W}<0)] \\&\quad - \mathbb {E}[\psi _{\tau }(Y-\xi _{\tau })\varvec{W}I(\varvec{W}<0)]\\&= 0. \end{aligned}$$

We then have $\hbox {qcor}_{\tau }(Y,\varvec{W})=0$. Similarly, $\hbox {qcor}_{\tau }(Y,{\textbf{Z}})=0$. Further $\mathbb {E}[\psi _{\tau }(Y-\alpha _{2}$ $-\varvec{\beta }_{2}^{\top } {\textbf{Z}}) \varvec{W}] = 0$. We then have $\hbox {qpcor}_{\tau }(Y, \varvec{W}| {\textbf{Z}}) = 0$. This example illustrates that the $\hbox {qcor}_{\tau }$ and $\hbox {qpcor}_{\tau }$ cannot detect nonlinear quantile correlation. While our proposed measurements can detect nonlinear quantile correlation.

3 Statistical inference

In this section, we investigate how to estimate $\hbox {qGMC}_{\tau }$ and $\hbox {qPGMC}_{\tau }$ and construct confidence intervals for them.

Since both the qGMC and qPGMC involve expectations of check loss, it is natural to consider sample average of relevant variables. For qGMC and qPGMC, we also need to estimate the conditional quantile functions $q_{\tau }({\textbf{X}})$ and $q_{\tau }({\textbf{Z}})$. When the dimensions of ${\textbf{X}}$ and ${\textbf{Z}}$ are small, classical nonparametric methods such as kernel smoothing or local polynomial smoothing (Yu and Jones 1998; Kai et al. 2010) can be adopted. However, these classical nonparametric methods generally perform not well when the dimension is large. To be more flexible, we adopt modern machine learning methods (Meinshausen 2006; Athey et al. 2019; Patidar et al. 2023) which perform well in nonlinear and high-dimensional settings. Now suppose $\mathcal {D}=\{{\textbf{X}}_i, Y_i\}_{i=1}^N$ is an i.i.d sample from the population $({\textbf{X}}, Y)$. Further suppose $\hat{\xi }_{\tau }$, $\hat{q}_{\tau }({\textbf{X}}_i)$ and $\hat{q}_{\tau }({\textbf{Z}}_i)$ are estimators of $\xi _{\tau }, q_{\tau }({\textbf{X}}_i)$ and $q_{\tau }({\textbf{Z}}_i)$, respectively. We then estimate $\hbox {qGMC}_{\tau }(Y,{\textbf{X}})$ and $\hbox {qPGMC}_{\tau }(Y,\varvec{W}|{\textbf{Z}})$ as follows:

$$\begin{aligned} \widetilde{\hbox {qGMC}}_{\tau }&=\frac{\sum _{i=1}^N[\rho _{\tau }(Y_i-\hat{\xi }_{\tau })-\rho _{\tau }(Y_i-\hat{q}_{\tau }({\textbf{X}}_i))]}{\sum _{i=1}^N\rho _{\tau }(Y_i-\hat{\xi }_{\tau })};\nonumber \\ \widetilde{\hbox {qPGMC}}_{\tau }&=\frac{\sum _{i=1}^N[\rho _{\tau }(Y_i-\hat{q}_{\tau }({\textbf{Z}}_i))-\rho _{\tau }(Y_i-\hat{q}_{\tau }({\textbf{X}}_i))]}{\sum _{i=1}^N\rho _{\tau }(Y_i-\hat{q}_{\tau }({\textbf{Z}}_i))}. \end{aligned}$$

(2)

To establish the asymptotic properties of $\widetilde{\hbox {qGMC}}_{\tau }$ and $\widetilde{\hbox {qPGMC}}_{\tau }$, the following condition is imposed.

(A1). $\sum _{i=1}^N[\hat{q}_{\tau }({\textbf{X}}_i)-q_{\tau }({\textbf{X}}_i)]^2/N=o_p(1)$. The same assumption is imposed for $\hat{q}_{\tau }({\textbf{Z}})$ and $\hat{\xi }_{\tau }$.

Clearly condition (A1) here is mild. It only requires that we can estimate the conditional quantile functions consistently. In their condition (A3), Zhao et al. (2019) imposed certain convergence rate on the estimators of $q_{\tau }({\textbf{X}})$. The following theorem establishes the consistency of the above estimators $\widetilde{\hbox {qGMC}}_{\tau }$ and $\widetilde{\hbox {qPGMC}}_{\tau }$.

Theorem 2

Under condition (A1), we have

$$\begin{aligned} {\widetilde{\hbox {qGMC}}}_{\tau }&=\hbox {qGMC}_{\tau }+o_p(1);\\ \widetilde{\hbox {qPGMC}}_{\tau }&=\hbox {qPGMC}_{\tau }+o_p(1). \end{aligned}$$

Although the above estimators enjoy consistency, they can not be adopted to construct confidence intervals since they do not have asymptotic normality. Actually terms like $\sum _{i=1}^N[\hat{q}_{\tau }({\textbf{X}}_i)-q_{\tau }({\textbf{X}}_i)][I(\epsilon _i\le 0)-\tau ]/N$ are not $o_p(N^{-1/2})$. Here $\epsilon =Y-q_{\tau }({\textbf{X}})$. To make these terms be smaller order, sample sample-splitting strategy can be adopted. To be precise, now the data is randomly split into two parts $\mathcal {D}_1$ and $\mathcal {D}_2$ with sample size $n=N/2$. We use $\mathcal {D}_1$ to estimate $q_{\tau }({\textbf{X}})$ and $q_{\tau }({\textbf{Z}})$. Denote the estimators as $\hat{q}_{\tau ,1}({\textbf{X}}_i)$ and $\hat{q}_{\tau ,1}({\textbf{Z}}_i)$. We still estimate $\xi _{\tau }$ using the whole data set $\mathcal {D}$ and denote the estimator as $\hat{\xi }_{\tau }$. We then estimate $\hbox {qGMC}_{\tau }$ and $\hbox {qPGMC}_{\tau }$ as follows:

$$\begin{aligned} \widehat{\hbox {qGMC}}_1&=\frac{\sum _{i\in \mathcal {D}_2}[\rho _{\tau }(Y_i-\hat{\xi }_{\tau })-\rho _{\tau }(Y_i-\hat{q}_{\tau ,1}({\textbf{X}}_i))]}{\sum _{i\in \mathcal {D}_2}\rho _{\tau }(Y_i-\hat{\xi }_{\tau })};\nonumber \\ \widehat{\hbox {qPGMC}}_1&=\frac{\sum _{i\in \mathcal {D}_2}[\rho _{\tau }(Y_i-\hat{q}_{\tau ,1}({\textbf{Z}}_i))-\rho _{\tau }(Y_i-\hat{q}_{\tau ,1}({\textbf{X}}_i))]}{\sum _{i\in \mathcal {D}_2}\rho _{\tau }(Y_i-\hat{q}_{\tau ,1}({\textbf{Z}}_i))}. \end{aligned}$$

(3)

We further adopt the refitted cross-validation (RCV) idea in Fan et al. (2012) to enhance estimation efficiency. Actually, one can reverse the training set and the validation set, construct two estimators of ${\hbox {qGMC}}_{\tau }$ and ${\hbox {qPGMC}}_{\tau }$ and average the pairs. To be more specific, define $\widehat{\hbox {qGMC}}_{\tau }=(\widehat{\hbox {qGMC}}_1+ \widehat{\hbox {qGMC}}_2)/2$, where $\widehat{\hbox {qGMC}}_2$ is similarly defined by swapping the role of $\mathcal {D}_1$ and $\mathcal {D}_2$. The same procedure can be applied to estimate ${\hbox {qPGMC}}_{\tau }$ and the details are omitted here.

We summarize our procedure of computing ${\hbox {qGMC}}_{\tau }$ and ${\hbox {qPGMC}}_{\tau }$ in Algorithm 1.

To establish the asympototic distributions of $\widehat{\hbox {qGMC}}_{\tau }$ and $\widehat{\hbox {qPGMC}}_{\tau }$, the following condition is imposed.

(C2). $\mathbb {E}[\hat{q}_{\tau ,k}({\textbf{X}})-q_{\tau }({\textbf{X}})]^2=o(n^{-1/2}), k=1,2$. The same assumption is imposed for $\hat{q}_{\tau ,k}({\textbf{Z}})$ and $\hat{\xi }_1$.

The conditions $\mathbb {E}[\hat{q}_{\tau ,k}({\textbf{X}})-q_{\tau }({\textbf{X}})]^2=o(n^{-1/2})$ and $\mathbb {E}[\hat{q}_{\tau ,k}({\textbf{Z}})-q_{\tau }({\textbf{X}})]^2=o(n^{-1/2})$ in (C2) are mild conditions on the estimation errors of estimators of unknown functions, which are available for many machine learning methods. Among many others, see condition (A3) in Zhao et al. (2019), Theorem 2 in Shen et al. (2021), and Theorem 3.1 in Zhong and Wang (2023).

Theorem 3

Under conditions (C1)–(C2), we have

$$\begin{aligned} \sqrt{N}(\widehat{\hbox {qGMC}}_{\tau }-{\hbox {qGMC}}_{\tau })&=\frac{1}{\sqrt{N}} \sum _{i=1}^NL_i+o_p(1)\\&\overset{d}{\rightarrow }\ N(0,\hbox {Var}(L)); \\ \sqrt{N}(\widehat{\hbox {qPGMC}}_{\tau }-{\hbox {qPGMC}}_{\tau })&=\frac{1}{\sqrt{N}} \sum _{i=1}^N K_i+o_p(1)\\&\overset{d}{\rightarrow }\ N(0,\hbox {Var}(K)). \end{aligned}$$

Here $\overset{d}{\rightarrow }$ denotes convergence in distribution. Further

$$\begin{aligned} L_i&=\frac{\rho _{\tau }(Y_i-\xi _{\tau })-\rho _{\tau }(Y_i-q_{\tau }({\textbf{X}}_i)) }{\mathbb {E}[\rho _{\tau }(Y-\xi _{\tau })]}\\&\quad -\frac{\mathbb {E}[\rho _{\tau }(Y-\xi _{\tau })-\rho _{\tau }(Y-q_{\tau }({\textbf{X}}))]}{\mathbb {E}^2[\rho _{\tau }(Y-\xi _{\tau })]}\rho _{\tau }(Y_i-\xi _{\tau });\\ K_i&=\frac{\rho _{\tau }(Y_i-q_{\tau }({\textbf{Z}}_i))-\rho _{\tau }(Y_i-q_{\tau }({\textbf{X}}_i)) }{\mathbb {E}[\rho _{\tau }(Y-q_{\tau }({\textbf{Z}}))]}\\&\quad -\frac{\mathbb {E}[\rho _{\tau }(Y-q_{\tau }({\textbf{Z}}))-\rho _{\tau }(Y-q_{\tau }({\textbf{X}}))]}{\mathbb {E}^2[\rho _{\tau }(Y-q_{\tau }({\textbf{Z}}))]} \rho _{\tau }\\&\quad \times (Y_i-q_{\tau }({\textbf{Z}}_i)). \end{aligned}$$

The above theorem establishes the asymptotic normalities of $\widehat{\hbox {qGMC}}_{\tau }$ and $\widehat{\hbox {qPGMC}}_{\tau }$. Based on the above theorem, we can construct confidence interval for our introduced correlations. However to make inference for ${\hbox {qGMC}}_{\tau }$, we also need to estimate $\hbox {Var}(L)$. But $L_i$ involves unknown parameters and unknown functions. To this aim, for $i\in \mathcal {D}_2$, we can estimate $L_i$ by

$$\begin{aligned} \hat{L}_i&=\frac{\rho _{\tau }(Y_i-\hat{\xi }_{\tau })-\rho _{\tau }(Y_i-\hat{q}_{\tau ,1}({\textbf{X}}_i)) }{\mathbb {E}_n[\rho _{\tau }(Y-\xi _{\tau })]}\\&\quad -\frac{\mathbb {E}_n[\rho _{\tau }(Y-\xi _{\tau })-\rho _{\tau }(Y-q_{\tau }({\textbf{X}}))]}{\mathbb {E}_n^2[\rho _{\tau }(Y-\xi _{\tau })]}\rho _{\tau }(Y_i-\hat{\xi }_{\tau }). \end{aligned}$$

Here

$$\begin{aligned} \mathbb {E}_n[\rho _{\tau }(Y-\xi _{\tau })]&=\frac{1}{N}\sum _{i=1}^N\rho _{\tau }(Y_i-\hat{\xi }_{\tau });\\ \mathbb {E}_n[\rho _{\tau }(Y-q_{\tau }({\textbf{X}}))]&=\frac{1}{n}\sum _{i\in \mathcal {D}_2}\rho _{\tau }(Y-\hat{q}_{\tau ,1}({\textbf{X}}_i)). \end{aligned}$$

We can estimate $L_i, i\in \mathcal {D}_1$ similarly by swapping the role of $\mathcal {D}_1$ and $\mathcal {D}_2$. Then we estimate $\hbox {Var}(L)$ by

$$\begin{aligned} \widehat{\hbox {Var}}(L)=\frac{1}{N}\sum _{i=1}^N \hat{L}_i^2. \end{aligned}$$

Then by the Slusky theorem, a confidence interval for ${\hbox {qGMC}}_{\tau }$ can be constructed as follows:

$$\begin{aligned} \left( \widehat{\hbox {qGMC}}_{\tau }-z_{\alpha /2}\cdot \sqrt{\frac{\widehat{\hbox {Var}(L)}}{N}}, \widehat{\hbox {qGMC}}_{\tau }+z_{\alpha /2}\cdot \sqrt{\frac{\widehat{\hbox {Var}(L)}}{N}}\right) . \end{aligned}$$

The same steps can be applied to construct confidence interval for ${\hbox {qPGMC}}_{\tau }$. The details are omitted here.

4 Composite quantile GMC

In above introduced quantile correlation measures, we focus on a specific quantile level. To be more comprehensive, we can propose the following composite quantile GMC and composite quantile partial GMC which are denoted as cqGMC and cqPGMC:

$$\begin{aligned} {\hbox {cqGMC}}&=\int _0^1 {\hbox {qGMC}}_{\tau }\,\omega (\tau )d\tau ,\nonumber \\ {\hbox {cqPGMC}}&=\int _0^1 {\hbox {qPGMC}}_{\tau }\,\omega (\tau )d\tau . \end{aligned}$$

(4)

Here, $\omega (\tau )>0$ is a suitable weight function over (0, 1) and satisfies that $\int _0^1\omega (\tau )d\tau =1$. There are different choices for the weight function. If we want to make the weight for the central quantile level heavy, we may consider $\omega (\tau )=c\tau (1-\tau )$, where c is some constant. If we do not have any prior preference, we may consider $\omega (\tau )=1$ for $\tau \in (0,1)$. This results in unweighted average over all quantile levels. This choice has been adopted and recommended in Zou and Yuan (2008) and Kai et al. (2010). It is difficult for us to determine the optimal weight. For simplicity, we follow Zou and Yuan (2008) and Kai et al. (2010) and take $\omega (\tau )=1$.

From Proposition 1, we have the following results.

Proposition 4

For the above introduced ${\hbox {cqGMC}}$ and ${\hbox {cqPGMC}}$, under condition (C1), we have

1.
${\hbox {cqGMC}}$ and ${\hbox {cqPGMC}}$ are two indicators lying between 0 and 1, that is,
$$\begin{aligned} 0\le {\hbox {cqGMC}},\quad {\hbox {cqPGMC}}\le 1. \end{aligned}$$
2.
${\hbox {cqGMC}}=0$ if and only if $q_{\tau }({\textbf{X}})=\xi _{\tau },$ for all quantile levels a.s. That is, ${\textbf{X}}$ is independent with Y.
3.
${\hbox {cqPGMC}}=0$ if and only if $q_{\tau }({\textbf{X}})=q_{\tau }({\textbf{Z}}),$ for all quantile levels a.s. That is, given ${\textbf{Z}}$, $\varvec{W}$ is conditional independent with Y.
4.
${\hbox {cqGMC}}=1$ and ${\hbox {cqPGMC}}=1$ if and only if $Y=q_{\tau }({\textbf{X}})$ for all quantile levels a.s.

From the above proposition, we know that by integrating information from different quantile levels, the composite quantile GMC and composite quantile partial GMC can characterize independence and conditional independence.

We can estimate ${\hbox {cqGMC}}$ and ${\hbox {cqPGMC}}$ by

$$\begin{aligned} \widehat{\hbox {cqGMC}}&=\frac{1}{B}\sum _{j=1}^B\widehat{\hbox {qGMC}}_{\tau _j}\,\omega (\tau _j);\\ \widehat{\hbox {cqPGMC}}&=\frac{1}{B}\sum _{j=1}^B \widehat{\hbox {qPGMC}}_{\tau _j}\,\omega (\tau _j). \end{aligned}$$

Here $0<\tau _1<\tau _2<\cdots<\tau _B<1$ are B chosen quantile levels. In practice we set B to be 29. The asymptotic normalities of $\widehat{\hbox {cqGMC}}$ and $\widehat{\hbox {cqPGMC}}$ can be derived as follows.

Define $\epsilon _j=Y-q_{\tau _j}({\textbf{X}})$. Let $f_{\epsilon _j|{\textbf{X}}}$ be the conditional density function of $\epsilon _j$ given ${\textbf{X}}$. Let $\delta _j = Y-\xi _{\tau _j}$ and $f_{\delta _j}$ be the density function of $\delta _j$. Similarly define $\eta _j=Y-q_{\tau _j}({\textbf{Z}})$ and $f_{\eta _j|{\textbf{Z}}}$. Further let $\hat{q}_{k\tau _j}({\textbf{X}}), \hat{q}_{k\tau _j}({\textbf{Z}})$ be estimators of $q_{\tau _j}({\textbf{X}}), q_{\tau _j}({\textbf{Z}})$ based on $\mathcal {D}_k$. Also let $\hat{\xi }_{\tau _j}$ be the estimator of $\xi _{\tau _j}$ based on the whole data. To establish the asymptotic distributions of $\widehat{\hbox {cqGMC}}$ and $\widehat{\hbox {cqPGMC}}$, the following two conditions are required.

(C1’). For a around zero, assume that $0<c<\inf _a f_{\epsilon _j|{\textbf{X}}}(a)<\sup _{a} f_{\epsilon _j|{\textbf{X}}}(a)<C<\infty $. The same assumption is imposed for $f_{\eta _j|{\textbf{Z}}}$ and $f_{\delta _j}$.
(C2’). $\mathbb {E}[\hat{q}_{k\tau _j}({\textbf{X}})-q_{\tau _j}({\textbf{X}})]^2=o(n^{-1/2}), k=1,2$. The same assumption is imposed for $\hat{q}_{k\tau _j}({\textbf{Z}})$ and $\hat{\xi }_{j}$.

Clearly these two conditions (C1’) and (C2’) are in the same spirit as conditions (C1) and (C2).

Theorem 5

Under conditions (C1’)–(C2’), we have

Here

$$\begin{aligned} L_{ci}&= \frac{1}{B}\sum _{j=1}^B\left[ \frac{\rho _{\tau _j}(Y_i-\xi _{\tau _j})-\rho _{\tau _j}(Y_i-q_{\tau _j}({\textbf{X}}_i)) }{\mathbb {E}\{\rho _{\tau _j}(Y-\xi _{\tau _j})\}}\omega (\tau _j)\right. \\&\quad -\frac{\mathbb {E}\{\rho _{\tau _j}(Y-\xi _{\tau _j})-\rho _{\tau _j}(Y-q_{\tau _j}({\textbf{X}}))\}}{\mathbb {E}^2\{\rho _{\tau _j}(Y-\xi _{\tau _j})\}}\rho _{\tau _j}\\&\quad \times (Y_i-\xi _{\tau _j})\omega (\tau _j)\biggr ];\\ K_{ci}&= \frac{1}{B}\sum _{j=1}^B\left[ \frac{\rho _{\tau _j}(Y_i-q_{\tau _j}({\textbf{Z}}_i))-\rho _{\tau _j}(Y_i-q_{\tau _j}({\textbf{X}}_i)) }{\mathbb {E}\{\rho _{\tau _j}(Y-q_{\tau _j}({\textbf{Z}}))\}}\omega (\tau _j)\right. \\&\quad -\frac{\mathbb {E}\{\rho _{\tau _j}(Y-q_{\tau _j}({\textbf{Z}}))-\rho _{\tau _j}(Y-q_{\tau _j}({\textbf{X}}))\}}{\mathbb {E}^2\{\rho _{\tau _j}(Y-q_{\tau _j}({\textbf{Z}}))\}}\\&\quad \times \rho _{\tau _j}(Y_i-q_{\tau _j}({\textbf{Z}}_i))\omega (\tau _j)\biggr ]. \end{aligned}$$

This theorem can be proven based on the Theorem 3 and thus the proof is omitted here. Confidence interval can be similarly constructed and the details are omitted.

Remark 1

Motivated by the regional quantile regression methods (Zheng et al. 2015; Park and He 2017; Yoshida 2021; Park et al. 2023), we can further explore the applicability of (4) within specific quantile level region $\Delta \subseteq (0,1)$, instead of all quantile levels in the interval [0, 1]. To be precise, we introduce the following regional quantile GMC and regional quantile partial GMC, which are denoted as rqGMC and rqPGMC for any quantile level region $\Delta $:

$$\begin{aligned} {\hbox {rqGMC}}&=\frac{\int _\Delta {\hbox {qGMC}}_{\tau }\,\omega (\tau )d\tau }{\int _{\Delta }\omega (\tau )d\tau },\nonumber \\ {\hbox {rqPGMC}}&=\frac{\int _\Delta {\hbox {qPGMC}}_{\tau }\,\omega (\tau )d\tau }{\int _{\Delta }\omega (\tau )d\tau }. \end{aligned}$$

(5)

In this way, we can focus on some specific quantile-level regions of interest. For instance, to examine the lower quantile, we may choose $\Delta =[0.05, 0.2]$. Conversely, if we are interested in higher quantile, the interval $\Delta =[0.75, 0.9]$ can be considered. The properties and confidence intervals for ${\hbox {rqGMC}}$ and ${\hbox {rqPGMC}}$ can be similarly developed. The details are omitted here.

5 Numerical studies

In this section, we present some numerical results to assess the finite sample performance of our proposed measures.

Example 1

Linear model:

$$\begin{aligned} Y_{i}&={\textbf{Z}}_{i}^\top \varvec{\beta }_1 + \varvec{W}_{i}^\top \varvec{\theta }_1 + \frac{1}{4} W_{i1}\varepsilon _{i}\\&\quad + \frac{1}{5} W_{i2}\varepsilon _{i},\quad i=1,2,\dots ,N. \end{aligned}$$

Example 2

Linear model:

$$\begin{aligned} Y_{i}&={\textbf{Z}}_{i}^\top \varvec{\beta }_2 + \varvec{W}_{i}^\top \varvec{\theta }_2 + \frac{1}{4} W_{i1}\varepsilon _{i} \\&\quad + \frac{1}{5} W_{i2}\varepsilon _{i},\quad i=1,2,\dots ,N. \end{aligned}$$

Example 3

Nonlinear model:

$$\begin{aligned} Y_{i}&=\sin ({\textbf{Z}}_{i}^\top \varvec{\beta }_2) +\sin (\varvec{W}_{i}^\top \varvec{\theta }_2) + \frac{1}{4} W_{i1}\varepsilon _{i} \\&\quad + \frac{1}{5} W_{i2}\varepsilon _{i}, \quad i=1,2,\dots ,N. \end{aligned}$$

The predictor ${\textbf{X}}_{i}=({\textbf{Z}}_{i}^\top \in \mathbb {R}^{p_1}, \varvec{W}_{i}^\top \in \mathbb {R}^{p_2})^\top $ with $p_1=p_2=p/2$ is from the multivariate normal distribution $N_p({\varvec{0}}_p,{\varvec{\Sigma }})$. And $W_{ij}$ is the jth element of $\varvec{W}_{i}$. Here the covariance matrix ${\varvec{\Sigma }}$ is a $p\times p$-dimensional matrix following the Toeplitz structure, $({\varvec{\Sigma }})_{ij}= 0.5^{|i-j|} $ for $i,j=1,2,\ldots ,p$. Furthermore, the random error term $\varepsilon _{i}$ is independent with $({\textbf{Z}}_{i}, \varvec{W}_{i})$. There are two settings for the distribution of $\varepsilon _{i}$:

Setting 1: $\varepsilon _{i}$ follows the mixture gaussian distribution $ b N(0,1) + (1-b)N(0,8)$ with $b \sim B(1,0.95)$.
Setting 2: $\varepsilon _{i}$ is simulated from T-distribution with 3 degrees of freedom, i.e. $ \varepsilon _{i}\sim t_3$.

The regression coefficients $\varvec{\beta }_1$ are set as: $\beta _{1j} = b_0$ for $1\le j\le p_1$. Set regression coefficients $\varvec{\theta }_1$ as: $\theta _{1j} = c_0$ for $1\le j\le p_2$. Similarly let ${\beta }_{2j} = d_0$ for $1\le j\le 3$, ${\beta }_{2j} = 0$ otherwise, and set $\theta _{2j} = e_0$ for $1\le j\le 3$, $\theta _{2j} = 0$ otherwise. Throughout the simulation study, let $\left\| \varvec{\beta }_1 \right\| = \left\| \varvec{\beta }_2 \right\| =1/\sqrt{3}$ and $\left\| \varvec{\theta }_1 \right\| = \left\| \varvec{\theta }_2 \right\| =1/\sqrt{2}$. We utilize the Python-package xgboost version 2.0.3 to estimate unknown quantile functions $q_{\tau }({\textbf{X}})$ and $q_{\tau }({\textbf{Z}})$. Specifically, for the xgboost configuration, we set objective to reg:quantileerror, tree_method to hist, and quantile_alpha to $\tau $. The default hyperparameter settings for estimating $q_{\tau }({\textbf{X}})$ are eta = 0.04 and max_depth = 1. Similarly, for estimating $q_{\tau }({\textbf{Z}})$, we use eta = 0.01 and max_depth = 1 as the default hyperparameter values in xgboost.

Due to the nonlinearity of quantiles, it is not easy or not direct to calculate the true value of $\hbox {qGMC}_\tau $ or other criteria. To obtain the true values, we consider a sampling approach. Actually, we generate a sample with a very large sample size, say $N_0=100{,}000 $, from the generation model. Based on this large sample, we estimate $\xi _{\tau }, q_{\tau }({\textbf{X}})$ and also $q_{\tau }({\textbf{Z}})$. We then approximate $\hbox {qGMC}_{\tau }(Y,{\textbf{X}})$ and $\hbox {qPGMC}_{\tau }(Y,\varvec{W}|{\textbf{Z}})$ as follows:

$$\begin{aligned}&{\hbox {qGMC}}_{\tau }\approx \frac{\sum _{i=1}^{N_0}[\rho _{\tau }(Y_i-\hat{\xi }_{\tau })-\rho _{\tau }(Y_i-\hat{q}_{\tau }({\textbf{X}}_i))]}{\sum _{i=1}^{N_0}\rho _{\tau }(Y_i-\hat{\xi }_{\tau })};\\&{\hbox {qPGMC}}_{\tau }\approx \frac{\sum _{i=1}^{N_0}[\rho _{\tau }(Y_i-\hat{q}_{\tau }({\textbf{Z}}_i))-\rho _{\tau }(Y_i-\hat{q}_{\tau }({\textbf{X}}_i))]}{\sum _{i=1}^{N_0}\rho _{\tau }(Y_i-\hat{q}_{\tau }({\textbf{Z}}_i))}. \end{aligned}$$

That is, we approximate the true values of $\hbox {qGMC}_{\tau }(Y,{\textbf{X}})$ and $\hbox {qPGMC}_{\tau }(Y,\varvec{W}|{\textbf{Z}})$ with the same estimation formulas as in Eq. (2) but with a very large sample size. Since we know the generation model in the simulations, we can easily obtain a sample with sample size as large as possible. When the sample size $N_0$ is sufficiently large, the sample is close to the underlying population, and thus we can view the above values as the true values.

In our simulation study, our goal is to build confidence intervals for $\hbox {qGMC}_{\tau }$, $\hbox {qPGMC}_{\tau }$, $\hbox {cqGMC}$ and $\hbox {cqPGMC}$ under different circumstances. In all simulations, we evaluate the coverage probabilities (CP) and the average lengths (AL) of 95% confidence intervals. We set $p = 4, 6, 200, 500$ and vary the values of sample size $N=2n$ over $n = 200, 500$. To make inference for the ${\hbox {qGMC}}_{\tau }$ and ${\hbox {qPGMC}}_{\tau }$ of the simulated data in high-dimensional settings with $p = 200$ and 500, we employ stability selection (Meinshausen and Bühlmann 2010) on the simulated data. This process aims to reduce the dimensionality of predictors, enabling us to train models using the reduced set of predictors.

We use the quantile levels $\tau = 0.25, 0.5, 0.75$ in $\hbox {qGMC}_{\tau }$ and $\hbox {qPGMC}_{\tau }$, and the quantiles $\tau _j = j/(B+1) $ for $j=1,2,\dots ,B$ with $B = 29$ in $\hbox {cqGMC}$ and $\hbox {cqPGMC}$. The results are calculated based on 500 simulation runs.

Corresponding numerical results are presented in Tables 1, 2, 3 and 4. The results in Tables 1, 2, 3 and 4 provide strong evidence that corroborates the asymptotic theory. Actually for all examples, settings, and quantile levels, the coverage probabilities for qGMC, qPGMC, cqGMC, and cqPGMC, are all close to the nominal level 95%. The average lengths of these confidence intervals are also very short. When the sample size n increases, the empirical coverage probabilities are closer to the 95% and the average lengths become smaller. We also find that our procedures still work well even when the dimension is as high as 500.

Table 1 The coverage probabilities (CP) and the average lengths (AL) of 95% confidence intervals for $ {\hbox {qGMC}}$ and $ {\hbox {qPGMC}}$ at $\tau = 0.25, 0.50$ and 0.75 under Example 1

Full size table

Table 2 The coverage probabilities (CP) and the average lengths (AL) of 95% confidence intervals for $ {\hbox {cqGMC}}$ and $ {\hbox {cqPGMC}}$ under Example 1

Full size table

Table 3 The coverage probabilities (CP) and the average lengths (AL) of 95% confidence intervals for $ {\hbox {qGMC}}$ and $ {\hbox {qPGMC}}$ at $\tau = 0.25, 0.50$ and 0.75 under Example 2

Full size table

Table 4 The coverage probabilities (CP) and the average lengths (AL) of 95% confidence intervals for $ {\hbox {qGMC}}$ and $ {\hbox {qPGMC}}$ at $\tau = 0.25, 0.50$ and 0.75 under Example 3

Full size table

6 Real data analysis

We consider a real dataset consisting of 1200 male Carworth Farms White (CFW) mice, referred to as CFW data, to illustrate the effectiveness of our new measures. This dataset is made publicly accessible by Parker et al. (2016) and has been analyzed by Nicod et al. (2016) and Li and Auwerx (2020). It includes information on 92,734 SNPs of 1200 male CFW mice. The CFW mice in this dataset have been subjected to various phenotyping assessments, such as conditioned fear, anxiety behavior, methamphetamine sensitivity, prepulse inhibition, fasting glucose levels, body weight, tail length, and testis weight. These phenotypes can be categorized into three main groups: behavioral, physiological, and expression quantitative traits.

Table 5 The first five important SNPs and their marginal $\hbox {qGMC}_\tau $’s (in parentheses) at $\tau = 0.25, 0.50$ and 0.75 in CFW data

Full size table

Table 6 The $\hbox {qGMC}_\tau $’s and their 95% confidence intervals (in parentheses) at $\tau = 0.25, 0.50$ and 0.75 in CFW data

Full size table

We aim to illustrate that our new measures can be adopted for feature screening and inferring heritability in CFW data. Feature screening methods (Fan and Fan 2008; Zhu et al. 2011; Li et al. 2012) are used to discard as many noise features as possible and at the same time to retain all important features. Heritability is a fundamental quantity in genetic applications (Yang et al. 2010; Evans et al. 2018; Speed and Balding 2019) which specifies the contribution of genetic factors to the variation of a phenotype. For simplicity, we analyze three distinct phenotypes in this paper, which are “D1totaldist0to30”, “tibialength”, and “fastglucose”. “D1totaldist0to30” represents that “baseline” measurements of locomotor response on Day 1 of methamphetamine sensitivity testing. Locomotor activity is measured as the distance traveled (in cm) over 30-min intervals immediately after receiving an injection of saline solution. “tibialength” denotes the length of the tibia bone (in mm). “fastglucose” means Glucose levels in plasma after fasting. We proceed to normalize both the SNPs and the phenotypes separately, taking into account missing values and removing them from the dataset. This normalization process ensures that the data is standardized and ready for further analysis.

We first conduct feature screening based on our quantile GMC. To be precise, we first calculate the marginal $\hbox {qGMC}_\tau $ values for each SNP and phenotype. These $\hbox {qGMC}_\tau $ values represent the marginal associations between each SNP and the phenotype at different quantile levels. We then select the first $N/\textrm{log}(N)$ SNPs with the largest $\hbox {qGMC}_\tau $ values. For simplicity, we present the first five SNPs and their corresponding $\hbox {qGMC}_\tau $ values at $\tau = 0.25, 0.50$ and 0.75 in Table 5.

Next, we then use all the reduced $N/\textrm{log}(N)$ SNPs to calculate the overall $\hbox {qGMC}_\tau $ values for three distinct phenotypes. To be more specific, these $\hbox {qGMC}_\tau $ values represent correlation between all reduced SNPs and three distinct phenotypes at different quantile levels. We also compute the 95% confidence intervals for these $\hbox {qGMC}_\tau $ values. These results are shown in Table 6. Each $\hbox {qGMC}_\tau $ value in Table 6 represents the heritability of the SNPs on each phenotype at a specific quantile level ($\tau $), namely 0.25, 0.50, and 0.75.

The marginal $\hbox {qGMC}_\tau $ values in Table 5 also indicate the proportion of the heritability that can be attributed to a specific SNP at various quantile levels. This information provides valuable insights into the relationship between genetic markers and phenotypic expression. We observe that the SNP rs265727287 is the most important SNP for phenotype “tibialength” at the quantile level being 0.75. This SNP is also observed in Parker et al. (2016). We can also find that the SNP rs244502760 plays an important role in the phenotype “D1totaldist0to30” at both the 0.25 and 0.5 quantile levels. In Table 6, we illustrate the ability of our new measures to calculate the heritability of SNPs on phenotype at different quantile levels. Furthermore, our results may differ somewhat from those obtained in Parker et al. (2016). This discrepancy arises due to the utilization of different correlation measures. While they employed traditional correlation measures, we have opted for quantile GMC. This choice is justified by the observation depicted in Fig. 1, which indicates that the distributions of these phenotypes are heavy-tailed. Therefore, our quantile GMC is better suited for variable screening and the investigation of heritability, particularly when dealing with heavy tail or peak-distributed data.

7 Conclusions and discussions

In this paper, we propose quantile GMC to measure nonlinear quantile correlation. The new introduced measurement has many desirable properties. Notably, it takes value between 0 and 1, the value is 0 if and only if the conditionally quantile function is equal to the unconditional quantile and is 1 if and only if Y is almost surely equal to the conditional quantile function of ${\textbf{X}}$. A partial extension is also introduced. Statistical inferences for the introduced measures are investigated. Flexible machine learning methods are adopted to estimate unknown conditional quantile functions. Under very mild conditions, we establish the estimators’ consistency. However for construction of confidence interval, sample splitting is required. We further consider composite quantile GMC and composite quantile partial GMC by integrating information from different quantile levels. Its properties and estimators are investigated.

Our proposed dependence measures can have many applications. For instance, we may adopt our new measure to variable screening. We can also extend our measurement to censored data. Besides, we may extend check loss to other loss functions, such as huber loss. Moreover, when Y is multivariate, how to define a suitable measurement and make inference is also of great interest. These topics are pursued in near future.

Data Availability

This dataset is made publicly accessible by Parker et al. (2016) http://dx.doi.org/10.5061/dryad.2rs41.

References

Athey, S., Tibshirani, J., Wager, S.: Generalized random forests. Ann. Stat. 47(2), 1148–1178 (2019)
MathSciNet Google Scholar
Azadkia, M., Chatterjee, S.: A simple measure of conditional dependence. Ann. Stat. 49(6), 3070–3102 (2021)
MathSciNet Google Scholar
Chatterjee, S.: A new coefficient of correlation. J. Am. Stat. Assoc. 116(536), 2009–2022 (2021)
MathSciNet Google Scholar
Evans, L.M., Tahmasbi, R., Vrieze, S.I., Abecasis, G.R., Das, S., Gazal, S., Bjelland, D.W., De Candia, T.R., Consortium, H.R., Goddard, M.E., Neale, B.M., Yang, J., Visscher, P.M., Keller, M.C.: Comparison of methods that use whole genome data to estimate the heritability and genetic architecture of complex traits. Nat. Genet. 50, 737–745 (2018)
Google Scholar
Fan, J., Fan, Y.: High-dimensional classification using features annealed independence rules. Ann. Stat. 36(6), 2605–2637 (2008)
MathSciNet Google Scholar
Fan, J., Guo, S., Hao, N.: Variance estimation using refitted cross-validation in ultrahigh dimensional regression. J. R. Stat. Soc. Ser. B 74(1), 37–65 (2012)
MathSciNet Google Scholar
Gretton, A., Bousquet, O., Smola, A., Schölkopf, B.: Measuring statistical dependence with Hilbert–Schmidt norms. In: International Conference on Algorithmic Learning Theory, pp. 63–77 (2005)
Josse, J., Holmes, S.: Measuring multivariate association and beyond. Stat. Surv. 10, 132–167 (2016)
MathSciNet Google Scholar
Kai, B., Li, R., Zou, H.: Local composite quantile regression smoothing: an efficient and safe alternative to local polynomial regression. J. R. Stat. Soc. Ser. B 72(1), 49–69 (2010)
MathSciNet Google Scholar
Knight, K.: Limiting distributions for $L_1$ regression estimators under general conditions. Ann. Stat. 26(2), 755–770 (1998)
Google Scholar
Koenker, R.: Quantile Regression. Cambridge University Press, Cambridge (2005)
Google Scholar
Li, H., Auwerx, J.: Mouse systems genetics as a prelude to precision medicine. Trends Genet. 36(4), 259–272 (2020)
Google Scholar
Li, R., Zhong, W., Zhu, L.: Feature screening via distance correlation learning. J. Am. Stat. Assoc. 107(499), 1129–1139 (2012)
MathSciNet Google Scholar
Li, G., Li, Y., Tsai, C.-L.: Quantile correlations and quantile autoregressive modeling. J. Am. Stat. Assoc. 110(509), 246–261 (2015)
MathSciNet Google Scholar
Ma, X., Zhang, J.: Robust model-free feature screening via quantile correlation. J. Multivar. Anal. 143, 472–480 (2016)
MathSciNet Google Scholar
Ma, S., Li, R., Tsai, C.-L.: Variable screening via quantile partial correlation. J. Am. Stat. Assoc. 112(518), 650–663 (2017)
MathSciNet Google Scholar
Meinshausen, N.: Quantile regression forests. J. Mach. Learn. Res. 7(35), 983–999 (2006)
MathSciNet Google Scholar
Meinshausen, N., Bühlmann, P.: Stability selection. J. R. Stat. Soc. Ser. B 72(4), 417–473 (2010)
MathSciNet Google Scholar
Nicod, J., Davies, R.W., Cai, N., Hassett, C., Goodstadt, L., Cosgrove, C., Yee, B.K., Lionikaite, V., McIntyre, R.E., Remme, C.A., et al.: Genome-wide association of multiple complex traits in outbred mice by ultra-low-coverage sequencing. Nat. Genet. 48, 912–918 (2016)
Google Scholar
Pan, W., Wang, X., Zhang, H., Zhu, H., Zhu, J.: Ball covariance: a generic measure of dependence in Banach space. J. Am. Stat. Assoc. 115(529), 307–317 (2020)
MathSciNet Google Scholar
Park, S., He, X.: Hypothesis testing for regional quantiles. J. Stat. Plan. Inference 191, 13–24 (2017)
MathSciNet Google Scholar
Park, T., Shao, X., Yao, S.: Partial martingale difference correlation. Electron. J. Stat. 9(1), 1492–1517 (2015)
MathSciNet Google Scholar
Park, S., Kim, H., Lee, E.R.: Regional quantile regression for multiple responses. Comput. Stat. Data Anal. 188, 107826 (2023)
MathSciNet Google Scholar
Parker, C.C., Gopalakrishnan, S., Carbonetto, P., Gonzales, N.M., Leung, E., Park, Y.J., Aryee, E., Davis, J., Blizard, D.A., Ackert-Bicknell, C.L., Lionikas, A., Pritchard, J.K., Palmer, A.A.: Genome-wide association study of behavioral, physiological and gene expression traits in outbred CFW mice. Nat. Genet. 48, 919–926 (2016)
Google Scholar
Patidar, V.K., Wadhvani, R., Shukla, S., Gupta, M., Gyanchandani, M.: Quantile regression comprehensive in machine learning: a review. In: 2023 IEEE International Students’ Conference on Electrical, Electronics and Computer Science (SCEECS), pp. 1–6 (2023)
Reshef, D.N., Reshef, Y.A., Finucane, H.K., Grossman, S.R., McVean, G., Turnbaugh, P.J., Lander, E.S., Mitzenmacher, M., Sabeti, P.C.: Detecting novel associations in large data sets. Science 334(6062), 1518–1524 (2011)
Google Scholar
Shao, X., Zhang, J.: Martingale difference correlation and its use in high-dimensional variable screening. J. Am. Stat. Assoc. 109(507), 1302–1318 (2014)
MathSciNet Google Scholar
Shen, G., Jiao, Y., Lin, Y., Horowitz, J.L., Huang, J.: Deep quantile regression: Mitigating the curse of dimensionality through composition. arXiv preprint arXiv:2107.04907 (2021)
Speed, D., Balding, D.J.: SumHer better estimates the SNP heritability of complex traits from summary statistics. Nat. Genet. 51, 277–284 (2019)
Google Scholar
Székely, G.J., Rizzo, M.L., Bakirov, N.K.: Measuring and testing dependence by correlation of distances. Ann. Stat. 35(6), 2769–2794 (2007)
MathSciNet Google Scholar
Tang, Y., Wang, Y., Wang, H.J., Pan, Q.: Conditional marginal test for high dimensional quantile regression. Stat. Sin. 32, 869–892 (2022)
MathSciNet Google Scholar
Tjøstheim, D., Otneim, H., Støve, B.: Statistical dependence: beyond Pearson’s $\rho $. Stat. Sci. 37(1), 90–109 (2022)
MathSciNet Google Scholar
Visscher, P.M., Hill, W.G., Wray, N.R.: Heritability in the genomics era—concepts and misconceptions. Nat. Rev. Genet. 9, 255–266 (2008)
Google Scholar
Wang, H.J., Zhu, Z., Zhou, J.: Quantile regression in partially linear varying coefficient models. Ann. Stat. 37(6B), 3841–3866 (2009)
MathSciNet Google Scholar
Wang, L., Wu, Y., Li, R.: Quantile regression for analyzing heterogeneity in ultra-high dimension. J. Am. Stat. Assoc. 107(497), 214–222 (2012)
MathSciNet Google Scholar
Wang, X., Pan, W., Hu, W., Tian, Y., Zhang, H.: Conditional distance correlation. J. Am. Stat. Assoc. 110(512), 1726–1734 (2015)
MathSciNet Google Scholar
Xu, K.: Model-free feature screening via a modified composite quantile correlation. J. Stat. Plan. Inference 188, 22–35 (2017)
MathSciNet Google Scholar
Yang, J., Benyamin, B., McEvoy, B.P., Gordon, S., Henders, A.K., Nyholt, D.R., Madden, P.A., Heath, A.C., Martin, N.G., Montgomery, G.W., Goddard, M.E., Visscher, P.M.: Common SNPs explain a large proportion of the heritability for human height. Nat. Genet. 42, 565–569 (2010)
Google Scholar
Yoshida, T.: Quantile function regression and variable selection for sparse models. Can. J. Stat. 49(4), 1196–1221 (2021)
MathSciNet Google Scholar
Yu, K., Jones, M.C.: Local linear quantile regression. J. Am. Stat. Assoc. 93(441), 228–237 (1998)
MathSciNet Google Scholar
Zhang, Y., Lian, H., Yu, Y.: Ultra-high dimensional single-index quantile regression. J. Mach. Learn. Res. 21(1), 9212–9236 (2020)
MathSciNet Google Scholar
Zhao, W., Zhang, F., Lian, H.: Debiasing and distributed estimation for high-dimensional quantile regression. IEEE Trans. Neural Netw. Learn. Syst. 31(7), 2569–2577 (2019)
MathSciNet Google Scholar
Zheng, S., Shi, N.-Z., Zhang, Z.: Generalized measures of correlation for asymmetry, nonlinearity, and beyond. J. Am. Stat. Assoc. 107(499), 1239–1252 (2012)
Zheng, Q., Peng, L., He, X.: Globally adaptive quantile regression with ultra-high dimensional data. Ann. Stat. 43(5), 2225–2258 (2015)
Zhong, Q., Wang, J.-L.: Neural networks for partially linear quantile regression. J. Bus. Econ. Stat. 1–12 (2023)
Zhu, L.-P., Li, L., Li, R., Zhu, L.-X.: Model-free feature screening for ultrahigh-dimensional data. J. Am. Stat. Assoc. 106(496), 1464–1475 (2011)
MathSciNet Google Scholar
Zhu, L., Xu, K., Li, R., Zhong, W.: Projection correlation between two random vectors. Biometrika 104(4), 829–843 (2017)
MathSciNet Google Scholar
Zou, H., Yuan, M.: Composite quantile regression and the oracle model selection theory. Ann. Stat. 36(3), 1108–1126 (2008)
MathSciNet Google Scholar

Download references

Acknowledgements

The authors are grateful to the Editor, the Associate Editor and the two referees for their constructive comments and suggestions that led to significant improvement of an early manuscript. Guo’s research was supported by National Key R & D Program of China (Grant No. 2023YFA1011100) and National Natural Science Foundation of China Grant (Nos. 12322112, 12071038). Zhou’s research was supported by National Natural Science Foundation of China (12301331) and Natural Science Foundation of Guang dong Province (2023A1515010026). Tan’s research was supported by National Natural Science Foundation of China (12071119).

Author information

Authors and Affiliations

School of Statistics, Beijing Normal University, Beijing, 100875, China
Xinyu Zhang, Hongwei Shi & Xu Guo
Center for Statistics and Data Science, Beijing Normal University, Zhuhai, 519087, China
Niwen Zhou
Department of Statistics, Hunan University, Changsha, 410006, China
Falong Tan

Authors

Xinyu Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Hongwei Shi
View author publications
You can also search for this author in PubMed Google Scholar
Niwen Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Falong Tan
View author publications
You can also search for this author in PubMed Google Scholar
Xu Guo
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Xu Guo contributed to conceptualization, supervision, investigation, and writing of the manuscript; Xinyu Zhang ran the simulations and conducted real data analysis; Hongwei Shi made important contributions for the simulation studies and also the revision; Niwen Zhou contributed to conceptualization, investigation, and worked on theoretical results; Falong Tan contributed to conceptualization, investigation, and worked on theoretical results. All parts of the paper are discussed by all the authors. All authors reviewed the manuscript.

Corresponding author

Correspondence to Xu Guo.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

In the appendix, we give the proofs for the main theoretical results in the paper.

Proof of Proposition 1

The assertion 1 of Proposition 1 is obvious. Next we give the proof of the assertion 2 of the proposition, that is,

$$\begin{aligned} \hbox {qGMC}_{\tau }(Y, {\textbf{X}}) = 0 \Longleftrightarrow q_{\tau }(\textbf{X}) = \xi _{\tau }. \end{aligned}$$

Recall that $\epsilon = Y - q_{\tau }(\textbf{X})$ and write $s=\xi _{\tau } - q_{\tau }(\textbf{X})$. It follows from the definition of $q_{\tau }(\textbf{X})$ that $\mathbb {E}[I(Y - q_{\tau }({\textbf{X}})<0) - \tau |{\textbf{X}}] =\mathbb {E}[I(\epsilon <0) - \tau |{\textbf{X}}] = 0$. According to Knight (1998), we have

$$\begin{aligned} \rho _{\tau }(\epsilon -s)-\rho _{\tau }(\epsilon )&=s[I(\epsilon < 0)-\tau ]\\&\quad +\int _0^s [I(\epsilon \le t)-I(\epsilon \le 0)] dt. \end{aligned}$$

Substituting $\epsilon $ and s and taking the expectation, we obtain that

$$\begin{aligned}&\mathbb {E}[\rho _{\tau }(Y - \xi _{\tau }) - \rho _{\tau }(Y - q_{\tau }({\textbf{X}}))] \\&\quad = \mathbb {E}\Big [ \{\xi _{\tau } - q_{\tau }({\textbf{X}})\} \{I(\epsilon<0) - \tau \} \\&\qquad + \int _0^{\xi _{\tau } - q_{\tau }(\textbf{X})} \{I(\epsilon \le t)-I(\epsilon \le 0)\} dt \Big ] \\&\quad = \mathbb {E}\Big [\{\xi _{\tau } - q_{\tau }({\textbf{X}})\} \mathbb {E}\{I(\epsilon <0) - \tau |{\textbf{X}} \}\Big ] \\&\qquad + \mathbb {E}\left[ \int _0^{\xi _{\tau } - q_{\tau }(\textbf{X})} \mathbb {E}\{I(\epsilon \le t)-I(\epsilon \le 0)|{\textbf{X}}\} dt \right] \\&\quad = \mathbb {E}\left[ \int _0^{\xi _{\tau } - q_{\tau }(\textbf{X})} \{F_{\epsilon |{\textbf{X}}}(t) - F_{\epsilon |{\textbf{X}}}(0) \}dt \right] \\&\quad \ge \frac{c}{2} \mathbb {E}[\xi _{\tau } - q_{\tau }({\textbf{X}})]^2. \end{aligned}$$

Here the last inequality is due to Condition (C1). Then it is readily seen that if $\hbox {qGMC}_{\tau }(Y, {\textbf{X}}) = 0 $, then we have $\xi _{\tau } = q_{\tau }(\textbf{X}) $ a.s.. The converse of assertion 2 is obvious.

Similarly, we can show that $\hbox {qPGMC}_{\tau }=0 \Longleftrightarrow q_{\tau }({\textbf{X}})=q_{\tau }({\textbf{Z}})$ a.s.. Thus, the assertion 3 holds.

According to the definitions of $\hbox {qGMC}_{\tau }$ and $\hbox {qPGMC}_{\tau }$, it is known that the assertions 4, 5, and 6 hold. Then we complete the proof of this Proposition. $\square $

Proof of Theorem 2

In the following we aim to show that

$$\begin{aligned} \frac{1}{N}\sum _{i=1}^N\rho _{\tau }(Y_i-\hat{q}_{\tau }({\textbf{X}}_i))=\frac{1}{N}\sum _{i=1}^N\rho _{\tau }(Y_i-q_{\tau }({\textbf{X}}_i))+o_p(1). \end{aligned}$$

From Knight (1998), we know that

$$\begin{aligned} \rho _{\tau }(r-s)-\rho _{\tau }(r)&=s[I(r<0)-\tau ]\\&\quad +\int _0^s I(r\le t)-I(r\le 0)dt. \end{aligned}$$

We then have

$$\begin{aligned}&\frac{1}{N}\sum _{i=1}^N[\rho _{\tau }(Y_i-\hat{q}_{\tau }({\textbf{X}}_i))-\rho _{\tau }(Y_i-q_{\tau }({\textbf{X}}_i))]\\&\quad =\frac{1}{N}\sum _{i=1}^N[\hat{q}_{\tau }({\textbf{X}}_i)-q_{\tau }({\textbf{X}}_i)]\varepsilon _i\\&\qquad +\frac{1}{N}\sum _{i=1}^N\int _0^{\hat{q}({\textbf{X}}_i)-q_{\tau }({\textbf{X}}_i)} I(\epsilon _i\le t)-I(\epsilon _i\le 0)dt\\&\quad =:A_1+A_2. \end{aligned}$$

Here $\varepsilon _i=I(\epsilon _i<0)-\tau , \epsilon _i=Y_i-q_{\tau }({\textbf{X}}_i)$. Firstly for the term $A_1$, by the Cauchy–Schwarz inequality, we have

$$\begin{aligned} |A_1|\le \sqrt{\frac{\sum _{i=1}^N[\hat{q}_{\tau }({\textbf{X}}_i)-q_{\tau }({\textbf{X}}_i)]^2}{N}}\sqrt{\frac{\sum _{i=1}^N\varepsilon _i^2}{N}}. \end{aligned}$$

Note that $|\varepsilon |\le 2$. Then under condition that $\sum _{i=1}^N[\hat{q}_{\tau }({\textbf{X}}_i)-q_{\tau }({\textbf{X}}_i)]^2/N=o_p(1)$, we conclude that $A_1=o_p(1)$.

While for the term $A_2$, we note that $|I(\epsilon _i\le t)-I(\epsilon _i\le 0)|\le 2$, and thus

$$\begin{aligned} |A_2|&\le \frac{2}{N}\sum _{i=1}^N|\hat{q}_{\tau }({\textbf{X}}_i)-q_{\tau }({\textbf{X}}_i)|\\&\le 2\sqrt{\frac{\sum _{i=1}^N[\hat{q}({\textbf{X}}_i)-q_{\tau }({\textbf{X}}_i)]^2}{N}}. \end{aligned}$$

We then have $A_2=o_p(1)$, and further conclude that

$$\begin{aligned} \frac{1}{N}\sum _{i=1}^N\rho _{\tau }(Y_i-\hat{q}({\textbf{X}}_i))&=\frac{1}{N}\sum _{i=1}^N\rho _{\tau }(Y_i-q_{\tau }({\textbf{X}}_i))+o_p(1)\\&=\mathbb {E}[\rho _{\tau }(Y-q_{\tau }({\textbf{X}}))]+o_p(1). \end{aligned}$$

Similarly,

$$\begin{aligned} \frac{1}{N}\sum _{i=1}^N\rho _{\tau }(Y_i-{\hat{\xi }_{\tau }})&=\frac{1}{N}\sum _{i=1}^N\rho _{\tau }(Y_i-\xi _{\tau })+o_p(1)\\&=\mathbb {E}[\rho _{\tau }(Y-\xi _{\tau })]+o_p(1),\\ \frac{1}{N}\sum _{i=1}^N\rho _{\tau }(Y_i-\hat{q}_{\tau }({\textbf{Z}}_i))&=\frac{1}{N}\sum _{i=1}^N\rho _{\tau }(Y_i-q_{\tau }({\textbf{Z}}_i))+o_p(1)\\&=\mathbb {E}[\rho _{\tau }(Y-q_{\tau }({\textbf{Z}}))]+o_p(1). \end{aligned}$$

Thus, this theorem is proven by using the Slutsky theorem. $\square $

Proof of Theorem 3

In the following we aim to show that

$$\begin{aligned} \frac{1}{\sqrt{n}}\sum _{i\in \mathcal {D}_2}\rho _{\tau }(Y_i-\hat{q}_{\tau ,1}({\textbf{X}}_i))&=\frac{1}{\sqrt{n}}\sum _{i\in \mathcal {D}_2}\rho _{\tau }(Y_i-q_{\tau }({\textbf{X}}_i))\\&\quad +o_p(1). \end{aligned}$$

From the proof of Theorem 2, we have that

$$\begin{aligned}&\frac{1}{\sqrt{n}}\sum _{i\in \mathcal {D}_2}[\rho _{\tau }(Y_i-\hat{q}_{\tau ,1}({\textbf{X}}_i))-\rho _{\tau }(Y_i-q_{\tau }({\textbf{X}}_i))]\\&\quad =\frac{1}{\sqrt{n}}\sum _{i\in \mathcal {D}_2}[\hat{q}_{\tau ,1}({\textbf{X}}_i)-q_{\tau }({\textbf{X}}_i)]\varepsilon _i\\&\qquad +\frac{1}{\sqrt{n}}\sum _{i\in \mathcal {D}_2}\int _0^{\hat{q}_{\tau ,1}({\textbf{X}}_i)-q_{\tau }({\textbf{X}}_i)} I(\epsilon _i\le t)-I(\epsilon _i\le 0)dt\\&\quad =:B_1+B_2. \end{aligned}$$

Firstly for the term $B_1$, note that by the definition of $q_{\tau }({\textbf{X}})$, $\mathbb {E}(\varepsilon _i|{\textbf{X}})=\mathbb {E}[I(\epsilon _i<0)-\tau |{\textbf{X}}]=0$ and given $\mathcal {D}_1$, $B_{1}$ is a sum of i.i.d r.v. Thus

$$\begin{aligned} \mathbb {E}(B_1)&=\frac{1}{\sqrt{n}}\sum _{i\in \mathcal {D}_{2}}\mathbb {E}\Big [\mathbb {E}\big \{(\hat{q}_{\tau ,1}({\textbf{X}}_i)-q_{\tau }({\textbf{X}}_i))\varepsilon _i|\mathcal {D}_1,X_i\big \}\Big ]\\&=0. \end{aligned}$$

Further

$$\begin{aligned} \mathbb {E}(B_{1}^2|\mathcal {D}_1)=\frac{1}{n}\sum _{i\in \mathcal {D}_{2}}\mathbb {E}[\varepsilon _i^2\{\hat{q}_{\tau ,1}({\textbf{X}}_i)-q_{\tau }({\textbf{X}}_i)\}^2|\mathcal {D}_1]. \end{aligned}$$

Thus

$$\begin{aligned} \mathbb {E}(B_1^2)=\frac{1}{n}\sum _{i\in \mathcal {D}_{2}}\mathbb {E}[\sigma ^2({\textbf{X}}_i)\{\hat{q}_{\tau ,1}({\textbf{X}}_i)-q_{\tau }({\textbf{X}}_i)\}^2]=o(1). \end{aligned}$$

Then $B_1=o_p(1)$.

While for the term $B_2$, we note that

$$\begin{aligned}&\mathbb {E}\left[ \int _0^{\hat{q}_{\tau ,1}({\textbf{X}}_i)-q_{\tau }({\textbf{X}}_i)} I(\epsilon _i\le t)-I(\epsilon _i\le 0)dt|\mathcal {D}_1, {\textbf{X}}_i\right] \\&\quad =\int _0^{\hat{q}_{\tau ,1}({\textbf{X}}_i)-q_{\tau }({\textbf{X}}_i)} F_{\epsilon |{\textbf{X}}}(t)-F_{\epsilon |{\textbf{X}}}(0)dt\\&\quad \le C(\hat{q}_{\tau ,1}({\textbf{X}}_i)-q_{\tau }({\textbf{X}}_i))^2. \end{aligned}$$

Here we assume that $\sup _a f_{\epsilon |{\textbf{X}}}(a)\le C<\infty $.

It then follows that

$$\begin{aligned} \mathbb {E}\left[ B_2|\mathcal {D}_1, ({\textbf{X}}_i)_{i=1}^n\right] \le C\frac{1}{\sqrt{n}}\sum _{i\in \mathcal {D}_2}(\hat{q}_{\tau ,1}({\textbf{X}}_i)-q_{\tau }({\textbf{X}}_i))^2. \end{aligned}$$

Under condition that $\mathbb {E}[(\hat{q}_{\tau ,1}({\textbf{X}}_i)-q_{\tau }({\textbf{X}}_i))^2]=o(n^{-1/2})$, we obtain that $\mathbb {E}(B_2)=o(1)$. Under condition that $f_{\epsilon |{\textbf{X}}}(a)>c>0$ when a is around zero, $B_2$ is nonnegative. We then have that $B_2=o_p(1)$.

We then conclude that

$$\begin{aligned} \frac{1}{\sqrt{n}}\sum _{i\in \mathcal {D}_2}\rho _{\tau }(Y_i-\hat{q}_{\tau ,1}({\textbf{X}}_i))&=\frac{1}{\sqrt{n}}\sum _{i\in \mathcal {D}_2}\rho _{\tau }(Y_i-q_{\tau }({\textbf{X}}_i))\\&\quad +o_p(1). \end{aligned}$$

Similarly,

$$\begin{aligned} \frac{1}{\sqrt{n}}\sum _{i\in \mathcal {D}_2}\rho _{\tau }(Y_i-{\hat{\xi }}_{\tau })&=\frac{1}{\sqrt{n}}\sum _{i\in \mathcal {D}_2}\rho _{\tau }(Y_i-\xi _{\tau })+o_p(1),\\ \frac{1}{\sqrt{n}}\sum _{i\in \mathcal {D}_2}\rho _{\tau }(Y_i-\hat{q}_{\tau ,1}({\textbf{Z}}_i))&=\frac{1}{\sqrt{n}}\sum _{i\in \mathcal {D}_2}\rho _{\tau }(Y_i-q_{\tau }({\textbf{Z}}_i))+o_p(1). \end{aligned}$$

Thus we obtain that

$$\begin{aligned}&\sqrt{n}(\widehat{\hbox {qGMC}}_1-\hbox {qGMC})=\frac{1}{\sqrt{n}} \sum _{i\in \mathcal {D}_2}\frac{\rho _{\tau }(Y_i-\xi _{\tau })-\rho _{\tau }(Y_i-q_{\tau }({\textbf{X}}_i)) -\mathbb {E}[\rho _{\tau }(Y-\xi _{\tau })-\rho _{\tau }(Y-q_{\tau }({\textbf{X}}))] }{\mathbb {E}[\rho _{\tau }(Y-\xi _{\tau })]}\\&\qquad -\frac{1}{\sqrt{n}} \sum _{i\in \mathcal {D}_2}\frac{\mathbb {E}[\rho _{\tau }(Y-\xi _{\tau })-\rho _{\tau }(Y-q_{\tau }({\textbf{X}}))]}{\mathbb {E}^2[\rho _{\tau }(Y-\xi _{\tau })]}\Big (\rho _{\tau }(Y_i-\xi _{\tau })-\mathbb {E}[\rho _{\tau }(Y-\xi _{\tau })]\Big )+o_p(1)\\&\quad =\frac{1}{\sqrt{n}} \sum _{i\in \mathcal {D}_2}L_i+o_p(1);\\&\sqrt{n}(\widehat{\hbox {qPGMC}}_1-\hbox {qPGMC})=\frac{1}{\sqrt{n}} \sum _{i\in \mathcal {D}_2}\frac{\rho _{\tau }(Y_i-q_{\tau }({\textbf{Z}}_i))-\rho _{\tau }(Y_i-q_{\tau }({\textbf{X}}_i)) -\mathbb {E}[\rho _{\tau }(Y-q_{\tau }({\textbf{Z}}))-\rho _{\tau }(Y-q_{\tau }({\textbf{X}}))] }{\mathbb {E}[\rho _{\tau }(Y-\xi _{\tau })]}\\&\qquad -\frac{1}{\sqrt{n}} \sum _{i\in \mathcal {D}_2}\frac{\mathbb {E}[\rho _{\tau }(Y-q_{\tau }({\textbf{Z}}))-\rho _{\tau }(Y-q_{\tau }({\textbf{X}}))]}{\mathbb {E}^2[\rho _{\tau }(Y-q_{\tau }({\textbf{Z}}))]}\times \Big (\rho _{\tau }(Y_i-q_{\tau }({\textbf{Z}}_i))-\mathbb {E}[\rho _{\tau }(Y-q_{\tau }({\textbf{Z}}))]\Big )+o_p(1)\\&\quad =\frac{1}{\sqrt{n}} \sum _{i\in \mathcal {D}_2}K_i+o_p(1). \end{aligned}$$

Here

$$\begin{aligned} L_i&=\frac{\rho _{\tau }(Y_i-\xi _{\tau })-\rho _{\tau }(Y_i-q_{\tau }({\textbf{X}}_i)) }{\mathbb {E}[\rho _{\tau }(Y-\xi _{\tau })]}\\&\quad -\frac{\mathbb {E}[\rho _{\tau }(Y-\xi _{\tau })-\rho _{\tau }(Y-q_{\tau }({\textbf{X}}))]}{\mathbb {E}^2[\rho _{\tau }(Y-\xi _{\tau })]}\rho _{\tau }(Y_i-\xi _{\tau });\\ K_i&=\frac{\rho _{\tau }(Y_i-q_{\tau }({\textbf{Z}}_i))-\rho _{\tau }(Y_i-q_{\tau }({\textbf{X}}_i)) }{\mathbb {E}[\rho _{\tau }(Y-q_{\tau }({\textbf{Z}}))]}\\&\quad -\frac{\mathbb {E}[\rho _{\tau }(Y-q_{\tau }({\textbf{Z}}))-\rho _{\tau }(Y-q_{\tau }({\textbf{X}}))]}{\mathbb {E}^2[\rho _{\tau }(Y-q_{\tau }({\textbf{Z}}))]} \rho _{\tau }(Y_i-q_{\tau }({\textbf{Z}}_i)). \end{aligned}$$

Next, we aim to show that

$$\begin{aligned} \frac{1}{ n}\sum _{i\in \mathcal {D}_2} \hat{L}_i^2 = \frac{1}{ n}\sum _{i\in \mathcal {D}_2} L_i^2+o_p(1). \end{aligned}$$

To this aim, we first show that

$$\begin{aligned} \mathbb {E}[\rho _{\tau }(Y_i-\hat{q}_{\tau ,1}({\textbf{X}}_i)) -\rho _{\tau }(Y_i-q_{\tau }({\textbf{X}}_i))]^2 = o(1). \end{aligned}$$

Recall that

$$\begin{aligned}&\rho _{\tau }(Y_i-\hat{q}_{\tau ,1}({\textbf{X}}_i))-\rho _{\tau }(Y_i-q_{\tau }({\textbf{X}}_i))\\&\quad =[\hat{q}_{\tau ,1}({\textbf{X}}_i)-q_{\tau }({\textbf{X}}_i)]\varepsilon _i\\&\qquad +\int _0^{\hat{q}_{\tau ,1}({\textbf{X}}_i)-q_{\tau }({\textbf{X}}_i)} I(\epsilon _i\le t)-I(\epsilon _i\le 0)dt. \end{aligned}$$

It then follows that

$$\begin{aligned}&|\rho _{\tau }(Y_i-\hat{q}_{\tau ,1}({\textbf{X}}_i))-\rho _{\tau }(Y_i-q_{\tau }({\textbf{X}}_i))|\\&\quad \le C|\hat{q}_{\tau ,1}({\textbf{X}}_i)-q_{\tau }({\textbf{X}}_i)|. \end{aligned}$$

From condition (C1), we then conclude that

$$\begin{aligned} \mathbb {E}[\rho _{\tau }(Y_i-\hat{q}_{\tau ,1}({\textbf{X}}_i)) -\rho _{\tau }(Y_i-q_{\tau }({\textbf{X}}_i))]^2 = o(1). \end{aligned}$$

Similarly, we have $\mathbb {E}[\rho _{\tau }(Y_i-{\hat{\xi }}_{\tau })-\rho _{\tau }(Y_i-\xi _{\tau })]^2=o(1), \mathbb {E}[\rho _{\tau }(Y_i-\hat{q}_{\tau ,1}({\textbf{Z}}_i))-\rho _{\tau }(Y_i-q_{\tau }({\textbf{Z}}_i))]^2=o(1)$, and also $\mathbb {E}(\hat{L}_i-L_i)^2=o(1).$ Further note that

$$\begin{aligned} \frac{1}{ n}\sum _{i\in \mathcal {D}_2} \hat{L}_i^2&= \frac{1}{ n}\sum _{i\in \mathcal {D}_2} L_i^2+\frac{1}{ n}\sum _{i\in \mathcal {D}_2} (\hat{L}_i-L_i)^2\\&\quad +\frac{2}{ n}\sum _{i\in \mathcal {D}_2} L_i (\hat{L}_i-L_i). \end{aligned}$$

Clearly the second term is $o_p(1)$, while the third term satisfies that

$$\begin{aligned} \left| \frac{2}{ n}\sum _{i\in \mathcal {D}_2} L_i (\hat{L}_i-L_i)\right|&\le 2\sqrt{\frac{\sum _{i\in \mathcal {D}_2}L_i^2}{n}}\sqrt{\frac{\sum _{i\in \mathcal {D}_2}(\hat{L}_i-L_i)^2}{n}}\\&=o_p(1). \end{aligned}$$

We then conclude that

$$\begin{aligned} \frac{1}{ n}\sum _{i\in \mathcal {D}_2} \hat{L}_i^2 = \frac{1}{ n}\sum _{i\in \mathcal {D}_2} L_i^2+o_p(1). \end{aligned}$$

This means that $\widehat{\hbox {Var}}(L) = \hbox {Var}(L)+o_p(1)$. Similarly, we have $\widehat{\hbox {Var}}(K) = \hbox {Var}(K)+o_p(1).$ $\square $

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Zhang, X., Shi, H., Zhou, N. et al. Quantile generalized measures of correlation. Stat Comput 34, 92 (2024). https://doi.org/10.1007/s11222-024-10414-8

Download citation

Received: 14 November 2023
Accepted: 19 February 2024
Published: 12 March 2024
DOI: https://doi.org/10.1007/s11222-024-10414-8

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Quantile generalized measures of correlation

Abstract

1 Introduction

2 Quantile generalized measure of correlation and its partial version

Proposition 1

Example 1

3 Statistical inference

Theorem 2

Theorem 3

4 Composite quantile GMC

Proposition 4

Theorem 5

Remark 1

5 Numerical studies

Example 1

Example 2

Example 3

6 Real data analysis

7 Conclusions and discussions

Data Availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix

Appendix

Proof of Proposition 1

Proof of Theorem 2

Proof of Theorem 3

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation