-
HETEROGENEITY ANALYSIS VIA INTEGRATING MULTI-SOURCES HIGH-DIMENSIONAL DATA WITH APPLICATIONS TO CANCER STUDIES. Stat. Sin. (IF 1.4) Pub Date : 2023-12-1 Tingyan Zhong, Qingzhao Zhang, Jian Huang, Mengyun Wu, Shuangge Ma
This study has been motivated by cancer research, in which heterogeneity analysis plays an important role and can be roughly classified as unsupervised or supervised. In supervised heterogeneity analysis, the finite mixture of regression (FMR) technique is used extensively, under which the covariates affect the response differently in subgroups. High-dimensional molecular and, very recently, histopathological
-
HIGH-DIMENSIONAL FACTOR REGRESSION FOR HETEROGENEOUS SUBPOPULATIONS. Stat. Sin. (IF 1.4) Pub Date : 2023-10-19 Peiyao Wang, Quefeng Li, Dinggang Shen, Yufeng Liu
In modern scientific research, data heterogeneity is commonly observed owing to the abundance of complex data. We propose a factor regression model for data with heterogeneous subpopulations. The proposed model can be represented as a decomposition of heterogeneous and homogeneous terms. The heterogeneous term is driven by latent factors in different subpopulations. The homogeneous term captures common
-
Use of random integration to test equality of high dimensional covariance matrices. Stat. Sin. (IF 1.4) Pub Date : 2023-10-6 Yunlu Jiang, Canhong Wen, Yukang Jiang, Xueqin Wang, Heping Zhang
Testing the equality of two covariance matrices is a fundamental problem in statistics, and especially challenging when the data are high-dimensional. Through a novel use of random integration, we can test the equality of high-dimensional covariance matrices without assuming parametric distributions for the two underlying populations, even if the dimension is much larger than the sample size. The asymptotic
-
Globally Adaptive Longitudinal Quantile Regression with High Dimensional Compositional Covariates. Stat. Sin. (IF 1.4) Pub Date : 2023-7-24 Huijuan Ma, Qi Zheng, Zhumin Zhang, Huichuan Lai, Limin Peng
In this work, we propose a longitudinal quantile regression framework that enables a robust characterization of heterogeneous covariate-response associations in the presence of high-dimensional compositional covariates and repeated measurements of both response and covariates. We develop a globally adaptive penalization procedure, which can consistently identify covariate sparsity patterns across a
-
Marginal Bayesian Posterior Inference using Recurrent Neural Networks with Application to Sequential Models. Stat. Sin. (IF 1.4) Pub Date : 2023-7-6 Thayer Fisher, Alex Luedtke, Marco Carone, Noah Simon
In Bayesian data analysis, it is often important to evaluate quantiles of the posterior distribution of a parameter of interest (e.g., to form posterior intervals). In multi-dimensional problems, when non-conjugate priors are used, this is often difficult generally requiring either an analytic or sampling-based approximation, such as Markov chain Monte-Carlo (MCMC), Approximate Bayesian computation
-
An Efficient Greedy Search Algorithm for High-dimensional Linear Discriminant Analysis. Stat. Sin. (IF 1.4) Pub Date : 2023-05-01 Hannan Yang,D Y Lin,Quefeng Li
High-dimensional classification is an important statistical problem that has applications in many areas. One widely used classifier is the Linear Discriminant Analysis (LDA). In recent years, many regularized LDA classifiers have been proposed to solve the problem of high-dimensional classification. However, these methods rely on inverting a large matrix or solving large-scale optimization problems
-
Feature-weighted elastic net: using "features of features" for better prediction. Stat. Sin. (IF 1.4) Pub Date : 2023-04-27 J Kenneth Tay, Nima Aghaeepour, Trevor Hastie, Robert Tibshirani
In some supervised learning settings, the practitioner might have additional information on the features used for prediction. We propose a new method which leverages this additional information for better prediction. The method, which we call the feature-weighted elastic net ("fwelnet"), uses these "features of features" to adapt the relative penalties on the feature coefficients in the elastic net
-
Sieve estimation of a class of partially linear transformation models with interval-censored competing risks data. Stat. Sin. (IF 1.4) Pub Date : 2023-04-01 Xuewen Lu,Yan Wang,Dipankar Bandyopadhyay,Giorgos Bakoyannis
In this paper, we consider a class of partially linear transformation models with interval-censored competing risks data. Under a semiparametric generalized odds rate specification for the cause-specific cumulative incidence function, we obtain optimal estimators of the large number of parametric and nonparametric model components via maximizing the likelihood function over a joint B-spline and Bernstein
-
PENALIZED REGRESSION FOR MULTIPLE TYPES OF MANY FEATURES WITH MISSING DATA. Stat. Sin. (IF 1.4) Pub Date : 2023-04-01 Kin Yau Wong,Donglin Zeng,D Y Lin
Recent technological advances have made it possible to measure multiple types of many features in biomedical studies. However, some data types or features may not be measured for all study subjects because of cost or other constraints. We use a latent variable model to characterize the relationships across and within data types and to infer missing values from observed data. We develop a penalized-likelihood
-
Interval estimation for operating characteristic of continuous biomarkers with controlled sensitivity or specificity. Stat. Sin. (IF 1.4) Pub Date : 2023-01-01 Yijian Huang,Isaac Parakati,Dattatraya H Patil,Martin G Sanda
The receiver operating characteristic (ROC) curve provides a comprehensive performance assessment of a continuous biomarker over the full threshold spectrum. Nevertheless, a medical test often dictates to operate at a certain high level of sensitivity or specificity. A diagnostic accuracy metric directly targeting the clinical utility is specificity at the controlled sensitivity level, or vice versa
-
An Online Projection Estimator for Nonparametric Regression in Reproducing Kernel Hilbert Spaces. Stat. Sin. (IF 1.4) Pub Date : 2023-01-01 Tianyu Zhang,Noah Simon
The goal of nonparametric regression is to recover an underlying regression function from noisy observations, under the assumption that the regression function belongs to a prespecified infinite-dimensional function space. In the online setting, in which the observations come in a stream, it is generally computationally infeasible to refit the whole model repeatedly. As yet, there are no methods that
-
Robust Inference for Partially Observed Functional Response Data. Stat. Sin. (IF 1.4) Pub Date : 2022-10-01 Yeonjoo Park,Xiaohui Chen,Douglas G Simpson
Irregular functional data in which densely sampled curves are observed over different ranges pose a challenge for modeling and inference, and sensitivity to outlier curves is a concern in applications. Motivated by applications in quantitative ultrasound signal analysis, this paper investigates a class of robust M-estimators for partially observed functional data including functional location and quantile
-
Prior Knowledge Guided Ultra-high Dimensional Variable Screening with Application to Neuroimaging Data. Stat. Sin. (IF 1.4) Pub Date : 2022-9-3 Jie He, Jian Kang
Variable screening is a powerful and efficient tool for dimension reduction under ultrahigh dimensional settings. However, most existing methods overlook useful prior knowledge in specific applications. In this work, from a Bayesian modeling perspective, we develop a unified variable screening procedure for the linear regression model. We discuss different constructions of posterior mean screening
-
A spline-based nonparametric analysis for interval-censored bivariate survival data. Stat. Sin. (IF 1.4) Pub Date : 2022-7-8 Yuan Wu, Ying Zhang, Junyi Zhou
In this manuscript we propose a spline-based sieve nonparametric maximum likelihood estimation method for joint distribution function with bivariate interval-censored data. We study the asymptotic behavior of the proposed estimator by proving the consistency and deriving the rate of convergence. Based on the sieve estimate of the joint distribution, we also develop an efficient nonparametric test for
-
Maximum Likelihood Estimation for Cox Proportional Hazards Model with a Change Hyperplane. Stat. Sin. (IF 1.4) Pub Date : 2022-4-19 Yu Deng, Jianwen Cai, Donglin Zeng
We propose a Cox proportional hazards model with a change hyperplane to allow the effect of risk factors to differ depending on whether a linear combination of baseline covariates exceeds a threshold. The proposed model is a natural extension of the change-point hazards model. We maximize the partial likelihood function for estimation and suggest an m-out-of-n bootstrapping procedure for inference
-
STRUCTURED CORRELATION DETECTION WITH APPLICATION TO COLOCALIZATION ANALYSIS IN DUAL-CHANNEL FLUORESCENCE MICROSCOPIC IMAGING. Stat. Sin. (IF 1.4) Pub Date : 2022-1-21 Shulei Wang, Jianqing Fan, Ginger Pocock, Ellen T Arena, Kevin W Eliceiri, Ming Yuan
Current workflows for colocalization analysis in fluorescence microscopic imaging introduce significant bias in terms of the user's choice of region of interest (ROI). In this work, we introduce an automatic, unbiased structured detection method for correlated region detection between two random processes observed on a common domain. We argue that although intuitive, using the maximum log-likelihood
-
Hypothesis Testing for Network Data with Power Enhancement. Stat. Sin. (IF 1.4) Pub Date : 2022-1-11 Yin Xia, Lexin Li
Comparing two population means of network data is of paramount importance in a wide range of scientific applications. Numerous existing network inference solutions focus on global testing of entire networks, without comparing individual network links. The observed data often take the form of vectors or matrices, and the problem is formulated as comparing two covariance or precision matrices under a
-
SMOOTH DENSITY SPATIAL QUANTILE REGRESSION. Stat. Sin. (IF 1.4) Pub Date : 2022-1-7 Halley Brantley, Montserrat Fuentes, Joseph Guinness, Eben Thoma
We derive the properties and demonstrate the desirability of a model-based method for estimating the spatially-varying effects of covariates on the quantile function. By modeling the quantile function as a combination of I-spline basis functions and Pareto tail distributions, we allow for flexible parametric modeling of the extremes while preserving non-parametric flexibility in the center of the distribution
-
SEMIPARAMETRIC DOSE FINDING METHODS FOR PARTIALLY ORDERED DRUG COMBINATIONS. Stat. Sin. (IF 1.4) Pub Date : 2022-01-01 Matthieu Clertant,Nolan A Wages,John O'Quigley
We investigate a statistical framework for Phase I clinical trials that test the safety of two or more agents in combination. For such studies, the traditional assumption of a simple monotonic relation between dose and the probability of an adverse event no longer holds. Nonetheless, the dose toxicity (adverse event) relationship will obey an assumption of partial ordering in that there will be pairs
-
Robust inference of conditional average treatment effects using dimension reduction. Stat. Sin. (IF 1.4) Pub Date : 2022-01-01 Ming-Yueh Huang,Shu Yang
Personalized treatment aims at tailoring treatments to individual characteristics. An important step is to understand how a treatment effect varies across individual characteristics, known as the conditional average treatment effect (CATE). In this study, we make robust inferences of the CATE from observational data, which becomes challenging with a multivariate confounder. To reduce the curse of dimensionality
-
Construction of strong group-orthogonal arrays Stat. Sin. (IF 1.4) Pub Date : 2022-01-01 Chunyan Wang,Jinyu Yang,Min-Qian Liu
Space-filling designs with low-dimensional stratifications are desirable choices for computer experiments. In addition, column orthogonality is an important property of designs for such experiments, because it allows the estimates of the main effects in linear models to be uncorrelated with each other. However, few works have examined space-filling designs with both properties. This paper proposes
-
Variable Selection for Multiple Function-on-Function Linear Regression Stat. Sin. (IF 1.4) Pub Date : 2022-01-01 Xiong Cai,Liugen Xue,Jiguo Cao
We introduce a variable selection procedure for function-on-function linear models with multiple functional predictors, using the functional principal component analysis (FPCA)-based estimation method with the group smoothly clipped absolute deviation regularization. This approach enables us to select significant functional predictors and estimate the bivariate functional coefficients simultaneously
-
Communication-Efficient Distributed Linear Discriminant Analysis for Binary Classification Stat. Sin. (IF 1.4) Pub Date : 2022-01-01 Mengyu Li,Junlong Zhao
Large-scale data are common when the sample size n is large, and these data are often stored on k different local machines. Distributed statistical learning is an efficient way to deal with such data. In this study, we consider the binary classification problem for massive data based on a linear discriminant analysis (LDA) in a distributed learning framework. The classical centralized LDA requires
-
Conditional Test for Ultrahigh Dimensional Linear Regression Coefficients Stat. Sin. (IF 1.4) Pub Date : 2022-01-01 Wenwen Guo,Wei Zhong,Sunpeng Duan,Hengjian Cui
This paper is concerned with a conditional test for regression coefficients in ultrahigh dimensional linear models. Conditioning on a subset of important predictors in the model, we test the overall significance of regression coefficients of the remaining ultrahigh dimensional predictors. We first propose a conditional U-statistic test (CUT) based on an estimated U-statistic for a high dimensional
-
A Sequential Probability Ratio Test for Sparse Gaussian Mixtures Stat. Sin. (IF 1.4) Pub Date : 2022-01-01 Wenhua Jiang,Cun-Hui Zhang
We develop a one-sided sequential probability ratio test (SPRT) for detecting the presence of rare and weak signals. We prove that the test is consistent throughout the detectable region of the sparse Gaussian mixtures. This makes an interesting connection between the test of power one and the higher criticism. Unlike existing methods that use simulated critical values, for the SPRT, the actual size
-
STATISTICAL INFERENCE IN QUANTILE REGRESSION FOR ZERO-INFLATED OUTCOMES Stat. Sin. (IF 1.4) Pub Date : 2022-01-01 Wodan Ling,Bin Cheng,Ying Wei,Joshua Willey,Ying Kuen Cheung
-
A Position-Based Approach for Design and Analysis of Order-of-Addition Experiments Stat. Sin. (IF 1.4) Pub Date : 2022-01-01 Zack Stokes,Hongquan Xu
In many physical and computer experiments, the order in which the steps of a process are performed may have a substantial impact on the measured response. Often, the goal in these situations is to uncover the order that optimizes the response according to some metric. However, the brute force approach of performing all permutations quickly becomes impractical as the number of components in the process
-
Dynamic Penalized Splines for Streaming Data Stat. Sin. (IF 1.4) Pub Date : 2022-01-01 Dingchuan Xue,Fang Yao
-
A proximal dual semismooth Newton method for zero-norm penalized quantile regression estimator Stat. Sin. (IF 1.4) Pub Date : 2022-01-01 Dongdong Zhang,Shaohua Pan,Shujun Bi
-
SEAMLESS PHASE II/III CLINICAL TRIALS WITH COVARIATE ADAPTIVE RANDOMIZATION Stat. Sin. (IF 1.4) Pub Date : 2022-01-01 Wei Ma,Mengxi Wang,Hongjian Zhu
-
An iterative algorithm to learn from positive and unlabeled examples Stat. Sin. (IF 1.4) Pub Date : 2022-01-01 Xin Liu,Qingle Zheng,Xiaotong Shen,Shaoli Wang
-
Efficient estimation and computation in generalized varying coefficient models with unknown link and variance functions for large-scale data Stat. Sin. (IF 1.4) Pub Date : 2022-01-01 Huazhen Lin,Jiaxin Liu,Haoqi Li,Lixian Pan,Yi Li
-
Hypothesis Testing for Block-structured Correlation for High Dimensional Variables Stat. Sin. (IF 1.4) Pub Date : 2022-01-01 Shurong Zheng,Xuming He,Jianhua Guo
-
A Note on Endogeneity Resolution in Regression Models for Comparative Studies Stat. Sin. (IF 1.4) Pub Date : 2022-01-01 Ravi Kashyap
We provide a justification for why, and when, endogeneity will not cause a bias in the interpretation of the coefficients in a regression model. This technique can be a viable alternative to, or even used alongside, the instrumental variable method. We show that, when performing any comparative study, it is possible to measure the true change in the coefficients under a broad set of conditions. Our
-
Causal Inference from Possibly Unbalanced Split-Plot Designs: A Randomization-based Perspective Stat. Sin. (IF 1.4) Pub Date : 2022-01-01 Rahul Mukerjee,Tirthankar Dasgupta
Split-plot designs find wide applicability in multifactor experiments with randomization restrictions. Practical considerations often warrant the use of unbalanced designs. This paper investigates randomization based causal inference in split-plot designs that are possibly unbalanced. Extension of ideas from the recently studied balanced case yields an expression for the sampling variance of a treatment
-
Sparse Composite Quantile Regression with Ultra-high Dimensional Heterogeneous Data Stat. Sin. (IF 1.4) Pub Date : 2022-01-01 Lianqiang Qu,Meiling Hao,Liuquan Sun
-
Power Analysis of Projection-Pursuit Independence Tests Stat. Sin. (IF 1.4) Pub Date : 2022-01-01 Kai Xu,Liping Zhu
-
A new nonparametric extension of ANOVA via projection mean variance measure Stat. Sin. (IF 1.4) Pub Date : 2022-01-01 Jicai Liu,Yuefeng Si,Wenchao Xu,Riquan Zhang
-
CLT For U-statistics With Growing Dimension Stat. Sin. (IF 1.4) Pub Date : 2022-01-01 Cyrus DiCiccio,Joseph Romano
The purpose of this paper is to present a general triangular array Central Limit Theorem for U -statistics, where the kernel hk(x1, . . . , xk) and its dimension k may increase with the sample size. Some motivating examples which require such a general result are presented. The examples include a class of Hodges-Lehmann estimators, subsampling estimators, and combining p-values through data splitting
-
Relationship between orthogonal and baseline parameterizations and its applications to design constructions Stat. Sin. (IF 1.4) Pub Date : 2022-01-01 Cheng-Yu Sun,Boxin Tang
-
-
Consistent Screening Procedures in High-dimensional Binary Classification Stat. Sin. (IF 1.4) Pub Date : 2022-01-01 Hangjin Jiang,Xingqiu Zhao,Ronald C.W. Ma,Xiaodan Fan
In this paper, we consider variable screening in high-dimensional binary classification. Firstly, we proposed non-parametric test statistics for the problem of twosample distribution comparison that combine the merits of the Chi-square statistic and the Kolmogorov-Smirnov statistic and provide new insights into the equality test of the unspecified distributions underlying two independent samples. Based
-
Bayesian Estimation of Gaussian Conditional Random Fields Stat. Sin. (IF 1.4) Pub Date : 2022-01-01 Lingrui Gan,Naveen Narisetty,Feng Liang
-
Spectral distribution of the sample covariance of high-dimensional time series with unit roots Stat. Sin. (IF 1.4) Pub Date : 2022-01-01 Alexei Onatski,Chen Wang
We study the empirical spectral distributions of two sample-covariance-type matrices associated with high-dimensional time series with unit roots. The first matrix is S = XX ′/T, where X is an n × T data with rows represented by n i.i.d. copies of T consecutive observations of a difference-stationary process. The second matrix is W = n ∫ 1 0 Wn (t)Wn (t) ′ dt, where Wn (t) is an n-dimensional vector
-
Efficient Estimation for Dimension Reduction with Censored Survival Data Stat. Sin. (IF 1.4) Pub Date : 2022-01-01 Ge Zhao,Yanyuan Ma,Wenbin Lu
We propose a general index model for survival data, that generalizes many commonly used semiparametric survival models and belongs to the framework of dimension reduction. Using a combination of a geometric approach in semiparametrics and a martingale treatment in survival data analysis, we devise estimation procedures that are feasible and do not require covariate-independent censoring, as assumed
-
Double Happiness: Enhancing the Coupled Gains of L-lag Coupling via Control Variates Stat. Sin. (IF 1.4) Pub Date : 2022-01-01 Radu V. Craiu,Xiao-Li Meng
The recently proposed L-lag coupling for unbiased MCMC \citep{biswas2019estimating, jacob2020unbiased} calls for a joint celebration by MCMC practitioners and theoreticians. For practitioners, it circumvents the thorny issue of deciding the burn-in period or when to terminate an MCMC iteration, and opens the door for safe parallel implementation. For theoreticians, it provides a powerful tool to establish
-
A Bayesian Approach to Envelope Quantile Regression Stat. Sin. (IF 1.4) Pub Date : 2022-01-01 Minji Lee,Saptarshi Chakraborty,Zhihua Su
The enveloping approach employs sufficient dimension-reduction techniques to gain estimation efficiency, and has been used in several multivariate analysis contexts. However, its Bayesian development has been sparse, and the only Bayesian envelope construction is in the context of a linear regression. In this paper, we propose a Bayesian envelope approach to a quantile regression, using a general framework
-
Estimation for nonignorable missing response or covariate using semi-parametric quantile regression imputation and a parametric response probability model Stat. Sin. (IF 1.4) Pub Date : 2022-01-01 Emily Berg,Cindy Yu
We address the problem of imputation when a response or covariate may be subject to a nonignorable (or, equivalently, missing not at random) nonresponse, meaning the response probability may depend on a variable that is not always observed. We discuss model identification and develop a novel estimator of the parameters of the response probability. We use a propensity score adjustment to incorporate
-
Space-Time Estimation and Prediction under Infill Asymptotics with Compactly Supported Covariance Functions Stat. Sin. (IF 1.4) Pub Date : 2022-01-01 Tarik Faouzi,Emilio Porcu,Moreno Bevilacqua
-
Gaussian Process Prediction using Design-Based Subsampling Stat. Sin. (IF 1.4) Pub Date : 2022-01-01 Linglin He,Ying Hung
-
A stable and more efficient doubly robust estimator Stat. Sin. (IF 1.4) Pub Date : 2022-01-01 Min Zhang,Baqun Zhang
Under the assumption of missing at random, doubly robust (DR) estimators are consistent when either the propensity score or the outcome model is correctly specified. However, despite its appealing theoretic properties, Kang and Schafer (2007) show that the usual augmented inverse probability weighted (AIPW) DR estimator may sometimes exhibit unsatisfying behavior. We propose an alternative DR method
-
A Permutation Test for Two-Sample Means and Signal Identification of High-dimensional Data Stat. Sin. (IF 1.4) Pub Date : 2022-01-01 Efang Kong,Lengyang Wang,Yingcun Xia,Jin Liu
-
Sufficient cause interactions for categorical and ordinal outcomes Stat. Sin. (IF 1.4) Pub Date : 2022-01-01 Jaffer Zaidi,Tyler VanderWeele
The sufficient cause model is extended from binary to categorical and ordinal outcomes to formalize the concept of sufficient cause interaction and synergism in this setting. This extension allows us to derive counterfactual and empirical conditions for detecting the presence of sufficient cause interactions for ordinal and categorical outcomes. Some of these conditions are entirely novel in that they
-
Order Determination for Spiked Type Models Stat. Sin. (IF 1.4) Pub Date : 2022-01-01 Yicheng Zeng,Lixing Zhu
Motivated by dimension reduction in the context of regression analysis and signal detection, we investigate the order determination for large-dimensional matrices, including spiked-type models, in which the numbers of covariates are proportional to the sample sizes for different models. Because the asymptotic behaviors of the estimated eigenvalues of the corresponding matrices differ from those in
-
Infinite Arms Bandit: Optimality via Confidence Bounds Stat. Sin. (IF 1.4) Pub Date : 2022-01-01 Hock Peng Chan,Shouri Hu
Berry et al. (1997) initiated the development of the infinite arms bandit problem. They derived a regret lower bound of all allocation strategies for Bernoulli rewards with uniform priors, and proposed strategies based on success runs. Bonald and Proutiere (2013) proposed a two-target algorithm that achieves the regret lower bound, and extended optimality to Bernoulli rewards with general priors. We
-
Adaptive Change Point Monitoring for High-Dimensional Data Stat. Sin. (IF 1.4) Pub Date : 2022-01-01 Teng Wu,Runmin Wang,Hao Yan,Xiaofeng Shao
In this paper, we propose a class of monitoring statistics for a mean shift in a sequence of high-dimensional observations. Inspired by the recent U-statistic based retrospective tests developed by Wang et al.(2019) and Zhang et al.(2020), we advance the U-statistic based approach to the sequential monitoring problem by developing a new adaptive monitoring procedure that can detect both dense and sparse
-
A Simple and Efficient Estimation of Average Treatment Effects in Models with Unmeasured Confounders Stat. Sin. (IF 1.4) Pub Date : 2022-01-01 Chunrong Ai,Lukang Huang,Zheng Zhang
-
A METHOD OF LOCAL INFLUENCE ANALYSIS IN SUFFICIENT DIMENSION REDUCTION Stat. Sin. (IF 1.4) Pub Date : 2022-01-01 Fei Chen,Lei Shi,Lin Zhu,Lixing Zhu
-
A UNIFIED FRAMEWORK FOR MINIMUM ABERRATION Stat. Sin. (IF 1.4) Pub Date : 2022-01-01 Ming-Chung Chang
-
Correction: Derivative principal components for representing the time dynamics of longitudinal and functional data Stat. Sin. (IF 1.4) Pub Date : 2022-01-01 Xiongtao Dai,Hans-Georg Müller,Wenwen Tao