Sukanya Intarapak , Thidaporn Supapakorn

(English) PDF


For the regression analysis of clustered data, the error of cluster data violates the independence assumption. Consequently, the test statistic based on the ordinary least square method leads to incorrect inferences. To overcome this issue, the transformation is required to apply to the observations. In this paper we propose an alternative matrix transformation that adjusts the intra-cluster correlation with Householder matrix and apply it to the F test statistic based on generalized least squares procedures for the regression coefficients hypothesis. By Monte Carlo simulations of the balanced and unbalanced data, it is found that the F test statistic based on generalized least squares procedures with Adjusted Householder transformation performs well in terms of the type I error rate and power of the test.


adjusted Householder, clustered data, F test statistic, generalized least squares, intra-cluster correlation


AQARWAL, G. G., AWASTHI, S., WALTER, S. D., (2005). Intra-class Correlation Estimates for Assessment of Vitamin A Intake in Children. Journal of Health, Population, and Nutrition, 23 (1), pp. 66–73.

BATTESE, G. E., HARTER, R. M., FULLER, W. A., (1988). An Error-Components Model for Prediction of County Crop Areas Using Survey and Satellite Data. Journal of the American Statistical Association, 83 (401), pp. 28–36.

ELDRIDGE, S. M., UKOUMUNNE, O. C., CARLIN, J. B., (2009). The Intra-Cluster Correlation Coefficient in Cluster Randomized Trials: A Review of Definitions. International Statistical Review, 77 (3), pp. 378–394.

FARHADIAN, R., ASADIAN, N., (2017). On the Helmert Matrix and Application in Stochastic Processes. International Journal of Mathematics and Computer Science, 12 (2), pp. 107–115.

GALBRAITH, S., DANIEL, J. A., VISSEL, B., (2010). A Study of Clustered Data and Approaches to Its Analysis. The Journal of Neuroscience, 30 (32), pp. 10601–10608.

HOUSEHOLDER, A. S., (1958). Unitary Triangularization of a Nonsymmetric Matrix. Journal of the ACM, 5, pp. 339–342.

LAHIRI, P., LI, Y., (2009). A New Alternative to the Standard F Test for Clustered Data. Journal of Statistical Planning and Inference, 139, pp. 3430–3441.

LANCASTER, H. O., (1965). The Helmert Matrices. The American Mathematical Monthly, 72, pp. 4–12.

MCCULLOCH, C. E., SHAYLE, R. S., (2001). Generalized, linear, and mixed models, New York: John Wiley and Sons.

MIALL, W. E., OLDHAM, P. D., (1955). A Study of Arterial Blood Pressure and Its Inheritance in a Sample of the General Population. Clinical Science, 14 (3), pp. 459–488.

NG, S. K., MCLACHLAN, G. J., YAU, K. K. W., LEE, A. H., (2004). Modeling the Distribution of Ischaemic Stroke-specific Survival Time using an EM-based Mixture Approach with Random Effects Adjustment. Statistics in Medicine, 23, pp. 2729–2744.

RAO, J. N. K., SUTRADHAR, B. C., YUE, K., (1993). Generalized Least Squares F Test in Regression Analysis with Two-Stage Cluster Samples. Journal of the American Statistical Association, 88 (424), pp. 1388–1391.

RAO, J. N. K., WANG, S. G., (1995). On the Power of F Tests under Regression Models with Nested Error Structure. Journal of Multivariate Analysis, 53, pp. 237–246.

SMITH, C. A. B., (1980). Estimating Genetic Correlations. Ann. Human Genetics, 43, pp. 265–284.

SRIVASTAVA, M. S., KATAPA, R. S., (1986). Comparison of Estimators of Interclass and Intraclass Correlations from Familial Data. The Canadian Journal of Statistics, 14 (1), pp. 29–42.

WU, C. F. J., HOLT, D., HOLMES, D. J., (1988). The Effect of Two-Stage Sampling on the F Statistic. Journal of the American Statistical Association, 83 (401), pp. 150–159.

Back to top
© 2019–2023 Copyright by Statistics Poland, some rights reserved. Creative Commons Attribution-ShareAlike 4.0 International Public License (CC BY-SA 4.0) Creative Commons — Attribution-ShareAlike 4.0 International — CC BY-SA 4.0