Randomisation tests (R-tests) are regularly proposed as an alternative method of hypothesis testing when assumptions of classical statistical methods are violated in data analysis. In this paper, the robustness in terms of the type-I-error and the power of the R-test were evaluated and compared with that of the F-test in the analysis of a single factor repeated measures design. The study took into account normal and non-normal data (skewed: exponential, lognormal, Chi-squared, and Weibull distributions), the presence and lack of outliers, and a situation in which the sphericity assumption was met or not under varied sample sizes and number of treatments. The Monte Carlo approach was used in the simulation study. The results showed that when the data were normal, the R-test was approximately as sensitive and robust as the F-test, while being more sensitive than the F-test when data had skewed distributions. The R-test was more sensitive and robust than the F-test in the presence of an outlier. When the sphericity assumption was met, both the R-test and the F-test were approximately equally sensitive, whereas the R-test was more sensitive and robust than the F-test when the sphericity assumption was not met.
randomisation test, repeated measures design, sensitivity, robustness, Monte Carlo
Berry, K., Johnston, J., Mielke, P., (2018). Permutation Statistical Methods: A Permutation Statistical Approach, doi: 10.1007/978-3-319-98926-6_2.
Cohen, J., (1988). Statistical Power Analysis for the Behavioral Sciences (2nd ed.). New Jersey: Lawrence Earlbaum Associates.
Craig, A. R., Fisher, W. W., (2019). Randomization tests as alternative analysis methods for behavior-analytic data. Journal of the Experimental Analysis of Behavior, 111(2), pp. 309–328.
Davis, C. S., (2002). Statistical Methods for the Analysis of Repeated Measurements. New York, NY: Springer Publishers.
Dragset, I. G., (2009). Analysis of longitudinal data with missing values: Methods and Applications in Medical Statistics (Master’s Thesis). Available from Norwegian university of science and technology digital theses database.
Edgington, E. S., (1995). Randomization Tests (3rd Ed). New York, NY: Marcel Dekker.
Girden, E. R., (1992). ANOVA: Repeated measures. Sage Publications, Newbury Park, CA.
Gleason, J., (2013). Comparative power of the ANOVA, randomization ANOVA, And Kruskal-Wallis test (Doctoral Dissertation). Available from Wayne State University Digital Dissertations database.
Gravetter, F. J., Wallnau, L. B., (2007). Statistics for the behavioral science. Canada: Vicki Knight.
Howitt, D., Cramer, D., (2011). Introduction to Research Methods in Psychology (3rd ed.). Essex: Pearson Education Limited.
Kherad-Pajouh, S., Renaud, O., (2014). A general permutation approach for analyzing repeated measures ANOVA and mixed-model designs. Computational Statistics and Data Analysis, 21 (5), pp. 42–59.
Krueger, C., Tian, L., (2004). A Comparison of the general linear mixed model and repeated measures ANOVA using a dataset with multiple missing data points. Biological Research for Nursing, 6, pp. 151–157.
Lindman, H. R., (1992). Analysis of Variance in Experimental Design. New York: Springer-Verlag.
Ma, Y., Mazumdar M., Memtsoudis, S. G., (2012). Beyond repeated-measures analysis of variance: advanced statistical methods for the analysis of longitudinal data in anesthesia research. Regional Anesthesia and Pain Medicine, 37, pp. 99–105.
Mewhort, D. J. K., (2005). A comparison of the randomization test with the F test when error is skewed. Behavior Research Methods, 37 (3), pp. 426–435.
Mewhort, D. J. K., Johns, B. T., Kelly, M., (2010). Applying the permutation test to factorial designs. Behavior Research Methods, 42 (2), pp. 366–372.
Mundry, R., (1999). Testing related samples with missing values: a permutation approach. Animal Behaviour, 58, pp. 1143–1153.
Oladugba, A. V., Udom, A. U., Ugah, T. E., Ukaegbu, E. C, Madukaife, M. S., Sanni, S. S., (2014). Principles of Applied Statistics. Nsukka: University of Nigeria Press Ltd.
Peres-Neto, P. R., Olden, J., (2001). Assessing the robustness of randomization tests: examples from behavioral Studies. Animal Behaviour, 61, pp. 79–86.
Reed III, J., (2003). Analysis of variance (ANOVA) models in emergency medicine. The Journal of Emergency and Intensive Care Medicine, 7(2), pp. 21–34.
Sawilowsky, S. S., Blair, R. C., Higgins, J. J., (1989). An investigation of the type-I-error and power properties of the rank transformation procedure in factorial ANOVA. Journal of Educational Statistics, 14 (3), pp. 255–267.
Songwon, S., (2006). A Review and Comparison of Methods for Detecting Outliers in Univariate Data Sets (Master's thesis) available from University of Pittsburgh, Graduate School of Public Health database.
Vorapongsathorn, T., Taejaroenkul, S., Viwatwongkasem, C., (2004). A comparison of type-I-error and power of Bartlett’s test, Levene’s test and Cochran’s test under violation of assumptions. Songklanakarin Journal of Science and Technology, 26(4), pp. 537–547.
Zimmerman, D. W., Zumbo, B. D., (1990). Effect of outliers on the relative power of parametric and nonparametric statistical tests. Perceptual and Motor Skills, 71, pp. 339–349.