Special Issue 2022 – Call for Papers
A New Role for Statistics: The Joint Special Issue of "Statistics in Transition New Series" (SiTns) and "Statystyka Ukraïny" (SU)
Sakshi Kaushik https://orcid.org/0000- 0002-4219-1488 , Alka Sabharwal https://orcid.org/0000-0002-8252-8284 , Gurprit Grover https://orcid.org/0000-0003-2051-4810

© Sakshi Kaushik, Alka Sabharwal, Gurprit Grover. Article available under the CC BY-SA 4.0 licence


(English) PDF


Mental disorders are common non-communicable diseases whose occurrence rises at epidemic rates globally. The determination of the severity of a mental illness has important clinical implications and it serves as a prognostic factor for effective intervention planning and management. This paper aims to identify the relevant predictors of the severity of mental illnesses (measured by psychiatric rating scales) from a wide range of clinical variables consisting of information on both laboratory test results and psychiatric factors . The laboratory test results collectively indicate the measurements of 23 components derived from vital signs and blood tests results for the evaluation of the complete blood count. The 8 psychiatric factors known to affect the severity of mental illnesses are considered, viz. the family history, course and onset of an illness, etc. Retrospective data of 78 patients diagnosed with mental and behavioural disorders were collected from the Lady Hardinge Medical College & Smt. S.K, Hospital in New Delhi, India. The observations missing in the data are imputed using the non-parametric random forest algorithm. The multicollinearity is detected based on the variance inflation factor. Owing to the presence of multicollinearity, regularisation techniques such as ridge regression and extensions of the least absolute shrinkage and selection operator (LASSO), viz. adaptive and group LASSO are used for fitting the regression model. Optimal tuning parameter ? is obtained through 13-fold cross-validation. It was observed that the coefficients of the quantitative predictors extracted by the adaptive LASSO and the group of predictors extracted by the group LASSO were comparable to the coefficients obtained through ridge regression.


adaptive LASSO, group LASSO, mental disorder, multicollinearity, random forest imputation, ridge regression, severity of an illness


Akinwande, M. O., Dikko, H. G., and Samson, A., (2015). Variance inflation factor: as a condition for the inclusion of suppressor variable(s) in regression analysis. Open Journal of Statistics, pp. 754–767.

American Psychiatric Association, (2013). Diagnostic and statistical manual of mental disorders, 5th edition Arlington, VA: American Psychiatric Publishing.

Barbato, A., (1998). Schizophrenia and Public Health. Nations For Mental Health, Division of Mental Health and Prevention of Substance Abuse, Geneva: World Health Organization.

Bahn, S., Schwarz, E., Harris, L. W., Martins-De-Souza, D., Rahmoune, H., and Guest, P. C., (2013). Biomarker blood tests for diagnosis and management of mental disorders: focus on schizophrenia. Archives of Clinical Psychiatry, Sao Paulo, 40(1), pp. 02–09.

Bradvik, L., (2018). Suicide Risk and Mental Disorders. International journal of environmental research and public health, 15 (9), pp. 2028–2031.

Belsley, D., (1991). Conditioning diagnostics: collinearity and weak data in regression, New York: Wiley.

Brewer, B. R., Pradhan, S., Carvell, G., and Delitto, A., (2009). Application of modified regression techniques to a quantitative assessment for the motor signs of Parkinson's Disease.” IEEE transactions on neural systems and rehabilitation engineering: a publication of the IEEE Engineering in Medicine and Biology Society, 17 (6), pp. 568–575.

Canan, F., Dikici, S., Kutlucan, A., and Celbek, G., Coskun, H., Gungor, A., Aydin, Y. and Kocaman, G., (2012). Association of mean trombosit volume with DSM-IV major depression in a large community-based population: the MELEN study. Journal of psychiatric research, 46 (3), pp. 298–302. 10.1016/j.jpsychires.2011.11.016.

Dwivedi, A. K., Chatterjee, K., and Singh, R., (2017). Lifetime alcohol consumption and severity in alcohol dependence syndrome. Industrial Psychiatry Journal, 26(1), pp. 34–38.

Farrar, and Glauber, R., (1967). Multicollinearity in regression analysis: The problem revisited. The Review of Economics and Statistics, 49 (1), 92–107.

Goldstein, B., Velyvis, V., and Parikh, S. V., (2006). The association between moderate alcohol use and illness severity in bipolar disorder: a preliminary report. The Journal of Clinical Psychiatry, 67 (1), pp. 102–106.

Greene, W. H., (1993). The econometric approach to efficiency analysis. In the measurement of productive efficiency and productivity change, by Harold O. Fried, C. A. Knox Lovell, and Shelton S. Schmidt, pp. 68–119. United Kingdom.

Haenisch, F., Cooper, J. D., Reif, A., Kittel-Schneide, S., Steiner, J., Leweke, F. M., Rothermundt, M., Beveren, N., Crespo-Facorro, B., Niebuhr, D., Cowan, D., Weber, N., Yolken, R., Penninx, B. and Bahn, S., (2016). Towards a blood-based diagnostic panel for bipolar disorder. Brain, Behavior, and Immunity, 52, pp. 49– 57. https://doi.org/10.1016/j.bbi.2015.10.001

Hafner, H., (2005). Gender Differences in Schizophrenia. In Estrogen Effects in Psychiatric Disorders, by N. Bergemann, and A. (eds.) Riecher-Ro¨ssler. Austria: SpringerWienNewYork.

Hastie, T., Tibshirani, R., and Friedman, J., (2009). The elements of statistical learning: data mining, inference and prediction. Second Edition. California: Springer.

Hastie, T., Tibshirani, R., and Wainwright, M., (2015). Statistical learning with sparsity: The Lasso and generalizations. New York: Chapman and Hall/CRC Press.

Hoerl, A. E., Kennard, R. W., (1970). Ridge regression: biased estimation for nonorthogonal problems. Technometrics, 12 (1), pp. 55–67. DOI: 10.1080/00401706.1970.10488634.

Huang, S. H., Lependu, P., Iyer, S. V., Ai-Seale, M., Carrell, T. D., and Shah, N. H., (2014). Toward personalizing treatment for depression: predicting diagnosis and severity. Journal of the American Medical Informatics Association, 21 (6), pp. 1069–1075.

Jacob, K. S., (2016). Insight in psychosis: An indicator of severity of psychosis, an explanatory model of illness, and a coping strategy. Indian journal of psychological medicine, 38(3), pp. 194–201.

Jain, R., (1985). Ridge regression and its application to medical data. Computers and Biomedical Research, 18, pp. 363–368.

James, G., Witten, D., Hastie, T., and Tibshirani, R., (2013). An introduction to statistical learning: with applications in R, New York: Springer.

Jongh, P. J. De, Jongh, E. De, Pienaar, M., Gordon-Grant, H., Oberholzer, M., and Santana, L., (2015). The impact of pre-selected variance inflation factor thresholds on the stability and predictive power of logistic regression models in credit scoring. Orion, 31(1), pp. 17–37, DOI: https://doi.org/10.5784/31-1-162.

Kang, H., (2013). The prevention and handling of the missing data. Korean journal of anesthesiology, 64 (5), pp. 402–406.

Kim, M. H., Banerjee, S., Park, S. M., and Pathak, J., (2017). Improving risk prediction for depression via Elastic Net regression – Results from Korea National Health Insurance Services Data. AMIA ... Annual Symposium proceedings. AMIA Symposium, 2016, pp. 1860–1869.

Krishnadas, R., Jauhar, S., Telfer, S., Shivashankar, S., and Mccreadie, R., (2012). Nicotine dependence and illness severity in schizophrenia. The British journal of psychiatry, 201 (4), pp. 306–12.

Laursen, T. M., Labouriau, R., Licht, R. W., Bertelsen, A., Munk-Olsen, T., and Mortensen, P. B., (2005). Family history of psychiatric illness as a risk factor for schizoaffective disorder: A Danish Register-Based Cohort Study. Arch Gen Psychiatry, 62 (8), pp. 841–848. doi:10.1001/archpsyc.62.8.841

Lu, Y., Pouget, J. G., Andreassen, O. A., Djurovic, S., Esko, T., Hultman, C. M., Metspalu, A., Milani, L., Werge, T., and Sullivan, P. F., (2018). Genetic risk scores and family history as predictors of schizophrenia in Nordic registers. Psychological medicine, 48(7), pp. 1201–1208.

Marzo, S. D., Giordano, A., Pacchiarotti, I., Colom, F., Sánchez-Moreno, J., and Vieta, E., (2006). The impact of the number of episodes on the outcome of bipolar disorder. The European Journal of Psychiatry, 20, pp. 21–28.

Mcdaniel, K., Edland, S., and Heyman, A., (1995). Relationship between level of insight and severity of dementia in Alzheimer disease. CERAD Clinical Investigators. Consortium to Establish a Registry for Alzheimer's Disease. Alzheimer Dis Assoc Disord, 9 (2), pp. 101–104.

Milne, B., Caspi, A., Harrington, H., Poulton, R., Rutter, M., and Moffitt, T., (2009). Predictive value of family history on severity of illness: The case for depression, anxiety, alcohol dependence, and drug dependence. Arch Gen Psychiatry, 66 (7), pp. 738–747.

Oba, S., Sato, M. A., Takemasa, I., Monden, M., Matsubara, K., and Ishii, S., (2003). A Bayesian missing value estimation method for gene expression profile data. Bioinformatics, 19, pp. 2088–2096.

Richards, D., Richardson, T., Timulak, L., Vigano, N., Mooney, J., Doherty, G., Hayes, C., Sharry, J., (2016). Predictors of depression severity in a treatment-seeking sample. International Journal of Clinical and Health Psychology, 16 (3), pp. 221– 314.

Sadock, B., (2009). Psychiatric report, medical record and medical error. In S. V. Sadock BJ, Kaplan and Sadock’s Comprehensive Textbook of Psychiatry (9th ed., pp. 907– 18). Philadelphia: Lippincott Williams and Wilkins.

Setoyama, D., Kato, T. A., Hashimoto, R., Kunugi, H., Hattori, K., Hayakawa, K., Sato-Kasai, M., Shimokawa, N., Kaneko, S., Yoshida, S., Goto, Y. I., Yasuda, Y., Yamamori, H., Ohgidani, M., Sagata, N., Miura, D., Kang, D., and Kanba, S., (2016). Plasma metabolites predict severity of depression and suicidal ideation in psychiatric patients-A Multicenter Pilot Analysis. PLoS One, 11(12). e0165267

Stegenga, B. T., Kamphuis, M. H., King, M., Nazareth, I., and Geerlings, M. I., (2010). The natural course and outcome of major depressive disorder in primary care: the PREDICT-NL study. Social psychiatry and psychiatric epidemiology, 47 (1), pp. 87–95.

Stekhoven, D. J., Bühlmann, P., (2012). MissForest—non-parametric missing value imputation for mixed-type data. Bioinformatics, 28 (1), pp. 112–118.

Tibshirani, R., (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological), pp. 267–288.

Upadhya, S. S., Cheeran, A. N., (2018). Performance comparison of regression techniques in predicting Parkinson disease severity score using speech features. Biomedical Engineering: Applications, Basis and Communications, 30(4). https://doi.org/10.4015/S1016237218500254

World Health Organization, (1992). The Icd-10 Classification of Mental and Behavioural Disorders: Clinical Descriptions and Diagnostic Guidelines. Geneva: World Health Organization.

WORLD HEALTH ORGANIZATION, (2000). Cross-national comparisons of the prevalences and correlates of mental disorders. WHO International Consortium in Psychiatric Epidemiology, Bull, 78 (4), pp. 413–426.

World Health Organization, (2003). Investing in mental health. World Health Organization, pp. 1–48.

Yuan, M., Lin, Yi., (2006). Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society. Series B 68, part 1, pp. 49–67.

Zhao, P., Yu, B., (2006). On Model Selection Consistency of Lasso. Journal of Machine Learning Research, 7, pp. 2541–2563.

Zou, H., and Hastie, T., (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society, Series B, 67(2), pp. 301–320.

Zou, H., (2006). The adaptive Lasso and its oracle properties. Journal of the American Statistical Association: Theory and Methods, 101(476), pp. 1418–1429.

Zimmerman, M., Morgan, T. A., and Stanton, K., (2018). The severity of psychiatric disorders. World psychiatry: official journal of the World Psychiatric Association (WPA), 17 (3), pp. 258–275.

Back to top
© 2019–2022 Copyright by Statistics Poland, some rights reserved. Creative Commons Attribution-ShareAlike 4.0 International Public License (CC BY-SA 4.0)