Application of Machine Learning Techniques for Classifying Household Standards of Living in Sinnar State, Sudan: A Comparative Study of Discriminant Analysis and Decision Trees
DOI:
https://doi.org/10.59992/IJFAES.2025.v4n4p12Keywords:
Standard of Living Classification, Discriminant Analysis, Decision Tree Model, Socio-Economic Factors, SPSS Analysis, Income SufficiencyAbstract
This study aims to classify households in Sinnar State, Sudan, into high, medium, and low standard-of-living groups based on selected economic, demographic, and social variables. Using a two-stage cluster sampling method, data were collected from 800 households across 23 administrative units. A structured questionnaire was used to gather primary data, and SPSS software was employed for statistical analysis. Discriminant analysis and decision tree models were applied to identify key factors affecting household classification and assess model accuracy. Results showed significant differences between groups, confirming the suitability of discriminant analysis. The most influential variables in the first discriminant function were income sufficiency, car ownership, and health expenditure sources. The decision tree model slightly outperformed discriminant analysis, achieving 72% classification accuracy compared to 71.6% from the discriminant model. Despite the close performance, both methods effectively distinguished household living standards.
The study concludes that advanced statistical techniques such as discriminant analysis and decision trees are useful for socio-economic classification and can support better-targeted development policies. It recommends applying these models to guide government interventions and focusing on income-generating initiatives to enhance household welfare.
References
1. Afifi, A., & May, S. (2012). Practical multivariate analysis (5th ed.). Taylor & Francis Group, LLC.
2. Afifi, A., May, S., Donatello, R. A., & Clark, V. A. (2020). Practical multivariate analysis (6th ed.). Taylor & Francis Group, LLC.
3. Ahmed. (2014). Classification of Syrian provinces by household consumption using cluster analysis. Tishreen University Journal - Economic and Legal Sciences Series, 37(2).
4. Hamad, A. K. S. (2018). Classification of the Iraqi provinces of some transitional diseases using measures (CCC, Delta) in the cluster analysis. University of Salah Eddin, Department of Scientific Publishing, 22(5), 187–206.
5. Izenman, A. J. (2008). Modern multivariate statistical techniques: Regression, classification, and manifold learning. Springer-Verlag.
6. Everitt, B., & Hothorn, T. (2011). An introduction to applied multivariate analysis with R. Springer-Verlag New York.
7. Huberty, C. J., & Olejnik, S. (2006). Applied MANOVA and discriminant analysis (2nd ed.). Wiley-Interscience.
8. Czepiel, S. A. (2002). Maximum likelihood estimation of logistic regression models: Theory and implementation. Retrieved from http://czep.net
9. Al Mekhlafi, F. A. I. (2019). Classification and discrimination of Yemeni provinces by sources of income using cluster analysis and discriminant analysis. Taiz University Research Journal, 19.
10. Johnson, R. A., & Wichern, D. W. (2007). Applied multivariate statistical analysis (6th ed.). Prentice Hall.
11. Ho, R. (2006). Handbook of univariate and multivariate data analysis and interpretation with SPSS. Taylor & Francis Group, LLC.
12. Ahmad, Z., & Ejaz, Z. (2011). Classification of households with respect to poverty by using cluster analysis. In ICCS-11, Lahore, Pakistan.