OpinionML: An Interpretable Machine Learning Framework for Opinion Mining on Social Networking Sites

Harith Hamoodat; Soud Mohamed Amen; Firas Aswad

doi:10.59992/IJCI.2026.v5n4p2

Authors

Harith Hamoodat Author
Soud Mohamed Amen Author
Firas Aswad Author

DOI:

https://doi.org/10.59992/IJCI.2026.v5n4p2

Keywords:

Opinion Analysis, Sentiment Analysis, Aspect Analysis, Topic Modeling, Fake Review Detection, Machine Learning

Abstract

In recent years, the dramatic increase in the volume of user-generated content on social media, making effective opinion analysis systems more essential than ever. Tasks such as sentiment analysis, aspect extraction, topic modeling, and fake review detection are often addressed separately, although they are closely related. This isolating limits the potential for leveraging shared information. In this paper, we present OpinionML, a machine learning framework designed to unify tasks into a single framework. The framework uses a common feature engineering pipeline that combines several feature types: TF-IDF, lexical, syn- tactic, behavioral, and topic-based. Topic information is extracted using latent Dirichlet allocation (LDA) and augmented to obtain additional contextual cues. In modeling, different problems require different approaches—so we use support vector machines, random forests, and conditional random fields, depending on suitability. The proposed framework evaluated on standard datasets, (SemEval- 2016, Yelp, Amazon, and Sentiment140). The results show that our approach demonstrates competitive performance compared to traditional machine learn- ing methods while remaining interpretable and computationally feasible, and can make opinion analysis systems more effective and flexible in practice.

Author Biographies

Harith Hamoodat

PhD, Computer Science, Technical College of Management/Mosul, Northern Technical University, Iraq
Soud Mohamed Amen

PhD, Computer Science, Institute of Technical Management - Nineveh, Northern Technical University, Iraq
Firas Aswad

PhD, Computer Science, College of Computer Science and Mathematics, University of Mosul, Iraq

References

[1] Liu, B.: Sentiment Analysis and Opinion Mining. Synthesis Lectures on Human Language Technologies, vol. 5, pp. 1–167. Morgan & Claypool Publishers, San Rafael, CA (2012). https://doi.org/10.2200/S00416ED1V01Y201204HLT016

[2] Hamoodat, H., Aswad, F., Ribeiro, E., Menezes, R.: A longitudinal analysis of vocabulary changes in social media. In: Complex Networks XI: Proceedings of the 11th Conference on Complex Networks CompleNet 2020, pp. 212–221 (2020). Springer

[3] Aswad, F., Hamoodat, H., Ribeiro, E., Menezes, R.: Communities of human migration in social media: An experiment in social sensing. In: Complex Networks XI: Proceedings of the 11th Conference on Complex Networks CompleNet 2020, pp. 222–232 (2020). Springer

[4] Hamoodat, H., Al Rozz, Y., Menezes, R.: Complex networks reveal a glottochronological classification of natural languages. In: International Workshop on Complex Networks, pp. 209–219 (2018). Springer

[5] Pang, B., Lee, L.: Opinion mining and sentiment analysis. Foundations and Trends in Information Retrieval 2(1–2), 1–135 (2008)

[6] Li, Y., et al.: Recent advances in aspect-based sentiment analysis using transformer models. In: ACL Findings (2023)

[7] Ribeiro, M.T., Singh, S., Guestrin, C.:” why should i trust you?”: Explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1135–1144. Association for Computing Machinery, San Francisco, California, USA (2016). https://doi.org/10.1145/2939672.2939778

[8] Hutto, C.J., Gilbert, E.: VADER: A parsimonious rule-based model for sentiment analysis of social media text. In: Proceedings of the International AAAI Conference on Web and Social Media, vol. 8, pp. 216–225 (2014). https://doi.org/10. 1609/icwsm. v8i1.14550

[9] Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up? sentiment classification using machine learning techniques. In: Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing (EMNLP 2002), pp. 79–86 (2002). https://doi.org/10.3115/1118693.1118704

[10] Go, A., Bhayani, R., Huang, L.: Twitter sentiment classification using distant supervision. Cs224n project report, Stanford University (2009)

[11] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient Estimation of Word Representations in Vector Space (2013)

[12] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 4171–4186. Association for Computational Linguistics, Minneapolis, Minnesota (2019). https://doi.org/10.18653/v1/N19-1423

[13] Pontiki, M., Galanis, D., Papageorgiou, H., Androutsopoulos, I., Manandhar, S., et al.: Semeval-2016 task 5: Aspect based sentiment analysis. In: Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016), pp. 19–30. Association for Computational Linguistics, San Diego, California (2016). https://doi.org/10.18653/v1/S16-1002

[14] Hu, M., Liu, B.: Mining and summarizing customer reviews. In: Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 168–177 (2004). https://doi.org/10.1145/1014052.1014073

[15] Toh, Z., Wang, W.: DLIREC: Aspect term extraction and term polarity classification system. In: Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014), pp. 235–240. Association for Computational Linguistics, Dublin, Ireland (2014). https://doi.org/10.3115/v1/S14-2038

[16] Jiang, L., Yu, M., Zhou, M., Liu, X., Zhao, T.: Target-dependent twitter sentiment classification. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pp. 151–160. Association for Computational Linguistics, Portland, Oregon, USA (2011)

[17] Sun, C., Huang, L., Qiu, X.: Utilizing BERT for aspect-based sentiment analysis via constructing auxiliary sentence. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 380– 385. Association for Computational Linguistics, Minneapolis, Minnesota (2019). https://doi.org/10.18653/v1/N19-1035

[18] Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. Journal of Machine Learning Research 3, 993–1022 (2003)

[19] Zhao, W.X., Jiang, J., Weng, J., He, J., Lim, E.-P., et al.: Comparing twitter and traditional media using topic models. In: Advances in Information Retrieval: 33rd European Conference on IR Research (ECIR 2011). Lecture Notes in Computer Science, vol. 6611, pp. 338–349. Springer, Berlin, Heidelberg (2011). https://doi. org/10.1007/978-3-642-20161-5 34

[20] Yan, X., Guo, J., Lan, Y., Cheng, X.: A biterm topic model for short texts. In: Proceedings of the 22nd International Conference on World Wide Web, pp. 1445–1456 (2013). https://doi.org/10.1145/2488388.2488514

[21] Dieng, A., et al.: Neural topic modeling: Advances and challenges. Transactions of the ACL (2023)

[22] Zhao, W.X., Jiang, J., Yan, H., Li, X.: Jointly modeling aspects and opinions with a MaxEnt-LDA hybrid. In: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, pp. 56–65. Association for Computational Linguistics, Cambridge, MA (2010)

[23] Jindal, N., Liu, B.: Opinion spam and analysis. In: Proceedings of the International Conference on Web Search and Data Mining (WSDM 2008), pp. 219–230 (2008). https://doi.org/10.1145/1341531.1341560

[24] Mukherjee, A., Venkataraman, V., Liu, B., Glance, N.: Fake review detection: Classification and analysis of real and pseudo reviews. Technical Report UIC-CS03-2013, University of Illinois at Chicago (2013)

[25] Shu, K., et al.: Fake review detection: A survey of deep learning and hybrid approaches. IEEE Transactions on Knowledge and Data Engineering (2023)

[26] Shehnepoor, S., Salehi, M., Farahbakhsh, R., Crespi, N.: Netspam: A networkbased spam detection framework for reviews in online social media. IEEE Transactions on Information Forensics and Security 12(7),

OpinionML: An Interpretable Machine Learning Framework for Opinion Mining on Social Networking Sites

Authors

DOI:

Keywords:

Abstract

Author Biographies

References

Downloads

Published

Issue

Section

How to Cite