Advancements in Detecting and Mitigating Fake Reviews: A Comprehensive Review and Analysis

Kurupudi Nishith Narayana

Authors

Kurupudi Nishith Narayana Cybersecurity Analyst, USA Author

Keywords:

E-commerce, online reviews, fake review detection, machine learning, behavioral feature engineering

Abstract

As E-commerce systems continue to evolve, online reviews play a crucial role in establishing and maintaining reputations and aiding consumer decision-making processes. Positive reviews significantly influence customer attraction and sales. However, the prevalence of deceptive or fake reviews aimed at enhancing virtual reputations poses a challenge. Detecting fake reviews is an active research area, depending on both review characteristics and reviewer behaviors. This paper proposes a machine learning approach for identifying fake reviews. Beyond extracting review features, it employs behavioral feature engineering to capture diverse reviewer behaviors. The study evaluates its method on a real-world Yelp dataset of restaurant reviews using various classifiers—KNN, Naive Bayes (NB), SVM, Logistic Regression, and Random Forest—incorporating n-gram language models, particularly bigram and trigram. Results indicate that KNN (K=7) achieves the highest f-score of 82.40%, outperforming other classifiers. Incorporating extracted behavioral features increases the f-score by 3.80%, highlighting their effectiveness in enhancing fake review detection.

References

Jindal, Nitin, and Bing Liu. "Opinion spam and analysis." Proceedings of the International Conference on Web Search and Data Mining. ACM, 2008.

Lim, Ee-Peng, et al. "Detecting product review spammers using rating behaviors." Proceedings of the ACM International Conference on Information and Knowledge Management. ACM, 2010.

Algur, Siddu P., et al. "Conceptual level similarity measure based review spam detection." Signal and Image Processing (ICSIP), International Conference on. IEEE, 2010.

Feng, Song, et al. "Distributional Footprints of Deceptive Product Reviews." ICWSM12 (2012): 98-105.

Li, Wenbin, Ning Zhong, and Chunnian Liu. "Combining multiple email filters based on multivariate statistical analysis." Foundations of Intelligent Systems. Springer Berlin Heidelberg, 2006. 729-738.

Liu, Bing, et al. "Partially supervised classification of text documents." ICML. Vol. 2.2002.

Karimpour, Jaber, Ali A. Noroozi, and Somayeh Alizadeh. "Web Spam Detection by Learning from Small Labeled Samples." International Journal of Computer Applications 50.21, 2012.

Minqing Hu and Bing Liu. Mining and summarizing customer reviews. Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, 2004.

Myle Ott, Yejin Choi, Claire Cardie, and Jeffrey T. Hancock. Finding deceptive opinion spam by any stretch of the imagination. Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1, HLT, 2011.

Duyu Tang, Bing Qin, and Ting Liu, Document modeling with gated recurrent neural network for sentiment classification. Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pages 1422-1432, Lisbon, Portugal, 17-21 September, 2015.

Raghav, C., Nandan, L., Chaudhari, C., Shah, A. and Shingate, D.S., Sentiment Analysis for Business Intelligence Buildup-A Review Paper