Using a New Data Mining Method for Automobile Insurance Fraud Detection: A Case Study by a Real Data from an Iranian Insurance Company
- Insurance Research CenterTehran, Iran
Received: 15-01-2024
Accepted: 25-03-2024
Published in Issue 30-03-2024
Copyright (c) 2024 International Journal of Mathematical Modeling & Computations

This work is licensed under a Creative Commons Attribution 4.0 International License.
How to Cite
Abstract
The issue of car insurance fraud is one of the most important issues for insurance companies because it can impose a lot of financial losses on the insurance company. Therefore, timely and early detection of a suspected case can greatly prevent this loss. In the last decade, a lot of studies has been done using data mining techniques in this regard. In this article, we first examine the challenge of imbalanced data, and then, after fixing it, use a very new algorithm introduced in the field of fraud discovery, called XGBoost, for a real data set. Finally, we compare this method with an older method Random Forest algorithm and we will see that the new method works well.
Keywords
- Fraud detection,
- Imbalanced data,
- XGBoost algorithm,
- Random forest algorithm
References
- Nian, Ke, Haofan Zhang, Aditya Tayal, Thomas Coleman, and Yuying Li. "Auto insurance fraud detection using unsupervised spectral ranking for anomaly." The Journal of Finance and Data Science 2, no. 1 (2016): 58-75.
- Kirlidog, Melih, and Cuneyt Asuk. "A fraud detection approach with data mining in health insurance." Procedia-Social and Behavioral Sciences 62 (2012): 989-994.
- Bhowmik, Rekha. "Detecting auto insurance fraud by data mining techniques." Journal of Emerging Trends in Computing and Information Sciences 2, no. 4 (2011): 156-162.
- Hastie, Trevor, Robert Tibshirani, Jerome H. Friedman, and Jerome H. Friedman. The elements of statistical learning: data mining, inference, and prediction. Vol. 2. New York: springer, 2009.
- Khanizadeh, Farbod, Farzan Khamesian, and Maryam Esna-Ashari. "Employing unsupervised learning to detect fraudulent claims in auto insurance (isolation forest)." Journal of Management Accounting 15, no. 53, (2022): 141-153.
- Firoozi, Mahdi, Shakoori, Morteza, Kazemi, Leila and Zahedi, Sahar. “Detecting fraud in car insurance using data mining methods.” Iranian Journal of Insurance Research no. 3, (2011): 103-128.
- Goodarzi, Atoosa and Jannatbabaei, Sajad. “Evaluation of decision tree, Naive Bayes and logistic regression algorithms in detecting car insurance frauds.” Insurance Research no. 2, (2017): 61-80.
- Goleiji, Leila, and M. Tarokh. "Identification of influential features and fraud detection in the Insurance Industry using the data mining techniques (Case study: automobile’s body insurance)." Majlesi J Multimed Process 4 (2015): 1-5.
- Khanizadeh, Farbod, Maryam Esna-Ashari, Farzan Khamesian, and Azadeh Bahador. "Target replacement, a new approach to increase the performance of fraud detection system in auto insurance utilizing supervising learning." Journal of Quality Engineering and Management 11, no. 4 (2022): 413-428.
- Gepp, Adrian, J. Holton Wilson, Kuldeep Kumar, and Sukanto Bhattacharya. "A comparative analysis of decision trees vis-a-vis other computational data mining techniques in automotive insurance fraud detection." Journal of data science 10, no. 3 (2012): 537-561.
- Prasasti, Iffa Maula Nur, Arian Dhini, and Enrico Laoh. "Automobile insurance fraud detection using supervised classifiers." In 2020 International Workshop on Big Data and Information Security (IWBIS), pp. 47-52. IEEE, 2020.
- Na Bangchang, Kannat, Sangdao Wongsai, and Teerawat Simmachan. "Application of Data Mining Techniques in Automobile Insurance Fraud Detection." In Proceedings of the 2023 6th International Conference on Mathematics and Statistics, pp. 48-55. 2023.
- Simmachan, Teerawat, Weerapong Manopa, Pailin Neamhom, Achiraya Poothong, and Wikanda Phaphan. "Detecting fraudulent claims in automobile insurance policies by data mining techniques." Thailand Statistician 21, no. 3 (2023): 552-568.
- Salmi, Mabrouka, and Dalia Atif. "Using a data mining approach to detect automobile insurance fraud." In International Conference on Soft Computing and Pattern Recognition, pp. 55-66. Cham: Springer International Publishing, 2021.
- Hanafy, Mohamed, and Ruixing Ming. "Machine learning approaches for auto insurance big data." Risks 9, no. 2 (2021): 42.
- Averro, Nathanael Theovanny, Hendri Murfi, and Gianinna Ardaneswari. "The Imbalance Data Handling of XGBoost in Insurance Fraud Detection." In DATA, pp. 460-467. 2023.
- Okagbue, Hilary I., and O. Oyewole. "Prediction of automobile insurance fraud claims using machine learning." The Scientific Temper 14, no. 03 (2023): 756-762.
- Abdallah, Aisha, Mohd Aizaini Maarof, and Anazida Zainal. "Fraud detection system: A survey." Journal of Network and Computer Applications 68 (2016): 90-113.
- Menardi, Giovanna, and Nicola Torelli. "Training and assessing classification rules with imbalanced data." Data mining and knowledge discovery 28 (2014): 92-122.
- Chen, Tianqi, and Carlos Guestrin. "Xgboost: A scalable tree boosting system." In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, pp. 785-794. 2016.
10.71932/IJM.2024.1126834