Prediction of Lung Cancer Using Supervised Machine Learning

Document Type : Original Article


1 faculty of computers and information menoufia university. information system department

2 Information Systems Department Faculty of Computers and Information Menoufia University, Egypt

3 Information System, faculty of computer and information, Menoufia University, Shebin El Kom, Menofia, Egypt


Cancer of the lungs is a silent monster. It is discovered when it is far advanced, such as liver or pancreatic cancer. It can be difficult for doctors to recognize the disease in the beginning stages. For this reason, we focus on this topic, to help doctors and people to determine their cancer risk at a lower cost through an effective cancer prediction system and make appropriate decisions according to their cancer risk status. This paper's goal is to make a practical method for determining whether a patient has lung cancer or not. The proposal was tested with the Kaggle standardized data set Survey Lung Cancer. we used real data collected from real hospitals in Egypt. In our proposal, the two main processes are data pre-processing and prediction. Data preparation for the prediction process is known as data preprocessing. In the prediction process, we used techniques for machine learning to compare classifications between all these algorithms, which included a Decision tree,

Logistic regression, KNN, Support vector machine, and Naïve Bayes. The four criteria utilized for evaluating the techniques were accuracy, recall, precision, and F1-score. They were used to categorize the dataset, and the results were compared. The support vector machine achieved a maximum prediction accuracy of 98%.