Sentiment Analysis on Twitter Using Machine Learning Techniques and TF-IDF Feature Extraction: A Comparative Study

Document Type : Original Article


1 Wesam Ahmed ,Information Technology University of South valley,Mansoura, Egypt

2 Inf. tech. dept. , Information and computersInformation Technology Menoufia University Menoufia ,Egypt faculty, Menofia university

3 Information Technology dept., Faculty of Computers and Information, Menoufia University, Egypt

4 Faculty of computers and information, Menofia university


The term "machine learning" refers to a sort of artificial intelligence (AI) that empowers software applications to enhance their predictive capabilities without explicit programming for such purposes. In order for machine learning algorithms to anticipate future output values, they require past data as input. In terms of scope, this research falls under sentiment analysis. The latter field is becoming increasingly active in terms of extracting people's opinions on issues related to politics, economics, and social issues. The purpose of sentiment classification is to categorize users' opinions as neutral, positive, or negative based on textual input alone. Despite these advantages, the accuracy and effectiveness of sentiment analysis are compromised by the obstacles encountered in the field of natural language processing (NLP). Recent research has shown that machine learning algorithms can assist in NLP. In this research, we investigate a range of machine-learning strategies for solving sentiment analysis challenges. Two datasets were analyzed with the models based on the term frequency-inverse document frequency (TF-IDF). A comparison study was conducted between each of the models to determine how they performed in experiments. Regarding accuracy and F1 score, logistic regression performs better than other algorithms.