C5050: An Efficient Framework for Author Identification Using Deep Learning

Mohamed Anwar, Ahmed Mahmoud; Keshk, Arabi Elsayed; Mohamed, Eman M

doi:10.21608/ijci.2023.235242.1125

C5050: An Efficient Framework for Author Identification Using Deep Learning

Document Type : Original Article

Authors

¹ Department of Computer science, Faculty of Computers and Information, Menofia University, Shebin El Kom, Egypt

² Computer Science, Faculty of Computers and Information, Menoufia University

³ Computer Science Dept, Faculty of Computers and Information, Menoufia University, Egypt.

10.21608/ijci.2023.235242.1125

Abstract

Author identification aims to uncover the individuals responsible for creating texts, and it is a burgeoning field of research with diverse applications in literary analysis, cybersecurity, forensics, and social media investigations. The primary goal of this paper is to perform an analysis on author identification. We introduce two main elements within this study. The initial element utilizes six machine learning (ML) techniques: Decision Trees (DT), Logistic Regression (LR), k Nearest Neighbors (K-NN), Random Forests (RF), Support Vector Machines (SVM), and Naive Bayes (NB), with the application of the TF-IDF method for feature extraction. The second part involves the experimentation with two variations of Deep Learning (DL) models—specifically Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU)—employing word embedding for the input vector. To validate our approach, we conducted an experimental study using the Reuters 50_50 dataset, employing two learning modes: Hold-out and 10-fold cross validation. The obtained results, measured in terms of Accuracy (ACC), Precision (PREC), Recall (REC), and F1-score (F1), demonstrate the superior performance of DL techniques when employing a 10-fold cross-validation strategy compared to the current state-of-the-art methods. The experiments detailed in this paper showcase the efficacy of our proposed DL models, yielding the best results for author identification.

Keywords

IJCI. International Journal of Computers and Information

Volume 10, Issue 3
Special issue for the proceedings of ICCI 2023 conference
November 2023
Pages 34-43

View on SCiNiTO

Article View: 110
PDF Download: 157

C5050: An Efficient Framework for Author Identification Using Deep Learning

Volume 10, Issue 3
Special issue for the proceedings of ICCI 2023 conference
November 2023
Pages 34-43

Files

Share

How to cite

Statistics

C5050: An Efficient Framework for Author Identification Using Deep Learning

Volume 10, Issue 3Special issue for the proceedings of ICCI 2023 conferenceNovember 2023Pages 34-43

Files

Share

How to cite

Statistics

Volume 10, Issue 3
Special issue for the proceedings of ICCI 2023 conference
November 2023
Pages 34-43