Deep Learning-based Method for Enhancing the Detection of Arabic Authorship Attribution using Acoustic and Textual-based Features

Al-Sarem, Mohammed and Saeed, Faisal and Qasem, Sultan Noman and Albarrak, Abdullah M (2023) Deep Learning-based Method for Enhancing the Detection of Arabic Authorship Attribution using Acoustic and Textual-based Features. International Journal of Advanced Computer Science and Applications, 14 (7). ISSN 2158-107X

Preview

Text
IJACSA_Paper.pdf - Published Version
Available under License Creative Commons Attribution.
Download (1MB)

Official URL: http://dx.doi.org/10.14569/IJACSA.2023.0140705

Abstract

Authorship attribution (AA) is defined as the identification of the original author of an unseen text. It is found that the style of the author’s writing can change from one topic to another, but the author’s habits are still the same in different texts. The authorship attribution has been extensively studied for texts written in different languages such as English. However, few studies investigated the Arabic authorship attribution (AAA) due to the special challenges faced with the Arabic scripts. Additionally, there is a need to identify the authors of texts extracted from livestream broadcasting and the recorded speeches to protect the intellectual property of these authors. This paper aims to enhance the detection of Arabic authorship attribution by extracting different features and fusing the outputs of two deep learning models. The dataset used in this study was collected from the weekly livestream and recorded Arabic sermons that are available publicly on the official website of Al-Haramain in Saudi Arabia. The acoustic, textual and stylometric features were extracted for five authors. Then, the data were pre-processed and fed into the deep learning-based models (CNN architecture and its pre-trained ResNet34). After that the hard and soft voting ensemble methods were applied for combining the outputs of the applied models and improve the overall performance. The experimental results showed that the use of CNN with textual data obtained an acceptable performance using all evaluation metrics. Then, the performance of ResNet34 model with acoustic features outperformed the other models and obtained the accuracy of 90.34%. Finally, the results showed that the soft voting ensemble method enhanced the performance of AAA and outperformed the other method in terms of accuracy and precision, which obtained 93.19% and 0.9311 respectively.

Item Type:	Article
Identification Number:	10.14569/IJACSA.2023.0140705
Dates:	Date Event 15 July 2023 Accepted 31 July 2023 Published Online
Uncontrolled Keywords:	Authorship attribution, acoustic features, fusion approach, deep learning, CNN, ResNet34
Subjects:	CAH11 - computing > CAH11-01 - computing > CAH11-01-01 - computer science
Divisions:	Faculty of Computing, Engineering and the Built Environment > College of Computing
Depositing User:	Gemma Tonks
Date Deposited:	28 Sep 2023 13:50
Last Modified:	28 Sep 2023 13:50
URI:	https://www.open-access.bcu.ac.uk/id/eprint/14797