An Improved Multiple Features and Machine Learning-Based Approach for Detecting Clickbait News on Social Networks

Al-Sarem, Mohammed and Saeed, Faisal and Al-Mekhlafi, Zeyad and Mohammed, Badiea and Hadwan, Mohammed and Al-Hadhrami, Tawfik and Alshammari, Mohammad and Alreshidi, Abdulrahman and Alshammari, Talal (2021) An Improved Multiple Features and Machine Learning-Based Approach for Detecting Clickbait News on Social Networks. Applied Sciences, 11 (20). p. 9487. ISSN 2076-3417

Preview

Text
applsci-11-09-2021.pdf - Published Version
Available under License Creative Commons Attribution.
Download (1MB)

Official URL: https://www.mdpi.com/2076-3417/11/20/9487

Abstract

The widespread usage of social media has led to the increasing popularity of online advertisements, which have been accompanied by a disturbing spread of clickbait headlines. Clickbait dissatisfies users because the article content does not match their expectation. Detecting clickbait posts in online social networks is an important task to fight this issue. Clickbait posts use phrases that are mainly posted to attract a user’s attention in order to click onto a specific fake link/website. That means clickbait headlines utilize misleading titles, which could carry hidden important information from the target website. It is very difficult to recognize these clickbait headlines manually. Therefore, there is a need for an intelligent method to detect clickbait and fake advertisements on social networks. Several machine learning methods have been applied for this detection purpose. However, the obtained performance (accuracy) only reached 87% and still needs to be improved. In addition, most of the existing studies were conducted on English headlines and contents. Few studies focused specifically on detecting clickbait headlines in Arabic. Therefore, this study constructed the first Arabic clickbait headline news dataset and presents an improved multiple feature-based approach for detecting clickbait news on social networks in Arabic language. The proposed approach includes three main phases: data collection, data preparation, and machine learning model training and testing phases. The collected dataset included 54,893 Arabic news items from Twitter (after preprocessing). Among these news items, 23,981 were clickbait news (43.69%) and 30,912 were legitimate news (56.31%). This dataset was pre-processed and then the most important features were selected using the ANOVA F-test. Several machine learning (ML) methods were then applied with hyperparameter tuning methods to ensure finding the optimal settings. Finally, the ML models were evaluated, and the overall performance is reported in this paper. The experimental results show that the Support Vector Machine (SVM) with the top 10% of ANOVA F-test features (user-based features (UFs) and content-based features (CFs)) obtained the best performance and achieved 92.16% of detection accuracy.

Item Type:

Article

Identification Number:

https://doi.org/10.3390/app11209487

Dates:

Date	Event
6 October 2021	Accepted
13 October 2021	Published Online

Uncontrolled Keywords:

ANOVA-test; clickbait news; feature selection; social network

Subjects:

CAH11 - computing > CAH11-01 - computing > CAH11-01-01 - computer science
CAH11 - computing > CAH11-01 - computing > CAH11-01-05 - artificial intelligence

Divisions:

Faculty of Computing, Engineering and the Built Environment > School of Computing and Digital Technology

Depositing User:

Faisal Saeed

Date Deposited:

05 Jan 2022 14:07

Last Modified:

05 Jan 2022 14:07

URI:

https://www.open-access.bcu.ac.uk/id/eprint/12586

Actions (login required)

View Item

Tools

CORE (COnnecting REpositories)

Research

In this section...

Birmingham City University

Level of study...

Browse Undergraduate Courses

Browse Postgraduate Taught Courses

Browse Postgraduate Research Courses

Browse Foundation Courses

Browse Short Courses

Browse Part-time Courses

Browse Jan/Feb Starts

Browse Online Learning Courses

Search Undergraduate Courses...

More in this section...

How to
Graduate help and advice

More in this section...

More in this section

Prospective students

Making an application

More in this section...

More in this section...

Get Money

Get Knowledge

Get Connected

More in this section...

More in this section...

An Improved Multiple Features and Machine Learning-Based Approach for Detecting Clickbait News on Social Networks

Abstract

Actions (login required)

Research