DeepEGFR a graph neural network for bioactivity classification of EGFR inhibitors

Malik, Aijaz Ahmad and Khyriem, Costerwell and Hauns, Sven and Khan, Imran and Pinto, Frederico G. and Al-Sadi, Azzat and Mohammad, Rasheed and Tran, Van Dinh and Backofen, Rolf and Soares, Nelson and Uddin, Mohammed and Alkhnbashi, Omer S. (2025) DeepEGFR a graph neural network for bioactivity classification of EGFR inhibitors. Scientific Reports, 15 (1). ISSN 2045-2322

[thumbnail of s41598-025-22126-8.pdf]
Preview
Text
s41598-025-22126-8.pdf - Published Version
Available under License Creative Commons Attribution.

Download (2MB)

Abstract

Epidermal Growth Factor Receptor (EGFR) plays a critical role in the development of several cancers. Thus, modulation/inhibition of EGFR activity is an appealing target of developing novel cancer therapeutics. With the advent of modern machine learning technologies, it is now possible to simulate interactions with high precision between EGFR and small molecules to predict inhibitory/ modulatory activity at an unprecedented scale. In this work, we propose a novel machine-learning method to fast and precise classification of small compounds that are active, intermediate or inactive in inhibiting/modulating EGFR activity. We developed DeepEGFR, a novel multi-class graph neural network (GNN) model, to classify compounds into Active, Inactive, and Intermediate functional categories. DeepEGFR leverages complementary molecular representations, combining SMILES strings and molecular fingerprint matrices (Klekota-Roth and PubChem) to capture both structural and property-based features of compounds. The model constructs an advanced molecular graph representing atom type, formal charge, bond type, and bond order, through nodes and edges. DeepEGFR achieved superior performance compared to baseline machine learning algorithms (e.g., SVM, Random Forest, ANN), with approximately 94% F1-scores across training and test datasets for all activity classes. To ensure interpretability, the top 20 features identified by DeepEGFR were validated against the five key characteristics of FDA-approved EGFR inhibitors (Afatinib, Gefitinib, Osimertinib, Dacomitinib, Erlotinib), confirming the biological relevance of the features. Moreover, DeepEGFR successfully identified 300 underexplored EGFR-targeting compounds, demonstrating its potential to accelerate the discovery of therapeutic agents. These results highlight the effectiveness of graph neural networks in advancing molecular activity classification, setting a potential new benchmark for EGFR inhibitor prediction. These findings demonstrate the DeepEGFR’s ability to highlight the promising EGFR inhibitors, that have received limited prior investigation, thereby supporting its role in facilitating the rational development of targeted therapies for precision oncology.

Item Type: Article
Identification Number: 10.1038/s41598-025-22126-8
Dates:
Date
Event
25 September 2025
Accepted
31 October 2025
Published Online
Uncontrolled Keywords: EGFR inhibitors, Graph neural networks (GNN), Fingerprints, Molecular docking, Substructures, Targeted cancer therapy
Subjects: CAH11 - computing > CAH11-01 - computing > CAH11-01-01 - computer science
Divisions: Architecture, Built Environment, Computing and Engineering > Computer Science
Depositing User: Gemma Tonks
Date Deposited: 10 Nov 2025 14:28
Last Modified: 10 Nov 2025 14:28
URI: https://www.open-access.bcu.ac.uk/id/eprint/16712

Actions (login required)

View Item View Item

Research

In this section...