DeepEGFR a graph neural network for bioactivity classification of EGFR inhibitors
Malik, Aijaz Ahmad and Khyriem, Costerwell and Hauns, Sven and Khan, Imran and Pinto, Frederico G. and Al-Sadi, Azzat and Mohammad, Rasheed and Tran, Van Dinh and Backofen, Rolf and Soares, Nelson and Uddin, Mohammed and Alkhnbashi, Omer S. (2025) DeepEGFR a graph neural network for bioactivity classification of EGFR inhibitors. Scientific Reports, 15 (1). ISSN 2045-2322
Preview |
Text
s41598-025-22126-8.pdf - Published Version Available under License Creative Commons Attribution. Download (2MB) |
Abstract
Epidermal Growth Factor Receptor (EGFR) plays a critical role in the development of several cancers. Thus, modulation/inhibition of EGFR activity is an appealing target of developing novel cancer therapeutics. With the advent of modern machine learning technologies, it is now possible to simulate interactions with high precision between EGFR and small molecules to predict inhibitory/ modulatory activity at an unprecedented scale. In this work, we propose a novel machine-learning method to fast and precise classification of small compounds that are active, intermediate or inactive in inhibiting/modulating EGFR activity. We developed DeepEGFR, a novel multi-class graph neural network (GNN) model, to classify compounds into Active, Inactive, and Intermediate functional categories. DeepEGFR leverages complementary molecular representations, combining SMILES strings and molecular fingerprint matrices (Klekota-Roth and PubChem) to capture both structural and property-based features of compounds. The model constructs an advanced molecular graph representing atom type, formal charge, bond type, and bond order, through nodes and edges. DeepEGFR achieved superior performance compared to baseline machine learning algorithms (e.g., SVM, Random Forest, ANN), with approximately 94% F1-scores across training and test datasets for all activity classes. To ensure interpretability, the top 20 features identified by DeepEGFR were validated against the five key characteristics of FDA-approved EGFR inhibitors (Afatinib, Gefitinib, Osimertinib, Dacomitinib, Erlotinib), confirming the biological relevance of the features. Moreover, DeepEGFR successfully identified 300 underexplored EGFR-targeting compounds, demonstrating its potential to accelerate the discovery of therapeutic agents. These results highlight the effectiveness of graph neural networks in advancing molecular activity classification, setting a potential new benchmark for EGFR inhibitor prediction. These findings demonstrate the DeepEGFR’s ability to highlight the promising EGFR inhibitors, that have received limited prior investigation, thereby supporting its role in facilitating the rational development of targeted therapies for precision oncology.
| Item Type: | Article |
|---|---|
| Identification Number: | 10.1038/s41598-025-22126-8 |
| Dates: | Date Event 25 September 2025 Accepted 31 October 2025 Published Online |
| Uncontrolled Keywords: | EGFR inhibitors, Graph neural networks (GNN), Fingerprints, Molecular docking, Substructures, Targeted cancer therapy |
| Subjects: | CAH11 - computing > CAH11-01 - computing > CAH11-01-01 - computer science |
| Divisions: | Architecture, Built Environment, Computing and Engineering > Computer Science |
| Depositing User: | Gemma Tonks |
| Date Deposited: | 10 Nov 2025 14:28 |
| Last Modified: | 10 Nov 2025 14:28 |
| URI: | https://www.open-access.bcu.ac.uk/id/eprint/16712 |
Actions (login required)
![]() |
View Item |

Tools
Tools