Advancing Urdu named entity recognition: deep learning for aspect targeting
Aziz, Kamran and Ahmed, Naveed and Yu, Yaoxiang and Hadi, Hassan Jalil and Alshara, Mohammaed Ali and Tariq, Umair and Ji, Donghong (2025) Advancing Urdu named entity recognition: deep learning for aspect targeting. Complex & Intelligent Systems, 11 (12). ISSN 2199-4536
Preview |
Text
s40747-025-02066-6.pdf - Published Version Available under License Creative Commons Attribution. Download (3MB) |
Abstract
This study unveils the Named Entity Recognition (NER) system specifically designed for Urdu news headlines, aimed at bridging crucial linguistic resource gaps. We meticulously developed a comprehensive corpus from diverse news sources, specifically tailored to reflect Urdu’s unique orthographic and morphological characteristics. Our approach incorporates state-of-the-art (SOTA) neural technologies including transformers for deep contextual embeddings, Graph Convolutional Networks (GCN) for detailed syntactic analysis, and Biaffine Attention mechanisms to enhance inter-token relationships. A Conditional Random Field (CRF) layer further ensures accurate and consistent entity labeling, improving the system’s precision. Initially, our model was rigorously benchmarked using established transformer models such as XLM-R, mBERT, and XLNet to set initial performance benchmarks. Subsequent enhancements involved integrating encoder functionalities from generative models like mBART and mT5, allowing a thorough comparative evaluation of these advanced encoders against our benchmarks. This phase aimed to assess their potential in effectively detecting implicit entities, thus enhancing our model’s functionality for complex searches and automated content categorization on Urdu digital platforms. Our improvements notably contribute to computational linguistics by extending SOTA language technologies to under-resourced languages and promoting greater inclusivity in Natural Language Processing (NLP).
| Item Type: | Article |
|---|---|
| Identification Number: | 10.1007/s40747-025-02066-6 |
| Dates: | Date Event 18 August 2025 Accepted 29 October 2025 Published Online |
| Uncontrolled Keywords: | Named Entity Recognition, Data mining, NLP, Entity extraction, XLM-R, Deep learning |
| Subjects: | CAH11 - computing > CAH11-01 - computing > CAH11-01-01 - computer science |
| Divisions: | Architecture, Built Environment, Computing and Engineering > Computer Science |
| Depositing User: | Gemma Tonks |
| Date Deposited: | 18 Nov 2025 13:49 |
| Last Modified: | 18 Nov 2025 13:49 |
| URI: | https://www.open-access.bcu.ac.uk/id/eprint/16725 |
Actions (login required)
![]() |
View Item |

Tools
Tools