Retrieval-Augmented Large Language Model for AWS Cloud Threat Detection and Modelling: Cloudtrail Mitre ATT&CK Mapping
Adediran, Goodness and Awuson-David, Kenny and Ahmed, Yussuf (2026) Retrieval-Augmented Large Language Model for AWS Cloud Threat Detection and Modelling: Cloudtrail Mitre ATT&CK Mapping. Computers, Materials & Continua. ISSN 1546-2226
Preview |
Text
TSP_CMC_77606.pdf - Published Version Available under License Creative Commons Attribution. Download (7MB) |
Abstract
Amazon Web Services (AWS) CloudTrail auditing service provides detailed records of operational and security events, enabling cloud administrators to monitor user activity and manage compliance. Although signature-based threat detection methods have been enhanced with machine learning and Large Language Models (LLMs), these approaches remain limited in addressing emerging threats. This study evaluates a two-step Retrieval Augmented Generation (RAG) approach using Gemini 2.5 Pro to enhance threat detection accuracy and contextual relevance. The RAG system integrates external cybersecurity knowledge sources including the MITRE ATT&CK framework, AWS Threat Technique Catalogue, and threat reports to overcome limitations of static pre-trained LLMs. We constructed an evaluation dataset of 200 unique CloudTrail events (122 malicious, 78 benign) using the Stratus Red Team adversary emulation framework, covering 9 MITRE ATT&CK techniques across 8 tactics. Events were sampled from 1724 total events using stratified sampling. Ground truth labels were created through systematic expert annotation with 90% inter-annotator agreement. The RAG-enabled model achieved estimated 78% accuracy, 85% precision, and 79% F1-score, representing 70.5% accuracy improvement and 76.4% F1-score improvement over baseline Gemini 2.5 Pro (46% accuracy, 45% F1-score). Performance are based on evaluation results on 200-event dataset. Cost-latency analysis revealed processing time of 4.1 s and cost of $0.00376 per event, comparable to commercial SIEM solutions while providing superior MITRE ATT&CK attribution. The findings demonstrate that RAG substantially enhances context-aware threat detection, providing actionable insights for cloud security operations.
| Item Type: | Article |
|---|---|
| Identification Number: | 10.32604/cmc.2026.077606 |
| Dates: | Date Event 14 February 2026 Accepted 28 February 2026 Published Online |
| Uncontrolled Keywords: | Retrieval-augmented generation, Amazon web services, LLM, cloud service provider, threat detection, threat modelling, MITRE ATT&CK, RAG-enabled model, RAG-enabled LLM system |
| Subjects: | CAH11 - computing > CAH11-01 - computing > CAH11-01-01 - computer science |
| Divisions: | Architecture, Built Environment, Computing and Engineering > Computer Science |
| Depositing User: | Gemma Tonks |
| Date Deposited: | 10 Mar 2026 11:11 |
| Last Modified: | 10 Mar 2026 11:11 |
| URI: | https://www.open-access.bcu.ac.uk/id/eprint/16916 |
Actions (login required)
![]() |
View Item |

Tools
Tools