Retrieval-Augmented Large Language Model for AWS Cloud Threat Detection and Modelling: Cloudtrail Mitre ATT&CK Mapping

Adediran, Goodness and Awuson-David, Kenny and Ahmed, Yussuf (2026) Retrieval-Augmented Large Language Model for AWS Cloud Threat Detection and Modelling: Cloudtrail Mitre ATT&CK Mapping. Computers, Materials & Continua. ISSN 1546-2226

[thumbnail of TSP_CMC_77606.pdf]
Preview
Text
TSP_CMC_77606.pdf - Published Version
Available under License Creative Commons Attribution.

Download (7MB)

Abstract

Amazon Web Services (AWS) CloudTrail auditing service provides detailed records of operational and security events, enabling cloud administrators to monitor user activity and manage compliance. Although signature-based threat detection methods have been enhanced with machine learning and Large Language Models (LLMs), these approaches remain limited in addressing emerging threats. This study evaluates a two-step Retrieval Augmented Generation (RAG) approach using Gemini 2.5 Pro to enhance threat detection accuracy and contextual relevance. The RAG system integrates external cybersecurity knowledge sources including the MITRE ATT&CK framework, AWS Threat Technique Catalogue, and threat reports to overcome limitations of static pre-trained LLMs. We constructed an evaluation dataset of 200 unique CloudTrail events (122 malicious, 78 benign) using the Stratus Red Team adversary emulation framework, covering 9 MITRE ATT&CK techniques across 8 tactics. Events were sampled from 1724 total events using stratified sampling. Ground truth labels were created through systematic expert annotation with 90% inter-annotator agreement. The RAG-enabled model achieved estimated 78% accuracy, 85% precision, and 79% F1-score, representing 70.5% accuracy improvement and 76.4% F1-score improvement over baseline Gemini 2.5 Pro (46% accuracy, 45% F1-score). Performance are based on evaluation results on 200-event dataset. Cost-latency analysis revealed processing time of 4.1 s and cost of $0.00376 per event, comparable to commercial SIEM solutions while providing superior MITRE ATT&CK attribution. The findings demonstrate that RAG substantially enhances context-aware threat detection, providing actionable insights for cloud security operations.

Item Type: Article
Identification Number: 10.32604/cmc.2026.077606
Dates:
Date
Event
14 February 2026
Accepted
28 February 2026
Published Online
Uncontrolled Keywords: Retrieval-augmented generation, Amazon web services, LLM, cloud service provider, threat detection, threat modelling, MITRE ATT&CK, RAG-enabled model, RAG-enabled LLM system
Subjects: CAH11 - computing > CAH11-01 - computing > CAH11-01-01 - computer science
Divisions: Architecture, Built Environment, Computing and Engineering > Computer Science
Depositing User: Gemma Tonks
Date Deposited: 10 Mar 2026 11:11
Last Modified: 10 Mar 2026 11:11
URI: https://www.open-access.bcu.ac.uk/id/eprint/16916

Actions (login required)

View Item View Item

Research

In this section...