Data Augmentation and Deep Learning Methods in Sound Classification: A Systematic Review

Abayomi-Alli, Olusola O. and Damaševičius, Robertas and Qazi, Atika and Adedoyin-Olowe, Mariam and Misra, Sanjay (2022) Data Augmentation and Deep Learning Methods in Sound Classification: A Systematic Review. Electronics, 11 (22). p. 3795. ISSN 2079-9292

[img]
Preview
Text
electronics-11-03795.pdf - Published Version
Available under License Creative Commons Attribution.

Download (6MB)

Abstract

The aim of this systematic literature review (SLR) is to identify and critically evaluate current research advancements with respect to small data and the use of data augmentation methods to increase the amount of data available for deep learning classifiers for sound (including voice, speech, and related audio signals) classification. Methodology: This SLR was carried out based on the standard SLR guidelines based on PRISMA, and three bibliographic databases were examined, namely, Web of Science, SCOPUS, and IEEE Xplore. Findings. The initial search findings using the variety of keyword combinations in the last five years (2017–2021) resulted in a total of 131 papers. To select relevant articles that are within the scope of this study, we adopted some screening exclusion criteria and snowballing (forward and backward snowballing) which resulted in 56 selected articles. Originality: Shortcomings of previous research studies include the lack of sufficient data, weakly labelled data, unbalanced datasets, noisy datasets, poor representations of sound features, and the lack of effective augmentation approach affecting the overall performance of classifiers, which we discuss in this article. Following the analysis of identified articles, we overview the sound datasets, feature extraction methods, data augmentation techniques, and its applications in different areas in the sound classification research problem. Finally, we conclude with the summary of SLR, answers to research questions, and recommendations for the sound classification task.

Item Type: Article
Identification Number: https://doi.org/10.3390/electronics11223795
Dates:
DateEvent
16 November 2022Accepted
18 November 2022Published Online
Uncontrolled Keywords: sound data, audio data, data augmentation, feature extraction, deep learning
Subjects: CAH11 - computing > CAH11-01 - computing > CAH11-01-01 - computer science
Divisions: Faculty of Computing, Engineering and the Built Environment > School of Computing and Digital Technology
Depositing User: Gemma Tonks
Date Deposited: 17 Jan 2023 10:10
Last Modified: 17 Jan 2023 10:10
URI: https://www.open-access.bcu.ac.uk/id/eprint/14120

Actions (login required)

View Item View Item

Research

In this section...