Automatic drum transcription for polyphonic recordings using soft attention mechanisms and convolutional neural networks
Southall, Carl and Stables, Ryan and Hockman, Jason (2017) Automatic drum transcription for polyphonic recordings using soft attention mechanisms and convolutional neural networks. In: Proceedings of the International Society of Music Information Retrieval Conference. International Society of Music Information Retrieval. ISBN 978-0-692-75506-8
Preview |
Text
Automatic drum transcription for polyphonic.pdf - Accepted Version Available under License Creative Commons Attribution Non-commercial No Derivatives. Download (684kB) |
Abstract
Automatic drum transcription is the process of generating symbolic notation for percussion instruments within audio recordings. To date, recurrent neural network (RNN) systems have achieved the highest evaluation accuracies for both drum solo and polyphonic recordings, however the accuracies within a polyphonic context still remain relatively low. To improve accuracy for polyphonic recordings, we present two approaches to the ADT problem: First, to capture the dynamism of features in multiple time-step hidden layers, we propose the use of soft attention mechanisms (SA) and an alternative RNN configuration containing additional peripheral connections (PC). Second, to capture these same trends at the input level, we propose the use of a convolutional neural network (CNN), which uses a larger set of time-step features. In addition, we propose the use of a bidirectional recurrent neural network (BRNN) in the peak-picking stage. The proposed systems are evaluated along with two state-of-the-art ADT systems in five evaluation scenarios, including a newly-proposed evaluation methodology designed to assess the generalisability of ADT systems. The results indicate that all of the newly proposed systems achieve higher accuracies than the stateof- the-art RNN systems for polyphonic recordings and that the additional BRNN peak-picking stage offers slight improvement in certain contexts.
Item Type: | Book Section |
---|---|
Dates: | Date Event 2017 Published |
Subjects: | CAH11 - computing > CAH11-01 - computing > CAH11-01-01 - computer science |
Divisions: | Faculty of Computing, Engineering and the Built Environment Faculty of Computing, Engineering and the Built Environment > College of Computing |
Depositing User: | Oana-Andreea Dumitrascu |
Date Deposited: | 28 Jun 2017 08:32 |
Last Modified: | 07 Feb 2025 15:26 |
URI: | https://www.open-access.bcu.ac.uk/id/eprint/4747 |
Actions (login required)
![]() |
View Item |