Automatic drum transcription for polyphonic recordings using soft attention mechanisms and convolutional neural networks

Southall, Carl and Stables, Ryan and Hockman, Jason (2017) Automatic drum transcription for polyphonic recordings using soft attention mechanisms and convolutional neural networks. In: Proceedings of the International Society of Music Information Retrieval Conference. International Society of Music Information Retrieval. ISBN 978-0-692-75506-8

[img]
Preview
Text
Automatic drum transcription for polyphonic.pdf - Accepted Version
Available under License Creative Commons Attribution Non-commercial No Derivatives.

Download (684kB)

Abstract

Automatic drum transcription is the process of generating symbolic notation for percussion instruments within audio recordings. To date, recurrent neural network (RNN) systems have achieved the highest evaluation accuracies for both drum solo and polyphonic recordings, however the accuracies within a polyphonic context still remain relatively low. To improve accuracy for polyphonic recordings, we present two approaches to the ADT problem: First, to capture the dynamism of features in multiple time-step hidden layers, we propose the use of soft attention mechanisms (SA) and an alternative RNN configuration containing additional peripheral connections (PC). Second, to capture these same trends at the input level, we propose the use of a convolutional neural network (CNN), which uses a larger set of time-step features. In addition, we propose the use of a bidirectional recurrent neural network (BRNN) in the peak-picking stage. The proposed systems are evaluated along with two state-of-the-art ADT systems in five evaluation scenarios, including a newly-proposed evaluation methodology designed to assess the generalisability of ADT systems. The results indicate that all of the newly proposed systems achieve higher accuracies than the stateof- the-art RNN systems for polyphonic recordings and that the additional BRNN peak-picking stage offers slight improvement in certain contexts.

Item Type: Book Section
Dates:
DateEvent
2017Published
Subjects: CAH11 - computing > CAH11-01 - computing > CAH11-01-01 - computer science
Divisions: Faculty of Computing, Engineering and the Built Environment
Faculty of Computing, Engineering and the Built Environment > School of Computing and Digital Technology
Depositing User: Oana-Andreea Dumitrascu
Date Deposited: 28 Jun 2017 08:32
Last Modified: 22 Mar 2023 12:01
URI: https://www.open-access.bcu.ac.uk/id/eprint/4747

Actions (login required)

View Item View Item

Research

In this section...