Automatic drum transcription for polyphonic recordings using soft attention mechanisms and convolutional neural networks

Hockman, Jason and Southall, Carl and Stables, Ryan (2017) Automatic drum transcription for polyphonic recordings using soft attention mechanisms and convolutional neural networks. In: Proceedings of the International Society of Music Information Retrieval Conference. International Society of Music Information Retrieval. ISBN 978-0-692-75506-8 (In Press)

[img]
Preview
Text
Automatic drum transcription for polyphonic.pdf - Accepted Version
Available under License Creative Commons Attribution Non-commercial No Derivatives.

Download (684kB)

Abstract

Automatic drum transcription is the process of generating symbolic notation for percussion instruments within audio recordings. To date, recurrent neural network (RNN) systems have achieved the highest evaluation accuracies for
both drum solo and polyphonic recordings, however the accuracies within a polyphonic context still remain relatively low. To improve accuracy for polyphonic recordings, we present two approaches to the ADT problem: First, to capture the dynamism of features in multiple time-step hidden
layers, we propose the use of soft attention mechanisms (SA) and an alternative RNN configuration containing additional peripheral connections (PC). Second, to capture these same trends at the input level, we propose the use of a convolutional neural network (CNN), which uses a larger set of time-step features. In addition, we propose the use of a bidirectional recurrent neural network (BRNN) in the peak-picking stage. The proposed systems are evaluated along with two state-of-the-art ADT systems in five
evaluation scenarios, including a newly-proposed evaluation methodology designed to assess the generalisability of ADT systems. The results indicate that all of the newly proposed systems achieve higher accuracies than the stateof- the-art RNN systems for polyphonic recordings and that
the additional BRNN peak-picking stage offers slight improvement in certain contexts.

Item Type: Book Section
Subjects: G400 Computer Science
Divisions: Faculty of Computing, Engineering and the Built Environment
Faculty of Computing, Engineering and the Built Environment > School of Computing and Digital Technology
Faculty of Computing, Engineering and the Built Environment > School of Computing and Digital Technology > Digital Media Technology
UoA Collections > UoA11: Computer Science and Informatics
Depositing User: Oana-Andreea Dumitrascu
Date Deposited: 28 Jun 2017 08:32
Last Modified: 15 Aug 2017 09:45
URI: http://www.open-access.bcu.ac.uk/id/eprint/4747

Actions (login required)

View Item View Item

Research

In this section...