Applications of loudness models in audio engineering

Ward, Dominic (2017) Applications of loudness models in audio engineering. Doctoral thesis, Birmingham City University.

PhD Thesis.pdf

Download (9MB)


This thesis investigates the application of perceptual models to areas of audio engineering, with a particular focus on music production. The goal was to establish efficient and practical tools for the measurement and control of the perceived loudness of musical sounds. Two types of loudness model were investigated: the single-band model and the multiband excitation pattern (EP) model. The heuristic single-band devices were designed to be simple but sufficiently effective for real-world application, whereas the multiband procedures were developed to give a reasonable account of a large body of psychoacoustic findings according to a functional model of the peripheral hearing system. The research addresses the extent to which current models of loudness generalise to musical instruments, and whether can they be successfully employed in music applications. The domain-specific disparity between the two types of model was first tackled by reducing the computational load of state-of-the-art EP models to allow for fast but low-error auditory signal processing. Two elaborate hearing models were analysed and optimised using musical instruments and speech as test stimuli. It was shown that, after significantly reducing the complexity of both procedures, estimates of global loudness, such as peak loudness, as well as the intermediate auditory representations can be preserved with high accuracy. Based on the optimisations, two real-time applications were developed: a binaural loudness meter and an automatic multitrack mixer. This second system was designed to work independently of the loudness measurement procedure, and therefore supports both linear and nonlinear models. This allowed for a single mixing device to be assessed using different loudness metrics and this was demonstrated by evaluating three configurations through subjective assessment. Unexpectedly, when asked to rate both the overall quality of a mix and the degree to which instruments were equally loud, listeners preferred mixes generated using heuristic single-band models over those produced using a multiband procedure. A series of more systematic listening tests were conducted to further investigate this finding. Subjective loudness matches of musical instruments commonly found in western popular music were collected to evaluate the performance of five published models. The results were in accord with the application-based assessment, namely that current EP procedures do not generalise well when estimating the relative loudness of musical sounds which have marked differences in spectral content. Model specific issues were identified relating to the calculation of spectral loudness summation (SLS) and the method used to determine the global-loudness percept of time-varying musical sounds; associated refinements were proposed. It was shown that a new multiband loudness model with a heuristic loudness transformation yields superior performance over existing methods. This supports the idea that a revised model of SLS is needed, and therefore that modification to this stage in existing psychoacoustic procedures is an essential step towards the goal of achieving real-world deployment.

Item Type: Thesis (Doctoral)
Additional Information: Firstly, I must acknowledge my supervisor Cham Athwal. He has guided me throughout this long journey and always made himself available when I needed support, even at difficult times. Cham's continued encouragement, reassurance, motivation and belief in me was paramount to me completing this thesis. I would like to express my gratitude to Joshua Reiss for his involvement in my work; his ideas and advice have been invaluable. Furthermore, a big thank you to Munevver Kokuer for her attention to detail and teaching me the ways of LATEX. I'd like to acknowledge my colleagues at DMT Lab for fun times amidst interesting technical discussions: Ryan Stables, Matthew Cheshire, Yonghao Wang, Gregory Hough, Ian Williams, Sam Smith, Alan Dolhasz and Izzy MacLachlan. A special thanks to Sean Enderby for clarifying many technical issues and being a robust sounding board when battling concepts out with myself. I would also like to thank the Sound Engineering students at Birmingham City University, especially those who took the time to participate in my listening experiments. Thanks to Esben Skovenborg for his patience and maintained interest, despite being attacked by my perpetual questions about regression models he fit 10 years ago. Similarly, thanks to Brecht De Man for many fruitful discussions about our research. Furthermore, I am grateful for the invaluable techniques and skills I have developed during my time at The University of Birmingham. Alan Wing, Mark Elliot, Winnie Chua and Caroline Palmer thanks for all your support. I would also like to acknowledge the following people for sharing code and clarifying implementation-level details: Harish Krishnamoorthi, Brian Moore and Brian Glasberg. A massive thank you to my family for their continued love and support, without you I would not be writing this final page. Special thanks to Charlie for reading through my work, despite telling me you didn't understand the majority of it. Finally, thanks to Frank, a nightmare of a dog but my best friend.
Uncontrolled Keywords: Loudness, loudness models, psychoacoustics, audio engineering, automatic mixing
Subjects: H600 Electronic and Electrical Engineering
H900 Others in Engineering
Divisions: REF UoA Output Collections > Doctoral Theses Collection
Depositing User: Kip Darling
Date Deposited: 13 Mar 2019 15:25
Last Modified: 13 Mar 2019 15:25

Actions (login required)

View Item View Item


In this section...