This week’s blog post features a podcast exploring the use of machine learning algorithms to assist audio producers in identifying and classifying specific aspects of an audio file. The podcast reviews and contextualises recent research carried out by academics from the Acoustics Research Centre at the University of Salford.
The podcast can also be found at the following link:
The podcast was created by reviewing current research concerning the use of artificial intelligence in audio and identifying this area of research as a suitable topic to highlight and explore further. The basis of research for the podcast was drawn from several papers listed in the sources section of this post. The recent collective works of Dr Paul Kendrick were focused on heavily during the research and development.
His work looking at the use of artificial intelligence to automatically identify different species of bird from a complex field recording was of particular interest. The ecological connotations of this work were also of great interest. It was decided however to place greater focus on his work concerning the automatic detection and identification of various artifacts in audio files. This was due to it having greater implications for and relevance to work carried out specifically by audio production professionals.
The format for the podcast comprises of a lone presenter delivering a monologue with the addition of sound effects to illustrate select points. The sound effects were sourced from freesound.org and were available under a Creative commons CC0 1.0 Universal. The intro and outro music was taken from a remix I produced of "Let Up" from the band "The Slim".
It was recorded using a pop filter and Electro-Voice RE20 dynamic cardioid microphone with the low-cut filter engaged. The microphone signal was recorded using an Audient ID44 audio interface running straight into Logic Pro X. An instance of the Softube FET compressor plugin with a quick attack and release and reasonably high ratio was used to control the faster transients present in the recording.
The default Logic Pro X limiter was then used to apply a nuanced compression to achieve a more consistent output level across the recording. Finally, the completed podcast was run through the Logic Pro X Loudness meter to measure its averaged loudness value. The master gain was then adjusted to ensure the measured loudness value was around the standard of -23 dB LUFS.
Kendrick, P., Cox, T., Li, F., Fazenda, B. and Jackson, I. (2013). Wind-induced microphone noise detection - automatically monitoring the audio quality of field recordings. 2013 IEEE International Conference on Multimedia and Expo (ICME).
Kendrick, P., Cox, T., Li, F., Fazenda, B. and Jackson, I. (2014). Automatic detection of microphone handling noise. 2014 4th International Workshop on Cognitive Information Processing (CIP).
Kendrick, P., Cox, T., Li, F., Fazenda, B. and Jackson, I. (2014). Perception and automatic detection of wind-induced microphone noise. The Journal of the Acoustical Society of America, 136(3), pp.1176-1186.
Kendrick, P., Cox, T., Li, F., Fazenda, B. and Jackson, I. (2016). Perception and automated assessment of audio quality in user generated content: An improved model. 2016 Eighth International Conference on Quality of Multimedia Experience (QoMEX).
Kendrick, P., Wood, M. and Barçante, L. (2017). Automated assessment of bird vocalization activity. The Journal of the Acoustical Society of America, 141(5), pp.3963-3964.