Cosmopolite Sound Monitoring (CoSMo): A Study of Urban Sound Event Detection Systems Generalizing to Multiple Cities
Abstract
Measuring noise in cities and automatically identifying the corresponding sound sources are a crucial challenge for policymakers. Indeed, such information helps addressing noise pollution and improving the well-being of urban dwellers. In recent years, researchers have provided annotated datasets recorded in two major cities to foster the development of urban sound event detection (SED) systems. This paper presents an in-depth study of the behaviour of state-of-the-art SED systems well suited to our problem, combining three far-field real recordings datasets which can be used jointly during training. In our evaluation, we highlight the performance gaps existing between simple and hard recording examples based on the salience of sound events and the polyphony of the recordings. We provide new proximity annotations for this analysis. We evaluate the ability of urban SED systems to generalize across cities with varying degrees of training supervision. We show that such generalization is hindered mostly by the difficulties current urban SED systems have to detect sound events with low salience along with sound events in highly polyphonic soundscapes.
Origin | Files produced by the author(s) |
---|---|
licence |
Copyright
|