Opensoundscape: Free, open-source software for automated animal sound identification using convolutional neural networks

Wednesday, August 4, 2021

ON DEMAND

Link To Share This Presentation: https://cdmcd.co/Dwmpaz

Tessa Rhinehart and Samuel Lapp, Biological Sciences, University of Pittsburgh, Pittsburgh, PA, Barry E. Moore II, Center for Research Computing, University of Pittsburgh, Pittsburgh, PA, Zhongqi Miao, Department of Environmental Science, Policy, and Management, University of California, Berkeley, Berkeley, CA, Justin Kitzes, Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA

Presenting Author(s)

Tessa Rhinehart
Biological Sciences, University of Pittsburgh
Pittsburgh, PA, USA

Slides (PDF) Handout (PDF) Slides With Audio

Background/Question/Methods
Autonomous acoustic recording is becoming increasingly popular for surveying sound-producing animals like birds, bats, and frogs. As the scale of acoustic data collection increases, more ecologists are turning to automated data processing techniques, especially machine learning classifiers. However, automated processing techniques for identifying biological sounds remain in their infancy, in part due to significant programming experience and procedural knowledge required to use state-of-the-art algorithms. Classifiers are not available for most species. Those available often perform poorly on datasets recorded under different field conditions or using different recording hardware than that of the data the model was trained on. The code and data used to create these classifiers often have not been made available open-source.
To democratize the creation and application of machine learning models for bioacoustics, we created OpenSoundscape, an open-source Python package which reduces the technical barriers to automated sound identification. We developed simple but customizable functions for creating and using convolutional neural networks for bioacoustic identification. These functions use built-in audio processing operations that are also available to users, such as splitting audio data and creating spectrograms.

Results/Conclusions
OpenSoundscape simplifies the process of creating and using convolutional neural networks for bioacoustic identification. Its wrapper routines enable users with acoustic data labeled in CSV or Raven format to train binary or multiclass convolutional neural networks with any PyTorch-implemented model structure (e.g. ResNet50, Inception v3, custom implementations). Training includes optional data augmentation to improve model generalization (e.g. weighted audio overlay, random clip extraction, GoogleBrain’s SpecAugment). Users can predict species presence in acoustic data using their own classifiers or pretrained classifiers. We provide pretrained ResNet18 binary classifiers for 500 North American bird species.
OpenSoundscape’s machine learning functionality scales from personal computers to computing clusters, is fully documented, and is open-source. Its built-in Pytorch parallelization enables species-presence prediction for 7TB (~30,000 hours) of field recordings in 3.5 hours using 20 GPUs and 100 CPUs. Demonstrations of OpenSoundscape that illustrate machine learning principles are available at https://opensoundscape.org. OpenSoundscape’s code is available open source at https://github.com/kitzeslab/opensoundscape.
OpenSoundscape lowers the technical barriers to using state-of-the-art classification techniques, empowering more ecologists to extract ecological insight from large-scale bioacoustic data.