|Publication Type:||Conference Paper|
|Year of Publication:||2019|
|Authors:||Abbasi, Balazs, NOLL, NICOLAKIS, Marconi, Zala, Penn|
|Keywords:||Deep learning, Image-based classification, Ultrasonic vocalizations (USVs)|
Many species in diverse taxonomic groups, including rodents, bats, and insects, communicate with complex ultrasonic vocalizations (USVs) (>20 kHz). Two main components of processing and analyzing USV recordings include detection and classification of syllable types. Recently we developed an efficient algorithm for detecting mouse USVs (Automatic Mouse Ultrasound Detector (A-MUD)). The main challenge is detecting USVs under conditions with a low signal-to-noise ratio, while minimizing rates of false positives (FP). Mice produce many short USVs (< 10 millisecond) that inflate FPs. We aimed to improve the detection of mouse USVs with A-MUD by classifying vocalizations into three discrete syllable types (with 0, 1, or ≥2 frequency-jumps) or FP. Supervised Convolutional Neural Networks (CNNs) were fed by 2D Gammatone Spectrograms (GSs) adapted to the frequency range of mice. Evaluation of performance shows that CNNs yielded an overall accuracy of 95±1.2% and macro-F1 score of 90±2.7%. In contrast, multilayer feed-forward neural networks fed by vectorized spectrograms provided an overall accuracy of only 85.4±1.9% and macro-F1 score of 75.4±2.9%, and therefore, the chosen CNNs outperformed this conventional classification method.
Applying convolutional neural networks to the analysis of mouse ultrasonic vocalizations