Deep Learning for Environmental Sound Classification: A BirdCLEF 2025 Case Study
Students: Martin Bolx, Emily Hernandez, Ryan Nguyen
Faculty Mentor: Gurman Gill
Computer Science
College of Science, Technology, and Business
Our project focuses on automated species identification using bioacoustic recordings from the Middle Magdalena Valley in Colombia, one of the most biodiverse regions in the world. As part of the BirdCLEF 2025 challenge, we aim to develop a machine learning system capable of identifying not only birds but also mammals, amphibians, and insects from real-world soundscape recordings. Building upon the BirdNET framework developed by the K. Lisa Yang Center for Conservation Bioacoustics at Cornell University, we have customized the model to recognize a broader range of species specific to this region. Utilizing the BirdNET-Analyzer tools, we trained the model with curated audio datasets, incorporating diverse environmental conditions and recording equipment to enhance robustness. Using spectrogram representations of audio data, we fine-tuned pre-trained deep learning models such as ResNet and EfficientNet to classify species based on vocalizations. To improve accuracy and generalization, we applied advanced data augmentation techniques, leveraged metadata, and experimented with both mono-label and multi-label classification strategies. Our approach emphasizes real-world application: building a lightweight model suitable for offline deployment in remote forest environments like El Silencio Nature Reserve. This tool could empower researchers and conservationists to monitor species presence and biodiversity without constant human oversight or internet connectivity. Through this project, we explore the intersection of ecology, machine learning, and sustainability, demonstrating how artificial intelligence can contribute meaningfully to conservation efforts and ecological research.