|Toni Heittola||Baseline system, DCASE Framework, Documentation|
|Aleksandr Diment||Dataset synthesis (Task 2)|
This document describes the baseline system for the Detection and Classification of Acoustic Scenes and Events 2017 (DCASE2017) challenge tasks.
The baseline system is intended to lower the hurdle to participate to the DCASE challenges. It provides an entry-level approach which is simple but relatively close to the state of the art systems to give reasonable performance for all the tasks. High-end performance is left for the challenge participants to find.
In the baseline, one single low-level approach is shared across the tasks using application-specific extensions. The main idea of this is to show the parallelism in the tasks settings, and how easily one can jump between tasks during system development.
The main baseline system implements following approach:
- Acoustic features: Log mel-band energies extracted in 40ms windows with 20ms hop size.
- Machine learning: neural network approach using multilayer perceptron (MLP) type of network (2 layers with 50 neurons each, and 20% dropout between layers).
In addition to this, Gaussian mixture model based system is included for the comparison.
More about the baseline system.
Applications are specialized versions of the main system for specific tasks. The baseline system includes the following applications tailored for DCASE2017 challenge:
Task 1, Acoustic scene classification
Task 2, Detection of rare sound events
More about applications.
The baseline system is built on top of the DCASE Framework, a collection of utility classes designed to ease the DCASE related research process. The framework provides tools for system parameter handling, acoustic feature extraction, data storage, acoustic model learning, and system evaluation. In addition to the utility classes, the framework provides application classes to help build research code specialized for sound classification and sound event detection type of target applications. Application classes handle the full development pipeline: feature extraction, feature normalization, model learning, model testing, and system evaluation. These application classes can be extended easily to accommodate different research problems.
More about the DCASE Framework.
See details from EULA.