Audio Research Group / Tampere University of Technology
Authors
Toni Heittola | Baseline system, DCASE Framework, Documentation | |
Aleksandr Diment | Dataset synthesis (Task 2) | |
Annamaria Mesaros | Documentation |
Introduction¶
This document describes the baseline system for the Detection and Classification of Acoustic Scenes and Events 2017 (DCASE2017) challenge tasks.
Baseline system¶
The baseline system is intended to lower the hurdle to participate to the DCASE challenges. It provides an entry-level approach which is simple but relatively close to the state of the art systems to give reasonable performance for all the tasks. High-end performance is left for the challenge participants to find.
In the baseline, one single low-level approach is shared across the tasks using application-specific extensions. The main idea of this is to show the parallelism in the tasks settings, and how easily one can jump between tasks during system development.
The main baseline system implements following approach:
- Acoustic features: Log mel-band energies extracted in 40ms windows with 20ms hop size.
- Machine learning: neural network approach using multilayer perceptron (MLP) type of network (2 layers with 50 neurons each, and 20% dropout between layers).
In addition to this, Gaussian mixture model based system is included for the comparison.
The system is developed for Python 2.7 and Python 3.6, and it can be used in Linux, Windows and Mac platforms.
More about the baseline system.
Applications¶
Applications are specialized versions of the main system for specific tasks. The baseline system includes the following applications tailored for DCASE2017 challenge:
Task 1, Acoustic scene classification
Task 2, Detection of rare sound events
Task 3, Sound event detection in real life audio
More about applications.
Getting started¶
- Clone repository from Github or download latest release.
- Install requirements with command:
pip install -r requirements.txt
, installation details. - Run the application with default settings:
python applications/task1.py
,python applications/task2.py
, andpython applications/task3.py
, usage details.
DCASE Framework¶
The baseline system is built on top of the DCASE Framework, a collection of utility classes designed to ease the DCASE related research process. The framework provides tools for system parameter handling, acoustic feature extraction, data storage, acoustic model learning, and system evaluation. In addition to the utility classes, the framework provides application classes to help build research code specialized for sound classification and sound event detection type of target applications. Application classes handle the full development pipeline: feature extraction, feature normalization, model learning, model testing, and system evaluation. These application classes can be extended easily to accommodate different research problems.
More about the DCASE Framework.
License¶
The DCASE Framework and the baseline system is released only for academic research under EULA from Tampere University of Technology.
See details from EULA.