_images/dcase2017_baseline.png

Audio Research Group / Tampere University of Technology

Authors

Toni Heittola Baseline system, DCASE Framework, Documentation email_toni home_toni git_toni
Aleksandr Diment Dataset synthesis (Task 2) email_aleksandr home_aleksandr
Annamaria Mesaros Documentation email_annamaria home_annamaria

Introduction

This document describes the baseline system for the Detection and Classification of Acoustic Scenes and Events 2017 (DCASE2017) challenge tasks.

Baseline system

The baseline system is intended to lower the hurdle to participate to the DCASE challenges. It provides an entry-level approach which is simple but relatively close to the state of the art systems to give reasonable performance for all the tasks. High-end performance is left for the challenge participants to find.

In the baseline, one single low-level approach is shared across the tasks using application-specific extensions. The main idea of this is to show the parallelism in the tasks settings, and how easily one can jump between tasks during system development.

The main baseline system implements following approach:

  • Acoustic features: Log mel-band energies extracted in 40ms windows with 20ms hop size.
  • Machine learning: neural network approach using multilayer perceptron (MLP) type of network (2 layers with 50 neurons each, and 20% dropout between layers).

In addition to this, Gaussian mixture model based system is included for the comparison.

The system is developed for Python 2.7 and Python 3.6, and it can be used in Linux, Windows and Mac platforms.

More about the baseline system.

Applications

Applications are specialized versions of the main system for specific tasks. The baseline system includes the following applications tailored for DCASE2017 challenge:

task1 Task 1, Acoustic scene classification

task2 Task 2, Detection of rare sound events

task3 Task 3, Sound event detection in real life audio

More about applications.

Getting started

  1. Clone repository from Github or download latest release.
  2. Install requirements with command: pip install -r requirements.txt, installation details.
  3. Run the application with default settings: python applications/task1.py, python applications/task2.py, and python applications/task3.py, usage details.

DCASE Framework

The baseline system is built on top of the DCASE Framework, a collection of utility classes designed to ease the DCASE related research process. The framework provides tools for system parameter handling, acoustic feature extraction, data storage, acoustic model learning, and system evaluation. In addition to the utility classes, the framework provides application classes to help build research code specialized for sound classification and sound event detection type of target applications. Application classes handle the full development pipeline: feature extraction, feature normalization, model learning, model testing, and system evaluation. These application classes can be extended easily to accommodate different research problems.

More about the DCASE Framework.

License

The DCASE Framework and the baseline system is released only for academic research under EULA from Tampere University of Technology.

See details from EULA.

Indices and tables