Parameterization¶

The baseline system supports multi-level parameter overwriting, to enable flexible switching between different system setups. Parameter changes are tracked with hashes calculated from parameter sections. These parameter hashes are used in the storage file paths when saving data (features, model, or results). By using this approach, the system will compute features, models and results only once for the specific parameter set, and after that it will reuse this precomputed data.

Parameter overwriting¶

Parameters are stored in YAML-formatted files, which are handled internally in the system as Dictionaries. Default parameters is the set of all possible parameters recognized by the system. These default parameters are defined in applications/parameters/task?.defaults.yaml. Parameter set is a smaller set of parameters used to overwrite values of the default parameters. This can be used to select methods for processing, or tune parameters.

Parameter file¶

Parameters files are YAML-formatted files, containing the following three blocks:

active_set, default parameter set id
sets, list of dictionaries
defaults dictionary containing default parameters which are overwritten by the sets[active_set]

At the top level of the parameter dictionary there are parameter sections; depending on the name of the section, the parameters inside it are processed sometimes differently (See below more information.)

Example file:

active_set: SET1

sets:
    - set_id: SET1
      flow:
        task1: false
        task2: false
        task3: true
      section1:
        enable: true
        field1: 11025
      section2
        enable: true
        field2: 44100
    - set_id: SET2
      section1:
        enable: false
      section2
        enable: false

defaults:
    flow:
        task1: true
        task2: true
        task3: true

    section1:
        enable: false
        field1: 44100
        field2: 22050

    section2:
        enable: true
        field1: 44100
        field2: 22050

Parameter hash¶

Parameter hashes are MD5 hashes calculated for each parameter section. In order to make these hashes more robust, some pre-processing is applied before hash calculation:

If section contains field enable with value False, all fields inside this section are excluded from the parameter hash calculation. This will avoid recalculating the hash if the section is not used but some of these unused parameters are changed.
If section contains fields with value False, these fields are excluded from the parameter hash calculation. This will enable to add new flag parameters without changing the hash. Define the new flag such that the previous behaviour is happening when this field is set to false.
All non_hashable_fields fields are excluded from the parameter hash calculation. These fields are set when ParameterContainer is constructed, and they usually are fields used to print various values to the console. These fields do not change the system output to be saved onto disk, and hence they are excluded from hash.

Parameter sections¶

The functionality of the parameters depending on the section name.

Flow¶

The processing blocks of the system can be controlled through this section. Usually all of them can be kept on.

Example section:

flow:
    initialize: true
    extract_features: true
    feature_normalizer: true
    train_system: true
    test_system: true
    evaluate_system: true

Field name	Value type	Description
extract_features	bool	Initialize the system
feature_normalizer	bool	Extract acoustic features for all data at once.
train_system	bool	Train the system with training material
test_system	bool	Test the system with testing material
evaluate_system	bool	Evaluate correctness of the system outputs produced in the `test_system` block.

General¶

This section contains general settings, mostly related to printing and logging.

Example section:

general:
    overwrite: false

    challenge_submission_mode: false

    print_system_progress: true
    log_system_parameters: false
    log_system_progress: false

Field name	Value type	Description
overwrite	bool	Overwrite all pre-calculated data. Enable this when changing system implementation.
challenge_submission_mode	bool	Save results to path location defined in `path->challenge_results`. Use this mode when preparing a submission to the challenge.
print_system_progress	bool	Print the system progress into console using carriage return.
use_ascii_progress_bar	bool	Force ASCII progres bars, use this if your console does not support UTF-8 character set. Under Windows this is set automatically True.
log_system_parameters	bool	Save system parameters into system log file.
log_system_progress	bool	Save system progress into system log file.
scene_handling	string {scene-dependent \| scene-independent}	Scene handling type, can be used in sound event detection application to control how audio material from multiple acoustic scene classes are handled.
active_scenes	list	List of active scene classes in the processing. This can be used to speed up processing when debugging.
event_handling	string {event-dependent \| event-independent}	Event handling type, can be used in binary sound event detection application to control how audio material from multiple event classes are handled.
active_events	list	List of active event classes in the processing. This can be used to speed up processing when debugging.

Path¶

This section defines all paths for the system. Paths can be defined either as absolute or relative to the application code file. Relative paths are converted into absolute before they are used.

Example section:

path:
    data: data/

    system_base: system/task1/
    feature_extractor: feature_extractor/
    feature_normalizer: feature_normalizer/
    learner: learner/
    recognizer: recognizer/
    evaluator: evaluator/

    recognizer_challenge_output: challenge_submission/task1/
    logs: logs/

Field name	Value type	Description
data	string	Path to store all audio datasets.
system_base	string	Base path for the system to store all data.
feature_extractor	string	Directory name under system_base for extracted features
feature_normalizer	string	Directory name under system_base for feature normalization values
learner	string	Directory name under system_base for learned acoustic models
recognizer	string	Directory name under system_base for predicted system outputs
evaluator	string	Directory name under system_base for evaluated metric values
recognizer_challenge_output	string	Path to store system output in challenge mode.
logs	string	Path to save system logs.

Dataset¶

This section defines the dataset use in development mode and in challenge mode.

Example section:

dataset:
    method: development

dataset_method_parameters:
    development:
        name: TUTAcousticScenes_2017_DevelopmentSet
        fold_list: [1, 2, 3, 4]
        evaluation_mode: folds

    challenge_train:
        name: TUTAcousticScenes_2017_DevelopmentSet
        evaluation_mode: full

    challenge_test:
        name: TUTAcousticScenes_2017_EvaluationSet
        evaluation_mode: full

dataset->method is used to select the active dataset.

Field name	Value type	Description
dataset->method	string	Active dataset, used to select parameter set from dataset_method_parameters
dataset_method_parameters->method->name	string	Dataset class name, use `./task1.py -show_datasets` to see valid ones
dataset_method_parameters->method->fold_list	list of ints	List of active folds. If nothing set, all available folds are used. Use this to run the system on a subset of cross-validation folds.
dataset_method_parameters->method->evaluation_mode	string {full\|folds}	System evalution mode. With `folds`, cross-evaluation folds are used. With `full` all the data is used for training and testing.

Feature extractor¶

This section defines the general feature extraction parameters and extractor specific parameters. feature_stacker->stacking_recipe is used to select active feature extractors.

Example section:

feature_extractor:
    fs: 44100                               # Sampling frequency
    win_length_seconds: 0.04                # Window length
    hop_length_seconds: 0.02                # Hop length

Field name	Value type	Description
fs	int	Sampling frequency. If different sampling frequency is encountered during audio file loading, resampling is used.
win_length_seconds	float	Analysis window length in seconds.
hop_length_seconds	float	Analysis window hop length in seconds.

Example section:

feature_extractor_method_parameters:
    mel:                                    # Mel band energy
        mono: true                          # [true, false]
        window: hamming_asymmetric          # [hann_asymmetric, hamming_asymmetric]
        spectrogram_type: magnitude         # [magnitude, power]
        n_mels: 40                          # Number of MEL bands used
        normalize_mel_bands: false          # [true, false]
        n_fft: 2048                         # FFT length
        fmin: 0                             # Minimum frequency when constructing MEL bands
        fmax: 22050                         # Maximum frequency when constructing MEL band
        htk: false                          # Switch for HTK-styled MEL-frequency equation
        log: true                           # Logarithmic

    mfcc:                                   # Mel-frequency cepstral coefficients
        mono: true                          # [true, false]
        window: hamming_asymmetric          # [hann_asymmetric, hamming_asymmetric]
        spectrogram_type: magnitude         # [magnitude, power]
        n_mfcc: 20                          # Number of MFCC coefficients
        n_mels: 40                          # Number of MEL bands used
        n_fft: 2048                         # FFT length
        fmin: 0                             # Minimum frequency when constructing MEL bands
        fmax: 22050                         # Maximum frequency when constructing MEL band
        htk: false                          # Switch for HTK-styled MEL-frequency equation

    mfcc_delta:                             # MFCC delta coefficients
        width: 9                            #

    mfcc_acceleration:                      # MFCC acceleration coefficients
        width: 9                            #

Field name	Value type	Description
feature_extractor_method_parameters->mel
mono	bool	If true, multi-channel audio input is averaged into single channel.
window	string {hann_asymmetric \| hamming_asymmetric}	Analysis window function.
spectrogram_type	string {magnitude \| power}	Spectrogram type.
n_mels	int	Number of mel bands used.
normalize_mel_bands	bool	Normalize mel bands.
n_fft	int	FFT length.
fmin	int	Minimum frequency when constructing mel bands
fmax	int	Maximum frequency when constructing mel band
htk	bool	Switch for HTK-style mel-frequency equation
log	bool	Logarithmic
feature_extractor_method_parameters->mfcc
mono	bool	If true, multi-channel audio input is averaged into single channel.
window	string {hann_asymmetric \| hamming_asymmetric}	Analysis window function.
spectrogram_type	string {magnitude \| power}	Spectrogram type.
n_mfcc	int	Number of mfcc coefficients. Zeroth coefficient is always returned.
n_mels	int	Number of mel bands used.
n_fft	int	FFT length.
fmin	int	Minimum frequency when constructing mel bands
fmax	int	Maximum frequency when constructing mel band
htk	bool	Switch for HTK-style mel-frequency equation
feature_extractor_method_parameters->mfcc_delta
width	int	Delta window length.
feature_extractor_method_parameters->mfcc_acceleration
width	int	Delta-delta window length.

Feature stacker¶

This section defines how the extracted features are combined to form the feature vector (and feature matrix). Stacking recipe is ; limited string with stacking recipe item in specific format:

[extractor (string)], default channel 0 and full vector
[extractor (string)]=[start index (int)]-[end index (int)], default channel 0 and vector [start:end]
[extractor (string)]=[channel (int)]:[start index (int)]-[end index (int)], specified channel and vector [start:end]
[extractor (string)]=1,2,3,4,5, default channel 0 and vector [1,2,3,4,5]
[extractor (string)]=0, specified channel and full vector

For example to get feature vector with mfcc and omitting zeroth coefficient use:

stacking_recipe: mfcc=1-19;mfcc_delta;mfcc_acceleration

For example to get features from both channels (make sure feature_extractor_method_parameters->mel->mono field is set to false:

stacking_recipe: mel=0;mel=1

Example section:

feature_stacker:
    stacking_recipe: mel

Field name	Value type	Description
stacking_recipe	string	Stacking recipe to form feature vector.
feature_hop	int {1}	Debugging parameter to strip data by taking every Nth feature vector. Use this only for classification tasks, as it will break synchronization of the meta data.

Feature normalizer¶

This section defines the feature normalization.

Example section:

feature_normalizer:
    enable: true
    type: global

Field name	Value type	Description
enable	bool	Switch to enable feature normalization.
type	string {global}	Normalization type. Currently only global normalization supported.

Feature aggregator¶

This section defines the feature aggregation. The feature aggregator can be used to process the feature matrix inside the processing window. It can be used for example to collapse features within the window by calculating mean and std per feature item, or to flatten the matrix into single longer feature vector.

Supported processing methods:

flatten
mean
std
cov
kurtosis
skew

The processing methods can combined with ;.

For example, to calculate mean and std:

aggregation_recipe: mean;std

Example section:

feature_aggregator:
    enable: false
    aggregation_recipe: flatten
    win_length_seconds: 0.1
    hop_length_seconds: 0.02

Field name	Value type	Description
enable	bool	Switch to enable feature aggregation.
aggregation_recipe	string	Aggregation recipe. See formatting above.
win_length_seconds	float	Aggregation processing window length.
hop_length_seconds	float	Aggregation processing window hop length.

Learner¶

This section defines the learner stage of the system.

Example section:

learner:
    method: mlp

    audio_error_handling: false
    show_model_information: false

Field name	Value type	Description
method	string	Learner method name. Used to select parameters from `learner_method_parameters`.
audio_error_handling	bool	Switch to skip frames annotated to contain errors. Only used in Task1 application
show_model_information	bool	Switch to show extra information about the learned model. Used only with keras learners.
file_hop	int {1}	Debugging parameter to strip data by taking every Nth file when collecting training data. Use this for debugging when dealing with large datasets.

MLP

Example section for MLP based learner:

learner_method_parameters:
    mlp:
        seed: 1

        keras:
            backend: theano
            backend_parameters:
                floatX: float32
                device: cpu
                fastmath: false

        validation:
            enable: true
            setup_source: generated_scene_balanced
            validation_amount: 0.10
            seed: 1

        training:
            nb_epoch: 100
            batch_size: 256
            shuffle: true
            callbacks:
                - type: EarlyStopping
                  parameters:
                      monitor: val_categorical_accuracy
                      min_delta: 0.001
                      patience: 10
                      verbose: 0
                      mode: max
        model:
            config:
                - class_name: Dense
                  config:
                    units: 50
                    kernel_initializer: uniform
                    activation: relu

                - class_name: Dropout
                  config:
                    rate: 0.2

                - class_name: Dense
                  config:
                    units: 50
                    kernel_initializer: uniform
                    activation: relu

                - class_name: Dropout
                  config:
                    rate: 0.2

                - class_name: Dense
                  config:
                    units: CLASS_COUNT
                    kernel_initializer: uniform
                    activation: softmax

            loss: categorical_crossentropy

            optimizer:
                type: Adam

            metrics:
                - categorical_accuracy

This learner is using Keras neural network implementation. See documentation.

Field name	Value type	Description
seed	int	Randomization seed. Use this to make learner behaviour deterministic.
mlp->keras
backend	string {theano \| tensorflow}	Keras backend selector.
mlp->keras->backend_parameters
device	string {cpu \| gpu}	Device selector. `cpu` is best option to produce deterministic results. All baseline results are calculated in cpu mode.
floatX	string	Float number type. Usually float32 used since that is compatible with GPUs. Valid only for `theano` backend.
fastmath	bool	If true, will enable fastmath mode when CUDA code is compiled. Div and sqrt are faster, but precision is lower. This can cause numerical issues some in cases. Valid only for `theano` backend and GPU mode.
optimizer	string {fast_run \| merge \| fast_compile None}	Compilation mode for theano functions.
openmp	bool	If true, Theano will use multiple cores, see more.
threads	int	Number of threads used. Use one to disable threading.
CNR	bool	Conditional numerical reproducibility for MKL BLAS. When set to True, compatible mode used. See more.
mlp->validation
enable	bool	If true, validation set is used during the training procedure.
setup_source	string	Validation setup source. Valid sources: `generated_scene_balanced`, balanced based on scene labels, used for Task1. `generated_event_file_balanced`, balanced based on events, used for Task2. `generated_scene_location_event_balanced`, balanced based on scene, location and events. Used for Task3. `dataset`, validation set specified by dataset in use.
validation_amount	float	Percentage of training data selected for validation. Use value between 0.0-1.0. Valid only if validation setup is generated.
seed	int	Validation set generation seed. If Null, learner seed will be used.
mlp->training
epochs	int	Number of epochs.
batch_size	int	Batch size.
shuffle	bool	If true, training samples are shuffled at each epoch.
mlp->training->callbacks, list of parameter sets in following format. Callback called during the model training.
type	string	Callback name, use standard keras callbacks callbacks or ones defined by dcase_framework (Plotter, Stopper, Stasher).
parameters	dict	Place inside this all parameters for the callback.
mlp->training->model->config, list of dicts. Defining network topology.
class_name	string	Layer name. Use standard keras core layers, convolutional layers, pooling layers, recurrent layers, or normalization layers.
config	dict	Place inside this all parameters for the layer. See Keras documentation. Magic parameter values: `FEATURE_VECTOR_LENGTH`, feature vector length. This automatically inserted for input layer. `CLASS_COUNT`, number of classes.
input_shape	list of ints	List of integers which is converted into tuple before giving to Keras layer.
mlp->training->model
loss	string	Keras loss function name. See Keras documentation.
metrics	list of strings	Keras metric function name. See Keras documentation.
mlp->training->model->optimizer
type	string	Keras optimizer name. See Keras documentation.
parameters	dict	Place inside this all parameters for the optimizer.

Keras sequential

Example section for Keras sequential learner:

learner_method_parameters:
  keras_seq:
    seed: 0
    keras:
      backend: theano
      backend_parameters:
        floatX: float32
        device: gpu
        fastmath: true
        optimizer: fast_run
        openmp: true
        threads: 4
        CNR: true

    input_sequencer:
      enable: false

    temporal_shifter:
      enable: false

    generator:
      enable: false
      method: feature_generator
      max_q_size: 1
      workers: 1
      parameters:
        buffer_size: 10

    validation:
      enable: true
      setup_source: generated_event_file_balanced
      validation_amount: 0.10
      seed: 123

    training:
      epochs: 200
      batch_size: 256
      shuffle: true

      epoch_processing:
        enable: true

        external_metrics:
          enable: true
          evaluation_interval: 1
          metrics:
            - name: sed_eval.event_based.overall.error_rate.error_rate
              label: ER
              parameters:
                evaluate_onset: true
                evaluate_offset: false
                t_collar: 0.5
                percentage_of_length: 0.5
            - name: sed_eval.event_based.overall.f_measure.f_measure
              label: F1
              parameters:
                evaluate_onset: true
                evaluate_offset: false
                t_collar: 0.5
                percentage_of_length: 0.5

      callbacks:
        - type: Plotter
          parameters:
            interactive: true
            save: false
            output_format: pdf
            focus_span: 10
            plotting_rate: 5

        - type: Stopper
          parameters:
            monitor: sed_eval.event_based.overall.error_rate.error_rate
            initial_delay: 20
            min_delta: 0.01
            patience: 10

        - type: Stasher
          parameters:
            monitor: sed_eval.event_based.overall.error_rate.error_rate
            initial_delay: 20

    model:
      constants:
        LAYER_SIZE: 50
        LAYER_INIT: uniform
        LAYER_ACTIVATION: relu
        DROPOUT: 0.2

      config:
        - class_name: Dense
          config:
            units: LAYER_SIZE
            kernel_initializer: LAYER_INIT
            activation: LAYER_ACTIVATION

        - class_name: Dropout
          config:
            rate: DROPOUT

        - class_name: Dense
          config:
            units: LAYER_SIZE
            kernel_initializer: LAYER_INIT
            activation: LAYER_ACTIVATION

        - class_name: Dropout
          config:
            rate: DROPOUT

        - class_name: Dense
          config:
            units: CLASS_COUNT
            kernel_initializer: LAYER_INIT
            activation: sigmoid

      loss: binary_crossentropy

      optimizer:
        type: Adam

      metrics:
        - binary_accuracy

This learner is using Keras neural network implementation. See documentation.

Field name	Value type	Description
seed	int	Randomization seed. Use this to make learner behaviour deterministic.
keras_seq->keras
backend	string {theano \| tensorflow}	Keras backend selector.
keras_seq->keras->backend_parameters
device	string {cpu \| gpu}	Device selector. `cpu` is best option to produce deterministic results. All baseline results are calculated in cpu mode.
floatX	string	Float number type. Usually float32 used since that is compatible with GPUs. Valid only for `theano` backend.
fastmath	bool	If true, will enable fastmath mode when CUDA code is compiled. Div and sqrt are faster, but precision is lower. This can cause numerical issues some in cases. Valid only for `theano` backend and GPU mode.
optimizer	string {fast_run \| merge \| fast_compile None}	Compilation mode for theano functions.
openmp	bool	If true, Theano will use multiple cores, see more.
threads	int	Number of threads used. Use one to disable threading.
CNR	bool	Conditional numerical reproducibility for MKL BLAS. When set to True, compatible mode used. See more.
keras_seq->input_sequencer, transforming input data into sequences
enable	bool	If true, input sequencing is used during the training procedure.
frames	int	Frames per sequence
hop	int	Hop (in frames) between sequences
padding	bool	Replicating data when sequence is not full
keras_seq->temporal_shifter, shift data on temporal axis for each epoch
enable	bool	If true, temporal data shifting per epoch is applied during the training procedure.
border	string {roll \| push }	How border is handled: `roll`, data matrix is rolled (data moved from end to the begin) `push`, unused material is not used.
step	int	How much sequence start is shifted per epoch
max	int	Maximum shift, after which shift is returned to zero.
keras_seq->generator, data generator to read training data directly from disk during the training procedure.
enable	bool	If true, data generator is used to provide training material.
method	string {feature}	Generator method: - feature, feature based generator
max_q_size	int	Maximum size for the generator queue
workers	int	Maximum number of generator processes to start up.
keras_seq->generator->parameters
buffer_size	int	Size of internal buffer. How many items (files) will be stored in the memory.
keras_seq->validation
enable	bool	If true, validation set is used during the training procedure.
setup_source	string	Validation setup source. Valid sources: `generated_scene_balanced`, balanced based on scene labels, used for Task1. `generated_event_file_balanced`, balanced based on events, used for Task2. `generated_scene_location_event_balanced`, balanced based on scene, location and events. Used for Task3.
validation_amount	float	Percentage of training data selected for validation. Use value between 0.0-1.0.
seed	int	Validation set generation seed. If Null, learner seed will be used.
keras_seq->training
epochs	int	Number of epochs.
batch_size	int	Batch size.
shuffle	bool	If true, training samples are shuffled at each epoch.
keras_seq->training->epoch_processing, epoch by epoch processing outside Keras.
enable	bool	If true, training is done in smaller segments to allow evaluation of external metrics for validation data.
keras_seq->training->epoch_processing->external_metrics
enable	bool	If true, external metrics are evaluated.
evaluation_interval	int	Evaluation is done every Nth epoch.
keras_seq->training->epoch_processing->external_metrics->metrics, list of dicts. Defining external metrics.
label	string	Metric label, use unique label.
evaluator	string	Evaluaor names: `sed_eval.scene`, acoustic scene classification metrics `sed_eval.segment_based`, segment based sound event detection metrics `sed_eval.event_based`, event based sound event detection metrics
name	string	Metric name, dict path to fetch metric value: `overall.accuracy`, accuracy `overall.f_measure.f_measure`, F1 `overall.error_rate.error_rate`, ER
parameters	dict	Parameters for the evaluator. See sed_eval documentation.
keras_seq->training->callbacks, list of parameter sets in following format. Callback called during the model training.
type	string	Callback name, use standard keras callbacks callbacks or ones defined by dcase_framework (Plotter, Stopper, Stasher).
parameters	dict	Place inside this all parameters for the callback.
keras_seq->training->model->constants, Defined constant to be used in while defining network topology.
keras_seq->training->model->config, list of dicts. Defining network topology.
class_name	string	Layer name. Use standard keras core layers, convolutional layers, pooling layers, recurrent layers, or normalization layers.
config	dict	Place inside this all parameters for the layer. See Keras documentation. Magic parameter values: `FEATURE_VECTOR_LENGTH`, feature vector length. This automatically inserted for input layer. `CLASS_COUNT`, number of classes. `INPUT_SEQUENCE_LENGTH`, input sequence length Addition constants defined in keras_seq->training->model->constants can be used.
input_shape	list of ints	List of integers which is converted into tuple before giving to Keras layer.
kernel_size	list of ints	List of integers which is converted into tuple before giving to Keras layer.
pool_size	list of ints	List of integers which is converted into tuple before giving to Keras layer.
dims	list of ints	List of integers which is converted into tuple before giving to Keras layer.
target_shape	list of ints	List of integers which is converted into tuple before giving to Keras layer.
keras_seq->training->model
loss	string	Keras loss function name. See Keras documentation.
metrics	list of strings	Keras metric function name. See Keras documentation.
keras_seq->training->model->optimizer
type	string	Keras optimizer name. See Keras documentation.
parameters	dict	Place inside this all parameters for the optimizer.

GMM

Example section for GMM based learner:

learner_method_parameters:
    gmm:
        n_components: 1
        covariance_type: diag
        tol: 0.001
        reg_covar: 0
        max_iter: 40
        n_init: 1
        init_params: kmeans
        random_state: 0

This learner is using sklearn.mixture.GaussianMixture implementation. See documentation.

Field name	Value type	Description
n_components	int	The number of mixture components.
covariance_type	string { full \| tied \| diag \| spherical }	Covariance type.
tol	float	Covariance threshold.
reg_covar	float	Non-negative regularization added to the diagonal of covariance.
max_iter	int	The number of EM iterations.
n_init	int	The number of initializations.
init_params	string { kmeans \| random }	The method used to initialize model weights.
random_state	int	Random seed.

Recognizer¶

This section defines the recognizer stage of the system.

Example section for Task 1:

recognizer:
    enable: true
    audio_error_handling: false

    frame_accumulation:
        enable: false
        type: sum

    frame_binarization:
        enable: false
        type: frame_max
        threshold: null

    decision_making:
        enable: true
        type: majority_vote

Field name	Value type	Description
enable	bool	Section selector
audio_error_handling	bool	Switch to skip frames annotated to contain errors. Only used in Task1 application. This used to exclude temporary microphone failure and radio signal interferences from mobile phones.
frame_accumulation, Defining frame probability accumulation.
enable	bool	Enable frame probability accumulation.
type	string {sum \| prod \| mean }	Operator type used to accumulate.
frame_binarization, Defining frame probability binarization.
enable	bool	Enable frame probability binarization.
type	string {frame_max \| global_threshold }	Type of binarization: `frame_max`, each frame is treated individually, max of each frame is set to one, other zero. `global_threshold`, global threshold, all values over the threshold are set to one.
threshold	float	Threshold value. Set to null if not used.
decision_making, Defining final decision making.
enable	bool	Enable final decision making.
type	string {maximum \| majority_vote }	Type of decision: `maximum`, maximum probability is choosen. `majority_vote`, majority vote among binarized frame decisions.

Example section for Task 2 and Task 3:

recognizer:
    enable: true

    frame_accumulation:
        enable: false
        type: sliding_sum
        window_length_seconds: 1.0

    frame_binarization:
        enable: true
        type: global_threshold
        threshold: 0.5

    event_activity_processing:
        enable: true
        type: median_filtering
        window_length_seconds: 0.54

    event_post_processing:
        enable: true
        minimum_event_length_seconds: 0.1
        minimum_event_gap_second: 0.1

Field name	Value type	Description
enable	bool	Section selector
frame_accumulation, Defining frame probability accumulation.
enable	bool	Enable frame probability accumulation.
type	string {sliding_sum \| sliding_mean \| sliding_median }	Operator type used to accumulate.
window_length_seconds	float	Window length in seconds for sliding accumulation.
frame_binarization, Defining frame probability binarization.
enable	bool	Enable frame probability binarization.
type	string {frame_max \| global_threshold }	Type of binarization: `frame_max`, each frame is treated individually, max of each frame is set to one, all others to zero. `global_threshold`, global threshold, all values over the threshold are set to one.
threshold	float	Threshold value. Set to null if not used.
event_activity_processing, Event activity processing per frame.
enable	bool	Enable activity processing.
type	string {median_filtering}	Type of decision: `median_filtering`, median filtering of decision inside window.
window_length_seconds	float	Length of sliding window in seconds for activity processing.
event_post_processing, Event post processing per event.
enable	bool	Enable event processing.
minimum_event_length_seconds	float	Minimum allowed event length. Shorter events will be removed.
minimum_event_gap_second	float	Minimum allowed gap between events. Smaller gaps between events will cause events to be merged together.

Evaluator¶

This section defines the evaluation stage of the system.

Example section:

evaluator:
    enable: true
    show_details: false

    saving:
        enable: true
        filename: eval_[{parameter_hash}].yaml

Field name	Value type	Description
enable	bool	Section selector
show_details	bool	Show more detailed metrics (class-wise, scene-wise, event-wise)
saving, Saving evaluation results
enable	bool	Enable result saving into yaml-file.
filename	string	Filename for the evalution results. Following magic fields can be used: `{parameter_hash}` `{parameter_set}` `{dataset_name}`

Logging¶

This section defines the system logging.

Example section:

logging:
    enable: true
    colored: true

    parameters:
        version: 1
        disable_existing_loggers: false
        formatters:
            simple:
                format: "[%(levelname)-8s] %(message)s"
            normal:
                format: "%(asctime)s\t[%(name)-20s]\t[%(levelname)-8s]\t%(message)s"
            extended:
                format: "[%(asctime)s] [%(name)s]\t [%(levelname)-8s]\t %(message)s \t(%(filename)s:%(lineno)s)"
        handlers:
            console:
                class: logging.StreamHandler
                level: DEBUG
                formatter: simple
                stream: ext://sys.stdout

            info_file_handler:
                class: logging.handlers.RotatingFileHandler
                level: INFO
                formatter: normal
                filename: task1.info.log
                maxBytes: 10485760
                backupCount: 20
                encoding: utf8

            debug_file_handler:
                class: logging.handlers.RotatingFileHandler
                level: DEBUG
                formatter: normal
                filename: task1.debug.log
                maxBytes: 10485760
                backupCount: 20
                encoding: utf8

            error_file_handler:
                class: logging.handlers.RotatingFileHandler
                level: ERROR
                formatter: extended
                filename: task1.errors.log
                maxBytes: 10485760
                backupCount: 20
                encoding: utf8

        loggers:
            my_module:
                level: ERROR
                handlers: [console]
                propagate: no

        root:
            level: INFO
            handlers: [console, error_file_handler, info_file_handler, debug_file_handler]

Field name	Value type	Description
enable	bool	Enable logging
colored	bool	Enable colored logging when printing it on console.
parameters, Logging parameters `logging.config.dictConfig(parameters)`, see documentation.