dcase_framework.learners.SceneClassifierGMM¶

class dcase_framework.learners.SceneClassifierGMM(*args, **kwargs)[source]¶

Scene classifier with GMM

This learner is using sklearn.mixture.GaussianMixture implementation. See documentation.

Usage example:

# Audio files
files = ['example1.wav', 'example2.wav', 'example3.wav']

# Meta data
annotations = {
    'example1.wav': MetaDataItem(
        {
            'file': 'example1.wav',
            'scene_label': 'SceneA'
        }
    ),
    'example2.wav':MetaDataItem(
        {
            'file': 'example2.wav',
            'scene_label': 'SceneB'
        }
    ),
    'example3.wav': MetaDataItem(
        {
            'file': 'example3.wav',
            'scene_label': 'SceneC'
        }
    ),
}

# Extract features
feature_data = {}
for file in files:
    feature_data[file] = FeatureExtractor().extract(
        audio_file=file,
        extractor_name='mfcc',
        extractor_params={
            'mfcc': {
                'n_mfcc': 10
            }
        }
    )['mfcc']

# Learn acoustic model
learner_params = {
    'n_components': 1,
    'covariance_type': 'diag',
    'tol': 0.001,
    'reg_covar': 0,
    'max_iter': 40,
    'n_init': 1,
    'init_params': 'kmeans',
    'random_state': 0,
}

gmm_learner = SceneClassifierGMM(
    filename='gmm_model.cpickle',
    class_labels=['SceneA', 'SceneB', 'SceneC'],
    params=learner_params,
)

gmm_learner.learn(
    data=feature_data,
    annotations=annotations
)

# Recognition
recognizer_params = {
    'frame_accumulation': {
        'enable': True,
        'type': 'sum'
    },
    'decision_making': {
        'enable': True,
        'type': 'maximum',
    }
}
correctly_predicted = 0
for file in feature_data:
    frame_probabilities = gmm_learner.predict(
        feature_data=feature_data[file],
    )

    # Scene recognizer
    current_result = SceneRecognizer(
        params=recognizer_params,
        class_labels=gmm_learner.class_labels,
    ).process(
        frame_probabilities=frame_probabilities
    )

    if annotations[file].scene_label == current_result:
        correctly_predicted += 1
    print(current_result, annotations[file].scene_label)

print('Accuracy = {:3.2f} %'.format(correctly_predicted/float(len(feature_data))*100))

Learner parameters

Field name	Value type	Description
n_components	int	The number of mixture components.
covariance_type	string { full \| tied \| diag \| spherical }	Covariance type.
tol	float	Covariance threshold.
reg_covar	float	Non-negative regularization added to the diagonal of covariance.
max_iter	int	The number of EM iterations.
n_init	int	The number of initializations.
init_params	string { kmeans \| random }	The method used to initialize model weights.
random_state	int	Random seed.

__init__(*args, **kwargs)[source]¶

Methods

`__init__`(\args, \\*kwargs)
`clear`(() -> None. Remove all items from D.)
`copy`(() -> a shallow copy of D)
`detect_file_format`(filename)	Detect file format from extension
`empty`()	Check if file is empty
`exists`()	Checks that file exists
`fromkeys`(...)	v defaults to None.
`get`((k[,d]) -> D[k] if k in D, ...)
`get_dump_content`(data)	Clean internal content for saving
`get_file_information`()	Get file information, filename
`get_hash`([data])	Get unique hash string (md5) for given parameter dict
`get_hash_for_path`([dotted_path])
`get_path`(dotted_path[, default, data])	Get value from nested dict with dotted path
`has_key`((k) -> True if D has a key k, else False)
`items`(() -> list of D’s (key, value) pairs, ...)
`iteritems`(() -> an iterator over the (key, ...)
`iterkeys`(() -> an iterator over the keys of D)
`itervalues`(...)
`keys`(() -> list of D’s keys)
`learn`(data, annotations[, data_filenames])	Learn based on data and annotations
`load`(\args, \\*kwargs)	Load file
`log`([level])	Log container content
`merge`(override[, target])	Recursive dict merge
`pop`((k[,d]) -> v, ...)	If key is not found, d is returned if given, otherwise KeyError is raised
`popitem`(() -> (k, v), ...)	2-tuple; but raise KeyError if D is empty.
`predict`(feature_data)	Predict frame probabilities for given feature matrix
`save`(\args, \\*kwargs)	Save file
`set_path`(dotted_path, new_value[, data])	Set value in nested dict with dotted path
`set_seed`([seed])	Set randomization seeds
`setdefault`((k[,d]) -> D.get(k,d), ...)
`show`()	Print container content
`update`(([E, ...)	If E present and has a .keys() method, does: for k in E: D[k] = E[k]
`values`(() -> list of D’s values)
`viewitems`(...)
`viewkeys`(...)
`viewvalues`(...)

Attributes

`class_labels`	Class labels
`feature_aggregator`	Feature aggregator instance
`feature_masker`	Feature masker instance
`feature_normalizer`	Feature normalizer instance
`feature_stacker`	Feature stacker instance
`learner_params`	Get learner parameters from parameter container
`method`	Learner method label
`model`	Acoustic model
`params`	Parameters
`valid_formats`