dcase_framework.learners.SceneClassifierGMM

class dcase_framework.learners.SceneClassifierGMM(*args, **kwargs)[source]

Scene classifier with GMM

This learner is using sklearn.mixture.GaussianMixture implementation. See documentation.

Usage example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
# Audio files
files = ['example1.wav', 'example2.wav', 'example3.wav']

# Meta data
annotations = {
    'example1.wav': MetaDataItem(
        {
            'file': 'example1.wav',
            'scene_label': 'SceneA'
        }
    ),
    'example2.wav':MetaDataItem(
        {
            'file': 'example2.wav',
            'scene_label': 'SceneB'
        }
    ),
    'example3.wav': MetaDataItem(
        {
            'file': 'example3.wav',
            'scene_label': 'SceneC'
        }
    ),
}

# Extract features
feature_data = {}
for file in files:
    feature_data[file] = FeatureExtractor().extract(
        audio_file=file,
        extractor_name='mfcc',
        extractor_params={
            'mfcc': {
                'n_mfcc': 10
            }
        }
    )['mfcc']

# Learn acoustic model
learner_params = {
    'n_components': 1,
    'covariance_type': 'diag',
    'tol': 0.001,
    'reg_covar': 0,
    'max_iter': 40,
    'n_init': 1,
    'init_params': 'kmeans',
    'random_state': 0,
}

gmm_learner = SceneClassifierGMM(
    filename='gmm_model.cpickle',
    class_labels=['SceneA', 'SceneB', 'SceneC'],
    params=learner_params,
)

gmm_learner.learn(
    data=feature_data,
    annotations=annotations
)

# Recognition
recognizer_params = {
    'frame_accumulation': {
        'enable': True,
        'type': 'sum'
    },
    'decision_making': {
        'enable': True,
        'type': 'maximum',
    }
}
correctly_predicted = 0
for file in feature_data:
    frame_probabilities = gmm_learner.predict(
        feature_data=feature_data[file],
    )

    # Scene recognizer
    current_result = SceneRecognizer(
        params=recognizer_params,
        class_labels=gmm_learner.class_labels,
    ).process(
        frame_probabilities=frame_probabilities
    )

    if annotations[file].scene_label == current_result:
        correctly_predicted += 1
    print(current_result, annotations[file].scene_label)

print('Accuracy = {:3.2f} %'.format(correctly_predicted/float(len(feature_data))*100))

Learner parameters

Field name Value type Description
n_components int The number of mixture components.
covariance_type string { full | tied | diag | spherical } Covariance type.
tol float Covariance threshold.
reg_covar float Non-negative regularization added to the diagonal of covariance.
max_iter int The number of EM iterations.
n_init int The number of initializations.
init_params string { kmeans | random } The method used to initialize model weights.
random_state int Random seed.
__init__(*args, **kwargs)[source]

Methods

__init__(\*args, \*\*kwargs)
clear(() -> None.  Remove all items from D.)
copy(() -> a shallow copy of D)
detect_file_format(filename) Detect file format from extension
empty() Check if file is empty
exists() Checks that file exists
fromkeys(...) v defaults to None.
get((k[,d]) -> D[k] if k in D, ...)
get_dump_content(data) Clean internal content for saving
get_file_information() Get file information, filename
get_hash([data]) Get unique hash string (md5) for given parameter dict
get_hash_for_path([dotted_path])
get_path(dotted_path[, default, data]) Get value from nested dict with dotted path
has_key((k) -> True if D has a key k, else False)
items(() -> list of D’s (key, value) pairs, ...)
iteritems(() -> an iterator over the (key, ...)
iterkeys(() -> an iterator over the keys of D)
itervalues(...)
keys(() -> list of D’s keys)
learn(data, annotations[, data_filenames]) Learn based on data and annotations
load(\*args, \*\*kwargs) Load file
log([level]) Log container content
merge(override[, target]) Recursive dict merge
pop((k[,d]) -> v, ...) If key is not found, d is returned if given, otherwise KeyError is raised
popitem(() -> (k, v), ...) 2-tuple; but raise KeyError if D is empty.
predict(feature_data) Predict frame probabilities for given feature matrix
save(\*args, \*\*kwargs) Save file
set_path(dotted_path, new_value[, data]) Set value in nested dict with dotted path
set_seed([seed]) Set randomization seeds
setdefault((k[,d]) -> D.get(k,d), ...)
show() Print container content
update(([E, ...) If E present and has a .keys() method, does: for k in E: D[k] = E[k]
values(() -> list of D’s values)
viewitems(...)
viewkeys(...)
viewvalues(...)

Attributes

class_labels Class labels
feature_aggregator Feature aggregator instance
feature_masker Feature masker instance
feature_normalizer Feature normalizer instance
feature_stacker Feature stacker instance
learner_params Get learner parameters from parameter container
method Learner method label
model Acoustic model
params Parameters
valid_formats