nnAudio.Spectrogram.MFCC¶

class nnAudio.Spectrogram.MFCC(sr=22050, n_mfcc=20, norm='ortho', device='cpu', verbose=True, **kwargs)¶

Bases: torch.nn.modules.module.Module

This function is to calculate the Mel-frequency cepstral coefficients (MFCCs) of the input signal. It only support type-II DCT at the moment. Input signal should be in either of the following shapes.

(len_audio)
(num_audio, len_audio)
(num_audio, 1, len_audio)

The correct shape will be inferred autommatically if the input follows these 3 shapes. Most of the arguments follow the convention from librosa. This class inherits from torch.nn.Module, therefore, the usage is same as torch.nn.Module.

Parameters

sr (int) – The sampling rate for the input audio. It is used to calculate the correct fmin and fmax. Setting the correct sampling rate is very important for calculating the correct frequency.
n_mfcc (int) – The number of Mel-frequency cepstral coefficients
norm (string) – The default value is ‘ortho’. Normalization for DCT basis
**kwargs – Other arguments for Melspectrogram such as n_fft, n_mels, hop_length, and window

Returns

MFCCs – It returns a tensor of MFCCs. shape = (num_samples, n_mfcc, time_steps).

Return type

torch.tensor

Examples

>>> spec_layer = Spectrogram.MFCC()
>>> mfcc = spec_layer(x)

Methods

`__init__`	Initializes internal Module state, shared by both nn.Module and ScriptModule.
`dct`	Refer to https://github.com/zh217/torch-dct for the original implmentation.
`forward`	Convert a batch of waveforms to MFCC.

dct(x, norm=None)¶: Refer to https://github.com/zh217/torch-dct for the original implmentation.

forward(x)¶

Convert a batch of waveforms to MFCC.

Parameters

x (torch tensor) –

Input signal should be in either of the following shapes.

(len_audio)
(num_audio, len_audio)

3. (num_audio, 1, len_audio) It will be automatically broadcast to the right shape

class power_to_db(ref=1.0, amin=1e-10, top_db=80.0, device='cpu')¶

Bases: object

Refer to https://librosa.github.io/librosa/_modules/librosa/core/spectrum.html#power_to_db for the original implmentation.

forward(S)¶