nnAudio.Spectrogram.MFCC

class nnAudio.Spectrogram.MFCC(sr=22050, n_mfcc=20, norm='ortho', device='cpu', verbose=True, **kwargs)

Bases: torch.nn.modules.module.Module

This function is to calculate the Mel-frequency cepstral coefficients (MFCCs) of the input signal. It only support type-II DCT at the moment. Input signal should be in either of the following shapes.

  1. (len_audio)

  2. (num_audio, len_audio)

  3. (num_audio, 1, len_audio)

The correct shape will be inferred autommatically if the input follows these 3 shapes. Most of the arguments follow the convention from librosa. This class inherits from torch.nn.Module, therefore, the usage is same as torch.nn.Module.

Parameters
  • sr (int) – The sampling rate for the input audio. It is used to calculate the correct fmin and fmax. Setting the correct sampling rate is very important for calculating the correct frequency.

  • n_mfcc (int) – The number of Mel-frequency cepstral coefficients

  • norm (string) – The default value is ‘ortho’. Normalization for DCT basis

  • **kwargs – Other arguments for Melspectrogram such as n_fft, n_mels, hop_length, and window

Returns

MFCCs – It returns a tensor of MFCCs. shape = (num_samples, n_mfcc, time_steps).

Return type

torch.tensor

Examples

>>> spec_layer = Spectrogram.MFCC()
>>> mfcc = spec_layer(x)

Methods

__init__

Initializes internal Module state, shared by both nn.Module and ScriptModule.

dct

Refer to https://github.com/zh217/torch-dct for the original implmentation.

forward

Convert a batch of waveforms to MFCC.

dct(x, norm=None)

Refer to https://github.com/zh217/torch-dct for the original implmentation.

forward(x)

Convert a batch of waveforms to MFCC.

Parameters

x (torch tensor) –

Input signal should be in either of the following shapes.

  1. (len_audio)

  2. (num_audio, len_audio)

3. (num_audio, 1, len_audio) It will be automatically broadcast to the right shape

class power_to_db(ref=1.0, amin=1e-10, top_db=80.0, device='cpu')

Bases: object

Refer to https://librosa.github.io/librosa/_modules/librosa/core/spectrum.html#power_to_db for the original implmentation.

forward(S)