nnAudio.features.gammatone.Gammatonegram¶
- class nnAudio.features.gammatone.Gammatonegram(sr=44100, n_fft=2048, n_bins=64, hop_length=512, window='hann', center=True, pad_mode='reflect', power=2.0, htk=False, fmin=20.0, fmax=None, norm=1, trainable_bins=False, trainable_STFT=False, verbose=True)¶
 Bases:
torch.nn.modules.module.ModuleThis function is to calculate the Gammatonegram of the input signal.
Input signal should be in either of the following shapes. 1.
(len_audio), 2.(num_audio, len_audio), 3.(num_audio, 1, len_audio). The correct shape will be inferred autommatically if the input follows these 3 shapes. This class inherits fromnn.Module, therefore, the usage is same asnn.Module.- Parameters
 sr (int) – The sampling rate for the input audio. It is used to calucate the correct
fminandfmax. Setting the correct sampling rate is very important for calculating the correct frequency.n_fft (int) – The window size for the STFT. Default value is 2048
n_mels (int) – The number of Gammatonegram filter banks. The filter banks maps the n_fft to Gammatone bins. Default value is 64
hop_length (int) – The hop (or stride) size. Default value is 512.
window (str) – The windowing function for STFT. It uses
scipy.signal.get_window, please refer to scipy documentation for possible windowing functions. The default value is ‘hann’center (bool) – Putting the STFT keneral at the center of the time-step or not. If
False, the time index is the beginning of the STFT kernel, ifTrue, the time index is the center of the STFT kernel. Default value ifTrue.pad_mode (str) – The padding method. Default value is ‘reflect’.
htk (bool) – When
Falseis used, the Mel scale is quasi-logarithmic. WhenTrueis used, the Mel scale is logarithmic. The default value isFalsefmin (int) – The starting frequency for the lowest Gammatone filter bank
fmax (int) – The ending frequency for the highest Gammatone filter bank
trainable_mel (bool) – Determine if the Gammatone filter banks are trainable or not. If
True, the gradients for Mel filter banks will also be caluclated and the Mel filter banks will be updated during model training. Default value isFalsetrainable_STFT (bool) – Determine if the STFT kenrels are trainable or not. If
True, the gradients for STFT kernels will also be caluclated and the STFT kernels will be updated during model training. Default value isFalseverbose (bool) – If
True, it shows layer information. IfFalse, it suppresses all prints
- Returns
 spectrogram – It returns a tensor of spectrograms. shape =
(num_samples, freq_bins,time_steps).- Return type
 torch.tensor
Examples
>>> spec_layer = Spectrogram.Gammatonegram() >>> specs = spec_layer(x)
Methods
__init__Initializes internal Module state, shared by both nn.Module and ScriptModule.
forward