nnAudio.features.gammatone.Gammatonegram¶
- class nnAudio.features.gammatone.Gammatonegram(sr=44100, n_fft=2048, n_bins=64, hop_length=512, window='hann', center=True, pad_mode='reflect', power=2.0, htk=False, fmin=20.0, fmax=None, norm=1, trainable_bins=False, trainable_STFT=False, verbose=True)¶
Bases:
torch.nn.modules.module.Module
This function is to calculate the Gammatonegram of the input signal.
Input signal should be in either of the following shapes. 1.
(len_audio)
, 2.(num_audio, len_audio)
, 3.(num_audio, 1, len_audio)
. The correct shape will be inferred autommatically if the input follows these 3 shapes. This class inherits fromnn.Module
, therefore, the usage is same asnn.Module
.- Parameters
sr (int) – The sampling rate for the input audio. It is used to calucate the correct
fmin
andfmax
. Setting the correct sampling rate is very important for calculating the correct frequency.n_fft (int) – The window size for the STFT. Default value is 2048
n_mels (int) – The number of Gammatonegram filter banks. The filter banks maps the n_fft to Gammatone bins. Default value is 64
hop_length (int) – The hop (or stride) size. Default value is 512.
window (str) – The windowing function for STFT. It uses
scipy.signal.get_window
, please refer to scipy documentation for possible windowing functions. The default value is ‘hann’center (bool) – Putting the STFT keneral at the center of the time-step or not. If
False
, the time index is the beginning of the STFT kernel, ifTrue
, the time index is the center of the STFT kernel. Default value ifTrue
.pad_mode (str) – The padding method. Default value is ‘reflect’.
htk (bool) – When
False
is used, the Mel scale is quasi-logarithmic. WhenTrue
is used, the Mel scale is logarithmic. The default value isFalse
fmin (int) – The starting frequency for the lowest Gammatone filter bank
fmax (int) – The ending frequency for the highest Gammatone filter bank
trainable_mel (bool) – Determine if the Gammatone filter banks are trainable or not. If
True
, the gradients for Mel filter banks will also be caluclated and the Mel filter banks will be updated during model training. Default value isFalse
trainable_STFT (bool) – Determine if the STFT kenrels are trainable or not. If
True
, the gradients for STFT kernels will also be caluclated and the STFT kernels will be updated during model training. Default value isFalse
verbose (bool) – If
True
, it shows layer information. IfFalse
, it suppresses all prints
- Returns
spectrogram – It returns a tensor of spectrograms. shape =
(num_samples, freq_bins,time_steps)
.- Return type
torch.tensor
Examples
>>> spec_layer = Spectrogram.Gammatonegram() >>> specs = spec_layer(x)
Methods
__init__
Initializes internal Module state, shared by both nn.Module and ScriptModule.
forward