nnAudio.utils.create_fourier_kernels¶
- nnAudio.utils.create_fourier_kernels(n_fft, win_length=None, freq_bins=None, fmin=50, fmax=6000, sr=44100, freq_scale='linear', window='hann', verbose=True)¶
This function creates the Fourier Kernel for STFT, Melspectrogram and CQT. Most of the parameters follow librosa conventions. Part of the code comes from pytorch_musicnet. https://github.com/jthickstun/pytorch_musicnet
- Parameters
n_fft (int) – The window size
freq_bins (int) – Number of frequency bins. Default is
None, which meansn_fft//2+1binsfmin (int) – The starting frequency for the lowest frequency bin. If freq_scale is
no, this argument does nothing.fmax (int) – The ending frequency for the highest frequency bin. If freq_scale is
no, this argument does nothing.sr (int) – The sampling rate for the input audio. It is used to calculate the correct
fminandfmax. Setting the correct sampling rate is very important for calculating the correct frequency.freq_scale ('linear', 'log', or 'no') – Determine the spacing between each frequency bin. When ‘linear’ or ‘log’ is used, the bin spacing can be controlled by
fminandfmax. If ‘no’ is used, the bin will start at 0Hz and end at Nyquist frequency with linear spacing.
- Returns
wsin (numpy.array) – Imaginary Fourier Kernel with the shape
(freq_bins, 1, n_fft)wcos (numpy.array) – Real Fourier Kernel with the shape
(freq_bins, 1, n_fft)bins2freq (list) – Mapping each frequency bin to frequency in Hz.
binslist (list) – The normalized frequency
kin digital domain. Thiskis in the Discrete Fourier Transform equation $$