nnAudio.Spectrogram.Combined_Frequency_Periodicity¶

class nnAudio.Spectrogram.Combined_Frequency_Periodicity(fr=2, fs=16000, hop_length=320, window_size=2049, fc=80, tc=0.001, g=[0.24, 0.6, 1], NumPerOct=48)¶

Bases: torch.nn.modules.module.Module

Vectorized version of the code in https://github.com/leo-so/VocalMelodyExtPatchCNN/blob/master/MelodyExt.py. This feature is described in ‘Combining Spectral and Temporal Representations for Multipitch Estimation of Polyphonic Music’ https://ieeexplore.ieee.org/document/7118691

Parameters

fr (int) – Frequency resolution. The higher the number, the lower the resolution is. Maximum frequency resolution occurs when fr=1. The default value is 2
fs (int) – Sample rate of the input audio clips. The default value is 16000
hop_length (int) – The hop (or stride) size. The default value is 320.
window_size (str) – It is same as n_fft in other Spectrogram classes. The default value is 2049
fc (int) – Starting frequency. For example, fc=80 means that Z starts at 80Hz. The default value is 80.
tc (int) – Inverse of ending frequency. For example tc=1/8000 means that Z ends at 8000Hz. The default value is 1/8000.
g (list) – Coefficients for non-linear activation function. len(g) should be the number of activation layers. Each element in g is the activation coefficient, for example [0.24, 0.6, 1].
device (str) – Choose which device to initialize this layer. Default value is ‘cpu’

Returns

Z (torch.tensor) – The Combined Frequency and Period Feature. It is equivalent to tfrLF * tfrLQ
tfrL0 (torch.tensor) – STFT output
tfrLF (torch.tensor) – Frequency Feature
tfrLQ (torch.tensor) – Period Feature

Examples

>>> spec_layer = Spectrogram.Combined_Frequency_Periodicity()
>>> Z, tfrL0, tfrLF, tfrLQ = spec_layer(x)

Methods

`__init__`	Initializes internal Module state, shared by both nn.Module and ScriptModule.
`create_logfreq_matrix`
`forward`
`nonlinear_func`