nnAudio.Spectrogram.Combined_Frequency_Periodicity¶
- class nnAudio.Spectrogram.Combined_Frequency_Periodicity(fr=2, fs=16000, hop_length=320, window_size=2049, fc=80, tc=0.001, g=[0.24, 0.6, 1], NumPerOct=48)¶
- Bases: - torch.nn.modules.module.Module- Vectorized version of the code in https://github.com/leo-so/VocalMelodyExtPatchCNN/blob/master/MelodyExt.py. This feature is described in ‘Combining Spectral and Temporal Representations for Multipitch Estimation of Polyphonic Music’ https://ieeexplore.ieee.org/document/7118691 - Parameters
- fr (int) – Frequency resolution. The higher the number, the lower the resolution is. Maximum frequency resolution occurs when - fr=1. The default value is- 2
- fs (int) – Sample rate of the input audio clips. The default value is - 16000
- hop_length (int) – The hop (or stride) size. The default value is - 320.
- window_size (str) – It is same as - n_fftin other Spectrogram classes. The default value is- 2049
- fc (int) – Starting frequency. For example, - fc=80means that Z starts at 80Hz. The default value is- 80.
- tc (int) – Inverse of ending frequency. For example - tc=1/8000means that Z ends at 8000Hz. The default value is- 1/8000.
- g (list) – Coefficients for non-linear activation function. - len(g)should be the number of activation layers. Each element in- gis the activation coefficient, for example- [0.24, 0.6, 1].
- device (str) – Choose which device to initialize this layer. Default value is ‘cpu’ 
 
- Returns
- Z (torch.tensor) – The Combined Frequency and Period Feature. It is equivalent to - tfrLF * tfrLQ
- tfrL0 (torch.tensor) – STFT output 
- tfrLF (torch.tensor) – Frequency Feature 
- tfrLQ (torch.tensor) – Period Feature 
 
 - Examples - >>> spec_layer = Spectrogram.Combined_Frequency_Periodicity() >>> Z, tfrL0, tfrLF, tfrLQ = spec_layer(x) - Methods - __init__- Initializes internal Module state, shared by both nn.Module and ScriptModule. - create_logfreq_matrix- forward- nonlinear_func