neural_source_filter package
- class neural_source_filter.NSFHifiganGenerator(in_channels: int, out_channels: int, resblock_type: str, resblock_dilation_sizes: Sequence[Sequence[int]], resblock_kernel_sizes: Sequence[int], upsample_kernel_sizes: Sequence[int], upsample_initial_channel: int, upsample_factors: Sequence[int], inference_padding: int = 5, cond_channels: int = 0, conv_pre_weight_norm: bool = True, conv_post_weight_norm: bool = True, conv_post_bias: bool = True, sine_gen: SineGen = SineGen())[source]
Bases:
HifiganGenerator- forward(x: torch.Tensor, g: torch.Tensor | None = None, f0: torch.Tensor = tensor([])) torch.Tensor[source]
- Args:
x (Tensor): feature input tensor. g (Tensor): global conditioning input tensor. f0 (Tensor): fundamental frequency input tensor.
- Returns:
Tensor: output waveform.
- Shapes:
x: [B, C, T] g: [B, C_g, T] f0: [B, 1, T] or [B, T]
- training: bool
- class neural_source_filter.SineGen(samp_rate: int, harmonic_num: int = 0, sine_amp: float = 0.1, noise_std: float = 0.003, voiced_threshold: float = 0, flag_for_pulse: bool = False)[source]
Bases:
Module- forward(f0: Tensor) tuple[Tensor, Tensor, Tensor][source]
sine_tensor, uv = forward(f0) input F0: tensor(batchsize=1, length, dim=1)
f0 for unvoiced steps should be 0
output sine_tensor: tensor(batchsize=1, length, dim) output uv: tensor(batchsize=1, length, 1)
- training: bool
- class neural_source_filter.SourceModuleHnNSF(sine_gen: SineGen, activation: Module = Tanh())[source]
Bases:
ModuleSourceModule for hn-nsf SourceModule(sampling_rate, harmonic_num=0, sine_amp=0.1,
add_noise_std=0.003, voiced_threshod=0)
sampling_rate: sampling_rate in Hz harmonic_num: number of harmonic above F0 (default: 0) sine_amp: amplitude of sine source signal (default: 0.1) add_noise_std: std of additive Gaussian noise (default: 0.003)
note that amplitude of noise in unvoiced is decided by sine_amp
voiced_threshold: threshold to set U/V given F0 (default: 0) Sine_source, noise_source = SourceModuleHnNSF(F0_sampled) F0_sampled (batchsize, length, 1) Sine_source (batchsize, length, 1) noise_source (batchsize, length 1) uv (batchsize, length, 1)