Announce

PukiWiki contents have been moved into SONOTS Plugin (20070703)

Project: Source-Filter Separation of Speech Signal (Small)

Table of Contents

Introduction

Deconvolution of speech signal into source (vocal codes, white noise) and filter (oral cavity, coloring, envelope) component.

spSeparationCepstrumDemo.png

First Edition: April 2008. Last Modified: April 2008
Tag: Scientific SoundProcessing Deconvolution Matlab

Source-Filter Separation via Cepstral Method

Cepstrum

Please use codes and refer report for math equations at Pitch.

Source-Filter Separation

% NAME
%   spSeparationCepstrum - Source-Filter Separation via Cepstrum
% SYNOPSIS
%   [source, filter, c, y] = 
%     spSeparationCepstrum(x, fs, ncoef, window, show)
% DESCRIPTION
%   Source-Filter Separation via the Cepstrum. 
%   This deconvolve speech signal into source (vocal codes, white noise)
%   and filter (oral cavity, coloring, envelope) component
% INPUTS
%   x      (vector) of size Nx1 which contains one frame signal
%   fs     (scalar) the sampling rate
%   [ncoef] 
%          (scalar) the number of coefficients of cepstrum to treat 
%           as low-time parts. The default uses
%             ncoef = 2 + fs / 1000;
%           as a rule of thumb. 
%   [window]
%          (string) the window function such as 'rectwin', 'hamming'. 
%           The default is 'rectwin' (or no window). 
%   [show] (boolean) plot or not. The default is 0. 
% OUTPUTS
%   source (vector) of size Nx1 which containts log magnitude (not dB) 
%           frequency response of high-time cepstrum parts that is 
%           source (vocal tract, white noise) component
%   filter (vector) of size Nx1 which constains log magnitude (not dB) 
%           frequency response of low-time cepstrum parts that is 
%           filter (oral cavity, coloring, envelope) component
%   [c]    (vector) of size Nx1 which contains cepstrum coefficients
%   [y]    (vector) of size Nx1 which contains frequency response of input
% AUTHOR
%   Naotoshi Seo, April 2008
function [source, filter, c, y] = spSeparationCepstrum(x, fs, ncoef, window, show)
 %% initialization
 N = length(x);
 x = x(:);
 if ~exist('show', 'var') || isempty(show)
     show = 0;
 end
 if ~exist('window', 'var') || isempty(window)
     window = 'rectwin';
 end
 if ~exist('ncoef', 'var') || isempty(ncoef)
     ncoef = 2 + round(fs / 1000);
 end

 %% get cepstrum
 [c, y] = spCepstrum(x, window);

 %% get high-time spectrum response
 highc = [zeros(ncoef,1); c(ncoef:end)];
 source = fft(highc); % recover high-time parts to log mag freq response

 %% get low-time spectrum response
 lowc = [c(1:ncoef); zeros(length(c)-ncoef,1)];
 filter = fft(lowc); % recover low-time parts to log mag freq response

 %% plot
 if show
     figure;
     ms1=fs/1000; % 1ms. maximum speech Fx at 1000Hz
     ms20=fs/50;  % 20ms. minimum speech Fx at 50Hz
     
     % plot waveform
     t=(0:N-1)/fs;        % times of sampling instants
     subplot(3,2,1);
     plot(t,x);
     xlabel('Time (s)');
     ylabel('Amplitude');
     title('(a) Waveform');
     
     % plot log spectrum
     f = (0:N/2-1)*fs/N; % rest half is mirror for real signal
     db = 20*log(abs(y(1:length(f))));
     subplot(3,2,2);
     plot(f,db);
     title('(b) Spectrum');
     xlabel('Frequency (Hz)');
     ylabel('Magnitude (dB)');
     ylim([-100 100]);
     
     % plot cepstrum between 1ms (=1000Hz) and 20ms (=50Hz)
     q=(ms1:ms20)/fs;
     subplot(3,2,3);
     plot(q,abs(c(ms1:ms20)));
     title('(c) Cepstrum');
     xlabel('Quefrency (s)');
     ylabel('Amplitude');

     % plot low-time spectrum response
     subplot(3,2,4);
     plot(f,20*real(filter(1:length(f))));
     title('(d) Low-time Spectrum (Filter)');
     xlabel('Frequency (Hz)');
     ylabel('Magnitude (dB)');
     ylim([-100 100]);

     % plot high-time spectrum response
     subplot(3,2,6);
     plot(f,20*real(source(1:length(f))));
     title('(e) High-time Spectrum (Source)');
     xlabel('Frequency (Hz)');
     ylabel('Magnitude (dB)');
     ylim([-100 100]);
 end
end
function spSeparationCepstrumDemo
 x1 = wgn(1000, 1, 1); fs = 16000;
 c = spSeparationCepstrum(x1, fs, 20, 'hamming', 'plot');

 [x2, fs] = wavread('bee.wav'); x2 = x2(1001:2000);
 c = spSeparationCepstrum(x2, fs, 20, 'hamming', 'plot');

 x3 = awgn(x2,50);
 c = spSeparationCepstrum(x3, fs, 20, 'hamming', 'plot');
end

spSeparationCepstrumDemo.png

Source-Filter Separation via LPC Method

LPC

Please refer codes and report for math equations at Pitch.

Source-Filter Separation

% NAME
%   spSeparationLpc - Source-Filter Separation via LPC
% SYNOPSIS
%   [source, filter, a P e] = spSeparationLpc(x, fs, ncoef, show)
% DESCRIPTION
%   Source-Filter Separation via LPC. 
%   This deconvolve speech signal into source (vocal codes, white noise)
%   and filter (oral cavity, coloring, envelope) component
% INPUTS
%   x      (vector) of size Nx1 which contains signal
%   fs     (scalar) the sampling frequency
%   [ncoef]
%          (scalar) the number of coefficients of LPC. The default uses
%             ncoef = 2 + fs / 1000;
%           as a rule of thumb. 
%   [show] (boolean) plot or not. The default is 0.
% OUTPUTS
%   a      (vector) of size ncoefx1 which contains LPC coefficients
%   P      (scalar) variance (power) of the prediction error
%   e      (vector) of size Nx1 which contains residual error signals
% AUTHOR
%   Naotoshi Seo, April 2008
% USES
%   lpc.m (Signal Processing toolbox)
function [source, filter, a, P, e] = spSeparationLpc(x, fs, ncoef, show)
 %% Initialization
 if ~exist('show', 'var') || isempty(show)
     show = 0;
 end
 if ~exist('ncoef', 'var')
     ncoef = [];
 end

 %% get Linear prediction
 [a P e] = spLpc(x, fs, ncoef);

 %% get filter (envelope) component
 [filter, f]=freqz(1,a,512,fs);

 %% get source component
 source = fft(e); 
 source = source(1:ceil(length(source)/2)); % rest half is just mirror of real signal

 if show
     figure;
     % plot waveform
     subplot(3,2,1);
     t=(0:length(x)-1)/fs;        % times of sampling instants
     plot(t,x);
     title('Waveform');
     xlabel('Time (s)');
     ylabel('Amplitude');
     xlim([t(1) t(end)]);
     
     % plot poles
     subplot(3,2,3);
     zplane(0, a);
     title('Poles of LPC filter');
     
     % plot frequency reponse of original
     subplot(3,2,2);
     y = fft(x); original = y(1:ceil(length(y)/2));
     N = length(original);
     f = (1:N)/N*(fs/2);
     plot(f, 20*log10(abs(original)+eps));
     title('Spectrum');
     xlabel('Frequency (Hz)');
     ylabel('Magnitude (dB)');
     xlim([f(1) f(end)]);

     % plot frequency response of LPC (filter part)
     subplot(3,2,4);
     N = length(filter);
     f = (1:N)/N*(fs/2);
     plot(f,20*log10(abs(filter)+eps));
     title('Spectrum of Filter (LPC) Part');
     xlabel('Frequency (Hz)');
     ylabel('Magnitude (dB)');
     xlim([f(1) f(end)]);

     % plot frequency reponse of residual (source part)
     subplot(3,2,6);
     N = length(source);
     f = (1:N)/N*(fs/2);
     plot(f, 20*log10(abs(source)+eps));
     title('Spectrum of Source (Residual) Part');
     xlabel('Frequency (Hz)');
     ylabel('Magnitude (dB)');
     xlim([f(1) f(end)]);
 end
end
function spSeparationLpcDemo
 x1 = wgn(1000, 1, 1); fs = 16000;
 c = spSeparationLpc(x1, fs, 20, 'plot');

 [x2, fs] = wavread('bee.wav'); x2 = x2(1001:2000);
 c = spSeparationLpc(x2, fs, 20, 'plot');

 x3 = awgn(x2,50);
 c = spSeparationLpc(x3, fs, 20, 'plot');
end

spSeparationLpcDemo.png

Reference

  • [1] Xuedong Huang, Alex Acero, Hsiao-wuen hon, "Spoken Language Processing: A guide to Theory, Algorithm, and System Development," Prentice Hall, January 2001