i m supposed to implement testing code for speech recognition..n getting error in line 154 as matrix dimensions must agree and 102 as index exceed matrix dimensions.....

1 view (last 30 days)
clc;
clear all;
close all;
THRESHOLD=0.7;
Fs = 10000;
fprintf('say a sentence immediately after hitting enter: ');
input('');
y= wavrecord(1 * 10000, 10000, 'double'); % Record and store the uttered speech
t=(0:(1*10000)-1)*1/(1*10000);
subplot(5,1,1);
plot(y);
r=fft(y);
d=abs(r);
subplot(5,1,2);
plot(d);
z=floor(Fs/100);
w=floor(Fs/5);%according to formula, 1600 sample needed for 8 khz
%----------
%calculation of mean and std
h=[];
for i=1:w
h=[h y(i)];
end
meanVal=mean(h);
sDev=std(h);
%----------
%identify voiced or not for each value
for i=1:length(y)
if(abs(y(i)-meanVal)/sDev > THRESHOLD)
voiced(i)=1;
else
voiced(i)=0;
end
end
% identify voiced or not for each frame
%discard insufficient samples of last frame
usefulSamples=length(y)-mod(length(y),z);
frameCount=usefulSamples/z;
voicedFrameCount=0;
for i=1:frameCount
cVoiced=0;
cUnVoiced=0;
for j=i*z-z+1:1:(i*z)
if(voiced(j)==1)
cVoiced=(cVoiced+1);
else
cUnVoiced=cUnVoiced+1;
end
end
%mark frame for voiced/unvoiced
if(cVoiced>cUnVoiced)
voicedFrameCount=voicedFrameCount+1;
voicedUnvoiced(i)=1;
else
voicedUnvoiced(i)=0;
end
end
k=[];
%-----
for i=1:frameCount
if(voicedUnvoiced(i)==1)
for j=i*z-z+1:1:(i*z)
k= [k y(j)];
end
end
end
%---display plot and play both sounds
subplot(5,1,3);
plot(k);
g=fft(k);
a=hamming(4000);% Hamming window to smooth the speech signal
b= [a ;zeros(6000,1)];
f = (1:10000);
mel(f) = 2595 * log(1 + f / 700); % Linear to Mel frequency scale conversion
tri = triang(100);
win1 = [tri ; zeros(9900,1)]; % Defining overlapping triangular windows for
win2 = [zeros(50,1) ; tri ; zeros(9850,1)]; % frequency domain analysis
win3 = [zeros(100,1) ; tri ; zeros(9800,1)];
win4 = [zeros(150,1) ; tri ; zeros(9750,1)];
win5 = [zeros(200,1) ; tri ; zeros(9700,1)];
win6 = [zeros(250,1) ; tri ; zeros(9650,1)];
win7 = [zeros(300,1) ; tri ; zeros(9600,1)];
win8 = [zeros(350,1) ; tri ; zeros(9550,1)];
win9 = [zeros(400,1) ; tri ; zeros(9500,1)];
win10 = [zeros(450,1) ; tri ; zeros(9450,1)];
win11 = [zeros(500,1) ; tri ; zeros(9400,1)];
win12 = [zeros(550,1) ; tri ; zeros(9350,1)];
win13 = [zeros(600,1) ; tri ; zeros(9300,1)];
win14 = [zeros(650,1) ; tri ; zeros(9250,1)];
win15 = [zeros(700,1) ; tri ; zeros(9200,1)];
win16 = [zeros(750,1) ; tri ; zeros(9150,1)];
win17 = [zeros(800,1) ; tri ; zeros(9100,1)];
win18 = [zeros(850,1) ; tri ; zeros(9050,1)];
win19 = [zeros(900,1) ; tri ; zeros(9000,1)];
win20 = [zeros(950,1) ; tri ; zeros(8950,1)];
ny = abs(g(floor(mel(f)))); % Mel warping
ny = ny / max(ny);
ny1 = ny * win1;
ny2 = ny * win2;
ny3 = ny * win3;
ny4 = ny * win4;
ny5 = ny * win5;
ny6 = ny * win6;
ny7 = ny * win7;
ny8 = ny * win8;
ny9 = ny * win9;
ny10 = ny * win10;
ny11 = ny * win11;
ny12 = ny *win12;
ny13 = ny * win13;
ny14 = ny * win14;
ny15 = ny * win15;
ny16 = ny * win16;
ny17 = ny * win17;
ny18 = ny * win18;
ny19 = ny * win19;
ny20 = ny * win20;
sy1 = sum(ny1 ^ 2); % Determine the energy of the signal within each window
sy2 = sum(ny2 ^ 2); % by summing square of the magnitude of the spectrum
sy3 = sum(ny3 ^ 2);
sy4 = sum(ny4 ^ 2);
sy5 = sum(ny5 ^ 2);
sy6 = sum(ny6 ^ 2);
sy7 = sum(ny7 ^ 2);
sy8 = sum(ny8 ^ 2);
sy9 = sum(ny9 ^ 2);
sy10 = sum(ny10 ^ 2);
sy11 = sum(ny11 ^ 2);
sy12 = sum(ny12 ^ 2);
sy13 = sum(ny13 ^ 2);
sy14 = sum(ny14 ^ 2);
sy15 = sum(ny15 ^ 2);
sy16 = sum(ny16 ^ 2);
sy17 = sum(ny17 ^ 2);
sy18 = sum(ny18 ^ 2);
sy19 = sum(ny19 ^ 2);
sy20 = sum(ny20 ^ 2);
sy = [sy1; sy2; sy3; sy4; sy5; sy6; sy7; sy8; sy9; sy10; sy11; sy12; sy13; sy14;
sy15; sy16; sy17; sy18; sy19; sy20];
by = log(sy);
dy = dct(by); % Determine DCT of Log of the spectrum energies
subplot(5,1,4);
plot(dy);
fid = fopen('sample.dat','r');
dx = fread(fid,20, 'real*8'); % Obtain the feature vector for the password
fclose(fid); % evaluated in the training phase
dx=dx';
MSE=(sum((dx - dy) ^ 2)) / 20; % Determine the Mean squared error
if MSE<1
fprintf('\n\nACCESS GRANTED\n\n');
Grant=wavread('Grant.wav'); % “Access Granted”
wavplay(Grant);
else
fprintf('\n\nACCESS DENIED\n\n');
Deny=wavread('Deny.wav'); % “Access Denied” is output in case of a failure
wavplay(Deny);
end;

Answers (1)

Walter Roberson
Walter Roberson on 21 Feb 2012
Without calculating it through, I see no inherent reason to expect that
floor(2595 * log(1 + (1:10000) / 700))
will always be in the range 1:length(g) as g = fft(k) and k is built up conditionally based upon an unusual windowing function applied for "unvoiced" signals.

Categories

Find more on Simulation, Tuning, and Visualization in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!