Huffman Dictionary, error with index exceed when i give specific probabilities.

5 views (last 30 days)
When i change my probabilities numbers i receive an error.
Also if i change some of my probabilities im not getting the error.
Index exceeds the number of array elements. Index must not exceed 1.
Error in huffmandict (line 17)
min1 = sorted_p(2);
The probabilties are the frequencies of each english letter.
Input Data:
probabilities=[0.0698 0.0128 0.0238 0.0364 0.1086 ...
0.0190 0.0172 0.0521 0.0595 0.0013 ...
0.0066 0.0344 0.0206 0.0577 0.0642 ...
0.0165 0.0008 0.0512 0.0541 0.0774 ...
0.0236 0.0084 0.0202 0.0013 0.0169 0.0006 0.1453];
alphabet = {'a' 'b' 'c' 'd' 'e' 'f' ...
'g' 'h' 'i' 'j' 'k' 'l' ...
'm' 'n' 'o' 'p' 'q' 'r' ...
's' 't' 'u' 'v' 'w' 'x' 'y' 'z' '-'};
dict = huffmandict(alphabet, probabilities);
My huffman dictionary script
function dict = huffmandict(symbols, prob)
%Probability Length
probability_length = length(prob);
for i = 1:probability_length
code{i} = '';
symbol{i} = i;
end
while(prob ~= 1)
% Sorting probabilities
[~,sorted_p] = sort(prob);
% Indexes to be merged
min = sorted_p(1);
min1 = sorted_p(2);
min_set = symbol{min};
min1_set = symbol{min1};
% Get the sum of the probabilities
new_possibility = prob(min) + prob(min1);
merged_set = [min_set, min1_set];
% Probability updates
symbol(sorted_p(1:2)) = '';
symbol = [symbol merged_set];
prob(sorted_p(1:2)) = '';
prob = [prob new_possibility];
% Assigning 0 and 1 to symbols
for i = 1:length(min_set)
code{min_set(i)} = strcat('1',code{min_set(i)});
end
for i = 1:length(min1_set)
code{min1_set(i)} = strcat('0',code{min1_set(i)});
end
% Constructing the dictionary
dict = cell(probability_length,2);
for i=1:probability_length
dict{i,1} = symbols{i}(1);
dict{i,2} = code{i};
end
end
end

Accepted Answer

David Goodmanson
David Goodmanson on 6 Jan 2022
Edited: David Goodmanson on 6 Jan 2022
Hi Rafael,
sum(probabilities)
ans = 1.0003
SInce the probabilities don't add up to 1, the code is going to have problems. I arbitrarily reduced the last probability by .0003, so now the last line is
0.0236 0.0084 0.0202 0.0013 0.0169 0.0006 0.1450]
Now
sum(probabilities)
ans = 1
and it works. However, you are using a check that reads
while(prob ~= 1)
and depends on the sum equaling 1 exactly. When adding floating point numbers this will only happen by luck. For example, suppose you change the last line to
0.0239 0.0081 0.0202 0.0013 0.0169 0.0006 0.1450]
This has increased the first element by .0003 and decreased the second element by .0003. The sum is still 1, but
sum(probabilities)-1
ans = 2.2204e-16
so the exact 'while' check fails. That check needs to be changed to something like
while abs(prob-1)>tol
where tol is some suitable tolerance such as 1e-10.
  4 Comments
Rafael Nicolaou
Rafael Nicolaou on 11 Jan 2022
Edited: Rafael Nicolaou on 11 Jan 2022
Hello sorry for the delay got tested postive for covid.
After changing the probabilities anything above 1 + 1e-8 is not working.
If i do anything smaller than 1e-9 works perfect.
The problem is my probabilities are standar and i cant change them.
David Goodmanson
David Goodmanson on 11 Jan 2022
Edited: David Goodmanson on 12 Jan 2022
Hi Rafael, If you have a list of probabilities and indend to use every one of them even if they don't add up to exactly 1, you could temporarily rescale them so that they do add up to 1, within something on the order of 10^-16. That doesn't change their relative sizes so the huffman code is unaffected.
probabilitiesnew = probabilities/sum(probabilities)
Then a tolerance of 1e-10 should work.

Sign in to comment.

More Answers (1)

Cris LaPierre
Cris LaPierre on 6 Jan 2022
The error is occuring the 27th time the while loop runs. prob only has a single value at this point, so sorted_p(2) does not exist.
A=5;
% works
A(1)
ans = 5
% Doesn't work
A(2)
Index exceeds the number of array elements. Index must not exceed 1.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!