Precision of numbers problem (I think)

4 views (last 30 days)
Phillip
Phillip on 1 Oct 2021
Commented: Phillip on 1 Oct 2021
I have a matrix with 4 columns. I'd like to do something very simple, which is average all numbers in Column 4 given specific numbers in Column 2 but I end up with NaNs for 2 of the 7 means with the code below.
I have a sample "new_data.mat" attached.
This is the Matlab script I'm trying to get to work (rounding alone has not solved the problem):
clear all; clc; close all
alldir = dir(['SS*']);
P = pwd;
for d = 1:size(alldir,1)
PathName = [P, '\', alldir(d).name, '\Behaviour\'];
cd(PathName)
FileName = dir('ID*_PRIM_beh_all.mat');
new_data = cell2mat(struct2cell(load(FileName.name))); % example in my dropbox linked above
phys_diffs = 0:.15:.90;
for k = 1:7
step = phys_diffs(k);
avgdiffRT(k) = mean(new_data(new_data(:, 2) == step, 4));
end
alldiffRT(d, :) = avgdiffRT;
end
If I use the function "round" and take the loop out, i.e. average each of the 7 steps "manually" it works:
rounded = round(new_data(:, 2), 2);
avgRTphysdiff1 = mean(new_data(rounded == 0, 4));
avgRTphysdiff2 = mean(new_data(rounded == .15, 4));
etc
Does somebody have an idea how to get the loop to work?
  1 Comment
Stephen23
Stephen23 on 1 Oct 2021
Edited: Stephen23 on 1 Oct 2021
"Does somebody have an idea how to get the loop to work?"
Do NOT use ROUND. It is not a robust approach, and should be avoided.
Nor should you compare for exact equivalence of floating point numbers, e.g. using EQ or ISMEMBER.
The simple, robust, recommended approach is to compare the absolute difference against a tolerance:
tol = 1e-5; % pick the tolerance to suit your needs.
abs(A-B)<tol

Sign in to comment.

Answers (1)

Konrad
Konrad on 1 Oct 2021
Hi Phillip,
yes, it has to do with precision.
ismember(phys_diffs,new_data(:,2))
% ans =
% 1×7 logical array
% 1 0 1 1 1 1 0
This shows that the 2nd and the last value in phy_diffs (.15 and .9) do not appear in new_data(:,2). (And the mean of nothing is NaN)
But values very close to these two numbers do appear. So what you could do is to round new_data(:,2), e.g. to two digits:
idx = round(new_data(:, 2),2) == step;
avgdiffRT(k) = mean(new_data(idx, 4));
Regards, Konrad
  3 Comments
John D'Errico
John D'Errico on 1 Oct 2021
Edited: John D'Errico on 1 Oct 2021
Note that rounding a number to 2 significant digits does NOT produce an exact value.
X = 0.15;
Is X exactly 0.15? Of course not. MATLAB CANNOT represent that number, which can be thought of in fractional form as 15/100, or 3/20 in lowest terms.
sprintf('%0.55f',X)
ans = '0.1499999999999999944488848768742172978818416595458984375'
Will rounding to two digits help?
Y = round(X,2)
Y = 0.1500
sprintf('%0.55f',Y)
ans = '0.1499999999999999944488848768742172978818416595458984375'
As you can see, the result is unchanged. This is because no matter how hard you try, MATLAB cannot store many such fractions in their exact form. This happens for the same reason why you cannot represent 2/3 as a decimal. To do so exactly would require storing infinitely many decimal digits, since when represented in decimal form, we have:
2/3 = 0.666666666666666666666666666666666666666...
But why is that a problem with the fraction 3/20? Or even 1/10?
The problem arises because computers use binary storage methods to represent numbers. So if we look at how 3/20 is stored, we would see this expansion
0.0010011001100110011001100110011001100110011001100110011...
In that expansion, the ones represent negative powers of 2.
P = [-3 -6 -7 -10 -11 -14 -15 -18 -19 -22 -23 -26 -27 -30 -31 -34 -35 -38 -39 -42 -43 -46 -47 -50 -51 -54 -55];
format long g
sum(2.^P)
ans =
0.15
sum(2.^P) - X
ans =
0
As you can see, that expansion does repreduce 3/20, as close as MATLAB can come. But to be exactly 3/20, the expansion needs to continue for an infinite number of terms. So 3/20 is a number without a finite binary expansion, just as 2/3 cannot be represented in a finite number of decimal digits. The pattern of 0's and 1's will repeat forever, and since MATLAB can use only a finite number of binary bits in the mantissa (52 of them) it can never represent most such fractions.
That is not to say ALL such fractions. Any (within limits) simple power of 2 will be exactly represented. That is, MATLAB will represent some fractions like 1/8, or 3/16 exactly. But these are either simple powers of 2, or they can be written as a sum of negative powers of 2.
sprintf('%0.55f',3/16)
ans = '0.1875000000000000000000000000000000000000000000000000000'
No problems there, as long as the denominator when written in lowest terms as a power of 2 (again within limits) MATLAB will succeed. For example, 2^-70 = 1/2^70.
sprintf('%0.150f',2^-70)
ans = '0.000000000000000000000847032947254300339068322500679641962051391601562500000000000000000000000000000000000000000000000000000000000000000000000000000000'
And in that number, there are now infinitely many trailing ZEROS in decimal form. In a binary form, we could think of that number like this
0.00000000000000000000000000000000000000000000000000000000000000000000010000000000000000000000000000000000000000000000000000...
With only a single binary bit turned on.
Phillip
Phillip on 1 Oct 2021
Thank you very much for taking the time to write such a detailed explanation. Very useful!!!

Sign in to comment.

Categories

Find more on Loops and Conditional Statements in Help Center and File Exchange

Products


Release

R2021b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!