precision of double variables
Show older comments
The code below outputs 1 and 1.1 (and not 0.9):
for ii = -1.1:0.1:1.1
if ii >= 0.9
disp(ii)
end
end
If the first line is replaced as the following:
for ii = -1.2:0.1:1.2
the output becomes 0.9, 1, 1.1 and 1.2.
The difference betwwen the actual value of the variable ii and the expected value is epsilon=2.2204e-16.
Does anyone have any idea why replacing 1.1 by 1.2 causes the behavior?
Accepted Answer
More Answers (2)
David Goodmanson
on 17 Jun 2022
Edited: David Goodmanson
on 17 Jun 2022
Hi Arda,
you pretty much answered the question by mentioning precision. Most floating point numbers are not represented exactly in memory. That includes most rational numbers such as .9. So you can't always expect the floating point values in the for loop to exactly agree with a value established in some other way. In the following example, you would at least expect the two would-be values of .9 to agree with each other:
% save the values of ii using concatenation (not recommended but the list is small here)
iivals = [];
for ii = -1.1:0.1:1.1
iivals = [iivals ii];
end
iivals(21) - .9 % 21st value should be .9
ans = -1.1102e-16
a = (-1.1:0.1:1.1);
a(21) - .9
ans = 1.1102e-16
but they don't, and neither equals Matlab's best approximation to .9. The for loop value of ii is too small, so .9 doesn't get displayed. (Doing the 1.2 case will show that the value of ii is just over the line so .9 does get displayed).
To see how these values are stored in memory (Matlab uses the IEEE754 standard) you can use format hex:
format hex
iivals(21)
a(21)
.9
ans = 3feccccccccccccc % too small
ans = 3fecccccccccccce % too large
ans = 3feccccccccccccd
The fix is pretty simple: don't use floating point values in an indexing situation. For example, something like
for ii = -11:11
x = ii/10 % use this value to calculate stuff
if ii >= 9 % use ii for equality-type checks
end
avoids a lot of problems.
20 Comments
Les Beckham
on 17 Jun 2022
Good answer.
I find it odd that, in the context of a for loop, the number is slightly smaller than 0.9 but, in the context of a vector assignment (your a vector above), the number is slightly bigger than 0.9. Strange!
Can anyone explain that?
Stephen23
on 18 Jun 2022
"Can anyone explain that?"
- The vector is created explicitly at once, probably by multiplying the step by the indices or similar.
- The FOR loop is optimized so that it does not create the entire vector in memory, but generates the next value on each iteration, i.e. uses repeated addition.
But then what explains this behavior:
iivals = [];
for ii = [-1.1:0.1:1.1] % added brackets
iivals = [iivals ii];
end
a = [-1.1:0.1:1.1];
isequal(a,iivals)
For this use case, the values input to the for statement should be a valarray, which to my understanding would be evaluated prior to execution of the for loop. Yet its evaluation is different for the for loop and assignment?
Walter Roberson
on 21 Jun 2022
Interesting, Paul.
You are right that we would normally expect that syntax to precalculate the array and then iterate over the values. However, it is already known that the right hand side of the = is analyzed and : expressions without [] are converted into iteration (you can prove this by using an upper bound of inf)
We can speculate that possibly the [] around the colon expression is noticed and converted to iteration. There might be other cases. For
for i = (-1.1:0.1:1.1)
would be a good candidate for conversion.
Paul
on 21 Jun 2022
That comment only explains the behavior if we a priori assume that the parser treats the square brackets as superfluous. Insofar as that behavior is undocumented, AFAICT, it might be considered a surprise, especially when the use of square brackets here could actually reflect a specific intent, i.e., evaluate values exactly as if creating the same for assignment, for which there could be reasons to do so.
Stephen23
on 21 Jun 2022
"only explains the behavior if we a priori assume that the parser treats the square brackets as superfluous"
Not at all: observing the behavior is sufficient to demonstrate that different algorithms are at play. Anything supplementary to that is merely amusing speculation (which will have zero effect on my code).
"...especially when the use of square brackets here could actually reflect a specific intent, i.e., evaluate values exactly as if creating the same for assignment, for which there could be reasons to do so."
Well, if this is turning into a discussion of what is documented or not, then your "specific intent" is definitely not documented anywhere, you just made that up. The actual purposes of square brackets is documented here:
Nope, there is absolutely nothing there about "evaluate values exactly as if creating the same for assignment". In your example nothing is being concatenated together (i.e. the documented purpose of square brackets), therefore the syntax parser and JIT engine are completely within their intended purposes to optimize such superfluous things away. I would certainly hope that they do.
So far no big surprises.
Paul
on 21 Jun 2022
In accordance with the doc pages, I coded the for statement that way with intent, namely the first use of square brackets on the linked doc page, which is "array construction," and the third use of for, here. In other words, the intent of the code, as documented, is to construct the valArray input to the for loop and have the index variable loop over the columns. Yet no array was constructed. So the parser ignored the brackets and changed the implementation of the for loop from the third use, which was the intent, to the second, which wasn't, which yields a different result than expected based on the intent.
Experimentally, for with =(colon) or =[colon] or =colon all compare the same, but storing into an array beforehand is different.
iivals0 = [];
for ii = (-1.1:0.1:1.1)
iivals0 = [iivals0 ii];
end
iivals1 = [];
for ii = [-1.1:0.1:1.1] % added brackets
iivals1 = [iivals1 ii];
end
iivals2 = [];
for ii = -1.1:0.1:1.1 % added brackets
iivals2 = [iivals2 ii];
end
avalues = -1.1:0.1:1.1; %precompute 1
iivals3 = [];
for ii = avalues
iivals3 = [iivals3 ii];
end
avalues = [-1.1:0.1:1.1]; %precompute 2
iivals4 = [];
for ii = avalues
iivals4 = [iivals4 ii];
end
a = [-1.1:0.1:1.1];
parts = {a, iivals0, iivals1, iivals2, iivals3, iivals4};
for J = 1 : length(parts)
for K = 1 : length(parts)
compare(J, K) = isequal(parts{J}, parts{K});
end
end
N = {'[:] bulk', '(:)', '[:]', ': loop', 'array', '[array]'};
compares = array2table(compare, 'VariableNames', N, 'RowNames', N)
Paul
on 22 Jun 2022
Are precompute1 and precompute2 any different? Should the latter be for ii = [avalues] ?
I'm not surprised (anymore) that =(colon), =[:] and =colon all evaluate the same, but think that =[colon] and and =(colon) should evalutate the same as precompute1.
Walter Roberson
on 22 Jun 2022
precompute 1 vs precompute 2 is testing the hypothesis that perhaps when you have variable = colon operator as an assignment statement, that perhaps the result is different than if you had variable = [colon operator]... the hypothesis that perhaps colon operator works differently for a syntactical direct colon assignment than if it is inside []
Testing the same syntaxes in a for loop is a slightly different question.
I agree that I would expect for=[colon] to work differently than for=colon.
However, by this point my expectation of for=(colon) is the same as if the () were not present. The syntax of for permits extra (). For example
for (variable=colon)
is treated the same as if the () were not there.
Paul
on 22 Jun 2022
Ah, now I see the difference between precompute1 and 2. If those two yielded didfferent results, it would be extremely troubling.
Now I'm surprised that
for (variable = colon)
doesn't throw an error. Maybe I shouldn't be suprised about anything anymore.
"In accordance with the doc pages, I coded the for statement that way with intent, namely the first use of square brackets on the linked doc page, which is "array construction,""
No, the colon operator already constructs an array. Your syntax cannot construct an array which already exists.
Whether that array values explicitly exist in memory or not is of academic interest only: the array has been constructed by the colon operator insomuch as MATLAB is able to say "here is an array which I will iterate over". Whether it stores all of the values at once or generates them on the fly is of absolutely no relevance to me, the user. The MATLAB documentation does not (never has, never will) tell us such details, in exactly the same way that they will never tell us in how a million other features are internally implemented and can be optimized and changed over different versions.
"Yet no array was constructed. So the parser ignored the brackets ..."
No. Your and Walter Roberson's incorrect understanding of "array construction" is not supported by the documentation. Nowhere in the MATLAB documentation does it state that using square brackets et al will explicitly create all elements in memory. "Array construction" is just concatenation by another name, which, just to emphasize again, your various example does not do and hence is optimized away. As they should be.
"Experimentally, for with =(colon) or =[colon] or =colon all compare the same, but storing into an array beforehand is different."
Thank you for taking the time to confirm what I expect and wrote.
So far absolutely no surprises.
"I agree that I would expect for=[colon] to work differently than for=colon."
Yet that is entirely inconsistent with the meaning of square brackets (concatenation), the optimization of FOR loops, the reality of all of your own tests, and the MATLAB documentation.
Paul
on 22 Jun 2022
No, the colon operator already constructs an array
Apparently the colon operator is not constructing an array when used with for. In fact, the Special Characters documentaion calls out "Vector creation" and "For loop iteration" as different uses of colon. Colon certainly is not constructing an array with for in the same way that colon contsructs an array on the rhs of an assignment statement. I agree that whether or not the array is preallocated is not really relevant. What is relevant is that the exact same [colon] expression returns different results, or perhaps more accurately has a different behavior, when used in different contexts.
"Array construction" is just concatenation by another name,
Why does the Special Characters documentation show "Array construction" and "Array concatenation" as two different uses of Square Bracket if they are really the same thing?
"Apparently the colon operator is not constructing an array when used with for."
It seems that we disagree on the definition of "construct" an array, which for me is something like "defines an array in such as way that its class, size, and element values can be accessed (e.g. by indexing)." Note that this definition does not require the elements to explicitly exist in the computer's memory (it does even refer to memory), but the array certainly has been constructed because its elements can be referred to by the user. Whether MATLAB stores them all in a long list in the computer's memory or magically makes them appear when I want them is to me, the user, a completely irrelevant implementation detail. The array has been constructed: it has a class, a size, and elements that can be retrieved.
This definition is also consistent with the fact that no computer has ever stored an array, just representations of them. Arrays are (much like triangles and numbers) abstract concepts that can only be written/stored using a representation of them. But there is nothing inherently special about one representation or another. Consider a sparse array: once the user has defined it, would you state that it has been "constructed" ? Most of the sparse array's elements are not in memory, yet we can get its class, size, and value of any element (including the zeros, which are not stored explicitly). So... is it constructed? Extend that argument to its logical conclusion: is there a canonical form for storing arrays in computer memory? No, there are infinite possible formats that we could store arrays in memory. Do these formats require storing elements explicitly? No, an infinite subset of those infinite formats could implicitly store any number of their values in an infinite number of ways (sparse are one example of this). Is such an array "constructed"? Yes, when we can access its class, size, and element values.
In contrast if we follow your argument to its logical conclusion we end up with the situation that only some of those infinite array formats meet your (so far unwritten) definition of "constructed" whereas other do not, even though the user can equally access the class, size, and element values of all of them (and sees no practical difference between them). As a simple MATLAB user, I see little added value in such a definition.
"Why does the Special Characters documentation show "Array construction" and "Array concatenation" as two different uses of Square Bracket if they are really the same thing?"
In one word: users. Users are not going to search for or understand "concatenate lots of scalar arrays" when they want to create a vector with some numbers. Would you? I certainly wouldn't have, when I was first learning MATLAB. It is reasonable that users will search for / understand "array construction", but in practice it is simply the concatenation of scalar arrays (all numbers in MATLAB are scalar arrays, there is no special scalar class). From a functional perspective there is absolutely no difference between "array construction" and concatenation of scalar arrays.
Note that if you claim these terms are fundamentally different, then in what precise way are these four operations actually different?:
[1,ones(1,3)]
[1,ones(1,2)]
[1,ones(1,1)]
[1,1]
Remember, scalars are not a special class, they are perfectly ordinary 1x1(x1x1...) arrays.
So, still absolutely no surprises here.
Walter Roberson
on 22 Jun 2022
There are three documented syntaxes for "for"
- initVal:endVal — Increment the index variable from initVal to endVal by 1, and repeat execution of statements until index is greater than endVal.
- initVal:step:endVal — Increment index by the value step on each iteration, or decrements index when step is negative.
- valArray — Create a column vector, index, from subsequent columns of array valArray on each iteration. For example, on the first iteration, index = valArray(:,1). The loop executes a maximum of n times, where n is the number of columns of valArray, given by numel(valArray(1,:)). The input valArray can be of any MATLAB® data type, including a character vector, cell array, or struct.
The forms with the colon operator directly after the = are defined in terms of incrementing a variable, rather than in terms of evaluating the colon operator and using the resulting array as the value array.
When you use
for ii = [-1.1:0.1:1.1]
then that does not match either of the first two syntaxes, and so must be a match for the third syntax -- which involves first evaluating the right hand side and then indexing into the result.
Therefore, according to the documentation
for ii = [-1.1:0.1:1.1]
should be the same as
iivals = [-1.1:0.1:1.1];
for ii = iivals
But, experimentally, it is not the same. Experimentally for ii = [-1.1:0.1:1.1] is the same as for ii = -1.1:0.1:1.1 which uses the increment strategy.
There is no documented special case for ii = [-1.1:0.1:1.1] as being the same as for ii = -1.1:0.1:1.1
@Paul is completely right to expect that the for ii = [-1.1:0.1:1.1] case will evaluate [-1.1:0.1:1.1] into a temporary internal location and then iterate over its columns, as that is what is documented as happening except for the initVal:endVal and initVal:step:endVal cases.
"then that does not match either of the first two syntaxes, and so must be a match for the third syntax... "
Irrelevant. Either:
- the MATLAB engine removes your superfluous fluff (unlike what you state, square brackets et al are not documented to actually create an explicit array in memory), in which case the 1st and 2nd syntaxes are perfectly relevant.
- the definition I already gave (which you accepted by not disagreeing with it or providing your own definition) does not require the explicit listing of all values in memory. So the array is constructed, just as I defined, and MATLAB iterates over it, following the 3rd syntax.
So every syntax is consistent with my explanation already given.
"@Paul is completely right to expect that the for ii = [-1.1:0.1:1.1] case will evaluate [-1.1:0.1:1.1] into a temporary internal location"
Where is that supported by the documentation? Answer: nowhere. Nowere in the documentation does it talk about "temporary internal locations" or anything about the implementation. Never has, never will.
Question: why are you trying to convince me of something that disagrees with your own test results ? To be honest I am struggling to follow your argument:
- provide test evidence of A
- claim ~A
It is certainly an innovative strategy.
Paul
on 23 Jun 2022
Hi @Stephen23
First, I think we all understand how for operates in practice for the case under discussion (for ii = [colon]). We all understand @Walter Roberson's test results, and nobody is trying to use those results to demonstrate that for works any differently than it actually does.
Second, speaking for myself, my mental model of how I think for should work in this case is that the parser sees I'm asking for the third documented syntax, evaluates the valarray on RHS of the = sign, and then loops index over the columns of the valarray. It is appealing, at least to me, to think of this process as actually constructing an array in a termporary internal location to understand the algorithmic flow. But I certainly realize that such construction doesn't have to happen and I don't care if it does, as long as the end result is the same as if such an array was actually constructed.
What I do care about is if Matlab is self-consistent in its syntax and with the documentation. From a user perspective, I don't think it is (surely, for could be improved to mention this issue). Others may disagree.
Are you aware of any cases besides for where [colon] (or (colon) for that matter) in an expression is not functionally equivalent to how [colon] is evaluated on the RHS of an assignment statement?
Hello Paul,
Good points, thank you for your comment.
I think what you term the "mental model" is highly relevant to this topic. Really all code (and by extension, all explanations/mental models of code) are just abstractions of what actually happens deep inside the computer. Thus we must consider this paradigm:
Levels of abstraction suit different purposes (and also incite endless flame wars about whose language rulz)... but none the less, every model must necessarily simplify, which means they are wrong.. But some models are useful, because they make it easier for us understand or do something (example: any map).
MATLAB suits a particular type of work, it is not a low-level language. Not only that, TMW have made it quite clear that they will not give details of the low-level implementation or optimization, with (IMHO) good reason (for the kind of language it is):
All kinds of things can and regularly do change in the optimization. As you wrote, "...I don't care if it does, as long as the end result is the same..."
So... what is the meaning of "same" ?
Over the years I have seen minor tweaks to various functions which changed the output bits just slightly, i.e. the usual binary floating point noise. And then some users would complain, because algorithm B delivered different bits to algorithm A, which they previously used and loved. Did TMW do something wrong? Are the outputs the "same" ? Inasmuch as both algorithms are numerically equally good (or equally bad), returning the same values but slightly different numerically meaningless floating point noise, ... are the values the "same"? From a scientific point of view, they are indistinguishable (error propogation is not just related to numeric computing, but is a well-defined topic that is relevant to all measurements and calculations). Or, in George Box's terms, the difference is not "importantly wrong".
That is my mental model of MATLAB: as a user, my code should not rely on a specific algorithm that delivers specific bits (because these are numerically meaningless noise quadrillions of times smaller than the precision of my data) that TMW have already explicitly stated that they will not document. I find that model useful.
"Are you aware of any cases besides for where [colon] (or (colon) for that matter) in an expression is not functionally equivalent to how [colon] is evaluated on the RHS of an assignment statement?"
They are already functionally equivalent.
I guess you are actually asking about the implentation: I can't think of any other example... but then according to my mental model there is also no reason to assume that RHS assignment necessarily requires an explicit array in memory either: perhaps this year's release does, but that is no reason to assume that next year's relase will not do something else. As long as they are functionally the same, I don't care.
Walter Roberson
on 24 Jun 2022
The third form of for is defined as doing indexing. It would be against the documentation to use any kind of incremental computation for the third form.
Welcome to the world of numerics with limited precision.
These are the expected effects. You observe the value mentioned in the frequently asked question:
If you start at -1.1, you see the deviation at 1.0:
for ii = -1.1:0.1:1.1
if ii >= 1.0
disp(ii)
end
end
If you start at -1.2 you see it at 0.9, so this is simply shifted by 0.1 .
Categories
Find more on Creating and Concatenating Matrices in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!