Why is assignment with RedefinesParen so slow

I want to make a class that wraps around an array, so I tried matlab.mixin.indexing.RedefinesParen to allow interfacing with the object as if it were an array. I'm finding that assigning to the wrapped array is typically 40-80x slower than regular assignment to an array. Here is a simple test class:
classdef TestCustomParen < matlab.mixin.indexing.RedefinesParen
properties
X
end
methods
function obj = cat(dim,varargin)
obj.X = cat(dim,varargin{:});
end
function sz = size(obj)
sz = size(obj.X);
end
end
methods(Static)
function obj = empty
obj = TestCustomParen;
end
end
methods(Access=protected)
function x = parenReference(obj, indexOp)
x = obj.X(indexOp.Indices{1});
end
function obj = parenAssign(obj,indexOp,varargin)
obj.X(indexOp.Indices{1}) = varargin{1};
end
function n = parenListLength(obj,indexOp,ctx)
n = listLength(obj.X,indexOp(3:end),ctx);
end
function obj = parenDelete(obj,indexOp)
obj.X.(indexOp) = [];
end
end
end
I'm not too concerned with what all the implementations are doing, just focsuing on parenAssign here.
And the timing script:
x = TestCustomParen;
tic
for ind = 1:1e4
x(ind) = rand;
end
toc
y = [];
tic
for ind = 1:1e4
y(ind) = rand;
end
toc
And the result:
Elapsed time is 0.085126 seconds.
Elapsed time is 0.001309 seconds.
(It seems I can't actually run this code in the post)
I also tried a version of this where the arrays are both pre-allocated, and in that case the difference becomes even more dramatic (more than 100x slowdown).
Is this slowdown just due to the overhead of accessing indexOp and opening cells? Is is possible to get this sort of behavior in a performant way? The documentation says overriding subsref is no longer recommended, but would that be faster?

 Accepted Answer

Here is the proper way to time the operations. The difference isn't that big.
x = TestCustomParen;
x.X=rand(500);
y=x.X;
a=rand;
timeit(@()assig(x,a))\timeit(@()assig(y,a)),
ans = 0.8566
timeit(@()Assig(x,a))\timeit(@()Assig(y,a)),
ans = 0.9793
function assig(z,a)
z(1)=a;
end
function Assig(z,a)
z(:)=a;
end

12 Comments

Can you elaborate? This does not seem to just be a different timing method, but timing something else alltogether. Instead of assigning a random value in a loop to different indices, you're assigning a constant value to a constant set of indices.
Here is something that looks more like what you have but timing a loop over indices as well. It's not too surprising that assigning to a constant set of indices allows for some optimization, but my use case is random access/assignment to a wrapped array.
n = 1e4;
x = TestCustomParen;
x.X = zeros(1,n);
y = x.X;
timeit(@() assig(x,n)) / timeit(@() assig(y,n))
ans = 48.6528
timeit(@() assig2(x)) / timeit(@() assig2(y))
ans = 1.5216
function assig(z,n)
for ind = 1:n
z(ind) = 1;
end
end
function assig2(z)
z(1) = 1;
end
Looks like something might actually have been improved in 23b since this is just showing a ~100x while on my 23a laptop I'm seeing ~300x to ~1000x.
Your question seems to be, will parenAssign allow you to achieve the same element-by-element assignment speed in a for-loop as a builtin numeric vector y. The answer seems to be no, probably because you are still repeatedly invoking an mfunction, which will never be as fast as JIT-accelerated builtin code. Although, I invite @James Lebak to comment on that.
However, if you vectorize the assignment, the issue mostly goes away. Also, the revised demo below does seem to confirm that parenAssign overloading is significantly faster than subsasgn overloading when the vector data isn't too large.
for n=logspace(0,6,7)
x = TestCustomParen; %uses overloaded parenAssign
x.X=rand(1,n);
obj = myclass(x.X); %uses overloaded subsasgn
y=x.X;
I=randperm(n);
T=timeit(@()assig(y,I));
n,
parenTime = timeit(@()assig(x,I))/T,
subsasgnTime=timeit(@()assig(obj,I))/T,
disp ' '
end
n = 1
parenTime = 1.8960
subsasgnTime = 5.2336
n = 10
parenTime = 1.5231
subsasgnTime = 4.1442
n = 100
parenTime = 3.7847
subsasgnTime = 10.9212
n = 1000
parenTime = 1.1000
subsasgnTime = 2.6711
n = 10000
parenTime = 1.1730
subsasgnTime = 1.7853
n = 100000
parenTime = 1.0259
subsasgnTime = 1.1305
n = 1000000
parenTime = 1.0050
subsasgnTime = 1.0131
function assig(z,I)
z(I) = 1;
end
Thanks for checking the subasgn method for me.
I wouldn't expect to be able to get the same speed as using built in array assignment, I would be happy if it was just a few times slower, but being 50x slower makes it unusable for my application.
Maybe they'll come up with a lighter-weight interface in the future to support custom reference/assign methods without needing varargin, two cell references, and an addition dot reference. For example I can't just do (Unable to use a value of type matlab.indexing.IndexingOperation as an index.)
function obj = parenAssign(obj,indexOp,varargin)
obj.X(indexOp) = varargin{1};
end
Even though an IndexingOperation can apparently be used as an index of the object as done here. So maybe they're working on it.
I would be happy if it was just a few times slower,
It is, when you vectorize.
but being 50x slower makes it unusable for my application.
Well, it seems like you can avoid that in R2023b, if you access obj.X directly as in the test below. Note that this doesn't call your parenAssign method at all. The test also shows how incredibly slower it is to do the same thing with subsasgn overloading.
n=1e4;
x = TestCustomParen;
x.X=rand(1,n);
y=x.X;
obj = myclass(x.X); %uses overloaded subsasgn
T=timeit(@()assigNum(y));
parenTime=timeit(@()assig(x))/T
parenTime = 6.0145
subsasgnTime = timeit(@()assig(obj))/T,
subsasgnTime = 1.1246e+04
function assig(z)
for ind = 1:numel(z.X)
z.X(ind) = 1;
end
end
function assigNum(z)
for ind = 1:numel(z)
z(ind) = 1;
end
end
As Matt says, overloaded indexing of any kind is always going to be slower than built-in indexing. RedefinesParen is designed to be faster than subsasgn, but it has to call a MATLAB method which built-in indexing doesn't have to do.
One thing you could try is to use indexing operation forwarding instead of using the elements directly. Instead of writing
x = obj.X(indexOp.Indices{1});
It should be faster to execute
x = obj.X.(indexOp(1));
That forwards whatever indexing is in the first element of the indexing operation to X. In this case that'll do paren-indexing on X.
Another note is that I don't advise returning x directly like you're doing in your example, because then the parenReference method is returning a different class than the class it was called on. This isn't prohibited, but it is usually better to return another instance of the same class.
The documentation for forwarding is here:
And you can see the use of forwarding in the map class example here:
One thing you could try is to use indexing operation forwarding instead of using the elements directly.
That does seem to help substantially in R2023b.
n=1e4;
x = TestCustomParen; %uses overloaded parenAssign
x.X=ones(1,n);
obj = myclass(x.X); %uses overloaded subsasgn
y=x.X;
v=num2cell(rand(1,n));
T=timeit(@()assig(y,v));
parenTime = timeit(@()assig(x,v))/T,
parenTime = 4.6619
subsasgnTime=timeit(@()assig(obj,v))/T,
subsasgnTime = 224.2956
function assig(z,v)
for i=1:numel(v)
z(i) = v{i};
end
end
Thanks for the references, I've only skimmed them but I'll look in more detail. I don't immediately see a better way to do the parenAssign method with this forwarding. The parenAssign example does
[obj.(indexOp(2:end))] = varargin{:};
which I did not quite understand. Does it only work when the value being assigned is of the same class as obj? If you see how the example in my OP can be modified to run faster please let me know. I feel like it's a pretty simple use case, I want a wrapper for an array that can be operated on like an array but does some other stuff under the hood. Like:
wrappedArray=TestCustomParen;
>> wrappedArray(4)=1;
>> wrappedArray(4)
ans =
1
It would not be the end of the world if I had to do something like this, if it were much faster:
wrappedArray=TestCustomParen;
>> wrappedArray(4)=TestCustomParen(1); % Assign a wrapped scalar
>> wrappedArray(4)
ans =
1
"it is usually better to return another instance of the same class"
I'm a little confused by this statement though, would you also return another instance of the wrapper class that just wraps a scalar double? Like:
wrappedArray=TestCustomParen;
>> wrappedArray(4)=TestCustomParen(1); % Assign a wrapped scalar
>> wrappedArray(4)
ans =
TestCustomParen with properties:
X: 1
In fact the parenReference method in that mapping class example seems to just be returning map values whatever their type so maybe that's not a big deal.
If you see how the example in my OP can be modified to run faster please let me know.
I have already done so in my last comment.
Nice, so I can change
obj.X(indexOp(1).Indices{1}) = varargin{1};
to
obj.X.(indexOp(1)) = varargin{1};
which certainly looks nicer, but I'm not seeing a noticable difference in performance. Is this something that might have been improved in 23b?
Matt J
Matt J on 2 Oct 2023
Edited: Matt J on 2 Oct 2023
Is this something that might have been improved in 23b?
Probably. I don't see nearly as much speed-up on my local Matlab installation (R2021b).
Cool, I'll hit accept on the answer. Thanks both for the info. @James Lebak I'd be grateful if you could indicate if there's been any significant performance improvements to this stuff in 23b. The above tests seem to indicate that going from 23a to 23b and modifying my code from
obj.X(indexOp(1).Indices{1}) = varargin{1};
to
obj.X.(indexOp(1)) = varargin{1};
results in a more than 20x speedup.
In 23b we made significant improvements to dot-indexing performance. On my machine I see your original code running about 2x faster in 23b versus 23a, and my suspicion is that those improvements are probably the cause.

Sign in to comment.

More Answers (0)

Categories

Products

Release

R2023a

Asked:

AB
on 2 Oct 2023

Commented:

on 3 Oct 2023

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!