Speeding up the manipulation of large matrices

5 views (last 30 days)
Hello there,
I am trying to manipulate (specifically delete rows with a certain condition and sort rows) a matrix with over 30 million rows. But the system almost craches and disk usage reaches %100 almost all the time during execution. Now I know there must be something I can do to make it run faster but since I'm fairly new to matlab, I wonder if anyone knows how I can somehow optimize my code to run faster.
I use a one-line code simmilar to one "the cyclist" suggested in one of the questions here, which sets rows with a certain condition (identical values in a row) to zero.
temp=sort(temp,2);
temp(any(diff(temp,[],2)==0,2),:)=0;
now this part of the code comsumes the most amount of time (usually more than 300~400 seconds) while my laptop is completely unsable for the time.
Is there a way I can run this code faster, maybe without having my pc totaly frozenor is there another way of writing it?
Thank you.
  6 Comments
per isakson
per isakson on 16 Mar 2019
Edited: per isakson on 17 Mar 2019
Make sure there isn't a spurious function named sort that shadows Matlab's sort
which sort -all
will produce a huge list
>> which sort -all
built-in (C:\Program Files\MATLAB\R2018b\toolbox\matlab\datafun\sort)
built-in (C:\Program Files\MATLAB\R2018b\toolbox\matlab\datafun\@cell\sort) % Shadowed cell method
built-in (C:\Program Files\MATLAB\R2018b\toolbox\matlab\datafun\@char\sort) % Shadowed char method
...
Yes, memory might be a problem on your system. (I have 32GB ram.)
>> 30e6*24*8/1e9
ans =
5.7600
Swapping might explain your problem.
Hamed Moghaddam
Hamed Moghaddam on 16 Mar 2019
Edited: Hamed Moghaddam on 16 Mar 2019
Is there a way I can alter my code to make it less RAM-intensive? maybe spilt the matrix into several pieces and execute the same function?

Sign in to comment.

Accepted Answer

per isakson
per isakson on 17 Mar 2019
Your code, "diff(temp,[],2)==0,2)", made me think that temp holds whole numbers. If so, you could convert temp to an appropriate interger type. See cast, Cast variable to different data type and intmax, Largest value of specified integer type
>> z1 = randi( 5, 1e6,3 );
>> z2 = cast( z1, 'uint8' );
>> whos z*
Name Size Bytes Class Attributes
z1 1000000x3 24000000 double
z2 1000000x3 3000000 uint8
If there is a free memory slot in your computer you might want to add ram.
"spilt the matrix into several pieces" piecewise sorting with Matlab - no that's not an option for you - IMO.
  1 Comment
Hamed Moghaddam
Hamed Moghaddam on 17 Mar 2019
Edited: Hamed Moghaddam on 17 Mar 2019
Casting worked just fine. I changed the matrix to int8 and the amount of consumed RAM almoste halved.
Thnak you so much for your help!

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!