Running Matlab in Parallel on local machine or another suggestion
5 views (last 30 days)
Show older comments
I have a matlab code that reads two sets of text files AAA_1.txt ......AAA_2000.txt and BBB_1.txt ......BBB_2000.txt . The files very smalls (100 lines ) but there are 1000s of them. The loop in the matlab reads the file AAA_1.txt and BBB_1.txt and do some operation on them. Each steps is couple of seconds. But due large number of files the total runtime is really large. I have i7 9th generation core and 64 gigs of memory and also 6 gibs of graphics card. Is there any ways I can use more of my memory and gpu to run the program faster. What will be the best way to run it faster?
4 Comments
Voss
on 28 Jan 2022
It's likely that your code can re-written to be more efficient. It probably shouldn't take 2 seconds to read a 100-line file; it should take more like 2 milliseconds.
Answers (2)
Voss
on 30 Jan 2022
You've got 2 nested for loops there, the outer one looping over all the F.* files and the inner one looping over all the Nareplicate.* files. So with 5 files of each type, the inner loop will run 25 times, performing the same set of operations 5 times on each pair of files. With 1000 files your inner loop would execute 1 million times (doing the same thing on each pair of files 1000 times), so fixing that should significantly speed things up, I would expect.
The current two-loop implementation with those 5 pairs of files takes ~1s:
clc
clear all
tic
S = dir('F.*');
for k = 1:numel(S)
S(k).name;
T = dir('Nareplicate.*');
for k = 1:numel(T)
T(k).name;
Nadump = dlmread(T(k).name, ' ', 9, 0);
Fdump = dlmread(S(k).name, ' ', 9, 0);
L1 = length(Nadump);
L2 = length(Fdump);
for i=1:L1
for j=1:L2
X(i)= sqrt((Fdump(j,3)-Nadump(i,3))^2 + (Fdump(j,4)-Nadump(i,4))^2 + (Fdump(j,5)-Nadump(i,5))^2);
X(i) = X(i)/10;
Y(j,i) = (X(i));
%X(i)= sqrt((Fdump(j,3)-Nadump(i,3))^2 + (Fdump(j,4)-Nadump(i,4))^2 + (Fdump(j,5)-Nadump(i,5))^2);
%X(i) = X(i)/10;
%Y(j,i) = (X(i));
end
end
%S = zeros(L2, L1);
%for j = 1:L2
%S(j,:) = sort(Y(j,:));
%end
%S1= S(:,1);
Y1 = Y';
Y2 = sort(Y1);
S1= sort ((Y2(1,:))');
%Find indices to elements in first column of A that satisfy the equalit
ind1 = S1(:,1) < .28;
ind2 = S1(:,1) < .55;
ind3 = S1(:,1) < .78;
%ind4 = S(:,1) > .79;
%Use the logical indices to index into A to return required sub-matrices
A1 = S1(ind1,:);
A2 = S1(ind2,:);
A3 = S1(ind3,:);
%A4 = S(ind4,:);
Q1(k,:) = [length(A1), (length(A2)-length(A1)), length(A3)-length(A2), 125-length(A3)];
end
end
W= sum(Q1)/(length(T));
W1 = W/125;
toc
bar(diag(W1),'stacked', 'BarWidth', 1)
dlmwrite('Ion-Pair-Stat.txt',Q1,'delimiter','\t','precision',3)
Using one loop takes ~0.3s:
clc
clear all
tic
S = dir('F.*');
T = dir('Nareplicate.*');
for k = 1:numel(S)
S(k).name;
T(k).name;
Nadump = dlmread(T(k).name, ' ', 9, 0);
Fdump = dlmread(S(k).name, ' ', 9, 0);
L1 = length(Nadump);
L2 = length(Fdump);
for i=1:L1
for j=1:L2
X(i)= sqrt((Fdump(j,3)-Nadump(i,3))^2 + (Fdump(j,4)-Nadump(i,4))^2 + (Fdump(j,5)-Nadump(i,5))^2);
X(i) = X(i)/10;
Y(j,i) = (X(i));
%X(i)= sqrt((Fdump(j,3)-Nadump(i,3))^2 + (Fdump(j,4)-Nadump(i,4))^2 + (Fdump(j,5)-Nadump(i,5))^2);
%X(i) = X(i)/10;
%Y(j,i) = (X(i));
end
end
%S = zeros(L2, L1);
%for j = 1:L2
%S(j,:) = sort(Y(j,:));
%end
%S1= S(:,1);
Y1 = Y';
Y2 = sort(Y1);
S1= sort ((Y2(1,:))');
%Find indices to elements in first column of A that satisfy the equalit
ind1 = S1(:,1) < .28;
ind2 = S1(:,1) < .55;
ind3 = S1(:,1) < .78;
%ind4 = S(:,1) > .79;
%Use the logical indices to index into A to return required sub-matrices
A1 = S1(ind1,:);
A2 = S1(ind2,:);
A3 = S1(ind3,:);
%A4 = S(ind4,:);
Q1(k,:) = [length(A1), (length(A2)-length(A1)), length(A3)-length(A2), 125-length(A3)];
end
W= sum(Q1)/(length(T));
W1 = W/125;
toc
bar(diag(W1),'stacked', 'BarWidth', 1)
dlmwrite('Ion-Pair-Stat.txt',Q1,'delimiter','\t','precision',3)
And simplifying the computation cuts the time down again by almost half (if anything is not clear about what I did here, you can put a break point and inspect the variables and convince yourself that it's doing the same thing it used to do, and/or come back here and post a comment and I'll explain it):
clc
clear all
tic
S = dir('F.*');
T = dir('Nareplicate.*');
N = numel(S);
Q1 = zeros(N,4);
for k = 1:N
Nadump = dlmread(T(k).name, ' ', 9, 0).';
Fdump = dlmread(S(k).name, ' ', 9, 0);
L1 = size(Nadump,2);
L2 = size(Fdump,1);
Y = zeros(L2,L1,3);
for m = [3 4 5]
Y(:,:,m-2) = Fdump(:,m)-Nadump(m,:);
end
S1 = min(sqrt(sum(Y.^2,3))/10,[],2);
N1 = nnz(S1 < 0.28);
N2 = nnz(S1 < 0.55);
N3 = nnz(S1 < 0.78);
Q1(k,:) = [N1 N2-N1 N3-N2 L2-N3];
end
W1 = sum(Q1,1)/N/125;
toc
bar(diag(W1),'stacked', 'BarWidth', 1)
dlmwrite('Ion-Pair-Stat.txt',Q1,'delimiter','\t','precision',3)
If you take any or all of those changes, I bet you will see significant improvement in the speed of your code when you run it on the real (1000s of files) case.
Walter Roberson
on 28 Jan 2022
∫Are the files already stored on an SSD ?
If not then are they split between two (or more) hard drives? Preferably on different controllers?
Generally speaking, the peak performance for hard drives is typically two reading processes per drive, one (sometimes two) drives per controller.
I have been testing some Samsung BAR+ USB 3.1 Flash Drives (makre sure you get 128 Gb or later version, the smaller ones are slower.) On a very new M1 MacBook, I am reading about 304 megabits/second from them; on my 2013 iMac and an external USB3 hub, I am reading about 386 megabits/second on them. Write speed is only on the order of 60 megabits/second but the read speed is very nice.
A little over a year ago, I connected an external thunderbolt external drive bay to my 2013 iMac; with it and WD Red drives or HG Star drives, I am about to write about 225 megabits/second and read about 285 megabits/second . Reading speed is not as good as those new flash drives... on the other hand I am running it through a Thunderbolt 2 <-> Thunderbolt 3 convertor, and would likely get a significant performance improvement if I were to switch it to my newer iMac .
My Samsung EVO (SSD) drive in the same external enclosure is giving me write speeds about 490 megabits/second and read speeds about 530 megabits/second
... The point being that paying attention to what kind of drives you have and how they are connected can help gain a significant performance improvement.
If you are using a USB 2 drive, then if you have a USB 3 controller, pick up a quality SSD drive, or a quality flash drive. The Samsung BAR+ 128 Gb drives cost me only $C30 each.
0 Comments
See Also
Categories
Find more on Python Package Integration in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!