How to find the most frequently repeated time interval?

Matlab warrior! In the data travels, there is a column with start and end time. Each interval has certain value or zero. I have to find the most repeated time (hot spot) among those travels which is not equal to zero. For ex, among two, 1st interval "06:30"-"07:30", and second "06:00 - 07:00", then hotspot would be "06:30-07:00". I have tried smth like:
load_dumb = zeros(24*60+1,1);
for i = 1:length(end_timedumb)
[~, ~, ~, H, MN, ~] = datevec(st_timesumb(i));
T_Stdumb = H*60+MN+1;
[~, ~, ~, H, MN, ~] = datevec(end_timedumb(i));
T_Enddumb = H*60+MN+1;
if isnan(T_Enddumb)
continue
else
minscale_DCH1(T_Stdumb:T_Enddumb,1) = minscale(T_Stdumb:T_Enddumb,1) + 1;
load_dumb(T_Stdumb:T_Enddumb,1) = load_dumb(T_Stdumb:T_Enddumb,1) + dumb_chargingpower(i);
end
end
TimeM = 1:length(minscale);
TimeH = TimeM/60;
figure
plot(TimeH,load_dumb)
but that gives me only the overall peak that values accumulated among intervals. However, I need to find specific interval that is the most common.

2 Comments

Can you attach some sample data?
Hey, thanks for the thread. I just did, there are several time durations, it's up to you which one to choose.

Sign in to comment.

 Accepted Answer

First I read in data as a string array
opts = detectImportOptions(filename,'TextType','string');
opts.SelectedVariableNames = {'Var2','Var3'}; % alternatively {'Var4','Var5'}, or {'Var6','Var7'}
D = readmatrix(filename,opts);
Convert each string to a duration in terms of minutes from 00:00
D = minutes(duration(D,'InputFormat','hh:mm'));
D(D(:,2)==0,2)=1440; % set 00:00 in the second column to 1440 minutes
To find the hotspot, I create a logical array, M, which has the same number of rows as D, and 1440 columns (one for each minute of the day). Values are set to true if they fall within the range:
M = false(size(D,1),24*60);
for c = 1:size(D,1)
M(c,D(c,1)+1:D(c,2))=true;
end
You can then sum up each column to find the frequency with which a given minute appeared in the time intervals.
figure(), plot(duration(0,0,0):minutes(1):duration(23,59,0),sum(M))
One thing to note, if the range from your data begins and ends on the same minute then that minute is included using the code as it is.

6 Comments

Also, if you want the best hotspot with a specific duration, for example 30 minutes, you can also apply a moving average as follows:
hotspotWindow = 30 % minutes
freqWindow = movmean(sum(M),[0 hotspotWindow-2],'EndPoints','fill');
[val,idx] = max(freqWindow);
figure(), plot(duration(0,1,0):minutes(1):duration(24,0,0),freqWindow)
hold on, plot(duration(0,idx,0),val,'or')
range = [duration(0,idx,0) duration(0,idx+hotspotWindow,0)] %non-inclusive
That an amazing, does the trick, thank you so much. However, I have two things to ask haha. I apologize for unclear explanation, but what I meant by when value equals zero, I meant not duration but different value that each duration has (in data its J column for example). Which defines if it's charging or not.
Second, what do you suggest if I want to find 4 maximum hot spot ranges (let's say 1 hour duration), not only one.
If it sounds like different question to you, I can make a different thread, you have already helped me a lot. Thanks
Do you want to exclude rows containing 00:00? I'm not sure what you mean.
For finding the four best hotspots do you want the four hotspots that maximises the sum of occurences, or to select best hotspot then the next from what remains and so on? The latter is straightforward enough the former might be worth another question.
No, i don't want to exclude rows containing 00:00, I mean for each duration there is a value that contains charging power. Which means if I want to include that third column, value changes. For example, in duration from 06:00 to 06:30 charging power can be zero. In that case, I don't want that duration to be counted as occurance.
Four highest hotspot that maximises the sum of occurances, correct.
No values in column J were zero, did you mean column I? That would be done as follows:
opts.SelectedVariableNames = {'Var9'}; % column I in excel
isOff = readmatrix(filename,opts)==0; % rows equal to 0
D(isOff,:) = []; % delete rows corresponding to a 0 in I.
As for finding the four hotspots which maximise occurrences, it is a little more involved, i suggest opening a new question on it.
@Turlough Hughes, thank you. I will make another question, you made my life much easier )

Sign in to comment.

More Answers (0)

Products

Release

R2020a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!