Finding Number of Events with Values Consectively Greater Than a Threshold

I have temperature data called tmax (2650x1 double) that is basically just temperatures every single day in a row. Im trying to find the number of events in which the temperature was 90 degrees or above for 2 or more consecutive days. I've tried a few things with no luck, please help me with how to go about this. Thank you in advance for the help!

 Accepted Answer

Hi Claire,
I would use the findpeaks function, a very useful function I like to use for finding peaks of particular width without going through a loop. I showcase how you might be interested in using this function below, with a sample array A. I convert it to a binary sequence that is either larger than or smaller than the threshold of 90 degrees, and then I find the location of the start of the increased temperature, and the extent of this increased temperature.
% Generating fake temperature data
A = floor(100*rand(2650,1));
% Creating a binary array of when the value is greater than 90 degrees F
B = ones(size(A));
B(A < 90) = 0;
% finding the number of times the temperature is larger than 90 degrees F
% at least 2 times in a row
findpeaks(B, 'MinPeakWidth', 2)
[pks,locs,w,p] = findpeaks(B, 'MinPeakWidth', 2);
% pks provides the peak values, the locs provides the peak locations, the w
% provides the width of the peak, and the p provides the prominence of the
% peaks. w should provide some useful information to you.
Hope this code helps, you learned the function, and it proves useful for your application!
Michael

6 Comments

Thank you Michael, I have never used the findpeaks function this was a great solution!
My pleasure! The solutions by Image Analyst and Walter below are also great solutions!
Claire: I'd like you to reconsider my approach. It requires the Image Processing Toolbox rather than the Signal Processing Toolbox (where findpeaks() is), but it will give you the correct answer, unlike findpeaks() unless you put in additional code to fix where it didn't find peaks at the start or end of your signal. With my code, it finds the correct number immediately on the first pass. Here's a little demo to prove that my code correctly found the 5 heatwaves while findpeaks() did not. Scroll down to see the code and the plots of the two methods:
% Clean up
clc; % Clear the command window.
close all; % Close all figures (except those of imtool.)
imtool close all; % Close all imtool figures.
clear; % Erase all existing variables.
workspace; % Make sure the workspace panel is showing.
fontSize = 14;
% Generate 2650 temperatures at random, and plot them in a bar chart.
numPoints = 2650;
dailyTemperatures = [...
93.01 97.35 84.12 86.508 84.195 86.038 81.075 83.326 84.542 89.415 81.751 84.713 91.139 82.987 88.783 83.681 85.674 87.902 88.204 85.248 80.282 84.929 93.084 94.996 ...
86.96 90.1 84.28 94.217 95.693 87.538 84.79 83.012 91.955 87.647 89.053 90.558 85.558 83.294 85.688 80.359 82.743 86.484 84.795 89.066 89.983 86.72 86.788 87.182 86.791 ...
85.733 84.09 89.741 89.577 83.387 82.546 83.006 82.561 86.643 86.309 83.658 90.006 80.955 84.811 95.8 96.496 82.703 82.811 91.197 87.806 83.365 95 98];
%----------------------------------------------------------------------------------------------
% Image Analyst's code below.
subplot(2, 1, 1);
bar(dailyTemperatures);
ylim([80, 100]);
grid on;
yline(90, 'Color', 'r', 'LineWidth', 2); % Display the threshold.
% Measure lengths of stretches above 90:
props = regionprops(dailyTemperatures > 90, 'Area')
% Count how events are 2 days or longer
allLengths = [props.Area]
numEvents = sum(allLengths >= 2);
fprintf('The longest heatwave lasted %d days.\n', max(allLengths));
fprintf('Image Analyst code correctly finds there are %d heatwaves of more than 90 Degrees for 2 days or longer.\n', numEvents);
caption = sprintf('%d heatwaves found', numEvents);
title(caption, 'FontSize', 15);
%----------------------------------------------------------------------------------------------
% Incorrect code below.
% Generating fake temperature data
% A = floor(100*rand(2650,1));
subplot(2, 1, 2);
A = dailyTemperatures;
% Creating a binary array of when the value is greater than 90 degrees F
B = ones(size(A));
B(A < 90) = 0;
% finding the number of times the temperature is larger than 90 degrees F
% at least 2 times in a row
findpeaks(B, 'MinPeakWidth', 2) % Makes plot appear.
[pks,locs,w,p] = findpeaks(B, 'MinPeakWidth', 2);
fprintf('Incorrect code finds %d heatwaves.\n', length(locs));
caption = sprintf('%d heatwaves found', length(locs));
title(caption, 'FontSize', 15);
% pks provides the peak values, the locs provides the peak locations, the w
% provides the width of the peak, and the p provides the prominence of the
% peaks. w should provide some useful information to you.
%----------------------------------------------------------------------------------------------
So that's why I thought my answer should have been the accepted one. If you don't have the Image Processing Toolbox, then we'll have to add code to the findpeaks() code to handle cases where the beginning and ending of the signal may contain a heatwave. Let us know if that's the case.
Nice Point Image Analyst! You are right that my code only works for heatwaves that are not at the edges of the timeseries. I believe that my answer may have been accepted simply because I had respondend quite early to the question, provided relatively working code, albeit with edge problems that I had not considered initially, and it looks as though your response may have simply been after Claire had logged in to accept her answer based on the timestamps I see of the thread. Hopefully this change can be made, but thank you for your very robust solution! Of course, as you mention, there are ways to work around this edge case challenge with a some more preprocessing.
The stfind() solution I presented does not need any additional toolboxes, and takes leading / trailing patterns into account.
Yes, it was very clever. Many people don't know the trick of using strfind() on numerical arrays. It's not just for character arrays.

Sign in to comment.

More Answers (2)

Hint:
T = [0 0 0 0 1 0 0 0 1 1 0 0 0 0 1 1 1 0 0 0 0 1 1]
strfind(T, [0 1 1])

2 Comments

I've somehow never used the strfind function, did some more looking into it and tried this out and it worked perfect, thank you!
tmax01=tmax01';
idx=tmax01>90;
ii1=strfind([0 idx 0],[0 1]);
ii2=strfind([0 idx 0],[1 0])-1;
ii=(ii2-ii1+1)>=2;
out=arrayfun(@(x,y) tmax01(x:y),ii1(ii),ii2(ii),'un',0);
celldisp(out)
Using [0 idx is correct when you are looking for the beginning of patterns: it avoids problems if the part being searched begins with an appropriate pattern. Well done for discovering that.
Using idx 0] is not needed when searching for the beginning of a pattern: the extra 0 you are putting on can never match the 1 of your [0 1] pattern, and if the data to be searched happens to end in [0 1] then you would want that to be found when using [0 1] as the pattern. Adding the 0 does not do anything for you compared to using strfind properly.
Using idx 0] is, however, proper if you are looking for the end of a pattern.
ii=(ii2-ii1+1)>=2;
You do not need that. Consider
idx = tmax01(:).' > 90; %force row
N = 2; %number to find
ii1 = strfind([0 idx], [0 ones(1,N)]);
then ii1 will only match the beginning of locations that have a minimum of N 1's in a row. And for your purposes here, knowing the number of events, length(ii1) is all you need.
It is common to want to know the start and end of patterns, though. Instead of subtracting and comparing to a threshold, you can work more directly:
idx = tmax01(:).' > 90; %force row
N = 2; %number to find
ii1 = strfind([0 idx], [0, ones(1,N)]);
ii2 = strfind([idx 0], [ones(1,N), 0]) + N - 1;
Then you already know that ii1 and ii2 only match if the appropriate length was found, and ii1 and ii2 are direct indices to the beginning and end:
out = arrayfun(@(x,y) tmax0(x:y), ii1, ii2, 'uniform', 0)

Sign in to comment.

Yet another way is to use regionprops() to measure all the stretches of days above 90:
% Generate 2650 random temperatures, and plot them in a bar chart.
numPoints = 2650;
dailyTemperatures = 80 + 12 * rand(numPoints,1);
bar(dailyTemperatures);
ylim([80, 100]);
grid on;
yline(90, 'Color', 'r', 'LineWidth', 2); % Display the threshold.
% Measure lengths of stretches above 90:
props = regionprops(dailyTemperatures > 90, 'Area')
allLengths = [props.Area]
% Count how events are 2 days or longer.
numEvents = sum(allLengths >= 2);
fprintf('There are %d heatwaves of more than 90 Degrees for 2 days or longer.\n', numEvents);
Doing it this way would also let you see the longest heat wave:
fprintf('The longest heatwave lasted %d days.\n', max(allLengths));

Categories

Find more on Read, Write, and Modify Image in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!