Finding Steepest Linear Portion of Data
16 views (last 30 days)
Show older comments
Jonathan Acheson
on 1 Sep 2015
Commented: Star Strider
on 5 Nov 2024
I've been trying to use fitlm as a way of finding the steepest linear portion of a graph but cannot get it to work correctly. Perhaps there is a better way I could be finding this out?
Essentially I have data (stress and strain) which looks like the following:

I'm trying to use Matlab to automatically find the steepest linear portion of the graph. This is what I have been trying so far using the fitlm function.
R=0;
for x = 100:size(strain)
fitlm(strain(10:x),stress(10:x));
if ans.Rsquared.Ordinary > R
R=ans.Rsquared.Ordinary;
index=x;
end
end
I can get the code to go along the data and find out where the r-squared value seems to fit best and identify a linear region of the data, but how can I get Matlab to identify the steepest linear region?
This is the output of plotting from the start of the data to the index point:

1 Comment
Oliver Ramos
on 17 Dec 2020
Hi Jonathan,
I have the same problem, I need to find the linear portion from my data points (quadratic) by finding the best R^2 based on a fixed window (number of data that forms a line). May I know how did you plot the resulting best linear portion from the code?
R=0;
for x = 100:size(strain)
fitlm(strain(10:x),stress(10:x));
if ans.Rsquared.Ordinary > R
R=ans.Rsquared.Ordinary;
index=x;
end
end
Accepted Answer
Star Strider
on 1 Sep 2015
You have a non-linear model, so fitting linear segments is likely not going to be easy. The best way to fit your data is to use a model of the process that created your data (if you have an analytical version of it) and then a non-linear regression, such as fitnlm or nlinfit. Then take the analytic derivative of the fitted function.
If you’re looking for linear segments in your data and don’t want to fit it, I would start by using the gradient function to calculate the numerical derivative of your data (since they appear reasonably noise-free). This should produce a series of ‘steps’ that are flat in the linear sections. Choose the one that meets your criteria in terms of length and ‘flatness’, and use it. One of the histogram functions could be helpful here, since you can define the ranges of the bins. This will eliminate a lot of tedious threshold programming. (I don’t have your data, so I can’t write code to analyse it.)
5 Comments
Abdulaziz Abutunis
on 4 Oct 2016
Edited: Abdulaziz Abutunis
on 5 Oct 2016
Hi Star Strider, I have the same problem, I have a set of non-linear data that has a linear segment close to the lift end of the curve. I wonder how would you suggest to use the gradient function to locate the best range of data that represents the linear segment and thin find the slop and the interception point. The data are as follow and the linear segment always crosses y coordinate means that the linear segment should have negative and positive elements from x array and usually between -4 and 5 but sometimes this range slightly increases or decreases.
X=[ -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10];
y=[ -0.2757 -0.3914 -0.4219 -0.3808 -0.3122 -0.1272 0.0800 0.1745 0.2545 0.3270 0.3932 0.4539 0.5091 0.5590 0.6040 0.6450];
Thanks in advance Aziz
Star Strider
on 5 Nov 2024
I cannot detect any parts that are actually linear. If that were so, there would be a regiion where the derivative (calculated by gradient) was compleetely flat, and the second derivative would then be zero. That could easily be detected. There is no such region here, however one portion could be considered ‘Nearly Linear’.
Try this —
X=[ -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10];
y=[ -0.2757 -0.3914 -0.4219 -0.3808 -0.3122 -0.1272 0.0800 0.1745 0.2545 0.3270 0.3932 0.4539 0.5091 0.5590 0.6040 0.6450];
dydX = gradient(y,X)
d2ydX2 = gradient(dydX,X)
islinear = d2ydX2 == 0
smallest_abs_d2ydX2 = min(abs(d2ydX2))
isnearlylinear = abs(d2ydX2) <= 5*smallest_abs_d2ydX2
figure
plot(X, y, DisplayName="Original Data")
hold on
plot(X, dydX, DisplayName="dy/dX")
plot(X, d2ydX2, DisplayName="d^2y/dX^2")
plot(X(isnearlylinear), y(isnearlylinear), ':r', LineWidth=1.5, DisplayName="‘Nearly Linear’")
hold off
grid
legend("Location","best")
That is how I would test it for ‘linearity’. Note that there is a strict mathematical definition of Linearity, and this acttually does not conform to that definition.
.
More Answers (1)
Image Analyst
on 1 Sep 2015
Steepest over what range of x? From one element to the next? Over 5 or 10 elements? Can you attach numerical data?
How about just scanning the data and fitting a line over the range, and keeping track of which location has the highest slope? Something like (untested):
maxSlope = -inf;
windowWidth = 5; % Whatever...
maxk = 1;
for k = 1 : length(x) - windowWidth
% Get a subset of the data that is just within the window.
windowedx = x(k:k+windowWidth-1);
windowedy = y(k:k+windowWidth-1);
% Fit a line to that chunk of data.
coefficients = polyfit(windowedx, windowedy, 1);
% Get the slope.
slope = coefficients(1);
% See if this slope is steeper than any prior slope.
if slope > maxSlope
% It is steeper. Keep track of this one.
maxSlope = slope
maxk = k;
end
end
2 Comments
Pietro Picerno
on 5 Nov 2024
Edited: Pietro Picerno
on 5 Nov 2024
very smart! Just tested. It works fine. I'd have preferred as an output an array of indices containing the less steep region, but I assume that maxk is the beginning of such a region. Thanks @Image Analyst
Image Analyst
on 5 Nov 2024
@Pietro Picerno for the least steep sections, you can do the obvious modifications:
minSlope = inf;
windowWidth = 5; % Whatever...
minSlopeIndex = 1;
for k = 1 : length(x) - windowWidth
% Get a subset of the data that is just within the window.
windowedx = x(k:k+windowWidth-1);
windowedy = y(k:k+windowWidth-1);
% Fit a line to that chunk of data.
coefficients = polyfit(windowedx, windowedy, 1);
% Get the slope.
slope = abs(coefficients(1)); % Use abs so it doesn't matter if it's a + or - slope.
% See if this slope is flatter than any prior slope.
if slope < minSlope
% It is steeper. Keep track of this one.
minSlope = slope
minSlopeIndex = k;
end
end
Not sure what you mean by an array. There is only one flattest portion, unless you have multiple sections where the slope is the same, like a slope value of 0. You can do that also if that's the case by doing something like this:
minSlope = inf;
windowWidth = 5; % Whatever...
minSlopeIndex = 1;
% First scan all the data to collect slopes everywhere.
% Need to do this first so we can determine the overall globa minimum slope.
for k = 1 : length(x) - windowWidth
% Get a subset of the data that is just within the window.
windowedx = x(k:k+windowWidth-1);
windowedy = y(k:k+windowWidth-1);
% Fit a line to that chunk of data.
coefficients = polyfit(windowedx, windowedy, 1);
% Get the slope.
thisSlope = abs(coefficients(1)); % Use abs so it doesn't matter if it's a + or - slope.
% Store the kth slope
allSlopes(k) = thisSlope;
end
% From the vector of all the slopes, find the minimum slope
minSlope = min(allSlopes)
% Now find the indexes where the slopes occur.
% Use find() because there may be more than one location.
minSlopeIndexes = find(allSlopes == minSlope)
If you want, you can examine allSlopes to determine all the slopes. You can sort them if you want.
See Also
Products
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!
