Generating a range of curves that fit inside a set of fixed limits

24 views (last 30 days)
Hi,
I have a dataset (upper and lower limits for a given X coordinate) where I want to generate a batch of curves that fit inside the dataset, to explore the worst case possible representations.
Dataset is:-
X Value , Lower Limit , Upper Limit
0.0 14.96 15.94
0.2 14.98 15.91
0.4 13.47 15.94
0.6 10.90 13.66
0.8 5.60 12.38
1.0 0.00 7.84
1.2 0.00 2.35
1.4 0.00 0.51
How would I go about it?
Many thanks, Mark.
  2 Comments
Adam Danz
Adam Danz on 24 Jan 2021
Edited: Adam Danz on 24 Jan 2021
What set of curves?
What defines the y coordinate (assuming 2d)? Are the lower/upper limits limiting y?
Mark Ingram
Mark Ingram on 24 Jan 2021
Hi Adam,
The Upper and Lower limits are shown versus the X values in the graph.
I would like to run a script that generates many curves (similar in shape to the average curve shown) that fit inside the upper and lower limits. I am trying to generate a bunch of curves that represent steep and shallow inflections that cover all of the areas between upper and lower limits. Also to explore bias towards the lower or to the upper. Just trying to find a way to represent all (or most obvious) possibilities.
Thanks,

Sign in to comment.

Answers (2)

Image Analyst
Image Analyst on 24 Jan 2021
Here's an example. Adapt as needed:
clc; % Clear the command window.
close all; % Close all figures (except those of imtool.)
clear; % Erase all existing variables. Or clearvars if you want.
workspace; % Make sure the workspace panel is showing.
format long g;
format compact;
fprintf('Beginning to run %s.m ...\n', mfilename);
% X Value , Lower Limit , Upper Limit
x = [...
0.0 14.96 15.94
0.2 14.98 15.91
0.4 13.47 15.94
0.6 10.90 13.66
0.8 5.60 12.38
1.0 0.00 7.84
1.2 0.00 2.35
1.4 0.00 0.51 ]
[rows, columns] = size(x);
% Get colors for all the lines.
cmap = jet(rows);
labelStrings = cell(rows, 1);
for row = 1 : rows
% Make 500 points between the left and right.
thisXAxis = linspace(x(row, 2), x(row, 3), 500);
% Assume the first column if the period, since we have no idea what it should be.
period = x(row, 1);
% Get a random amplitude
amplitude = rand;
% Get y
y = amplitude * cos(2 * pi * thisXAxis / period);
plot(thisXAxis, y, '-', 'Color', cmap(row, :), 'LineWidth', 4);
grid on;
hold on;
% Set up legend
legendStrings{row} = sprintf('Curve %d', row);
end
fontSize= 20;
xlabel('X', 'FontSize', fontSize);
ylabel('Y', 'FontSize', fontSize);
legend(legendStrings, 'Location', 'north');
g = gcf;
g.WindowState = 'maximized';
fprintf('Done running %s.m.\n', mfilename);
  1 Comment
Mark Ingram
Mark Ingram on 24 Jan 2021
Thanks for the example. Appreciate the quick turn around. I have provided some more information if that helps.

Sign in to comment.


Adam Danz
Adam Danz on 24 Jan 2021
Edited: Adam Danz on 24 Jan 2021
1. A sophisticated way would be to fit the curve, perhaps to a sigmoid or logistic fcn, and then adjust the fit parameters to produces a set of smooth curves that encompass the range of y values. That will likely requires a lot of fine tuning. If you're interested in slope and its variation, this would be the way to go.
2. Another suggestion is to implement a bootstrap process with whatever generated the data to begin with. You can resample from the distribution that gave you the bounds, with replacement, many times to generate many curves from the raw, resampled data and the slope can be computed from each bootstrap iteration to produce a normal distribution of slopes from which you can compute the mean and std.
3. A lower level solution is to linearly space y values at each range,
T = array2table([
0.0 14.96 15.94
0.2 14.98 15.91
0.4 13.47 15.94
0.6 10.90 13.66
0.8 5.60 12.38
1.0 0.00 7.84
1.2 0.00 2.35
1.4 0.00 0.51 ], ...
'VariableNames',{'x','lower','upper'});
% Add means since they weren't provided
T.y = mean([T.lower, T.upper],2)
T = 8x4 table
x lower upper y ___ _____ _____ ______ 0 14.96 15.94 15.45 0.2 14.98 15.91 15.445 0.4 13.47 15.94 14.705 0.6 10.9 13.66 12.28 0.8 5.6 12.38 8.99 1 0 7.84 3.92 1.2 0 2.35 1.175 1.4 0 0.51 0.255
% Plot mean curve
errorbar(T.x, T.y, T.y-T.lower, T.upper-T.y, 'LineWidth', 3);
% Add various lines within the bounds
hold on
nLines = 10; % <--- number of lines
yvals = cell2mat(arrayfun(@(i){linspace(T.lower(i), T.upper(i), nLines)}, 1:height(T))');
xvals = repmat(T.x, 1,nLines);
h = plot(xvals, yvals);
4. But the variation doesn't have to be only vertical. If you want noisy curves that are anywhere within the bounds you can generate random y values within each bound,
% Plot mean curve
figure()
errorbar(T.x, T.y, T.y-T.lower, T.upper-T.y, 'LineWidth', 3);
% Add various lines within the bounds
hold on
boundRange = range([T.lower,T.upper],2);
rng('default') % for reproducibility
for i = 1:50 % <--- number of lines
randYVals = rand(1,numel(T.x)).*boundRange' + T.lower';
plot(T.x, randYVals)
end
  5 Comments
Mark Ingram
Mark Ingram on 25 Jan 2021
You're right I've not been clear on the "why".
This curve is a characteristic that can vary between these limits (the limits are what the supplier says they will respect on delivery). I want to check that our compensation calibrations for the delivered unit will accomodate any "reasonable" curve that falls inside these boundaries.
I think the "bootstraping technique" is the way forward, however I have no experience of it and is the first time I have come across it.
Adam Danz
Adam Danz on 25 Jan 2021
Edited: Adam Danz on 25 Jan 2021
Thanks for the description. If those intervals were supplied byt he supplier and you don't have access to the underlying data that were used to compute the intervals, then the bootstrap method as I described it isn't available.
If the curve is something you generate and has some variation each time you generate it, you could store all of the curves you generated and bootstrap those data.
Brief description of bootstrapping
The main idea behind bootstrapping is this: your cuve is based on 8 coordinates. Let's say that curve is generated 20 times, now you have 20x8 y-values (assuming the x values do not change, but it's not a problem if they do). Assuming the x-values are independent, step 1 is to resample those 20x8 values with replacement to generate 1000 curves based on the same data. So col 1 is sampled 1000 times randomly, same with column 2 and so on. Now you have 1000x8 matrix containing resampled data from the original 20x8 matrix. You also now have 1000 curves. You can measure the bias in each of the 1000 curves to get 1000 bias values. Thanks to the central limit theorem, that distribution will be normal provided that you have enough bootstrap iterations (in this case, 1000). From that distribution of bias measurements, you can compute the mean and 95% confidence intervals (the 97.5 and 2.5 percentiles). That gives you a completely valid, scientific estimate of bias and error from the 20x8 original sample.
Alternatively, you could measure the bais from the original 20 curves and then bootstrap those values. That starts with a 20x1 (or 1x20) vector of baises from the 20 original curves, then you randomly sample it 1000 x to get 1000x1 biases and compute the mean and CI's. That might be the better approach since your x's aren't independent from the y's.
Problems with my suggestions #3 and #4
The problem with my suggestion #3 is that the curves vary mainly vertically with very little variation otherwise and that may not be how the actual data vary.
The problem with my suggestion #4 is that variation is too wild and likely generates curves that wouldn't actually occur naturally (I'm guessing; I have no idea what process bore the data).
So, basing estimate from unnatural curves won't be helpful to understand how the real data may vary.

Sign in to comment.

Products

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!