How to perform nonlinear regression accross multiple datasets

Appolgies in advance as I am new to MATLAB.
I am trying to fit a model to mutiple data sets at once using non linear regression. I have found similiar examples but I am unable to modify them to suit my needs.
The model contains 3 unkown paramaters that must be tuned to satsifty (or give best model fit) accross 4 data sets at once.However, the model also contains 1 known paramater which is different for each of the 4 datasets.
Model to fit:
  • ΔRon/Ron are the data set y values
  • t is the data set x values
  • A1, A2, γ are unkown paramaters (common to all data sets) which must be found
  • tau is a kown paramaer whcih differs accross all data sets
I have attached an m-file with relevant data and information. If sombody could provide guidance or a commented solution I would be very grateful. Thanks.

8 Comments

Hi, Jack, how about the result below:
Root of Mean Square Error (RMSE): 2.95053109543145
Sum of Squared Residual: 1392.90139921726
Correlation Coef. (R): 0.990249551962065
R-Square: 0.98059417516107
Parameter Best Estimate
-------------------- -------------
A1 0.0408605886292576
A2 10.6063188291707
gamma 1.93086466662657E23
The overall results are not so good.
Dtat-1:
Data-2:
Data-3:
Data-4:
Hi @Alex Sha, yes a solution like this is what I am looking for. Indeed the fits are not great accross all data sets. But this may be due to error in the tau values I have provided. Would you be able to share the code that you used to obtain this solution please.
Also would be able to share code that returns tau values (unique to each data set) that will allow a better model fit. In a hypothetical situation where tau is not initaly given. If this is not too much trouble. Thanks a lot.
Hi, if taking tau as unknown parameter for each dataset, the result will be:
Root of Mean Square Error (RMSE): 0.14007204526557
Sum of Squared Residual: 3.13922845838077
Correlation Coef. (R): 0.997402065114036
R-Square: 0.994810879493743
Parameter Best Estimate
-------------------- -------------
A1 0.280360966183305
A2 13.4523349717593
Gamma 37464.657062153
tau (for dataset1) 2.4789258139085
tau (for dataset2) 4.89126227167148
tau (for dataset3) 68.7724316646379
tau (for dataset4) 0.660921515735491
Dataset1:
Dataset2:
Dataset3:
Dataset4:
The above results are obtained by 1stOpt, a package other than Matlab, the code looks like:
VarParameter Tau>0;
Variable x,y;
Function y=A1*log(1+x/tau)+A2*log(1+x/(gamma1*tau));
DataFile "CodeSheet1[A2:B41]";
DataFile "CodeSheet1[C2:D41]";
DataFile "CodeSheet1[E2:F41]";
DataFile "CodeSheet1[G2:H41]";
Datafile are stored as follow:
Could you please provide a link to download the 1stOpt software package. I was searching for it earlier but was unable to locate it. Is the software availabe for free? Thanks.
1stOpt is not free, but a comercal software: www.7d-soft.com/en
@Alex Sha, can you recommend a free/more afordable software package please.
Matlab should be OK, but need you to do more work.

Sign in to comment.

 Accepted Answer

Hi Jack,
The following post on MATLAB Answers discusses a similar case:
In that question , there were 2 unknown shared parameters and 1 parameter was different for all the dataset but was also unknown. In this question we have 3 unknown shared parameters and 1 known parameters whose value will be different for each dataset.So I modified that to illustrate that :
function sharedparams
t = (0:10)';
T = [t; t; t;t];
Y = 3 + [exp(-t/2); 2*exp(-t/2); 3*exp(-t/2);4*exp(-t/2)] + randn(44,1)/10;
dsid = [ones(11,1); 2*ones(11,1); 3*ones(11,1);4*ones(11,1)];
gscatter(T,Y,dsid)
X = [T dsid];
A3 = [-5;1;3;4];
b = nlinfit(X,Y,@subfun,ones(1,3))
line(t,b(1)+b(2)+b(3)*t+A3(1),'color','r');
line(t,b(1)+b(2)+b(3)*t+A3(2),'color','g');
line(t,b(1)+b(2)+b(3)*t+A3(3),'color','b');
line(t,b(1)+b(2)+b(3)*t+A3(4),'color','c');
function yfit = subfun(param,X)
T = X(:,1); % time
dsid = X(:,2); % dataset id
A0 = param(1);
A1 = param(2);
A2 = param(3);
A3 = [-5;1;3;4]; %known paramter
yfit = A0 + A1+ A2*T + A3(dsid);

7 Comments

Thanks for this Unfortunatlely I am still having trouble applying this to my problem. What code should use in the main m file to call these functions (I am unsure what values need to be passed to param & X in the subfun function)?
Hi jack,
Refer to this nlinfit documentation. In nlinfit, we pass the function handle of non linear regression model function which must take two arguments,a coefficient vector and an array X,in that order. Coefficient vector(param in our case) is the initial values of the parameters. which we want to estimate. X should be the the data set . Here X is array of data set and dsid , which will help determine which value of the known parameter to use . So in your case , define X like this :
T = [d1 ;d2; d3;d4];% d1 , d2 ,d3 ,d4 are your data sets
dsid = [ones(11,1); 2*ones(11,1); 3*ones(11,1);4*ones(11,1)];
X = [T dsid];
Unfortuanely I am still getting erros.
Error using nlinfit (line 219)
MODELFUN must be a function that returns a vector of fitted values the same size as Y (160-by-1). The model function you provided returned a result that was 160-by-4.
One common reason for a size mismatch is using matrix operators (*, /, ^) in your function instead of the corresponding elementwise operators (.*, ./, .^).
Error in sharedparams (line 18)
b = nlinfit(X,Y,@subfun,ones(1,4))
Error in regression (line 17)
sharedparams
See attached for m files and data. Heres what I've done so far.
I presume I should create a main m file to call the functions shararedparams & subfun (which are each contained in seperate function files).
MAIN .m file:
clc
clear all
close all
d1 = xlsread('data.xlsx', 'Sheet1', 'G4:G43'); % y data 1
d2 = xlsread('data.xlsx', 'Sheet1', 'M4:M43'); % y data 1
d3 = xlsread('data.xlsx', 'Sheet1', 'AG4:AG43'); % y data 1
d4 = xlsread('data.xlsx', 'Sheet1', 'AM4:AM43'); % y data 1
T = [d1 ;d2; d3;d4];% d1 , d2 ,d3 ,d4 are your data sets
dsid = [ones(40,1); 2*ones(40,1); 3*ones(40,1);4*ones(40,1)]; % 40 data points in each data set (hence, ones(40,1)
X = [T dsid];
%%%% FUNCTION CALL TO sharedparams
sharedparams
param = [0.01, 1, 900]; % intial estimates of A1, A2 and tau
%%%% Function call to subfun
yfit = subfun(param,X)
sharedparams function:
function sharedparams
t = xlsread('data.xlsx', 'Sheet1', 'F4:F43'); % x data
y1 = xlsread('data.xlsx', 'Sheet1', 'G4:G43'); % y data 1
y2 = xlsread('data.xlsx', 'Sheet1', 'M4:M43'); % y data 1
y3 = xlsread('data.xlsx', 'Sheet1', 'AG4:AG43'); % y data 1
y4 = xlsread('data.xlsx', 'Sheet1', 'AM4:AM43'); % y data 1
% t = (0:10)';
T = [t; t; t;t];
Y = [y1; y2; y3; y4];
% Y = 3 + [exp(-t/2); 2*exp(-t/2); 3*exp(-t/2);4*exp(-t/2)] + randn(44,1)/10
dsid = [ones(40,1); 2*ones(40,1); 3*ones(40,1);4*ones(40,1)];
gscatter(T,Y,dsid)
X = [T dsid];
tau = [63085.05; 1525.3; 1601.8; 62465.28756];
b = nlinfit(X,Y,@subfun,ones(1,4))
line(t,b(1)+b(2)+b(3)*t+tau(1),'color','r');
line(t,b(1)+b(2)+b(3)*t+tau(2),'color','g');
line(t,b(1)+b(2)+b(3)*t+tau(3),'color','b');
line(t,b(1)+b(2)+b(3)*t+tau(4),'color','c');
subfun function:
function yfit = subfun(param,X)
T = X(:,1); % time
dsid = X(:,2); % dataset id
A1 = param(1);
A2 = param(2);
gamma = param(3);
tau = [63085.05; 1525.3; 1601.8; 62465.28756]; %known paramter
% yfit = A0 + A1+ A2*T + A3(dsid);
yfit = A1 * log(1 + T/tau) + A2 * log(1 + (T/tau)*(1/gamma))
Hi Jack ,
I see couple of mistake here , first you don't need to create a .m file separately, or you can use the code I provided without making it as function (no need of shared param) . Also we don't need to call subfun as it is called by the nlinfit function.
Another thing is you should be passing one(1,3) instead of one(1,4) as there are only 3 parameter to be estimated.
Also please understand why we are using dsid. It is to decide which value of tau to be choosen. tau should be replaced by tau(dsid). Before moving forward please try to understand what I did my code
Thanks
Deepak
Hi @Deepak Meena, Thankyou for all your help so far.
I have corrected all of the mistakes that you pointed out. I also now understand the purpose of dsid (its like and ID for each dataset, identifying it as 1,2,3, or 4). I have gone through every line of my code and your code running each line individualy so I understand its purpose.
Unfortuanately I am still getting an error (something to do with deviations between matrix dimiension of Y and the model function I provided). See below error. I don't see where I'm going wrong as I have followed your methodology exactly and have debugged each line without finding a solution. If you could point out my mistake I would realy appreciate it. Thanks.
ERROR:
Error using nlinfit (line 219)
MODELFUN must be a function that returns a vector of fitted values the same size as Y (160-by-1). The model function you provided returned a result that was 160-by-160.
One common reason for a size mismatch is using matrix operators (*, /, ^) in your function instead of the corresponding elementwise operators (.*, ./, .^).
Error in untitled (line 18)
b = nlinfit(X,Y,@subfun,ones(1,3))
CODE:
clc
clear all
close all
t = xlsread('data.xlsx', 'Sheet1', 'F4:F43'); % x data
y1 = xlsread('data.xlsx', 'Sheet1', 'G4:G43'); % y data 1
y2 = xlsread('data.xlsx', 'Sheet1', 'M4:M43'); % y data 1
y3 = xlsread('data.xlsx', 'Sheet1', 'AG4:AG43'); % y data 1
y4 = xlsread('data.xlsx', 'Sheet1', 'AM4:AM43'); % y data 1
T = [t; t; t; t] % vector of x data sets
Y = [y1; y2; y3; y4] % vector of y data sets
dsid = [ones(40,1); 2*ones(40,1); 3*ones(40,1);4*ones(40,1)] % ones(40,1) is a 40 x 1 array of 1's
gscatter(T,Y,dsid)
X = [T dsid] %%%%%%%%
tau = [63085.05; 1525.3; 1601.8; 62465.3]
b = nlinfit(X,Y,@subfun,ones(1,3))
line(t,b(1) * log(1 + t/tau(1)) + b(2) * log(1 + (t/tau(1))*(1/b(3))), 'color', 'r');
line(t,b(1) * log(1 + t/tau(2)) + b(2) * log(1 + (t/tau(2))*(1/b(3))), 'color', 'g');
line(t,b(1) * log(1 + t/tau(3)) + b(2) * log(1 + (t/tau(3))*(1/b(3))), 'color', 'b');
line(t,b(1) * log(1 + t/tau(4)) + b(2) * log(1 + (t/tau(4))*(1/b(3))), 'color', 'c');
function yfit = subfun(param,X)
T = X(:,1) % time
dsid = X(:,2); % dataset id
A1 = param(1);
A2 = param(2);
gamma = param(3);
tau = [63085.05; 1525.3; 1601.8; 62465.28756]; %known paramter
yfit = A1 * log(1 + T/tau(dsid)) + A2 * log(1 + (T/tau(dsid))*(1/gamma));
end
You have:
yfit = A1 * log(1 + T/tau(dsid)) + A2 * log(1 + (T/tau(dsid))*(1/gamma));
You should have:
yfit = A1 * log(1 + T./tau(dsid)) + A2 * log(1 + (T./tau(dsid))*(1/gamma));
You want element-by-element division, not vector division in the sense of a matrix operation.

Sign in to comment.

More Answers (1)

  1. (20%) Nonlinear Regression.
The data presented below follows the nonlinear functional relationship 𝑦 = 𝑥/(𝑎 + 𝑏𝑥), where 𝑎 and 𝑏 are the nonlinear model parameters.
Use MATLAB to complete the following:
Problem2.1. (5%) Linearize the dataset and perform linearregression on the linearized dataset.Display the slope and intercept/offset of the linear regression model, as well as its coefficient of determination.
Problem 2.2. (5%) Provide a plot that overlays the linearized dataset (i.e., linearized 𝑦 data vs linearized 𝑥 data) and the linear regression model obtained in problem 2.1. Display the linear regression model and its coefficient of determination in a legend. Include appropriate axes labels and axes grid lines.
Problem 2.3. (5%) Determine the values of the nonlinear model parameters 𝑎 and 𝑏 using the slope and intercept/offset of the linear regression model obtained in problem 2.1. Display the values of 𝑎 and 𝑏.
Problem 2.4. (5%) Provide a plot that overlays the original dataset and the nonlinear regression model obtained in problem 2.3. Display the nonlinear regression model and its coefficient of determination in a legend. Include appropriate axes labels and axes grid lines.

Categories

Products

Release

R2020b

Asked:

on 19 Feb 2021

Answered:

on 1 Jul 2025

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!