Parameter Estimation for a System of Differential Equations

Hello. I am new to matlab and I have a set of kinetic equations that need to be solve. The reaction has 1 reactant in which it will decompose into 3 other product over time. I have the experimental results that shows the degradation of reactant and the concentration of products over a period of time, as below. The differential equations I need to solve are also as below.
Reaction Kinetics:
d[A]/dt = -k1[A]-k2[A]
d[B]/dt = k1[A]-k3[B]-k4[B]
d[C]/dt = k2[A]+k3[B]
d[D]/dt = k4[B]
Experimental result
Time (min) / Conc (g/L) [A] [B] [C] [D]
0 1.000 0 0 0
20 0.8998 0.0539 0.0039 0.0338
30 0.8566 0.0563 0.00499 0.0496
60 0.7797 0.0812 0.00715 0.0968
90 0.7068 0.0918 0.00937 0.1412
120 0.6397 0.0989 0.01028 0.1867
I have browse through various code and the one solved StarStrider for Igor Moura shows the result that I wish to achieve where plotted is the graph of the concentrations of reactant and products over time as well as solving the reaction rate constants "ki". However I have modified the code according to my equations and the graph came out weird and the results are not fitted well into with the experimental results. Can someone help me with this? Attach below is the code I have tried out.
function Igor_Moura
% 2016 12 03
% NOTES:
%
% 1. The ‘theta’ (parameter) argument has to be first in your
% ‘kinetics’ funciton,
% 2. You need to return ALL the values from ‘DifEq’ since you are fitting
% all the values
function C=kinetics(k,t)
c0=[1;0;0;0];
[T,Cv]=ode45(@DifEq,t,c0);
%
function dC=DifEq(t,c)
dcdt=zeros(4,1);
dcdt(1)=-k(1).*c(1)-k(2).*c(1);
dcdt(2)= k(1).*c(1)-k(3).*c(2)-k(4).*c(2);
dcdt(3)= k(2).*c(1)+k(3).*c(2);
dcdt(4)= k(4).*c(2);
dC=dcdt;
end
C=Cv;
end
t=[20
30
60
90
120];
c=[0.91257 0.02086 0.00939 0.00725
0.88232 0.02531 0.01664 0.00897
0.83324 0.02882 0.03927 0.01195
0.76289 0.03137 0.06834 0.01514
0.70234 0.03380 0.10542 0.01679 ];
k0=[1;1;1;1];
[k,Rsdnrm,Rsd,ExFlg,OptmInfo,Lmda,Jmat]=lsqcurvefit(@kinetics,k0,t,c);
fprintf(1,'\tRate Constants:\n')
for k1 = 1:length(k)
fprintf(1, '\t\k(%d) = %8.5f\n', k1, k(k1))
end
tv = linspace(min(t), max(t));
Cfit = kinetics(k, tv);
figure(1)
plot(t, c, 'p')
hold on
hlp = plot(tv, Cfit);
hold off
grid
xlabel('Time')
ylabel('Concentration')
legend(hlp, 'C_1(t)', 'C_2(t)', 'C_3(t)', 'C_4(t)', 'Location','N')
end

7 Comments

According to your differential equations the total amount of C should be conserved. In fact c(1)+c(2)+c(3)+c(4)=1.
However, that isn't true of the data to which you are fitting the equations. Some material is lost with time. The amount that's lost is larger than some of the values remaining!
It suggests there should be another term, c(5), somewhere.
At t=0, the total concentration is 1. There could be measuremnt inaccuracies. We know nothing about the system being described other than what was posted. There could be an elimination pathway, however we have no idea what it is or from what compartment. It does not appear to be part of the current model.
By t = 120 the discrepancy is almost as large as the total concentration in B, C and D combined, suggesting another pathway somewhere (or a lousy measurement system!).
I seriously doubt that we’re seeing the whole system.
According to the model the sum of the concentrations is 1 for all time, not just at t = 0. To see this, add the four rates. They sum to zero, which means the sum of the concentrations is constant. (In fact it's not difficult to solve the equatons analytically). So, yes, I agree, we're not seeing the whole system.
Sorry for the late reply. Indeed the experimental result do not show the whole system as the experiment was of a decomposition reaction that forms multiple product and side product. The products were analyze using mass spectrometry and only significant products concentration was taken into consideration.
Daphne Deidre Wong — Thank you!

Sign in to comment.

 Accepted Answer

Using this version of ‘kinetics’ (that also estimates the initial conditions for ‘DifEq’),
function C=kinetics(k,t)
c0=k(5:8);
[T,Cv]=ode45(@DifEq,t,c0);
%
function dC=DifEq(t,c)
dcdt=zeros(4,1);
dcdt(1)=-k(1).*c(1)-k(2).*c(1);
dcdt(2)= k(1).*c(1)-k(3).*c(2)-k(4).*c(2);
dcdt(3)= k(2).*c(1)+k(3).*c(2);
dcdt(4)= k(4).*c(2);
dC=dcdt;
end
C=Cv;
end
I got a good fit with these parameters (note that the first 4 are ‘k(1)’ through ‘k4’ and the last 4 are the initial conditions):
Rate Constants:
Theta(1) = 0.00180
Theta(2) = 0.00049
Theta(3) = 0.02622
Theta(4) = 0.01094
Theta(5) = 0.89995
Theta(6) = 0.00011
Theta(7) = 0.00213
Theta(8) = 0.00436
producing this plot:
I used the ga (genetic algorithm) function to find them. I am cleaning up the code, and will post the entire code file in a few minutes.
EDIT — (22 Sep 2020 at 14:36)
The complete (cleaned) code:
function Daphne_Deidre_Wong_GA
% 2020 09 22
% NOTES:
%
% 1. The ‘k’ (parameter) argument has to be first in your
% ‘kinetics’ function,
% 2. You need to return ALL the values from ‘DifEq’ since you are fitting
% all the values
function C=kinetics(k,t)
c0=k(5:8);
[T,Cv]=ode45(@DifEq,t,c0);
%
function dC=DifEq(t,c)
dcdt=zeros(4,1);
dcdt(1)=-k(1).*c(1)-k(2).*c(1);
dcdt(2)= k(1).*c(1)-k(3).*c(2)-k(4).*c(2);
dcdt(3)= k(2).*c(1)+k(3).*c(2);
dcdt(4)= k(4).*c(2);
dC=dcdt;
end
C=Cv;
end
format long
t=[ 20
30
60
90
120];
c=[0.91257 0.02086 0.00939 0.00725
0.88232 0.02531 0.01664 0.00897
0.83324 0.02882 0.03927 0.01195
0.76289 0.03137 0.06834 0.01514
0.70234 0.03380 0.10542 0.01679];
ftns = @(theta) norm(c-kinetics(theta,t));
PopSz = 500;
Parms = 8;
opts = optimoptions('ga', 'PopulationSize',PopSz, 'InitialPopulationMatrix',randi(1E+4,PopSz,Parms)*1E-5, 'MaxGenerations',2E3, 'PlotFcn',@gaplotbestf, 'PlotInterval',1);
t0 = clock;
fprintf('\nStart Time: %4d-%02d-%02d %02d:%02d:%07.4f\n', t0)
[theta,fval,exitflag,output] = ga(ftns, Parms, [],[],[],[],zeros(Parms,1),Inf(Parms,1),[],[],opts)
t1 = clock;
fprintf('\nStop Time: %4d-%02d-%02d %02d:%02d:%07.4f\n', t1)
GA_Time = etime(t1,t0)
DT_GA_Time = datetime(num2str(GA_Time,'%.6f'),'InputFormat','ss.SSSSSS', 'Format','HH:mm:ss.SSS')
fprintf('\nElapsed Time: %23.15E\t\t%s\n\n', GA_Time, DT_GA_Time)
fprintf(1,'\tRate Constants:\n')
for k1 = 1:length(theta)
fprintf(1, '\t\tTheta(%d) = %8.5f\n', k1, theta(k1))
end
tv = linspace(min(t), max(t));
Cfit = kinetics(theta, tv);
figure
hd = plot(t, c, 'p');
for k1 = 1:size(c,2)
CV(k1,:) = hd(k1).Color;
hd(k1).MarkerFaceColor = CV(k1,:);
end
hold on
hlp = plot(tv, Cfit);
for k1 = 1:size(c,2)
hlp(k1).Color = CV(k1,:);
end
hold off
grid
xlabel('Time')
ylabel('Concentration')
legend(hlp, 'C_1(t)', 'C_2(t)', 'C_3(t)', 'C_4(t)', 'Location','E')
format short eng
end
.

30 Comments

Thank you star strider you've been such a great help!
May I ask the k5-k8 are initial conditions of what?
My pleasure!
They are the intial conditions for the ‘DifEq’ differential equations.
If you want to use your original code, just use ‘Theta(1)’ through ‘Theta(4)’ for ‘k0’. Those should work, although the results will be slightly different than the ga estimates for those parameters.
If my Answer helped you solve your problem, please Accept it!
.
How do I extract the concentrations data calculated against the different time from the solved code?
They are calculated as:
tv = linspace(min(t), max(t));
Cfit = kinetics(theta, tv);
and then plotted.
You can also use the original ‘t’ vector:
Cfit = kinetics(theta, t);
to get the values at those times. It will closely resemble the original ‘c’ matrix.
Sorry if my question were confusing. I don’t know how to extract the datas from the plot figure. I have tried to do it with the xlsxwrite() command but it didn’t went thru.
No worries!
Since you are new to MATLAB, a project this complicated would be daunting.
There is no need to extract the values from the plot figure (although that is possible) since my code calculates everything. It plots the time vector ‘tv’ and the calculated model fit ‘Cfit’, in order to provide a smoother set of concentration curves than would be provided by using only the times with associated with the data. Use those, or use your original time vector ‘t’ to get the concentrations at only those times.
Hi star strider. I was thinking if I were to include this base code into my thesis would there be any copyright issue? If i were to modify the code and add another reaction to the kinetics, which parameter should I be aware?
Hi, Daphne!
The only suggestion I have with respect to copyright is parts c. and d. from the Content (section 2) of MATLAB Central Terms of Use.
If you have questions beyond that, use the Contact Us telephone handset icon in the upper right corner of this page. I suppose this would be a Technical Support problem. Another option is to get legal assistance from your university. Copyright law (and most other aspects of the law) are very far from my areas of expertise.
With respect to adding another reaction, since I have no idea what the system is or what the parameters or compartments are, I cannot determine that. My modeling of such systems is mostly with respect to endocrine and metabilic systems, where the pathways are known and the only unknowns are some of the concentrations (in a partially ‘uncontrollable’ and ‘unobservable’ system), and most of the parameters. I may be able to help if I understand more about the system you are modeling. This assumes that I have sufficient knowledge of it to understand how to include the additional reaction.
Hi Star Strider, I have tried to alter the code by adding another reaction but as expected there's an error mainly due to the vector length. Is it okay for you to help me have a look into it in discreet? perhaps if I could personal message u with the file exchange?
I discourage personal messages, because that deprives others here of following these discussions.
You can possibly add another reaction without having to fit data to it. For example if you want to add an elimination reaction (and its associated parameter), you only need to add it to a specific compartment to estimate the parameter for it. (You have to know from what compartment it is being eliminated.)
You have 6 data points (including time 0), so adding another () parameter to estimate will not cause your model to fail by estimating more parameters than you have data. It is not possible to add another differential equation however, since the number of compartments you can model is fixed by the number of columns in your data.
Hello star strider. I have tried adding another 5th parameter, k5 as you have said. The graph result does not show a nice fit to the data points which is understandable. I actually do have the data point for the species number 5 as well but due to the vector length constraint I'm unable to fit it. Was wondering if i can alter the vector length. Yes i do understand where you're coming from with the open discussion, but this is my thesis and im supposedly not suppose to use a publicly discussed code. I know you have no obligations to help me at all but due to the time constraint and my supervisor not being helpful at all, I have reached out to this portal for help. I deeply appreciate your generousity so far.
It is my pleasure to help you with this.
I understand about your desire for confidentiality. However, this is already public.
There is no constraint on adding more data, another parameter, and and an additional differential equation to the system. If you have more data for species 5, add it as the fiftth column of ‘c’, and add the differential equation for it as ‘dcdt(5)’ in ‘DifEq’. Also change the ‘c0’ assignment to add the additional initial condition. Add ‘k(5)’ wherever it is appropriate, and increase ‘Parms’ by 1. Change the legend arguments in the plot call. That should be everything you need to do to change the code to accommodate the new data and parameter. The code will adapt to those changes.
I have made the changes accordingly and its able to run without the vector error but the errow below keeps showing up and I have tried varying the time string but it does not work?
Copying that assignment from the code I posted earlier, it should be:
DT_GA_Time = datetime(num2str(GA_Time,'%.6f'),'InputFormat','ss.SSSSSS', 'Format','HH:mm:ss.SSS')
I have no idea wher all the additional ‘S’ came from. I did not write them. .
Sorry. I made alteration to those trying if it make any changes. but even with the original code the datetime error appears.
That was not designed to accommodate more than 99 seconds.
Use this instead:
DT_GA_Time = datetime([0 0 0 0 0 GA_Time], 'Format','HH:mm:ss.SSS');
This will accommodate essentially everything in ‘GA_Time’ that etime can calculate, up to 86399.999999, before it overflows into the next day. It will not throw an error even then, however the results may appear strange.
Oh yes the results are very odd. Is there a way to minimally increment the time limit to obtain one that can accomodate the datas?
I do not understand what you ware asking.
It might be possible to interpolate the data to a time vector with finer resolution (more points in the same time limits), however this introduces ‘new data’ that are not part of the original data. I would definitely not do that.
If the results are strange, examine your model to be certain it is appropriate for your data.
Sorry for the confusion. I just altered the code as you have suggested above but faced with the complication with that time constraint. Is it okay for u to have a one last look at my code? The differential equations of kinetics really did not have a huge difference as to the one suggested above.
If you want to post your code, I will of course look at it and do my best to answer whatever questions you have about it.
I do not understand about ‘time constraint’ however. Please explain that in a bit more detail.
Currently the code that i have modify is shown below. However, the problem I have with it is the datetime error.
function Daphne_Deidre_Wong_GA_V1
% 2020 09 22
% NOTES:
%
% 1. The ‘k’ (parameter) argument has to be first in your
% ‘kinetics’ function,
% 2. You need to return ALL the values from ‘DifEq’ since you are fitting
% all the values
function C=kinetics(k,t)
c0=k(5:9);
[T,Cv]=ode45(@DifEq,t,c0);
%
function dC=DifEq(t,c)
dcdt=zeros(5,1);
dcdt(1)=k(1).*c(2)-k(2).*c(1)-k(3).*c(1)-k(4).*c(1);
dcdt(2)= k(2).*c(1)-k(1).*c(2)-k(5).*c(2)-k(6).*c(2);
dcdt(3)= k(3).*c(1)+k(5).*c(2);
dcdt(4)= k(6).*c(2);
dcdt(5)= k(4).*c(1);
dC=dcdt;
end
C=Cv;
end
format long
t=[ 20
30
60
90
120];
c=[0.89978 0.05385 0.00387 0.03384 0.00499
0.85655 0.05634 0.00499 0.0496 0.00665
0.77968 0.08119 0.00715 0.09682 0.00929
0.70684 0.09177 0.00937 0.14118 0.0114
0.63967 0.09893 0.01028 0.18666 0.01277];
ftns = @(theta) norm(c-kinetics(theta,t));
PopSz = 500;
Parms = 9;
opts = optimoptions('ga', 'PopulationSize',PopSz, 'InitialPopulationMatrix',randi(1E+4,PopSz,Parms)*1E-5, 'MaxGenerations',2E3, 'PlotFcn',@gaplotbestf, 'PlotInterval',1);
t0 = clock;
fprintf('\nStart Time: %4d-%02d-%02d %02d:%02d:%07.4f\n', t0)
[theta,fval,exitflag,output] = ga(ftns, Parms, [],[],[],[],zeros(Parms,1),Inf(Parms,1),[],[],opts)
t1 = clock;
fprintf('\nStop Time: %4d-%02d-%02d %02d:%02d:%07.4f\n', t1)
GA_Time = etime(t1,t0)
DT_GA_Time = datetime(num2str(GA_Time,'%.6f'),'InputFormat','ss.SSSSSS', 'Format','HH:mm:ss.SSS');
fprintf('\nElapsed Time: %23.15E\t\t%s\n\n', GA_Time, DT_GA_Time)
fprintf(1,'\tRate Constants:\n')
for k1 = 1:length(theta)
fprintf(1, '\t\tTheta(%d) = %8.5f\n', k1, theta(k1))
end
tv = linspace(min(t), max(t));
Cfit = kinetics(theta, tv);
figure
hd = plot(t, c, 'p');
for k1 = 1:size(c,2)
CV(k1,:) = hd(k1).Color;
hd(k1).MarkerFaceColor = CV(k1,:);
end
hold on
hlp = plot(tv, Cfit);
for k1 = 1:size(c,2)
hlp(k1).Color = CV(k1,:);
end
hold off
grid
xlabel('Time')
ylabel('Concentration')
legend(hlp, 'C_1(t)', 'C_2(t)', 'C_3(t)', 'C_4(t)', 'C_5(t)', 'Location','E')
format short eng
end
I fixed the datetime problem 5 days ago in my previous Comment!
Change ‘DT_GA_Time’ to this:
DT_GA_Time = datetime([0 0 0 0 0 GA_Time], 'Format','HH:mm:ss.SSS');
and the problem ino longer exists.
Hello Star Strider,
I am sorry to bother you again. I was thinking if you can help me take a look at the code below as the curve have strayed quite a lot from the data and I do not know how to fix it. I am so sorry for the inconveniences, but I really need this help.
function Gluc_200
% 2020 09 22
% NOTES:
%
% 1. The ‘k’ (parameter) argument has to be first in your
% ‘kinetics’ function,
% 2. You need to return ALL the values from ‘DifEq’ since you are fitting
% all the values
function C=kinetics(k,t)
c0=k(5:8);
[T,Cv]=ode45(@DifEq,t,c0);
%
function dC=DifEq(t,c)
dcdt=zeros(4,1);
dcdt(1)=-k(1).*c(1)-k(2).*c(1)-k(3).*c(1);
dcdt(2)= k(1).*c(1)-k(4).*c(2);
dcdt(3)= k(2).*c(1)+k(4).*c(2);
dcdt(4)= k(3).*c(1);
dC=dcdt;
end
C=Cv;
end
format long
t=[ 20
40
60
90
100];
c=[0.59511 0.12077 0.01160 0.01548
0.39150 0.08768 0.01284 0.01784
0.28869 0.05067 0.01251 0.01633
0.19520 0.03158 0.01158 0.01317
0.15019 0.02249 0.00883 0.01141];
ftns = @(theta) norm(c-kinetics(theta,t));
PopSz = 500;
Parms = 8;
opts = optimoptions('ga', 'PopulationSize',PopSz, 'InitialPopulationMatrix',randi(1E+4,PopSz,Parms)*1E-5, 'MaxGenerations',2E3, 'PlotFcn',@gaplotbestf, 'PlotInterval',1);
t0 = clock;
fprintf('\nStart Time: %4d-%02d-%02d %02d:%02d:%07.4f\n', t0)
[theta,fval,exitflag,output] = ga(ftns, Parms, [],[],[],[],zeros(Parms,1),Inf(Parms,1),[],[],opts)
t1 = clock;
fprintf('\nStop Time: %4d-%02d-%02d %02d:%02d:%07.4f\n', t1)
GA_Time = etime(t1,t0)
DT_GA_Time = datetime(num2str(GA_Time,'%.6f'),'InputFormat','ss.SSSSSS', 'Format','HH:mm:ss.SSS')
fprintf('\nElapsed Time: %23.15E\t\t%s\n\n', GA_Time, DT_GA_Time)
fprintf(1,'\tRate Constants:\n')
for k1 = 1:length(theta)
fprintf(1, '\t\tTheta(%d) = %8.5f\n', k1, theta(k1))
end
tv = linspace(min(t), max(t));
Cfit = kinetics(theta, tv);
figure
hd = plot(t, c, 'p');
for k1 = 1:size(c,2)
CV(k1,:) = hd(k1).Color;
hd(k1).MarkerFaceColor = CV(k1,:);
end
hold on
hlp = plot(tv, Cfit);
for k1 = 1:size(c,2)
hlp(k1).Color = CV(k1,:);
end
hold off
grid
xlabel('Time')
ylabel('Concentration')
legend(hlp, 'C_1(t)', 'C_2(t)', 'C_3(t)', 'C_4(t)', 'Location','E')
format short eng
end
I worked with this and I can’t get better results that you’re getting. I would try several runs with the ga function and see if it can do better.
Otherwise, your model and your data may not be an appropriate match. (I am not certain what you are doing.) If you want to share your model (preferably a PDF version of it) and some details about your data, perhaps I can come up with something that will provide a better fit.
Attached below is the summary of the work done so far. Thank you in advance.
Sorry to bother you, The initial assignment of c0 can only be equal to the number of column vectors. If I want to estimate more parameter k, how can I look at it?
The code estimates the kinetic parameters in the ‘k’ vector, and additionally the initial conditions as ‘c0’. Those must match the parameters in the system you are modeling.
Thank you for your answer. Your program is very helpful to me.
sorry to bother you again, I successfully made graphs using your program, but how should I separate the graphs (Make a picture for each substance)?
Use subplot, tiledlayout, or stackedplot. (I believe that exhausts the options.)
A Vote would be appreciated!

Sign in to comment.

More Answers (1)

You could also consider using SimBiology which provides parameter estimation capabilities for ODE systems.
Here is a short video tutorial.

Products

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!