# How do I create a mathematical model from a given data using MATLAB

11 views (last 30 days)
Akinyemi on 8 Feb 2023
Commented: Walter Roberson on 8 Feb 2023
Given a population data we are required to predict the future population,how do I create a mathematical model from a the data using MATLAB
##### 2 CommentsShowHide 1 older comment
Walter Roberson on 8 Feb 2023
I see that you edited your question to refer to population data. Unfortunately, the fact that it has to do with population does not change anything about my answer showing that You Can't Do That. Not unless you need the model to be strictly integer valued, in which case different approaches need to be taken for the proof.

Walter Roberson on 8 Feb 2023
It can be shown by constructive proof that given a finite set of distinct input coordinate lists of finite values with finite representation, and a corresponding list of finite output values with finite representation, that there exists a multinomial that exactly fits the data, provided that the calculations can be carried out with sufficient precision. In particular if the case of a single independent variable, that there exists deterministic methods to calculate a corresponding polynomial.
Therefore in theory you could fit a polynomial to the data and the polynomial would be the model.
In practice when you carry out the calculations in bounded-precision such as double precision, the results can only approximate the coefficients and the error gets large fairly quickly. Degree 7 is the outer limit of usability in most cases.
Now suppose you find the distance between the adjacent input coordinates, and find a number F that divides all of the distances exactly. This is always possible if the coordinates are specified to finite precision (the coordinates are not repeating decimals or irrationals such as square roots). For example if the input coordinates are all between 1 and 2 and they are double precision then the worst case common divisor of the distances is 2^-52. Now you can express all of the input coordinates as the first coordinate plus an integer multiple of the fraction.
Now create a sin() that is zero at every integer multiple of the fraction. sin(pi*t/F)
Now take that sine wave and add it to the polynomial you constructed earlier. By construction, the sine adds zero at every one of the given input points, and therefore at those points gives the same output as the polynomial. You now have two different models that predict exactly the same values at all input points. This proves that the input model created from the polynomial cannot be a unique model that fits all of the data. By increasing the frequency of the sine wave you can construct any number of different models that "explain" the data equally well.
With there being an infinite number of different models that all explain the data "exactly", the probability that any of them being the "right" model is 1/infinity which is zero.
Therefore there is no way given only inputs and corresponding outcomes, to calculate "the" corresponding model.
The best you can do is to examine a finite list of different model forms with undetermined coefficients, fit coefficients, calculate error, and select the form that gives the least error.
Unfortunately in practice it is common when testing that kind of approach with known models, to discover that even fairly small noise can lead to the "wrong" model winning.