Find nonlinear function to optimize parameters
4 views (last 30 days)
Show older comments
Matthew Blomquist
on 2 Nov 2021
Commented: Matthew Blomquist
on 2 Nov 2021
Hello,
I'm trying to optimize a large dataset that contains 36 predictors and 1 response variable. To optimize, I am using fminsearchbnd, which I found on the MathWorks File Exchange. However, I don't know the best formula/function to use for the optimization (e.g., coefficients, highest order, etc). I tried using fitlm with linear, squared, and interaction terms between all 36 predictors, but the function output isn't great, and the response variable goes below 0 (which it shouldn't because it is a RMSE). It should be a nonlinear function, but I don't know of what form.
Is there a function / toolbox I could use to find the formula/function to optimize the predictor variables so that the response variable (RMSE) is minimized?
Thank you in advance!
0 Comments
Accepted Answer
Walter Roberson
on 2 Nov 2021
Is there a function / toolbox I could use to find the formula/function to optimize the predictor variables so that the response variable (RMSE) is minimized?
No.
It can be proven mathematically (and I have personally posted proofs in the past) that any finite set of points of finite precision, can be exactly fitted (to within round-off error) by an uncountable infinity of different formula. If a program were to pick one of the formulas, then the probability that it picked the "right" formula would be which is 0 .
If you do not have a restricted set of possible forms, then there is no possible program that can find the "right" form of the equation.
Even if you have a restircted set of possible forms, due to round-off error and noise in measurements, it is notoriously true that a form known in advance to be the "wrong" equation can end up with a lower RMSE than the "right" equation.
More Answers (1)
John D'Errico
on 2 Nov 2021
Edited: John D'Errico
on 2 Nov 2021
NO. Do NOT use fminsearchbnd to try to optimize a problem with 36 parameters. You will be wasting your time and mine, when you next send me a plaintive e-mail asking why it does not work.
fminsearchbnd uses fminsearch, as an overlay to do the work, but then apply bound constraints. fminsearch is able to optimize problems with perhaps 6-8 parameters. Maybe 10 in a pinch. But 36 unknowns? Give me a break. It won't work. PERIOD.
What we are not told is how many data points you have. Far too often people think they don't need many data points. With too few data points, expect garbage for results no matter what. You say the dataset is large, but is it? Do you have sufficient information to reasonably estimate that many parameters?
Next, we are given no clue if the model is even reasonable for your data. Too often, people try to cram their own favorite model into their data. You can't fit a square peg into a round hole. Well, you can, but either the peg or the hole will suffer.
And, oh. it looks like you have no idea what model to use here, so you are trying to use a multinomial model (polynomial in multiple dimensions.) Expect randomly garbage results with that model.
Finally, you need good starting values for a nonlinear model. A 36 dimensinal search space is IMMENSE. Provide poor starting values, and expect crapola for a result. But if your model is LINEAR, as it would be if you used fitlm, then there is no reason to even bother with an iterative method like fminsearchbnd. fitlm will give you the optimal answer. It may not be a model that you like, but that is the fault of your data and your choice of model.
See Also
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!