# Best function for fitnlm

2 views (last 30 days)
Hi everybody,
I am working on a crash data which I have 14 input and 1 out put. my out put has 2 categories: 1. injury 2. fatality
number of second category is low and my model is not able to reporduce them so I oversampled my data but I am not able to find the best input formula to provide inside fitnlm so, my results in terms of accuracy is very low, is there any specific toolbox or any specific way to find out how to provide best input function or initial betas for fitnlm?
this is the way that I organize my data,
fun1 ='C15 ~ ((b2*C2) + (b3*C3) + b44*exp(b4*C4) + (b5*C5^2) + exp(b8*C8) + (b13*C13^2) + (b12*C12) + (b913*C9*C13) +(b7*C7) + (b108*C10*C8)) / (1+b90*C9) '
beta01=zeros(1,12);
mdl1 =fitnlm([INC.C2 INC.C3 INC.C4 INC.C5 INC.C8 INC.C13 INC.C12 INC.C7 INC.C9 INC.C10], INC.C15, fun1, beta01)
fit1=predict(mdl1,[INC.C2 INC.C3 INC.C4 INC.C5 INC.C8 INC.C13 INC.C12 INC.C7 INC.C9 INC.C10]);
Star Strider on 30 Oct 2020
If you are having problems getting the appropriate parameters, use the ga (genetic algorithm) function, and give it several opportunities to estimate the parameters. It will search the entire parameter space and eventually will find the best set. (It may take it a few runs to find them, however if your model accurately represents the process that created your data, it will succeed.)

Aditya Patil on 24 Dec 2020
As per my understanding, you are having problem with defining the features for the classification problem.
One solution is to use other algorithms like Deep Learning which will handle feature extraction for you. On the down side, such models tend to be less explainable.
Alternately, you can try feature selection algorithms. This will allow you to reduce the number of inputs. Depending on how important explainability is to you, you can also go with decision trees.