What does logp output in classify exactly mean?

Hi all!
I am using the classify function but I obtain positive values in the logp output parameter. If I understand correctly this is the logarithm of a probability and consequently should't be larger than 0. Is that correct? If so, what could cause getting these values?
Thank you very much!

4 Comments

What values? You haven't shown any values, or code for that matter.
In my code, I want to classify a query case x_query:
[pred_query_fisher, train_err, pos_query, logp] = classify(x_query, x_train, y_train);
I obtain in logp values as: 5.02668995160540 1.87672925813518 3.89646235539042 2.41511473947102 2.28310465459352 -0.264432538294149 2.07666820225971 5.02949160703484 4.17629990434130 -18.4216669707092 1.58722646584835
Thank you!
My expectation is the same as yours, Maria, but I am not an expert on this. My guess is that you are hitting some numerical instability. Are you able to post the smallest possible self-contained example that will exhibit the phenomenon?
Let's say that our query is:
query = [-0.6824 -0.0764 -0.4608 -0.0770 -0.5227]
Our training data is: train_x =
[-0.6837 -0.0789 -0.5838 -0.0436 -0.6582;
-0.5692 -0.0707 -0.5459 -0.0083 -0.5791;
-0.6475 -0.0597 -0.6075 -0.1157 -0.6768;
-0.7199 -0.0655 -0.5886 -0.1927 -0.6442;
-0.8650 -0.0616 -0.3579 -0.0563 -0.4931;
-0.7285 -0.0545 -0.2680 -0.1328 -0.3348;
-0.7717 -0.0749 -0.6171 -0.1440 -0.7033;
-0.4889 -0.0675 -0.5421 -0.1596 -0.5656;
-0.5019 -0.0822 -0.5932 -0.1313 -0.6452;
-0.5383 -0.0781 -0.6051 0.0638 -0.6635;
-0.8107 -0.0592 -0.5815 -0.2463 -0.6475;
-0.8576 -0.0607 -0.5961 -0.1486 -0.6813;
-0.8214 -0.0753 -0.6193 0.0215 -0.7097;
-0.7035 -0.0489 -0.4232 0.1721 -0.4677;
-0.8102 -0.0533 -0.2051 -0.2215 -0.3409]
and the labels : y = [-1; -1 ; -1; 1; -1; -1; -1; -1; 1; -1; -1; -1; -1; -1; -1]
Then if we do: [pred_query_fisher, train_err, pos_query, logp] = classify(query, train_x, train_y)
logp will be 7.2821
Thanks!

Sign in to comment.

Answers (1)

Quoting from the doc for classify:
[class,err,POSTERIOR,logp] = classify(...) also returns a vector logp containing estimates of the logarithms of the unconditional predictive probability density of the sample observations...
Values of probability density do not need to be less than one.

5 Comments

If this is the probability density function, what is useful for just one point of it? Thanks
I don't understand "what is useful for just one point of it". You can evaluate the log of the PDF at as many points as you'd like. PDF is one of the basic concepts in statistics: http://en.wikipedia.org/wiki/Probability_density_function
Thank you for your answer. I know what PDF is but this Matlab function returns just one number, that's why I thought it was a probability value and not a PDF. I cannot evaluate the PDF in any points, because it returns just a point
classify returns a vector of logp values, just as the doc says. You get one value because you pass one query point in. Take as many query points as you'd like, concatenate them in a matrix and pass them as the first input to classify. Form query points to sample the space with whatever granularity you choose.
I just want to classify one query at a time. I guess this output is useless in my case. Thanks

Sign in to comment.

Asked:

on 28 Apr 2014

Commented:

on 6 May 2014

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!