ROC curve for Decision Tree

1 view (last 30 days)
Nirmal
Nirmal on 25 Jul 2012
I am getting ROC curve for decision tree but I am a bit taken away by the curve. Its similar to the curve,
If you notice the curve has a straight part after hitting the optimal point and joining it to the (1,1). I dont know the reason behind that. I thought the curve should be a combination of either a horizontal or vertial line for each of the item, it seems the the result of the items were neither true position nor false positive.
Thank you.

Answers (2)

Image Analyst
Image Analyst on 25 Jul 2012
What function are you using to plot it, and how many points are in the data? Maybe you don't have any x values after 0.2 so it just drew a straight line between your last two points. You probably just don't have any "items" there.
  1 Comment
Nirmal
Nirmal on 25 Jul 2012
Edited: Nirmal on 25 Jul 2012
I am using perfcurve(). I have around 5000 items.
I thought since item can either be true positive or false positive. so this should infact lead to having either vertical or horizontal line for each of the item. and the X axis was supposed to be the total number of false positive in the prediction. isnt it? I dont see why there wouldnt be enough X items.
Its not just on that portion but also from (0,0) towards the optimal part, its mostly straight line. Sorry I couldnt upload my image for now.
I am starting to think that it may be because of the probability with which the DT makes the prediction, if it doesnt exceed the threshold, may be it is neither classified as true positive nor as false positive.
I hope you understood what i mean to say.

Sign in to comment.


Ilya
Ilya on 26 Jul 2012
The empirical ROC curve is computed using a finite set of points, without smoothing. The curve shows a step, either along the sensitivity or along specificity axis, when the next adjacent score is for an observation either of the positive class or the negative class, but not both. If one observation of the positive class and one observation of the negative class have equal scores, the step to that score is undefined: You could choose a vertical step or you could choose a horizontal step. This is a matter of convention. The convention adopted in perfcurve is to update both sensitivity and specificity at the same time. Check this out:
labels = [0 0 0 1 1]'
scores = [0 0.2 0.5 0 0.5]'
[fpr,tpr] = perfcurve(labels,scores,1)
fpr =
0
0.3333
0.6667
1.0000
tpr =
0
0.5000
0.5000
1.0000
Just like in this example, the classifier gives equal lowest scores for the positive and negative classes in your data. That's why you see a straight line to (1,1).

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!