Feature/Variable selection

2 views (last 30 days)
Eng Ofetotse
Eng Ofetotse on 12 Jul 2017
Answered: Amit Doshi on 17 Jul 2017
hello there, I have a feature selection problem. I have a dataset with 14 features. I have done an exhaustive search of 2^P-1 to yield 16,383 subsets. I then clustered the generated subsets using k-means clustering algorithm. So I am looking to find the best subset. The information that I have is as follows (as an illustration).
Dataset cluster membership silhouette value Feature/Variables 1 3 1.00 1 2 4 1.00 2 3 4 0.97 3 4 4 0.80 4 5 3 0.79 5 6 3 0.64 [6,7] 7 2 0.37 [8,9,10,11,12,13,14] 8 4 0.48 [15,16] 9 2 0.66 [17,18,19,20,21,22,23,24] 10 4 1.00 25 11 3 0.38 [26,27,28,29,30,31,32,33,34] 12 3 0.77 35 13 3 0.79 36 14 2 0.76 37 15 3 0.78 [1,2] 16 3 0.94 [1,3] 17 3 0.93 [1,4] 18 2 0.73 [1,5] 19 2 0.64 [1,6,7] 20 4 0.39 [1,8,9,10,11,12,13,14] 21 2 0.62 [1,15,16] 22 3 0.50 [1,17,18,19,20,21,22,23,24] 23 4 0.67 [1,25] 24 2 0.53 [1,26,27,28,29,30,31,32,33,34] 25 4 0.86 [1,35] 26 3 0.86 [1,36]
any ideas would be appreciated. Thank you

Accepted Answer

Amit Doshi
Amit Doshi on 17 Jul 2017
Hello Eng,
You can use 'sequentialfs' function in the Statistics and Machine Learning Toolbox in MATALB to do sequential feature selection and reduce the dimensionality of data.
Refer the below links to know more about Feature Selection in MATLAB:

More Answers (0)

Categories

Find more on Statistics and Machine Learning Toolbox in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!