Main Content

increaseB

Increase reference data sets

    Description

    updatedEvaluation = increaseB(evaluation,numsets) returns a gap criterion clustering evaluation object updatedEvaluation, which uses the gap criterion clustering evaluation object evaluation and an additional number of reference data sets specified by numsets.

    example

    Examples

    collapse all

    Create a gap clustering evaluation object using evalclusters. Then, use increaseB to increase the number of reference data sets used to compute the gap criterion values.

    Load the fisheriris data set. The data contains length and width measurements from the sepals and petals of three species of iris flowers.

    load fisheriris

    Cluster the flower measurement data using kmeans, and use the gap criterion to evaluate proposed solutions for 1 to 5 clusters. Use 50 reference data sets.

    rng("default") % For reproducibility
    evaluation = evalclusters(meas,"kmeans","gap","KList",1:5,"B",50)
    evaluation = 
      GapEvaluation with properties:
    
        NumObservations: 150
             InspectedK: [1 2 3 4 5]
        CriterionValues: [0.0870 0.5822 0.8766 1.0007 1.0465]
               OptimalK: 4
    
    
    

    The clustering evaluation object evaluation contains data on each proposed clustering solution. The returned results indicate that the optimal number of clusters is four.

    The value of the B property of evaluation shows 50 reference data sets.

    evaluation.B
    ans = 
    50
    

    Increase the number of reference data sets by 100, for a total of 150 sets.

    evaluation = increaseB(evaluation,100)
    evaluation = 
      GapEvaluation with properties:
    
        NumObservations: 150
             InspectedK: [1 2 3 4 5]
        CriterionValues: [0.0794 0.5850 0.8738 1.0034 1.0508]
               OptimalK: 5
    
    
    

    The returned results now indicate that the optimal number of clusters is five.

    The value of the B property of evaluation now shows 150 reference data sets.

    evaluation.B
    ans = 
    150
    

    Input Arguments

    collapse all

    Clustering evaluation data, specified as a GapEvaluation clustering evaluation object. Create a clustering evaluation object by using evalclusters.

    Number of additional reference data sets, specified as a positive integer scalar.

    Data Types: single | double

    Output Arguments

    collapse all

    Updated clustering evaluation data, returned as a GapEvaluation clustering evaluation object. updatedEvaluation contains evaluation data obtained using the reference data sets from the evaluation object and a number of additional reference data sets specified by numsets.

    The increaseB function updates the B property of the evaluation object to reflect the increase in the number of reference data sets used to compute the gap criterion values. The function also updates the CriterionValues property with gap criterion values computed using the total number of reference data sets. If the software finds a new optimal number of clusters and optimal clustering solution when using the total number of reference data sets, then increaseB updates the OptimalK and OptimalY properties. The function also updates the LogW, ExpectedLogW, StdLogW, and SE properties.

    Version History

    Introduced in R2014a