File Exchange

image thumbnail

SMOTE (Synthetic Minority Over-Sampling Technique)

version 1.0.0.0 (6.38 KB) by Manohar
The SMOTE function takes the feature vectors with dimension(r,n) and the target class with dimension

25 Downloads

Updated 29 Oct 2012

View License

The SMOTE (Synthetic Minority Over-Sampling Technique) function takes the feature vectors with dimension(r,n) and the target class with dimension(r,1) as the input.
And returns final_features vectors with dimension(r',n) and the target class with dimension(r',1) as the output.

Implementation based on :
N. Chawla, K. Bowyer, L. Hall, and W. Kegelmeyer. Smote: synthetic minority over-sampling technique. Arxiv preprint arXiv:1106.1813, 2011.

Cite As

Manohar (2021). SMOTE (Synthetic Minority Over-Sampling Technique) (https://www.mathworks.com/matlabcentral/fileexchange/38830-smote-synthetic-minority-over-sampling-technique), MATLAB Central File Exchange. Retrieved .

Comments and Ratings (19)

Giovani Chianese

Could some one help me? i have this error...

Error using nearestneighbour>parseinputs (line 316)
Number of Neighbours must be an integer, and smaller than the no. of points in X

michio

Federico Orsini

Ananya Dutta

excellent work. Thank you.

Stylianos Filippou

Keshav Chandak

Error: "Not enough input arguments". Can someone please help me with the variables we need to pass to this function?

Abinash Pujahari

ugur erkan

how are we running, how do we send the test data

Dovey

Kumuda Roy

Thanks, working, but requires modification for my application.

can this be used for Multi class classification?

Ivan Galindo

Could someone bring me an example of how to use this functions ? I´m trying to use them, but I don´t know which parameters do I have to send

Dylan Brewer

The SMOTE algorithm should choose a random difference between the two minority samples.

Fever

Chuan Pham Minh

Thanks all!

Akhilesh Gotmare

I feel your implementation needs to be used with the following considerations with respect to the technique described in the paper -

1. The Number of nearest neighbors to be chosen is default set to 5 in the paper. Hence the argument to the SMOTE function should be given as 6.

2. The percentage of over-sampling to be performed is a parameter of the algorithm (100%, 200%, 300%, 400% or 500%). The amount of SMOTE is assumed to be in integral multiples of 100. Hence how many of the 5 available neighbors to be chosen for synthesizing new samples is dependent on the amount of over-sampling desired. For instance - for over-sampling equals 200% one has to chose 2 out of the five neighbors randomly. This would require some change in the implementation.

3. The parameter 'th' that determines where on the line joining the two minority samples, the new minority would lie on is fixed to 0.3. The paper mentions the synthetic sample to be randomly placed between the two. The following change in line 24 will solve this problem -
instead of -
new_P=(1-th).*P(i,:) + th.*P(index,:);
write this -
new_p=P(i,:)+((P(index,:)-P(i,:))*rand);

Akhilesh Gotmare

The code works well. For other users - take care to specify the number of neighbors you need to utilize to build the synthetic minority class. Default value is set to 4 in the code and the function nearestneighbour computes the neighbors such that 1 out of 4 is the original minority point and other 3 are the nearest neighbors. Correct me if I am wrong.
Thanks.

Monther Alhamdoosh

The code has a bug when removing. The last loop should be written as follows
for j = 1:length(original_mark)
neighbors = I(j, 2:4);
len = length(find(original_mark(neighbors) ~= original_mark(j,1)));
if(len >= 2)
if(original_mark(j,1) == classlabel)
train_incl(neighbors(original_mark(neighbors) ~= original_mark(j,1)),1) = 0;
else
train_incl(j,1) = 0;
end
end
end

The current code keeps changing the first 3 elements in train_incl.

Carlos Mera

Hi, when I used the SMOTE function, I get the next error. It was caused because an entry row of the I matrix is [0 0 0 0] (line 14). Why did this happen?.

Attempted to access P(0,:); index must be a positive integer or logical.

Error in SMOTE (line 24)
new_P=(1-th).*P(i,:) + th.*P(index,:);

Thanks and regards, Carlos

MATLAB Release Compatibility
Created with R2010b
Compatible with any release
Platform Compatibility
Windows macOS Linux

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!