How I make a better pre-processing for machine learning?

Hi! I have a data set with different features of data (size 1950x22). With this I have to develop an algorithm through machine learning that is capable of predicting with respect to one of the categories (in particular the 22nd) when the result gives new data for the other features. So I summarize: the output features (the 22nd), the one that expresses the result through which the other categories (the first 21 columns) must be trained to predict, has been categorized into three different categories: 1,2,3.. The problem is that after the pre-processing the reference categories have become from
1: 1655
2: 295
3: 176
1: 1337
2: 135
3: 24
Is there a way to overfit the data of the 3 categories output? Or to make sure that in doing the training in the classification learner app it takes all the data belonging to category 3 of the output features

