Remove numbers during preprocessing
    10 views (last 30 days)
  
       Show older comments
    
I would like to remove numbers within text. I have this function or script for the preprocessing, how I can remove all numbers? 
%Create Co-occurence Network for only class1 and 0 5%
data = dataone.text;
%textdata = data.text;
data = randsample(data,100)
%data=data(1:100,1)
documents = preprocessText(data);
bag = bagOfWords(documents);
bag1 = removeInfrequentWords(bag,2);
counts = bag1.Counts;
cooccurrence = counts.'*counts;
G = graph(cooccurrence,bag1.Vocabulary,'omitselfloops');
0 Comments
Answers (1)
  Ergin Sezgin
      
 on 30 Sep 2022
        Hello Rachele,
Try using the following code with your string array.
words = ["stringOne", "stringTwo", "2022", "stringThree"]
doubleArray = str2double(words)
nanIdx = isnan(doubleArray)
wordsArray = words(1,nanIdx)
Good luck
2 Comments
  Ergin Sezgin
      
 on 30 Sep 2022
				If the issue is with a char array, its possible to remove all numbers from it, checking each element by an explicit loop or vectorization. If there are multiple char elements in a container, same method should also work after some additional steps are added. Could you please share some of the data?
See Also
Categories
				Find more on Matrix Indexing in Help Center and File Exchange
			
	Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!