combining text and numeric matrices

My dataset has 6 predictors (all ordinal text values e.g. good, better, best) and 1 response (ordinal numeric value e.g. 1,2,3) column. When I’m trying to combine these into 7 columns for further classification study, I’m shown the following error ’ Error using horzcat Dimensions of matrices being concatenated are not consistent. ’ Any suggestion?

10 Comments

Birdman
Birdman on 7 Feb 2018
Edited: Birdman on 7 Feb 2018
Can you share your data in a mat file and your code ?
When you talk about ordinal do you mean you are using categorical variables?
Hasnain Ali
Hasnain Ali on 8 Feb 2018
Edited: Hasnain Ali on 8 Feb 2018
The matrix looks like the pic attached. I can convert the categorical variables(which are ordinal as well) into numerical values, but I find assigning an arbitrary numeric value to the text values, unjustified. At the same time, I find logistic regression wouldn't classify the matrix in its original form.
But it appears you would have to convert the inputs to numeric, but not the response variable
Hasnain Ali
Hasnain Ali on 8 Feb 2018
Edited: Hasnain Ali on 8 Feb 2018
Walter Roberson,
It isn't working. Logistic regression app in Matlab wouldn't even identify matrix (XYO) on which I wish to employ logistic regression. I did exactly what you told.
First converted predictor text to numeric value(e.g. 1,2,3). Then,
XY=[Predictor1 Predictor2 Predictor3];
XY=num2cell(XY);
XYO=[XY Outcome];
If you use the mnrfit() routine then you would not convert XY to cell, and you would pass in the outcomes as the second parameter rather than building a single XYO matrix.
I did even this. It still won't accept response variable as cell value (text entries). It displays the following error.
Error using mnrfit (line 142) Inputs must be floats, namely single or double.
The R2017b documentation says that the Y may be categorical.
Oh! I have R2015b version. Walter Roberson, is there any other method you'd know of?
Response values, specified as a column vector or a matrix. Y can be one of the following:
  • An n-by-k matrix, where Y(i,j) is the number of outcomes of the multinomial category j for the predictor combinations given by X(i,:). In this case, the number of observations are made at each predictor combination.
  • An n-by-1 column vector of scalar integers from 1 to k indicating the value of the response for each observation. In this case, all sample sizes are 1.
  • An n-by-1 categorical array indicating the nominal or ordinal value of the response for each observation. In this case, all sample sizes are 1.

Sign in to comment.

Answers (1)

I am going to assume, that your predictors matrix is of type 'm x 6 Cell'.
temp = randi(3,10,6);
predictors = cell(10,6);
predictors(temp==1) = {'good'};
predictors(temp==2) = {'better'};
predictors(temp==3) = {'best'};
response = randi(3,10,1);
This results in:
predictors =
{'good' } {'good' } ...
{'better'} {'best' } ...
... ...
and
response =
1
2
...
When you want to combine them you, need to convert your numerical array 'response' into an cell array to match the type of 'predictors':
combined = [predictors, num2cell(response)];

1 Comment

Hey Kai Domhardt! Thank you. This is helpful.
However, I'm not able to perform logistic regression over the dataset. Can logistic regression be performed on ' combined' matrix that you've just generated?

Sign in to comment.

Categories

Find more on Text Analytics Toolbox in Help Center and File Exchange

Asked:

on 7 Feb 2018

Commented:

on 9 Feb 2018

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!