5 views (last 30 days)

Show older comments

Hi -- this is kind of tricky. I am using a 2D array (matrix) to store some information. The first n-1 columns hold indexes of variables I'm dealing with -- in other words think of them as names not numbers. The last column contains a count for the number of times I've seen the variables in the preceding n-1 columns. There are many rows with such counts, all with the same number of variables. So if I'm counting single variables it could look like this:

1 3

2 2

3 1

Meaning I've seen variable 1 three times, variable 2 twice, and variable 3 once. If I'm counting two variable combos, it could look like this:

1 2 4

1 3 3

2 4 3

3 4 1

which means I've seen variables 1 and 2 together four times, I've seen variables 1 and 3 together three times, and I've also seen variables 2 and 4 together three times, and finally I've seen variables 3 and 4 together once.

Similar structures can exist for 3, 4, 5 variables, maybe more.

What I need help with is turning these structures into a single vector of variables repeated for every time they've been counted. So for that first example with single variables the vector would contain:

1 1 1 2 2 3

For that second example the vector would contain:

1 1 1 1 2 2 2 2 1 1 1 3 3 3 2 2 2 4 4 4 3 4

These vectors will allow me to do some histogram type analysis, but I'm not sure how to replicate these variables into the new vector based on the counts in that last column. Any help would be appreciated.

PS - for n variables followed by a count column, the ACTUAL data structure I'm using has n extra columns inserted between the variables and the counts (i.e. 2n+1 columns). The information in those columns isn't relevant to the question, but it implies the following. For n=1 variable, the structure has three columns: the first for the variables, the second for the extra information not relevant to this question, and the third for the count. For n=2 variables, the structure has five columns: the first two for the variables, the second two for the extra information not relevant to this question, and the fifth for the count. For n=3 variables it has seven columns -- 3, 3 and 1...

Thanks in advance!

Mike

Mohsin Shah
on 15 May 2019

Daniel Shub
on 11 May 2012

It is not the fastest and you probably could preallocate z if you wanted to. It also ignores your ps, but getting rid of the extra columns shouldn't be hard

z = [];

for ii = 1:size(x, 1)

y = repmat(x(ii, 1:(end-1)), x(ii, end), 1);

y = reshape(y, 1, numel(y));

z = [z, y];

end

To deal with your ps instead of

x(ii, 1:(end-1))

you want to stop where the data stops.

Sean de Wolski
on 11 May 2012

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!