Efficient Row comparison large dataset

Question

0 votes

I have 2 arrays: array 1 consists of 'a|b|c' rows and array 2 consist rows containing a string, for example 'ut'. Both arrays have a length of 200.000+ rows and have equal length.

From array 1, I need to filter values which have the same third value but a different first value for the same string in array 2.

for example: row 1: 'a|b|c' and 'x' row 2: 'f|g|c' and 'y' row 3: 'a|b|c' and 'y' row 4: 'd|e|c' and 'x'

In this situation i would like to delete row 4, because 'c' and 'x' are the same for both rows, but 'a' and 'd' are different. All other rows do not fit these demands and won't be deleted.

It is possible to write a for loop and compare each row separately, however this process takes days (I tested).

Any help would be greatly appreciated.

0 Comments
Show -2 older comments Hide -2 older comments

Sign in to comment.

Sign in to answer this question.

Follow Question

Answer 1

John D'Errico on 3 Nov 2014

0 votes

help unique

This will serve your needs perfectly, and very efficiently.

0 Comments
Show -2 older comments Hide -2 older comments

Sign in to comment.

Answer 2

Paul on 3 Nov 2014

0 votes

Thank you for your fast answer.

Is it possible to use unique(..) to find the unique combinations of rows? Because then I would like to find unique combinations of the third value of array 1 ('a'|'b'|'this') and the the value of array 2 on the same row.

When I tried a simple set:

s = {'ut','rtd';'hg','ry';'ut','rtd'}

[r,i,j] = unique(s,'rows')

This gives the unique strings ('hg','rtd','ry','ut'), however I need the combinations (row1:'ut' and 'rtd', row2: 'hg' and 'ry')

Is this possible?

0 Comments
Show -2 older comments Hide -2 older comments

Sign in to comment.

Efficient Row comparison large dataset

0 Comments
Show -2 older comments Hide -2 older comments

Answers (2)

0 Comments
Show -2 older comments Hide -2 older comments

0 Comments
Show -2 older comments Hide -2 older comments

Categories

Tags

Community Treasure Hunt

Efficient Row comparison large dataset

0 Comments Show -2 older comments Hide -2 older comments

Answers (2)

0 Comments Show -2 older comments Hide -2 older comments

0 Comments Show -2 older comments Hide -2 older comments

Categories

Tags

See Also

Community Treasure Hunt

0 Comments
Show -2 older comments Hide -2 older comments

0 Comments
Show -2 older comments Hide -2 older comments

0 Comments
Show -2 older comments Hide -2 older comments