how to find cell array elements in other arrays

Hi everybody
I have some cell array like this:
source = {'a','b','c','d','e','f'}
bb = {'a','e','g','l','f','h'}
cc = {'e','c','j','k','l','aa'}
dd = {'a','z','x','yy','e','f','a','a'}
ee = {'z','h','e','a','aa','f','e','a'}
I wana search each element in source trough other arrays. for example for first element of source I expect to have 3. because there are 'a' in bb,dd,ee.
best regards.

7 Comments

What should be done if there are multiple places that a occurs within any of those? Should the count be only 1 per cell no matter how many times it occurs in the cell, or should the count be 1 per occurrence?
only return 1 for per array. for example, for ff = {a,a,a,b,aa,aaa,bb} return 1 for 'a' element.
thanx for your attention
This would be a much easier task if the data was stored in a single numeric array or character array. Then it becomes a trivial application of bsxfun or something similar.
What class of data are the variables? Numeric, character, cell, ... ?
Are the elements in source guaranteed to be numeric scalars? Are they guaranteed to be strings? Either of those two can be done efficiently; if they can be other data types then the comparisons are more expensive.
If any of the values are permitted to be structures, then do you want the comparison to proceed matching field to field? Or do you want the comparison to fail if there is a structure which has the same fields but in a different order?
In the case of numeric values, if the values exist but in a different numeric class, then do you want the test to succeed or does the numeric class need to be the same? For example, should uint8(42) be considered to match 42.0 ? Should 0 be considered to match logical false? Should 1 be considered to match logical true? Should any non-zero non-nan value be considered to match logical true?
mhm
mhm on 20 Jan 2016
Edited: mhm on 20 Jan 2016
There aren't any numeric value. All of the values are string CELL array . My source value is a sorted array of cell and the others are not sort.
It is extremely important to use valid matlab syntax in your question to avoid confusion.
c = {a, b, c}
is a cell array which contains three elements, the content of variable 'a', 'b' and 'c', these could be strings, scalar numbers, matrices, more cell arrays, structures, handles, etc. Everybody read your question as the cell array can contain arbitrary data of any type.
c = {'a', 'b', 'c'}
is a cell array which contains only strings. The answer for that is much simpler.
mhm
mhm on 20 Jan 2016
Edited: mhm on 20 Jan 2016
sorry that I make this mistake.you are right. and thanks for your mention.the second type is correct.I am new in matlab. So what is my answer?

Sign in to comment.

 Accepted Answer

source = {'a','b','c','d','e','f'};
bb = {'a','e','g', 'l', 'f', 'h'};
cc = {'e','c','j', 'k', 'l','aa'};
dd = {'a','z','x','yy', 'e', 'f','a','a'};
ee = {'z','h','e', 'a','aa', 'f','e','a'};
X = cellfun(@(c)ismember(source,c),{bb,cc,dd,ee},'UniformOutput',false);
Y = sum(vertcat(X{:}),1)
output:
Y =
3 0 1 0 4 3

3 Comments

That's it!
Thank's for your help and correct answer. and thanx to all of my friends here for their attentions.
" I'd like use matlab facilities and tricks for this problem in order better performance and optimization code"
More efficient code is a worthwhile goal. Unfortunately, this is not always achieved by avoiding loops altogether (but it often is). In terms of performance, the above is probably slower than the loop version. In terms of memory, it uses more.
You have the cost of an anonymous function call, which in matlab has never been cheap. You also have the cost of creating an output cell array and then converting the cell array to a matrix. These two operations are not performed by the loop version.
The only advantage of the cellfun is better clarity of the intent of the code (apply the same function to all elements of the array).
Dear my friend...
Thank you for your mentions and guides.

Sign in to comment.

More Answers (2)

Guillaume
Guillaume on 20 Jan 2016
Edited: Guillaume on 20 Jan 2016
The ismember function is your friend. To make it easier to perform the search, it is much easier to put your search arrays all in one big cell array. Giving individual names to similar variables is never a good idea.
source = {'a', 'b', 'c', 'd', 'e', 'f'}
bb = {'a', 'e', 'g', 'l', 'f', 'h'}
cc = {'e', 'c', 'j', 'k', 'l', 'aa'}
dd = {'a', 'z', 'x', 'yy', 'e', 'f'}
ee = {'z', 'h', 'e', 'a', 'aa', 'f'}
searcharrays = {bb, cc, dd, ee}; %put all search arrays into one cell array
searchcount = zeros(size(source)); %initialise count to 0
for sidx = 1:numel(searcharrays) %loop over each search array
searchcount = searchcount + ismember(source, searcharrays{sidx}); %and add 1 if source element is found
end

4 Comments

mhm
mhm on 20 Jan 2016
Edited: mhm on 20 Jan 2016
thanks for your answer. but...
I think this code is not my desired. because I expect below result for these value(source = {'a', 'b', 'c', 'd', 'e', 'f'}):
[3,0,1,0,4,3]
because...
3 -> there is 'a' in bb,dd,ee
0 -> there is 'b' in none of them
1 -> there is 'c' in cc
0 -> there is 'd' in none of them
4 -> there is 'e' in bb,cc,dd,ee
3 -> there is 'f' in bb,dd,ee.
This is very much the same code I was going to propose for the task until I realized I had no idea what the datatypes were. I had the minor difference of initializing the count to a scalar 0 instead of bothering with zeros(size(source))
mhm
mhm on 20 Jan 2016
Edited: mhm on 20 Jan 2016
We can use ismember function and we should use for loops but I don't like use it. I'd like use matlab facilities and tricks for this problem in order better performance and optimization code.
how can i combine ismember and cellfun for this problem?
The code is exactly your desired result. If you look at the content of searchcount, it has the values that you want.
There are usually very good to replace loops by vectorised operations as this makes the code more efficient. This is not one of them. The loop in this case is more efficient.
@Walter, to me it makes more sense to initialise the searchcount to a vector of the right size (it degenerates better when searcharrays is empty), but indeed for the generic case, it could just be a scalar 0.

Sign in to comment.

Categories

Find more on Loops and Conditional Statements in Help Center and File Exchange

Asked:

mhm
on 19 Jan 2016

Commented:

mhm
on 21 Jan 2016

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!