grouping elements of a column that correspond to specific elements from another column
    4 views (last 30 days)
  
       Show older comments
    
Dear all, I issue the following commands in order to merge 3 excel files
 clear all 
 fname = {'fgg.xlsx', 'sedf.xlsx','sddefff.xlsx'};
 [data,text,a]  = cellfun(@xlsread,fname,'un',0);
[m,n] = cellfun(@size,a);
mm = max(m);
mn = max(n);
out1 = arrayfun(@(x,y,z)[x{:},nan(y,mn-z)],a,m,n,'un',0);
out1 = cat(1,out1{:});
 out1=out1(~cellfun(@(x)all(isnan(x)), out1(:,8)),:);% erase empty cells in the date vector
 So the resulting outcome is out1 which is 
out1={ 
'country' 'area'  'number' 'geographical codes'
'MN'     [      0]   [1.2868]    [             NaN]
'MN'    [      0]    [2.9102]         [      0]    
'MN'    'AER_KL1'   [1.2868]         'B1'   
'MN'    'AER_KL1'  [2.9102]        'B1'  
'MN'     'AER_KL1'  [1.2868]      'B1'
'MN'     'AER_KL1'  [1.2868]     'B1'
'MN'    'AER_KL1'  [3.0740]    'B1'
'MN'     'AER_KL1' [3.0740]           'B2'    
'MN'      'AER_KL1' [3.0740]     'B2'   
'MN'     'AER_KL1'  [3.0740]    'B2'   
'MN'      'AER_KL1' [3.0740]     'B2'   
'MN'   'AER_KL1'  [3.0740]    'B2'   
'MN'    'AER_KL1'  [3.0740]       'B2'   
'MN'   'AER_KL2'  [3.0740]      'B2'   
'MN'     'AER_KL2'  [3.0740]      'B2'   
'MN'   'AER_KL2'    [3.0740]      'B2'   
'MN'   'AER_KL2'   [3.0740]      'B2'   
'MN'  'AER_KL2'    [3.0740]      'B2'   
'MN'   'AER_KL2'  [3.0740]       'B2'   
'MN'  'AER_KL2'   [3.0740]       'B2'   
'MN'  'AER_KL2'   [28.2414]    'B2'   
'MN'   'AER_KL3'   [3.0740]       'B2'   
'MN'     'AER_KL3'  [3.0740]       'B2'   
'MN'   'AER_KL3'   [3.0740]       'B2'   
'MN'   [      0]     [28.2414]       [      0]   
'MN'    [      0]   [3.0740]        [      0]   
'MN'   [      0]    [3.0740]      [      0]  
'MN'   [      0]    [29.6135]  [      0]     
'MS'    [      0]  [29.6135] [      0]   
'MS'      [      0]  [3.0740]        [             NaN]
'MS'      'AER_KL1'  [3.0740]             'KS 3001'  
'MS'      'AER_KL1'   [3.0740]       'KS 3001'  
'MS'       'AER_KL1'   [3.0740]       'KS 3001'  
'MS'      'AER_KL1' [3.0740]        'KS 3001'  
'MS'     'AER_KL1' [3.0740]       'KS 3001'  
'MS'     'AER_KL1'  [29.6135]    'KS 3001'  
'MS'   'AER_KL1'  [28.2414]  'KS 3001'  
'MS'   'AER_KL1'  [3.0740]      'KS 3001'  
'MS'   'AER_KL1'  [3.0740]      'KS 3001'  
'MS'   'AER_KL1' [28.2414]      'KS 3001'  
'MS'  'AER_KL1'  [29.6135]  'KS 3001'  
'MS'    'AER_KL2'  [3.0740]       ' SDIRNR+3000 '       
'MS'  'AER_KL2'    [28.2414]  ' SDIRNR+3000 '       
'MS'  'AER_KL2'    [3.0740]        ' SDIRNR+3000 '       
'MS'    'AER_KL2'  [3.0740]         ' SDIRNR+3000 '       
'MS'    'AER_KL2'   [3.0740]           ' SDIRNR+3000 '       
'MS'    'AER_KL2'   [3.0740]       ' SDIRNR+3000 '       
'MS'     'AER_KL2'    [29.6135]      ' SDIRNR+3000 '       
'MS'  'AER_KL2'    [3.0740]        ' SDIRNR+3000 '       
'MS'     'AER_KL2'    [3.0740]        ' SDIRNR+3000 '       
'MS'    'AER_KL2'  [3.0740]         ' SDIRNR+3000 '
As you can see I have 2 countries (MN and MS -first column) and for each country I have a specific number of areas-second column- (for MN I have 'AER_KL1', 'AER_KL2' and 'AER_KL3' and for MS I have 'AER_KL1' and 'AER_KL2' ). Each area is assigned a geographical code (last column)
I want to select first only MN and then find the geographical codes that correspond to this country. Then select 'MS' and find the geographical codes that correspond to this country and so on…using some loop For example I want to get
 'MN'           'B1'   
 'MN'           'B2'   
'MN'         'B3' 
'MS'            'KS 3001'   
'MS'           'SDIRNR+3000'
I would like to find a code that will produce this simplified matrix
Thanks in advance
PS: In my real data I have 40 countries and the number of areas (or geographical codes) varies across countries
0 Comments
Accepted Answer
  Cedric
      
      
 on 28 Jan 2013
        
      Edited: Cedric
      
      
 on 28 Jan 2013
  
      I would go for something like that:
 >> cName = 'MN' ;
 >> flagCountry = cellfun(@(cntry)strcmp(cntry, cName), out1(:,1)) ;
 >> flagValid   = cellfun(@(code)ischar(code), out1(:,end)) ;
 >> unique(out1(flagCountry&flagValid,end))
2 Comments
  Cedric
      
      
 on 29 Jan 2013
				
      Edited: Cedric
      
      
 on 29 Jan 2013
  
			You just need to loop over unique country/region codes:
 cntryCodes = unique(out1(:,1)) ;
 for ii = 1:numel(cntryCodes)
    flagCountry = cellfun(@(cntry)strcmp(cntry, cntryCodes{ii}), ...
                          out1(:,1)) ;
    flagValid   = cellfun(@(code)ischar(code), out1(:,end)) ;
    regionCodes = unique(out1(flagCountry&flagValid,end)) ;
    for jj = 1:numel(regionCodes)
       fprintf('%s\t%s\n', cntryCodes{ii}, regionCodes{jj}) ;
    end
    fprintf('\n') ;
 end
More Answers (1)
  Sean de Wolski
      
      
 on 28 Jan 2013
        Looks like you want to use unique() with the 'rows' flag and then something else to remove zeros and nans.
0 Comments
See Also
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!

