How to delete/retain variables (columns) from a table based on sections of the variables name

1 view (last 30 days)
Hello,
Wondering if anyone knows how to delete/retain variables (columns) from a table based on portions of the variables name?
This is a simple example, real data has more than 30000 variables (columns) with combinations of names and other information.
AA_001_AMER_USA AA_002_AMER_USA AA_001_AMER_CAN BB_001_AMER_USA BB_002_AMER_CAN
How to delete only variables with name section 'USA'
How to delete only variables with name section '001' & 'USA'
How to retain only variables with name section= 'USA'
How to retain only variables with name section= '001' & 'USA'
Thanks!

Accepted Answer

per isakson
per isakson on 12 Jan 2018
Edited: per isakson on 12 Jan 2018
One way
%%Sample table
tbl = array2table( magic(5) );
tbl.Properties.VariableNames = ...
{ 'A_001_AMER_USA' , 'AA_002_AMER_USA' ...
, 'AA_001_AMER_CAN', 'BB_001_AMER_USA' ...
, 'BB_002_AMER_CAN' };
How to retain only variables with name section 'USA'
cac = regexp( tbl.Properties.VariableNames, '^.+_USA$', 'match' );
T01 = tbl( :, cat(2,cac{:}) );
%%How to retain only variables with name section= '001' & 'USA'
cac = regexp( tbl.Properties.VariableNames, '^[^_]+_001_[^_]+_USA$', 'match' );
T02 = tbl( :, cat(2,cac{:}) );
How to delete only variables with name section 'USA'
cac = regexp( tbl.Properties.VariableNames, '^.+_USA$', 'match' );
T03 = tbl;
T03( :, cat(2,cac{:}) ) = [];
%%How to delete only variables with name section '001' & 'USA'
cac = regexp( tbl.Properties.VariableNames, '^[^_]+_001_[^_]+_USA$', 'match' );
T04 = tbl;
T04( :, cat(2,cac{:}) ) = [];
and a variant, which I guess is faster
How to retain only variables with name section 'USA'
cac = regexp( tbl.Properties.VariableNames, '^.+_USA$' );
T01 = tbl( :, not( cellfun( @isempty, cac ) ) );
%%How to retain only variables with name section= '001' & 'USA'
cac = regexp( tbl.Properties.VariableNames, '^[^_]+_001_[^_]+_USA$' );
T02 = tbl( :, not( cellfun( @isempty, cac ) ) );
How to delete only variables with name section 'USA'
cac = regexp( tbl.Properties.VariableNames, '^.+_USA$' );
T03 = tbl( :, cellfun( @isempty, cac ) );
%%How to delete only variables with name section '001' & 'USA'
cac = regexp( tbl.Properties.VariableNames, '^[^_]+_001_[^_]+_USA$' );
T04 = tbl( :, cellfun( @isempty, cac ) );

More Answers (0)

Categories

Find more on Cell Arrays in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!