How to identify strings that can be converted to float numbers in a cell array?
You are now following this question
- You will see updates in your followed content feed.
- You may receive emails, depending on your communication preferences.
An Error Occurred
Unable to complete the action because of changes made to the page. Reload the page to see its updated state.
Show older comments
I can convert a string to double using str2double, but my goal is to identify such cells. Attached is an example file.
A = readcell('test6.xlsx');
As you can see, some cells on the 2nd column meet this criteria.
Many thanks.
1 Comment
Stephen23
on 11 Mar 2025
The simplest approach ist to just call STR2DOUBLE.
Accepted Answer
Les Beckham
on 11 Mar 2025
I would suggest using readtable instead of readcell
A = readcell('test6.xlsx')
A = 9x6 cell array
{'Cruise ID'} {'Salinity_flag'} {'Longitude_DEC'} {'Temp' } {'Salinity'} {'EXPO' }
{'ID12' } {[ 2]} {[ -25.0100]} {[ 27.7690]} {[ 35.5800]} {'33RC1991'}
{'ID12' } {[ 2]} {[ -25.0020]} {[ 25.9660]} {[ 35.3670]} {0x0 char }
{'ID12' } {0x0 char } {[ -24.9930]} {[ 25.0250]} {[ 35.8480]} {[ 200105]}
{0x0 char } {0x0 char } {[ -25.0010]} {[<missing>]} {0x0 char } {[ 456]}
{'ID12' } {'2' } {[ -25.0020]} {[ 25.3260]} {[ 36.0880]} {[ 3689]}
{'ID12' } {'2' } {[ -25]} {[ 26.1690]} {[ 35.9120]} {[ 3256778]}
{'ID12' } {'2' } {[ -25]} {[<missing>]} {[ 36.2220]} {'33RC1991'}
{'ID12' } {[ 2]} {[ -25]} {[<missing>]} {[ 36.3660]} {'33RC1991'}
You can define the variable types so Matlab knows which columns (variables) are supposed to be numeric.
opts = detectImportOptions('test6.xlsx');
opts.VariableTypes = { 'char', 'double', 'double', 'double', 'double', 'double' };
T = readtable('test6.xlsx', opts)
Warning: Column headers from the file were modified to make them valid MATLAB identifiers before creating variable names for the table. The original column headers are saved in the VariableDescriptions property.
Set 'VariableNamingRule' to 'preserve' to use the original column headers as table variable names.
Set 'VariableNamingRule' to 'preserve' to use the original column headers as table variable names.
T = 8x6 table
CruiseID Salinity_flag Longitude_DEC Temp Salinity EXPO
__________ _____________ _____________ ______ ________ __________
{'ID12' } 2 -25.01 27.769 35.58 NaN
{'ID12' } 2 -25.002 25.966 35.367 NaN
{'ID12' } NaN -24.993 25.025 35.848 2.001e+05
{0x0 char} NaN -25.001 NaN NaN 456
{'ID12' } 2 -25.002 25.326 36.088 3689
{'ID12' } 2 -25 26.169 35.912 3.2568e+06
{'ID12' } 2 -25 NaN 36.222 NaN
{'ID12' } 2 -25 NaN 36.366 NaN
And, if you really want a cell array instead of a table:
C = table2cell(T)
C = 8x6 cell array
{'ID12' } {[ 2]} {[-25.0100]} {[27.7690]} {[35.5800]} {[ NaN]}
{'ID12' } {[ 2]} {[-25.0020]} {[25.9660]} {[35.3670]} {[ NaN]}
{'ID12' } {[NaN]} {[-24.9930]} {[25.0250]} {[35.8480]} {[ 200105]}
{0x0 char} {[NaN]} {[-25.0010]} {[ NaN]} {[ NaN]} {[ 456]}
{'ID12' } {[ 2]} {[-25.0020]} {[25.3260]} {[36.0880]} {[ 3689]}
{'ID12' } {[ 2]} {[ -25]} {[26.1690]} {[35.9120]} {[3256778]}
{'ID12' } {[ 2]} {[ -25]} {[ NaN]} {[36.2220]} {[ NaN]}
{'ID12' } {[ 2]} {[ -25]} {[ NaN]} {[36.3660]} {[ NaN]}
9 Comments
Leon
on 11 Mar 2025
Moved: Walter Roberson
on 11 Mar 2025
Many thanks for the proposed solution.
Unfortunately, I have hundreds of these files, and each of them is like a box of chocolate. I have no idea which column should be read as char or double. I have to write my own program to decipher that. That's why I used "readcell".
So far I'm able to identify these:
A = readcell('test6.xlsx');
% Determine the cells with doubles:
map1 = cellfun('isclass', A, 'double');
% Determine the cells with empty strings, like ' ':
map2 = cellfun(@(x)ischar(x) && isempty(deblank(x)),A);
% Determine the cells with missing values:
map3 = ismissing(string(A));
I need a map4 to identify strings that can potentially be converted into a double.
Walter Roberson
on 11 Mar 2025
It is possible to write a regexp pattern that matches floating point numbers, but it is a nuisance to do so. It goes something like
'$\w*[+-]?(\d+(\.\d*)?|\.\d+)([deDE][+-]?\d*)?\w*$'
The major nuisance here is that floating point numbers might start with a digit (after optional sign) and that might be followed by an optional period, but floating point numbers might also start directly with a period -- and you need to eliminate the possibility that the floating point number has only a period.
Leon
on 11 Mar 2025
Many thanks. How specifically should I write such a regexp script?
Something like the below? Unfortunately, it does not work for me? The results show all of the cells as 0x0 double.
A = readcell('test6.xlsx');
B = regexp(string(A), '$\w*[+-]?(\d+(\.\d*)?|\.\d+)([deDE][+-]?\d*)?\w*$')
A = readcell('test6.xlsx');
mask = cellfun(@(C) ischar(C) && regexp(C, '$\s*[+-]?(\d+(\.\d*)?|\.\d+)([deDE][+-]?\d*)?\s*$', 'once'), A);
This includes correction of the two \w to \s
Note that space is not permitted after the optional leading sign. Strictly speaking, the syntax for literals does not permit a space at that point, and "no space" is what is enforced by str2double(). In practice if you were entering numbers into MATLAB, then if the sign is not confused as indicating an operation between adjacent numbers, then the sign would be treated as a unary-minus or unary-plus operation
123 +456 %treated as being plus, giving a scalar result
[123 +456] %treated as a vector of two values with the second inherently positive
+456 %treated as a single literal value that is positive
+ 456 %treated as unary plus applied to 456
Leon
on 11 Mar 2025
Many thanks. Unfortunately, I got the error as below:
Operands to the logical AND (&&) and OR (||) operators must be convertible to logical scalar values. Use the ANY or ALL
functions to reduce operands to logical scalar values.
Error in readExcel_example>@(C)ischar(C)&®exp(C,'$\s*[+-]?(\d+(\.\d*)?|\.\d+)([deDE][+-]?\d*)?\s*$','once') (line 10)
mask = cellfun(@(C) ischar(C) && regexp(C, '$\s*[+-]?(\d+(\.\d*)?|\.\d+)([deDE][+-]?\d*)?\s*$', 'once'), A);
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Error in readExcel_example (line 10)
mask = cellfun(@(C) ischar(C) && regexp(C, '$\s*[+-]?(\d+(\.\d*)?|\.\d+)([deDE][+-]?\d*)?\s*$', 'once'), A);
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Note: correction of leading $ in pattern to be ^
A = readcell('test6.xlsx');
mask = cellfun(@(C) ischar(C) && ~isempty(regexp(C, '^\s*[+-]?(\d+(\.\d*)?|\.\d+)([deDE][+-]?\d*)?\s*$', 'once')), A)
mask = 9x6 logical array
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 1 0 0 0 0
0 1 0 0 0 0
0 1 0 0 0 0
0 0 0 0 0 0
Leon
on 11 Mar 2025
It works. Many thanks.
Note that str2double will give you the same mask. It returns NaN for anything that it can't convert.
A = readcell('test6.xlsx');
mask = ~isnan(str2double(A))
mask = 9x6 logical array
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 1 0 0 0 0
0 1 0 0 0 0
0 1 0 0 0 0
0 0 0 0 0 0
Walter Roberson
on 11 Mar 2025
Yes. You specifically wanted a method that did not use str2double() (for reasons that are not clear to us).
More Answers (1)
Can it be as simple as this?
>> str2double(A)
ans =
NaN NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN NaN
NaN 2 NaN NaN NaN NaN
NaN 2 NaN NaN NaN NaN
NaN 2 NaN NaN NaN NaN
NaN NaN NaN NaN NaN NaN
map0 = ~isnan(B)
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 1 0 0 0 0
0 1 0 0 0 0
0 1 0 0 0 0
0 0 0 0 0 0
Categories
Find more on Spreadsheets in Help Center and File Exchange
See Also
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)