MATLAB Answers

How to import data from attatched csv file?

3 views (last 30 days)
In the attatched csv file there is a lot of clutter other than data, first 11 lines are not needed, so I used
T=readtable('data.csv','HeaderLines',11);
but in the data also, every data point has its unit GHz attached to it, which I dont want to import into my array, how can I stop that GHz from being imported. What more lines of code needs to be added. For ex: output from above readtable function is :
Var1 Var2
_____________ ______
{'1.000 GHz'} 200
{'1.100 GHz'} 200
{'1.200 GHz'} 200
{'1.300 GHz'} 200
{'1.400 GHz'} 200
{'1.500 GHz'} 200
{'1.600 GHz'} 199.99
{'1.700 GHz'} 199.99
{'1.800 GHz'} 199.99
{'1.900 GHz'} 199.99
{'2.000 GHz'} 199.99
It is taking var1 as string because of the presence of GHz, please someone help me remove that. Also explain how can I name change names of these var1 and var2 by using appropritae argument is readtable function.
I noticed another problem, in the var2 column the data is till 3 decimal places, but imported data is automatically rounded off to 2 decimal place, i don't want it to be rounded off. How to avoid that?
Thankyou in advance.

  2 Comments

Prabhanshu Chandra
Prabhanshu Chandra on 12 Aug 2020
So sorry Chris, I added the question again and forgot to add to this one, now its attached.

Sign in to comment.

Accepted Answer

Cris LaPierre
Cris LaPierre on 12 Aug 2020
Edited: Cris LaPierre on 12 Aug 2020
You could try using the Format name-value pair. You might need to read about it in textscan for the specifics of how it works. This allows you to specify the format of the delimited values in each row.
Space gets treated as a delimiter here. Since there is a space between the number and GHz, it can be read in independent of the number and actually be ignored. Since we don't have your csv, I can't test it for your data, but try somethign like this.
T=readtable('data.csv','HeaderLines',11,"Format",'%f%*s%f')
From the sample data you shared, this is the result I get.
T =
Var1 Var2
____ ______
1 200
1.1 200
1.2 200
1.3 200
1.4 200
1.5 200
1.6 199.99
1.7 199.99
1.8 199.99
1.9 199.99
2 199.99

  5 Comments

Show 2 older comments
Prabhanshu Chandra
Prabhanshu Chandra on 12 Aug 2020
Oh you are right, its only rounding in the display, actual values have been correctly read. Did not know it showed rounded values in display, I am not an extensive MATLAB user.
T.Properties.VariableNames={'newVar1','newVar2'}
This also worked like a charm, thanks a lot Chris, learned some useful things today because of you.
Cris LaPierre
Cris LaPierre on 12 Aug 2020
Glad to help. One more follow up. It's only rounding with you display the entire table. Try displaying just var2 and it will print without rounding.
T.Var2
ans =
199.9980
199.9970
199.9970
199.9960
199.9950
199.9950
199.9940
199.9930
199.9920
199.9920
...

Sign in to comment.

More Answers (1)

Jeremy Hughes
Jeremy Hughes on 12 Aug 2020
In R2020a,
T = readtable('data.csv','TrimNonNumeric',true,'NumHeaderLines',11);
or in earlier releases,
opts = detectImportOptions('data.csv','TrimNonNumeric',true,'NumHeaderLines',11);
T = readtable('data.csv',opts);

  5 Comments

Show 2 older comments
Jeremy Hughes
Jeremy Hughes on 12 Aug 2020
Hmm, the NumHeaderLines thing seems like a bug. Reported.
Prabhanshu Chandra
Prabhanshu Chandra on 12 Aug 2020
Or maybe combined with TrimNonNumeric its acting that way.
Since we have already trimmed non numeric part we dont have to trim header again, we just have to trim numeric header. Thats why by using 1 as argument it works fine, because there only 1 numeric line in header part of my csv file.
Jeremy Hughes
Jeremy Hughes on 12 Aug 2020
Oh, I see, never mind. Not a bug exactly--just a quirk of CSV data. The first row begins with a double quote character.
CSV spec says that should be one field, containing the newlines and all... so there is only one line being interpreted by the parser. The first "row" considering quotes looks like this
"freq\nReferences : ['R']\nDependency : [freq]\nNum. Points : [91]\nMatrix Size : scalar\nType : Real", ...
"R\nReferences : []\nDependency : [freq]\nNum. Points : [91]\nMatrix Size : scalar\nType : Real"
If you leave off the NumHeaderLines it treats this as a row, and stripes away everything but the 91. and I see:
T = readtable('data.csv','TrimNonNumeric',true); head(T)
ans =
8×2 table
Var1 Var2
____ ______
91 91
1 200
1.1 200
1.2 200
1.3 200
1.4 200
1.5 200
1.6 199.99

Sign in to comment.

Products


Release

R2019b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!