Import data from text file in awkward format

12 views (last 30 days)
Hello Everyone,
I'm trying to bulk import data from textiles. While the data format is logical and always the same, it is very awkward.
First, there are about 60 lines of the experimental conditions, then there are columns and rows of data - this is what I would like to import as matrix. These are separated with spaces. The last column has no numbers, it's just a bunch of dots (....)
Then, after about 600 lines, the repeats are plotted i.e. there's a line that says "Curve 2", followed by headers and then the desirable data again.
I have tried different ways of textread textscan but without success. Is there any way to import the first 9 numbers, of let's say line 60 to line 600? Ideally I'd only import the columns V and A but I'm happy just getting the entire matrix and then copying over the stuff I really want into a new matrix.
I'll attach screen shots (the data in the red box is what I want) and an example file. It's the output of running CV on a Gamry potentiostat – In case anyone has experience with that.
There will be hundreds of such data files so I really need an automatic way of doing it...
Best wishes,
A.

Accepted Answer

Mostafa
Mostafa on 3 Nov 2016
Edited: Mostafa on 3 Nov 2016
This is neither the fastest, nor the most elegant solution, but it works.
%Read File
fid = fopen('example.txt','rt');
impData = textscan(fid, '%s', 'Delimiter', ' ');
fclose(fid);
impData = impData{1};
%Remove empty lines
impData(cell2mat(cellfun(@isempty,impData,'UniformOutput',0))) = [];
%Split data strings
strData = cellfun(@strsplit, impData, 'UniformOutput',0);
%Search for the keyword curve in the data
idxCurve = cellfun(@(X) any(cell2mat(strfind(X, 'CURVE'))), strData, 'UniformOutput',0);
%Return indicies of the keyword curve
idxCurve = find(~cell2mat(cellfun(@(X) X == 0,idxCurve,'UniformOutput',0)));
%Save the data in a new variable
for i = 1:length(idxCurve)-1
eval(['Data.Curve' num2str(i) ' = strData(idxCurve(i)+5 : idxCurve(i+1)-1);']);
end
eval(['Data.Curve' num2str(i+1) ' = strData(idxCurve(i)+5 : end);']);
%All the data is stored in Data.Curve1, Data.Curve2, ...
%Assume you want all the data from Curve3, column 2
dataOfInterest = cellfun(@(X) X{2}, Data.Curve3, 'UniformOutput', 0);
%Convert the data into doubles (numbers)
dataOfInterest = cellfun(@str2double, dataOfInterest, 'UniformOutput', 0);
%Use a similar notation for the rest of the data
  4 Comments
Alexander Al-Zubeidi
Alexander Al-Zubeidi on 4 Nov 2016
Thanks! I wasn't sure how covert it for all cells, only for one at a time. I haven't tried it yet but your way looks like it would be faster so I'll go back and change it. But I agree, as long as it works that's all that really matters!
Anyway, thanks again!
Mostafa
Mostafa on 7 Nov 2016
You're welcome.
In most cases, using cellfun or arrayfun will be faster than using a for loop or a while loop. You can look up ( tic, toc ) in the documentation and use it to check the execution speed.

Sign in to comment.

More Answers (0)

Categories

Find more on Data Import and Export in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!