Importing data from a complex text file

5 views (last 30 days)
Greetings
I'm fairly inexperienced in MATLAB and I've got a rather difficult problem on my hands. I need to import a set of numbers from a text file. The sample of a text file is on the link: https://www.dropbox.com/s/oexzfcdrfyj8n2n/FSB32E15M-L20.txt?dl=0
As you can see, numbers are arranged in groups. Each group consists of 512 pairs of numbers closed in brackets. I need to import these numbers into MATLAB in a following way: Each group of numbers will make two vectors, for example, X1 and Y1. The first number in bracket will go into the X1 vector, and the second number in bracket will go into the Y1 vector. For example, if the text file says: (2029,0),(2034,0),(1998,0)... my vectors will be:
X1 = [2029 2034 1998 ....]
Y1 = [0 0 0 ....]
And since the file has 25 groups of numbers, I should end up with 50 vectors. X1-X25 and Y1-Y25. Every other text in the file is to be ignored.
Any suggestions on how to do this will be appreciated :)
Btw., not everything needs to be automated or 100% efficient. I can do some manual work if it means having a much simpler code.
  2 Comments
Stephen23
Stephen23 on 5 Jun 2015
Edited: Stephen23 on 5 Jun 2015
Please upload the file here using the paperclip button, and then pushing both Choose file and Attach file buttons.
Not everyone uses dropbox, and it would be nice to know what format the data file has.
Guillaume
Guillaume on 5 Jun 2015
You can dismiss the prompt(s) to sign in into dropbox and access the file nonetheless. I agree, attaching it here makes it more accessible.

Sign in to comment.

Accepted Answer

Guillaume
Guillaume on 5 Jun 2015
There are many ways you could read that file, using fgetl, fscanf, textscan, etc.
The way I'd do it would be to read the whole file at once with fileread, automatically extract each section of number with regexp (your section of numbers are always between the (CYDS) and (CYDE) lines) and finally parse each section with textscan (or another regular expression):
wholecontent = fileread('FSB32E15M-L20.txt');
numericsection = regexp(wholecontent, '(?<=\(CYDS\)).*?(?=\(CYDE\))', 'match');
%the regexp above extract the minimum number of characters (the |.*?|) immediately
%preceded by (CYDS) (the (?<=\(CYDS\))) and immediately followed by (CYDE) (the (?=\(CYDE\)))
%Now, pass each section to textscan and extract groups of two numbers into two columns:
secxy = cellfun(@(section) textscan(section, '(%d,%d),'), numericsection, 'UniformOutput', false);
%Split and combine all xy cell arrays into a cell array of x and a cell array of y:
[x, y] = cellfun(@(xy) deal(xy{:}), secxy, 'UniformOutput', false);
%This last line only wors if all sections have the same number of elements
x = cell2mat(x); y = cell2mat(y);
Note that the output is two matrices, one of X columns, one of Y columns. It is never a good idea to create numbered arrays. Matrices or cell arrays are always better.
Note that if cellfun is problem for you, you can use this instead:
wholecontent = fileread('FSB32E15M-L20.txt');
numericsection = regexp(wholecontent, '(?<=\(CYDS\)).*?(?=\(CYDE\))', 'match');
x = cell(1, numel(numericsection));
y = cell(1, numel(numericsection));
for isec = 1:numel(numericsection);
secxy = textscan(numericsection{isec}, '(%d,%d),');
[x{isec}, y{isec}] = secxy{:};
end
x = cell2mat(x); y = cell2mat(y);
  1 Comment
Igor
Igor on 5 Jun 2015
Thank you very much. You've been extremely helpful.

Sign in to comment.

More Answers (0)

Categories

Find more on Text Data Preparation in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!