Extract data csv file different sizes?

1 view (last 30 days)
Milagros ARIETTI
Milagros ARIETTI on 28 Feb 2017
Hello! I'm sorry to bother. But I can't seem to be able to make my extraction work in a fast way. I have the attached csv files, where I have information in a lot of different ranges. I have a lot of files so I would love to have it all in one.
I've been using this 2 different codes
%%Initialize variables.
clear all
clc
excelname='R9T';
filename = 'C:\Users\Milagros\Desktop\Nueva carpeta\Rev-erb for matlab\R9.csv';
delimiter = ',';
startRow = 21;
endRow = 532;
%%Format string for each line of text:
formatSpec = '%*s%*s%f%f%f%f%f%f%f%f%[^\n\r]';
%%Open the text file.
fileID = fopen(filename,'r');
dataArray = textscan(fileID, formatSpec, endRow-startRow+1, 'Delimiter', delimiter, 'HeaderLines', startRow-1, 'ReturnOnError', false);
%%Close the text file.
fclose(fileID);
%%Create output variable
A = [dataArray{1:end-1}];
%%Clear temporary variables
clearvars filename delimiter startRow endRow formatSpec fileID dataArray ans;
%%Initialize variables.
filename = 'C:\Users\Milagros\Desktop\Nueva carpeta\Rev-erb for matlab\R9.csv';
delimiter = ',';
startRow = 556;
endRow = 1067;
%%Read columns of data as strings:
formatSpec = '%*s%*s%s%s%s%s%s%s%[^\n\r]';
%%Open the text file.
fileID = fopen(filename,'r');
%%Read columns of data according to format string.
dataArray = textscan(fileID, formatSpec, endRow-startRow+1, 'Delimiter', delimiter, 'HeaderLines', startRow-1, 'ReturnOnError', false);
%%Close the text file.
fclose(fileID);
%%Convert the contents of columns containing numeric strings to numbers.
raw = repmat({''},length(dataArray{1}),length(dataArray)-1);
for col=1:length(dataArray)-1
raw(1:length(dataArray{col}),col) = dataArray{col};
end
numericData = NaN(size(dataArray{1},1),size(dataArray,2));
for col=[1,2,3,4,5,6]
% Converts strings in the input cell array to numbers. Replaced non-numeric
% strings with NaN.
rawData = dataArray{col};
for row=1:size(rawData, 1);
% Create a regular expression to detect and remove non-numeric prefixes and
% suffixes.
regexstr = '(?<prefix>.*?)(?<numbers>([-]*(\d+[\,]*)+[\.]{0,1}\d*[eEdD]{0,1}[-+]*\d*[i]{0,1})|([-]*(\d+[\,]*)*[\.]{1,1}\d+[eEdD]{0,1}[-+]*\d*[i]{0,1}))(?<suffix>.*)';
try
result = regexp(rawData{row}, regexstr, 'names');
numbers = result.numbers;
% Detected commas in non-thousand locations.
invalidThousandsSeparator = false;
if any(numbers==',');
thousandsRegExp = '^\d+?(\,\d{3})*\.{0,1}\d*$';
if isempty(regexp(numbers, thousandsRegExp, 'once'));
numbers = NaN;
invalidThousandsSeparator = true;
end
end
% Convert numeric strings to numbers.
if ~invalidThousandsSeparator;
numbers = textscan(strrep(numbers, ',', ''), '%f');
numericData(row, col) = numbers{1};
raw{row, col} = numbers{1};
end
catch me
end
end
end
%%Create output variable
B = cell2mat(raw);
%%Clear temporary variables
clearvars filename delimiter startRow endRow formatSpec fileID dataArray ans raw col numericData rawData row regexstr result numbers invalidThousandsSeparator thousandsRegExp me;
filename = 'C:\Users\Milagros\Desktop\Nueva carpeta\Rev-erb for matlab\R9.csv';
delimiter = ',';
startRow = 1089;
endRow = 1600;
%%Read columns of data as strings:
formatSpec = '%*s%*s%s%s%s%s%s%s%[^\n\r]';
%%Open the text file.
fileID = fopen(filename,'r');
%%Read columns of data according to format string.
dataArray = textscan(fileID, formatSpec, endRow-startRow+1, 'Delimiter', delimiter, 'HeaderLines', startRow-1, 'ReturnOnError', false);
%%Close the text file.
fclose(fileID);
%%Convert the contents of columns containing numeric strings to numbers.
raw = repmat({''},length(dataArray{1}),length(dataArray)-1);
for col=1:length(dataArray)-1
raw(1:length(dataArray{col}),col) = dataArray{col};
end
numericData = NaN(size(dataArray{1},1),size(dataArray,2));
for col=[1,2,3,4,5,6]
% Converts strings in the input cell array to numbers. Replaced non-numeric
% strings with NaN.
rawData = dataArray{col};
for row=1:size(rawData, 1);
% Create a regular expression to detect and remove non-numeric prefixes and
% suffixes.
regexstr = '(?<prefix>.*?)(?<numbers>([-]*(\d+[\,]*)+[\.]{0,1}\d*[eEdD]{0,1}[-+]*\d*[i]{0,1})|([-]*(\d+[\,]*)*[\.]{1,1}\d+[eEdD]{0,1}[-+]*\d*[i]{0,1}))(?<suffix>.*)';
try
result = regexp(rawData{row}, regexstr, 'names');
numbers = result.numbers;
% Detected commas in non-thousand locations.
invalidThousandsSeparator = false;
if any(numbers==',');
thousandsRegExp = '^\d+?(\,\d{3})*\.{0,1}\d*$';
if isempty(regexp(numbers, thousandsRegExp, 'once'));
numbers = NaN;
invalidThousandsSeparator = true;
end
end
% Convert numeric strings to numbers.
if ~invalidThousandsSeparator;
numbers = textscan(strrep(numbers, ',', ''), '%f');
numericData(row, col) = numbers{1};
raw{row, col} = numbers{1};
end
catch me
end
end
end
%%Create output variable
C = cell2mat(raw);
%%Clear temporary variables
clearvars filename delimiter startRow endRow formatSpec fileID dataArray ans raw col numericData rawData row regexstr result numbers invalidThousandsSeparator thousandsRegExp me;
%%export to Excel
xlswrite(excelname,A,1)
xlswrite(excelname,B,2)
xlswrite(excelname,C,3)
clear all
clc
excelname='R9A';
filename = 'C:\Users\Milagros\Desktop\Nueva carpeta\Rev-erb for matlab\R9.csv';
delimiter = {',',';'};
startRow = 535;
endRow = 542;
%%Format string for each line of text:
formatSpec = '%*s%f%f%f%f%*s%*s%*s%*s%*s%[^\n\r]';
%%Open the text file.
fileID = fopen(filename,'r');
%%Read columns of data according to format string.
dataArray = textscan(fileID, formatSpec, endRow-startRow+1, 'Delimiter', delimiter, 'HeaderLines', startRow-1, 'ReturnOnError', false);
%%Close the text file.
fclose(fileID);
%%Create output variable
A = [dataArray{1:end-1}];
%%Clear temporary variables
clearvars filename delimiter startRow endRow formatSpec fileID dataArray ans;
% Initialize variables.
filename = 'C:\Users\Milagros\Desktop\Nueva carpeta\Rev-erb for matlab\R9.csv';
delimiter = {',',';'};
startRow = 1070;
endRow = 1075;
%%Format string for each line of text:
formatSpec = '%*s%f%f%f%f%*s%*s%*s%*s%*s%[^\n\r]';
%%Open the text file.
fileID = fopen(filename,'r');
%%Read columns of data according to format string.
dataArray = textscan(fileID, formatSpec, endRow-startRow+1, 'Delimiter', delimiter, 'HeaderLines', startRow-1, 'ReturnOnError', false);
%%Close the text file.
fclose(fileID);
%%Create output variable
B = [dataArray{1:end-1}];
%%Clear temporary variables
clearvars filename delimiter startRow endRow formatSpec fileID dataArray ans;
% Initialize variables.
filename = 'C:\Users\Milagros\Desktop\Nueva carpeta\Rev-erb for matlab\R9.csv';
delimiter = {',',';'};
startRow = 1603;
endRow = 1608;
%%Format string for each line of text:
formatSpec = '%*s%f%f%f%f%*s%*s%*s%*s%*s%[^\n\r]';
%%Open the text file.
fileID = fopen(filename,'r');
%%Read columns of data according to format string.
dataArray = textscan(fileID, formatSpec, endRow-startRow+1, 'Delimiter', delimiter, 'HeaderLines', startRow-1, 'ReturnOnError', false);
%%Close the text file.
fclose(fileID);
%%Create output variable
C = [dataArray{1:end-1}];
%%Clear temporary variables
clearvars filename delimiter startRow endRow formatSpec fileID dataArray ans;
%%export to Excel
xlswrite(excelname,A,1)
xlswrite(excelname,B,2)
xlswrite(excelname,C,3)
But I have to do it for each file, and for each different type of curves I want to extract. What I would like is to have:
Data from startRow = 535; endRow = 542; startRow = 1070; endRow = 1075; startRow = 1603; endRow = 1608;
in the same horizontal concatenated sheet. for each 20 files I have.
then for the other trace startRow = 21; endRow = 532; startRow = 556; endRow = 1067; startRow = 1089; endRow = 1600;
in another sheet, file I don't care. The main problems is that the first set of data is a matrix of 8 and the others are 6.
Please could you help me? I am really struggling with this.

Answers (0)

Categories

Find more on Word games in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!