Extract data csv file different sizes?
1 view (last 30 days)
Show older comments
Hello! I'm sorry to bother. But I can't seem to be able to make my extraction work in a fast way. I have the attached csv files, where I have information in a lot of different ranges. I have a lot of files so I would love to have it all in one.
I've been using this 2 different codes
%%Initialize variables.
clear all
clc
excelname='R9T';
filename = 'C:\Users\Milagros\Desktop\Nueva carpeta\Rev-erb for matlab\R9.csv';
delimiter = ',';
startRow = 21;
endRow = 532;
%%Format string for each line of text:
formatSpec = '%*s%*s%f%f%f%f%f%f%f%f%[^\n\r]';
%%Open the text file.
fileID = fopen(filename,'r');
dataArray = textscan(fileID, formatSpec, endRow-startRow+1, 'Delimiter', delimiter, 'HeaderLines', startRow-1, 'ReturnOnError', false);
%%Close the text file.
fclose(fileID);
%%Create output variable
A = [dataArray{1:end-1}];
%%Clear temporary variables
clearvars filename delimiter startRow endRow formatSpec fileID dataArray ans;
%%Initialize variables.
filename = 'C:\Users\Milagros\Desktop\Nueva carpeta\Rev-erb for matlab\R9.csv';
delimiter = ',';
startRow = 556;
endRow = 1067;
%%Read columns of data as strings:
formatSpec = '%*s%*s%s%s%s%s%s%s%[^\n\r]';
%%Open the text file.
fileID = fopen(filename,'r');
%%Read columns of data according to format string.
dataArray = textscan(fileID, formatSpec, endRow-startRow+1, 'Delimiter', delimiter, 'HeaderLines', startRow-1, 'ReturnOnError', false);
%%Close the text file.
fclose(fileID);
%%Convert the contents of columns containing numeric strings to numbers.
raw = repmat({''},length(dataArray{1}),length(dataArray)-1);
for col=1:length(dataArray)-1
raw(1:length(dataArray{col}),col) = dataArray{col};
end
numericData = NaN(size(dataArray{1},1),size(dataArray,2));
for col=[1,2,3,4,5,6]
% Converts strings in the input cell array to numbers. Replaced non-numeric
% strings with NaN.
rawData = dataArray{col};
for row=1:size(rawData, 1);
% Create a regular expression to detect and remove non-numeric prefixes and
% suffixes.
regexstr = '(?<prefix>.*?)(?<numbers>([-]*(\d+[\,]*)+[\.]{0,1}\d*[eEdD]{0,1}[-+]*\d*[i]{0,1})|([-]*(\d+[\,]*)*[\.]{1,1}\d+[eEdD]{0,1}[-+]*\d*[i]{0,1}))(?<suffix>.*)';
try
result = regexp(rawData{row}, regexstr, 'names');
numbers = result.numbers;
% Detected commas in non-thousand locations.
invalidThousandsSeparator = false;
if any(numbers==',');
thousandsRegExp = '^\d+?(\,\d{3})*\.{0,1}\d*$';
if isempty(regexp(numbers, thousandsRegExp, 'once'));
numbers = NaN;
invalidThousandsSeparator = true;
end
end
% Convert numeric strings to numbers.
if ~invalidThousandsSeparator;
numbers = textscan(strrep(numbers, ',', ''), '%f');
numericData(row, col) = numbers{1};
raw{row, col} = numbers{1};
end
catch me
end
end
end
%%Create output variable
B = cell2mat(raw);
%%Clear temporary variables
clearvars filename delimiter startRow endRow formatSpec fileID dataArray ans raw col numericData rawData row regexstr result numbers invalidThousandsSeparator thousandsRegExp me;
filename = 'C:\Users\Milagros\Desktop\Nueva carpeta\Rev-erb for matlab\R9.csv';
delimiter = ',';
startRow = 1089;
endRow = 1600;
%%Read columns of data as strings:
formatSpec = '%*s%*s%s%s%s%s%s%s%[^\n\r]';
%%Open the text file.
fileID = fopen(filename,'r');
%%Read columns of data according to format string.
dataArray = textscan(fileID, formatSpec, endRow-startRow+1, 'Delimiter', delimiter, 'HeaderLines', startRow-1, 'ReturnOnError', false);
%%Close the text file.
fclose(fileID);
%%Convert the contents of columns containing numeric strings to numbers.
raw = repmat({''},length(dataArray{1}),length(dataArray)-1);
for col=1:length(dataArray)-1
raw(1:length(dataArray{col}),col) = dataArray{col};
end
numericData = NaN(size(dataArray{1},1),size(dataArray,2));
for col=[1,2,3,4,5,6]
% Converts strings in the input cell array to numbers. Replaced non-numeric
% strings with NaN.
rawData = dataArray{col};
for row=1:size(rawData, 1);
% Create a regular expression to detect and remove non-numeric prefixes and
% suffixes.
regexstr = '(?<prefix>.*?)(?<numbers>([-]*(\d+[\,]*)+[\.]{0,1}\d*[eEdD]{0,1}[-+]*\d*[i]{0,1})|([-]*(\d+[\,]*)*[\.]{1,1}\d+[eEdD]{0,1}[-+]*\d*[i]{0,1}))(?<suffix>.*)';
try
result = regexp(rawData{row}, regexstr, 'names');
numbers = result.numbers;
% Detected commas in non-thousand locations.
invalidThousandsSeparator = false;
if any(numbers==',');
thousandsRegExp = '^\d+?(\,\d{3})*\.{0,1}\d*$';
if isempty(regexp(numbers, thousandsRegExp, 'once'));
numbers = NaN;
invalidThousandsSeparator = true;
end
end
% Convert numeric strings to numbers.
if ~invalidThousandsSeparator;
numbers = textscan(strrep(numbers, ',', ''), '%f');
numericData(row, col) = numbers{1};
raw{row, col} = numbers{1};
end
catch me
end
end
end
%%Create output variable
C = cell2mat(raw);
%%Clear temporary variables
clearvars filename delimiter startRow endRow formatSpec fileID dataArray ans raw col numericData rawData row regexstr result numbers invalidThousandsSeparator thousandsRegExp me;
%%export to Excel
xlswrite(excelname,A,1)
xlswrite(excelname,B,2)
xlswrite(excelname,C,3)
clear all
clc
excelname='R9A';
filename = 'C:\Users\Milagros\Desktop\Nueva carpeta\Rev-erb for matlab\R9.csv';
delimiter = {',',';'};
startRow = 535;
endRow = 542;
%%Format string for each line of text:
formatSpec = '%*s%f%f%f%f%*s%*s%*s%*s%*s%[^\n\r]';
%%Open the text file.
fileID = fopen(filename,'r');
%%Read columns of data according to format string.
dataArray = textscan(fileID, formatSpec, endRow-startRow+1, 'Delimiter', delimiter, 'HeaderLines', startRow-1, 'ReturnOnError', false);
%%Close the text file.
fclose(fileID);
%%Create output variable
A = [dataArray{1:end-1}];
%%Clear temporary variables
clearvars filename delimiter startRow endRow formatSpec fileID dataArray ans;
% Initialize variables.
filename = 'C:\Users\Milagros\Desktop\Nueva carpeta\Rev-erb for matlab\R9.csv';
delimiter = {',',';'};
startRow = 1070;
endRow = 1075;
%%Format string for each line of text:
formatSpec = '%*s%f%f%f%f%*s%*s%*s%*s%*s%[^\n\r]';
%%Open the text file.
fileID = fopen(filename,'r');
%%Read columns of data according to format string.
dataArray = textscan(fileID, formatSpec, endRow-startRow+1, 'Delimiter', delimiter, 'HeaderLines', startRow-1, 'ReturnOnError', false);
%%Close the text file.
fclose(fileID);
%%Create output variable
B = [dataArray{1:end-1}];
%%Clear temporary variables
clearvars filename delimiter startRow endRow formatSpec fileID dataArray ans;
% Initialize variables.
filename = 'C:\Users\Milagros\Desktop\Nueva carpeta\Rev-erb for matlab\R9.csv';
delimiter = {',',';'};
startRow = 1603;
endRow = 1608;
%%Format string for each line of text:
formatSpec = '%*s%f%f%f%f%*s%*s%*s%*s%*s%[^\n\r]';
%%Open the text file.
fileID = fopen(filename,'r');
%%Read columns of data according to format string.
dataArray = textscan(fileID, formatSpec, endRow-startRow+1, 'Delimiter', delimiter, 'HeaderLines', startRow-1, 'ReturnOnError', false);
%%Close the text file.
fclose(fileID);
%%Create output variable
C = [dataArray{1:end-1}];
%%Clear temporary variables
clearvars filename delimiter startRow endRow formatSpec fileID dataArray ans;
%%export to Excel
xlswrite(excelname,A,1)
xlswrite(excelname,B,2)
xlswrite(excelname,C,3)
But I have to do it for each file, and for each different type of curves I want to extract. What I would like is to have:
Data from startRow = 535; endRow = 542; startRow = 1070; endRow = 1075; startRow = 1603; endRow = 1608;
in the same horizontal concatenated sheet. for each 20 files I have.
then for the other trace startRow = 21; endRow = 532; startRow = 556; endRow = 1067; startRow = 1089; endRow = 1600;
in another sheet, file I don't care. The main problems is that the first set of data is a matrix of 8 and the others are 6.
Please could you help me? I am really struggling with this.
0 Comments
Answers (0)
See Also
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!