Sift through large amounts of data from csv
2 views (last 30 days)
Show older comments
How am I suppose to open a csv with large amount of data?
By the way I need to randomly select by certain options
Can anyone help me open the csv and teach me how to select randomly?
Answers (1)
Shadaab Siddiqie
on 17 Jun 2021
From my understanding you want to import and operate on large csv file. If the csv file too large for your system to handle, I workaround would be that you can break it down into block, e.g. 500MB. Here is the code which might help you:
blockSize = 500e6 ; % Choose large enough so not too many blocks.
tailSize = 100 ; % Choose large enough so larger than one value representation.
% - Open file.
fId = fopen( 'largeFile.csv' ) ;
% - Read first line, convert to double, determine #columns.
line = fgetl( fId ) ;
data = sscanf( line, '%f,' ) ;
nCols = numel( data ) ;
lastBit = '' ;
while ~feof( fId )
% - Read and pre-process block.
buffer = fread( fId, blockSize, '*char' ) ;
isLast = length( buffer ) < blockSize ;
buffer(buffer==10) = ',' ;
buffer(buffer==13) = '' ;
% - Pre-pend last bit of last block.
if ~isempty( lastBit )
buffer = [lastBit; buffer] ; %#ok<AGROW>
end
% - Truncate to last ',' and keep last bit for next iteration.
if ~isLast
n = find( buffer(end-tailSize:end)==',', 1, 'last' ) ;
cutAt = length(buffer) - tailSize + n ;
lastBit = buffer(cutAt:end) ;
buffer(cutAt:end) = [] ;
end
% - Parse.
data = [data; sscanf( buffer, '%f,' )] ; %#ok<AGROW>
end
% - Close file.
fclose( fId ) ;
% - Reshape data vector -> array.
data = reshape( data, nCols, [] )' ;
0 Comments
See Also
Categories
Find more on Data Import and Export in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!