Issue when importing large csv file
Show older comments
I am attempting to import a large csv file (~3 million rows) using the function readtable. The data set is two columns, first is datetime in UTC, and the second is values from an accelerometer. However, when it imports into Matlab it gets sorted into 3 columns, partial date, hour, and rest of time with the accelerometer value paired with a comma. Why is Matlab expanding my dataset to 3 columns? I've opened the file in Excel and Notepad ++ and it is two columns. Thank you in advance!
4 Comments
Stephen23
on 25 Aug 2023
Please upload a sample data file by clicking the paperclip button. It does not have to be the entire data file, but must contain sufficient data to reproduce the effect that you describe.
Yousef
on 25 Aug 2023
Hi
The issue you are experiencing is likely due to the way that MATLAB interprets the CSV file. By default, MATLAB assumes that the CSV file has a header row that defines the column names, and it tries to guess the data types of the columns based on the first few rows of data.
When you import a CSV file into MATLAB using the readtable function, it tries to determine the data types of the columns based on the first few rows of data. If the first row contains a date and time value, MATLAB may interpret it as a datetime column and automatically convert it to a datetime format. Similarly, if the second column contains numeric values, MATLAB may interpret it as a numerical column and convert it to a numeric data type.
However, it sounds like your CSV file does not have a header row that defines the column names, and the first row contains both the date and time values, which are being interpreted as separate columns. To resolve this issue, you can try the following options:
Add a header row to your CSV file that defines the column names. This will allow MATLAB to correctly interpret the data types of the columns.
Use the readtable function with the header option set to none, which will prevent MATLAB from interpreting the first row as a header. For example:
data = readtable('your_file.csv', 'header', 'none');
This will allow you to specify the column names manually, rather than relying on MATLAB to guess them.
3. Use the textread function to read the CSV file directly into a cell array, without interpreting the data types. For example
data = textread('your_file.csv', '%s', 'delimiter', ',');
This will allow you to access the data as a cell array, where each element of the array contains a single value from the CSV file. You can then use the cellfun function to convert the cell array into a numeric array, if needed.
Once you have imported the data into MATLAB, you can use the datetime function to convert the date and time values into a single datetime column, if needed. For example:
data(:, 1) = datetime(data(:, 1), 'Format', 'yyyy-mm-dd HH:MM:SS');
This will convert the date and time values in the first column into a single datetime column, using the format 'yyyy-mm-dd H
Dyuman Joshi
on 25 Aug 2023
The above comment by Yousef feels like a chatbot response.
Walter Roberson
on 25 Aug 2023
readtable() does not have an option named 'header' . It also does not have any options for which 'none' is a special value (for example if you use 'Sheet', 'none' then that would look for a sheet literally named "none" -- so 'none' is valid for some options but does not mean anything special to those options.)
Accepted Answer
More Answers (0)
Categories
Find more on Data Type Conversion in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!