Error in data formatting when creating a matrix of only certain columns from a very large txt file in MATLAB

6 views (last 30 days)
I have a .txt data file with 8 columns and 4,377,008 lines. The first row includes header titles, and the rest of the rows are numbers in a variety of formats.
For my code, I only want to use the data from columns 1, 2 and 5 in the txt file.
Column 1 is time in s, and the data in this column is positive numbers with up to 2 decimal places. (e.g. 78.95)
Column 2 is a node number, which is a whole, positive number with values into the 10s of thousands. I.e., anywhere between 0 and 60,000 (e.g, 35, or 36125)
Column 5 is flow rate, and can be a positive or negative values ranging from 10 decimal places to 2-digit or 3-digit numbers. (e.g. -69.0052 or 0.00714967)
I have used a simple readmatrix code:
Q_data = readmatrix("filename.txt");
remove = [3, 4, 6, 7, 8];
Q_data(:, remove) = []
And I have also tried with selectedData, but in both cases the formatting of the data has completely changed... It appears to be standardised the the same number of decimal points, and no numbers higher than 10. For example:
This it what the data in the txt file looks like (only including columns 1,2 and 5):
82.5, 12, -44.7079
and this is the equivalent row of data in the matrix in matlab:
0.0083 0.0012 -0.0045
As you can see, the format of the numbers has changed and I am losing some valuable information that I need to process the data.
Is there a way I can fix this? I would really appreciate any help :) Thank you in advance!
p.s. apologies if the formatting is weird, or I am missing any relevant information - this is my first time asking a question here!
  2 Comments
Mitchell Thurston
Mitchell Thurston on 7 Dec 2024
this could just be the way the data prints out, formatting in scientific notation for the second column which you said could go up to the tens of thousands. what happens on this command?
fprintf("%f, %f, %f\n", Q_data(10,:))
Izzy
Izzy on 22 Dec 2024
This gives one line of the code, with each bit of the data in the right format! Unfortunately when I try it for the whole matrix, it seems to print the whole column in one go, like an array, instead of one after the other... (i.e. 1, 1, 1, 1, 1, [...], 2, 2, 2, 2, 2, [...], 3, 3, 3, 3, 3, [...] while I need 1, 2, 3; 1, 2, 3; 1, 2, 3, [...]). I have used fprintf("%f, %f, %f\n", Q_data) and fprintf("%f, %f, %f\n", Q_data(:,:))which both give the same result described above. I am not sure if this is me misunderstanding the command or not though...
I have checked the data and happy to see that it is all saved in the correct format which is my main concern, so I think my question is practically answered - thank you for your response, and sorry it took me so long to reply!

Sign in to comment.

Accepted Answer

Izzy
Izzy on 22 Dec 2024
This came from a misunderstanding on my part! The data is saved correctly in the matrix, but since it was printing incorrectly, and I was using this as a check method, I thought there was an issue with the data.
Using fprintf("%f, %f, %f\n", Q_data(10,:)) successfully prints 1 row of data in the correct format, if this is useful to anyone in future applications!

More Answers (1)

Walter Roberson
Walter Roberson on 7 Dec 2024
You are examining the outputs by using disp() (or implied disp(), such as just naming the variable on the command line.)
The default output format is "format short". "format short" is going to examine the maximum absolute value of all of the data, and determine the number of decimal places based on the overall maximum absolute value.
You would be better off looking at the data with "format long g" in effect.

Categories

Find more on Programming in Help Center and File Exchange

Products


Release

R2023a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!