Error in data formatting when creating a matrix of only certain columns from a very large txt file in MATLAB

Question

Izzy on 7 Dec 2024

0
Link

Direct link to this question

https://au.mathworks.com/matlabcentral/answers/2171529-error-in-data-formatting-when-creating-a-matrix-of-only-certain-columns-from-a-very-large-txt-file-i

Answered: Izzy on 22 Dec 2024

I have a .txt data file with 8 columns and 4,377,008 lines. The first row includes header titles, and the rest of the rows are numbers in a variety of formats.

For my code, I only want to use the data from columns 1, 2 and 5 in the txt file.

Column 1 is time in s, and the data in this column is positive numbers with up to 2 decimal places. (e.g. 78.95)

Column 2 is a node number, which is a whole, positive number with values into the 10s of thousands. I.e., anywhere between 0 and 60,000 (e.g, 35, or 36125)

Column 5 is flow rate, and can be a positive or negative values ranging from 10 decimal places to 2-digit or 3-digit numbers. (e.g. -69.0052 or 0.00714967)

I have used a simple readmatrix code:

Q_data = readmatrix("filename.txt");

remove = [3, 4, 6, 7, 8];

Q_data(:, remove) = []

And I have also tried with selectedData, but in both cases the formatting of the data has completely changed... It appears to be standardised the the same number of decimal points, and no numbers higher than 10. For example:

This it what the data in the txt file looks like (only including columns 1,2 and 5):

82.5, 12, -44.7079

and this is the equivalent row of data in the matrix in matlab:

0.0083 0.0012 -0.0045

As you can see, the format of the numbers has changed and I am losing some valuable information that I need to process the data.

Is there a way I can fix this? I would really appreciate any help :) Thank you in advance!

p.s. apologies if the formatting is weird, or I am missing any relevant information - this is my first time asking a question here!

2 Comments
Show NoneHide None

Mitchell Thurston on 7 Dec 2024

this could just be the way the data prints out, formatting in scientific notation for the second column which you said could go up to the tens of thousands. what happens on this command?

fprintf("%f, %f, %f\n", Q_data(10,:))

Izzy on 22 Dec 2024

This gives one line of the code, with each bit of the data in the right format! Unfortunately when I try it for the whole matrix, it seems to print the whole column in one go, like an array, instead of one after the other... (i.e. 1, 1, 1, 1, 1, [...], 2, 2, 2, 2, 2, [...], 3, 3, 3, 3, 3, [...] while I need 1, 2, 3; 1, 2, 3; 1, 2, 3, [...]). I have used fprintf("%f, %f, %f\n", Q_data) and fprintf("%f, %f, %f\n", Q_data(:,:))which both give the same result described above. I am not sure if this is me misunderstanding the command or not though...

I have checked the data and happy to see that it is all saved in the correct format which is my main concern, so I think my question is practically answered - thank you for your response, and sorry it took me so long to reply!

Sign in to comment.

Sign in to answer this question.

Answer 1

Izzy on 22 Dec 2024

0
Link

Direct link to this answer

https://au.mathworks.com/matlabcentral/answers/2171529-error-in-data-formatting-when-creating-a-matrix-of-only-certain-columns-from-a-very-large-txt-file-i#answer_1556230

This came from a misunderstanding on my part! The data is saved correctly in the matrix, but since it was printing incorrectly, and I was using this as a check method, I thought there was an issue with the data.

Using fprintf("%f, %f, %f\n", Q_data(10,:)) successfully prints 1 row of data in the correct format, if this is useful to anyone in future applications!

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Answer 2

Walter Roberson on 7 Dec 2024

0
Link

Direct link to this answer

https://au.mathworks.com/matlabcentral/answers/2171529-error-in-data-formatting-when-creating-a-matrix-of-only-certain-columns-from-a-very-large-txt-file-i#answer_1554834

You are examining the outputs by using disp() (or implied disp(), such as just naming the variable on the command line.)

The default output format is "format short". "format short" is going to examine the maximum absolute value of all of the data, and determine the number of decimal places based on the overall maximum absolute value.

You would be better off looking at the data with "format long g" in effect.

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Error in data formatting when creating a matrix of only certain columns from a very large txt file in MATLAB

2 Comments
Show NoneHide None

Accepted Answer

0 Comments
Show -2 older commentsHide -2 older comments

More Answers (1)

0 Comments
Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

Error in data formatting when creating a matrix of only certain columns from a very large txt file in MATLAB

2 Comments Show NoneHide None

Accepted Answer

0 Comments Show -2 older commentsHide -2 older comments

More Answers (1)

0 Comments Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

2 Comments
Show NoneHide None

0 Comments
Show -2 older commentsHide -2 older comments

0 Comments
Show -2 older commentsHide -2 older comments