How to extract specific data from a txt file

88 views (last 30 days)
Hi there!
I have a txt file that contains lines starting in either U4 or U6. I want to filter out all the lines with U4 so I'm left with only the ones that contain U6. How would I go about that?
Thanks in advance!

Answers (2)

Star Strider
Star Strider on 23 Jun 2023
Using my code from my earlier Comment, one approach would be—
% type('Niskin-Logger-Sample.txt')
opts = detectImportOptions('Niskin-Logger-Sample.txt', 'Delimiter',{' '});
opts = setvaropts(opts, 'Var3', 'InputFormat','MM/dd/uuuu'); % Guessing The Date Format
T1 = readtable('Niskin-Logger-Sample.txt', 'Delimiter',{' '});
Warning: The DATETIME data was created using format 'MM/dd/uuuu' but also matched 'dd/MM/uuuu'.
To avoid ambiguity, supply a datetime format using SETVAROPTS, e.g.
opts = setvaropts(opts,varname,'InputFormat','MM/dd/uuuu');
T2 = readtable('Niskin-Logger-Sample.txt', 'HeaderLines',50)
T2 = 8826×4 table
Var1 Var2 Var3 Var4 ______ ______ ____ __________ {'U4'} 5.5718 298 {0×0 char} {'U4'} 5.9434 298 {0×0 char} {'U4'} 4.8241 298 {0×0 char} {'U4'} 4.967 298 {0×0 char} {'U4'} 5.0333 298 {0×0 char} {'U4'} 4.7708 298 {0×0 char} {'U4'} 5.1185 298 {0×0 char} {'U4'} 4.837 298 {0×0 char} {'U4'} 4.9518 298 {0×0 char} {'U4'} 5.0625 298 {0×0 char} {'U4'} 4.7761 298 {0×0 char} {'U4'} 5.1258 298 {0×0 char} {'U4'} 4.8705 298 {0×0 char} {'U4'} 4.9459 298 {0×0 char} {'U4'} 5.0957 298 {0×0 char} {'U4'} 4.7892 298 {0×0 char}
[UVar1,~,ix] = unique(T2.Var1);
Um = accumarray(ix, (1:numel(ix)).', [], @(x){T2(x,:)})
Um = 2×1 cell array
{8574×4 table} { 252×4 table}
U6 = Um{2}
U6 = 252×4 table
Var1 Var2 Var3 Var4 ______ ____ __________ __________ {'U6'} NaN 1.6868e+09 {0×0 char} {'U6'} NaN 1.6868e+09 {0×0 char} {'U6'} NaN 1.6868e+09 {0×0 char} {'U6'} NaN 1.6868e+09 {0×0 char} {'U6'} NaN 1.6868e+09 {0×0 char} {'U6'} NaN 1.6868e+09 {0×0 char} {'U6'} NaN 1.6868e+09 {0×0 char} {'U6'} NaN 1.6868e+09 {0×0 char} {'U6'} NaN 1.6868e+09 {0×0 char} {'U6'} NaN 1.6868e+09 {0×0 char} {'U6'} NaN 1.6868e+09 {0×0 char} {'U6'} NaN 1.6868e+09 {0×0 char} {'U6'} NaN 1.6868e+09 {0×0 char} {'U6'} NaN 1.6868e+09 {0×0 char} {'U6'} NaN 1.6868e+09 {0×0 char} {'U6'} NaN 1.6868e+09 {0×0 char}
U6Var2NotNaN = U6(~isnan(U6.Var2),:)
U6Var2NotNaN = 0×4 empty table
NrVar2NaN = nnz(isnan(U6.Var2))
NrVar2NaN = 252
So ‘Var2’ (whatever it is) has only NaN values for the entire ‘U6’ table.
.

Mayur
Mayur on 23 Jun 2023
Hi Sydney!
I understand that you want to extract specific lines from a .txt file, here particularly lines starting with 'U6'. You can use the following code snippet:
fid = fopen('Niskin-Logger-Sample.txt', 'r');
data = textscan(fid, '%s', 'Delimiter', '\n');
fclose(fid);
u6_lines = data{1}(startsWith(data{1}, 'U6'));
disp(u6_lines);
You can read more about textscan here: https://www.mathworks.com/help/matlab/ref/textscan.html

Categories

Find more on Large Files and Big Data in Help Center and File Exchange

Products


Release

R2021a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!