Removing numerical data from txt file

1 view (last 30 days)
Inport a txt file, and remove all numerical data, have only text left. when importing text file, the text gets surrounded by quotes.

Answers (3)

Sulaymon Eshkabilov
Sulaymon Eshkabilov on 18 Feb 2023
Edited: Sulaymon Eshkabilov on 18 Feb 2023
Here is one of the possible solutions for this exercise regexprep():
Text=readlines('Citation.txt') % Read data file with texts and numbers
Text = 27×1 string array
"JOURNAL=Frontiers in Plant Science " "→" "VOLUME=11 " "→" "YEAR=2020 " "→→" "URL=https://www.frontiersin.org/articles/10.3389/fpls.2020.613760 " "→ " "DOI=10.3389/fpls.2020.613760 " "→" "ISSN=1664-462X " "" "ABSTRACT=Excessive nitrogen (N) application is widespread in Southern China. The effects of N fertilization on soil properties and crop physiology are poorly understood in " "tropical red loam soil. We conducted a field experiment to evaluate the effect of nitrogen fertilization rates on physiological attributes (chlorophyll, plant metabolic " "enzymes, soluble matters) on banana leaves, soil properties (soil enzymes, soil organic matter (SOM), soil available nutrients) as well as banana crop yield in a subtropical " "region of southern China. The N rates tested were 0 (N<sub>0</sub>), 145 (N<sub>145</sub>), 248 (N<sub>248</sub>), 352 (N<sub>352</sub>), 414 (N<sub>FT</sub>), and 455 (N<sub>455</sub>) " "g N per plant. The correlations among soil factors, leaf physiological factors and crop yield were evaluated. The results indiated that the high rates of N fertilization (N<sub>FT</sub> " "and N<sub>455</sub>) significantly decreased soil available potassium (K) content, available phosphorus (P) content, glutamine synthetase (GS) activity, and soluble protein and sugar " "contents compared with lower N rates. The N<sub>352</sub> treatment had the highest crop yields compared with higher N rates treatments, followed by the N<sub>455</sub> treatment. However, " "there were no significant differences in crop yields among N fertilization treatments. Factor analysis showed that the N<sub>352</sub> treatment had the highest integrated score for soil " "and leaf physiological factors among all treatments. Moreover, the N<sub>352</sub> treatment was the most effective in improving carbon and nitrogen metabolism in banana. Crop yield was " "significantly and positively linearly correlated with the integrated score (r = 0.823, p < 0.05). Path analysis revealed that invertase, SOM and sucrose synthase (SS) had a strong positive " "effect on banana yield. Canonical correspondence analysis (CCA) suggested that available K, invertase, acid phosphatase and available P were the most important factors impacting leaf" " physiological attributes. Cluster analysis demonstrated distinct differences in N application treatment related to variations in soil and leaf factors. This study suggested that excessive " "N fertilization had a negative effect on soil fertility, crop physiology and yield. The lower N rates were more effective in improving crop yield than higher rates of N fertilization. " "The N rate of 352 g N per plant (N<sub>352</sub>) was recommended to reduce excess N input while maintaining the higher yield for local farmers’ banana planting." ""
A = regexprep(Text, '\d+(?:_(?=\d))?', '') % All numbers removed
A = 27×1 string array
"JOURNAL=Frontiers in Plant Science " "→" "VOLUME= " "→" "YEAR= " "→→" "URL=https://www.frontiersin.org/articles/./fpls.. " "→ " "DOI=./fpls.. " "→" "ISSN=-X " "" "ABSTRACT=Excessive nitrogen (N) application is widespread in Southern China. The effects of N fertilization on soil properties and crop physiology are poorly understood in " "tropical red loam soil. We conducted a field experiment to evaluate the effect of nitrogen fertilization rates on physiological attributes (chlorophyll, plant metabolic " "enzymes, soluble matters) on banana leaves, soil properties (soil enzymes, soil organic matter (SOM), soil available nutrients) as well as banana crop yield in a subtropical " "region of southern China. The N rates tested were (N<sub></sub>), (N<sub></sub>), (N<sub></sub>), (N<sub></sub>), (N<sub>FT</sub>), and (N<sub></sub>) " "g N per plant. The correlations among soil factors, leaf physiological factors and crop yield were evaluated. The results indiated that the high rates of N fertilization (N<sub>FT</sub> " "and N<sub></sub>) significantly decreased soil available potassium (K) content, available phosphorus (P) content, glutamine synthetase (GS) activity, and soluble protein and sugar " "contents compared with lower N rates. The N<sub></sub> treatment had the highest crop yields compared with higher N rates treatments, followed by the N<sub></sub> treatment. However, " "there were no significant differences in crop yields among N fertilization treatments. Factor analysis showed that the N<sub></sub> treatment had the highest integrated score for soil " "and leaf physiological factors among all treatments. Moreover, the N<sub></sub> treatment was the most effective in improving carbon and nitrogen metabolism in banana. Crop yield was " "significantly and positively linearly correlated with the integrated score (r = ., p < .). Path analysis revealed that invertase, SOM and sucrose synthase (SS) had a strong positive " "effect on banana yield. Canonical correspondence analysis (CCA) suggested that available K, invertase, acid phosphatase and available P were the most important factors impacting leaf" " physiological attributes. Cluster analysis demonstrated distinct differences in N application treatment related to variations in soil and leaf factors. This study suggested that excessive " "N fertilization had a negative effect on soil fertility, crop physiology and yield. The lower N rates were more effective in improving crop yield than higher rates of N fertilization. " "The N rate of g N per plant (N<sub></sub>) was recommended to reduce excess N input while maintaining the higher yield for local farmers’ banana planting." ""
(2) another data file:
Atext=readlines('DATA_TEXT.txt')
Atext = 16×1 string array
"ABSTRACT={The effects of N fertilization on soil properties and crop physiology are poorly understood in tropical red loam soil." " We conducted a field experiment to evaluate the effect of nitrogen fertilization rates on physiological attributes (chlorophyll, plant metabolic enzymes, soluble matters) on banana leaves, soil " "properties (soil enzymes, soil organic matter (SOM), soil available nutrients) as well as banana crop yield in a subtropical region of southern China. The N rates tested were 0 (N<sub>0</sub>), " "145 (N<sub>145</sub>), 248 (N<sub>248</sub>), 352 (N<sub>352</sub>), 414 (N<sub>FT</sub>), and 455 (N<sub>455</sub>) g N per plant. The correlations among soil factors, leaf physiological factors " "and crop yield were evaluated. The results indiated that the high rates of N fertilization (N<sub>FT</sub> and N<sub>455</sub>) significantly decreased soil available potassium (K) content, " "available phosphorus (P) content, glutamine synthetase (GS) activity, and soluble protein and sugar contents compared with lower N rates. The N<sub>352</sub> treatment had the highest crop yields " "compared with higher N rates treatments, followed by the N<sub>455</sub> treatment. However, there were no significant differences in crop yields among N fertilization treatments. " "Factor analysis showed that the N<sub>352</sub> treatment had the highest integrated score for soil and leaf physiological factors among all treatments. Moreover, the N<sub>352</sub> " "treatment was the most effective in improving carbon and nitrogen metabolism in banana. Crop yield was significantly and positively linearly correlated with the integrated score (r = 0.823, p < 0.05). " "{1 2 3 4 5 6 7 8 9 10}. 2023. Feb 18, 12:24 am." "Path analysis revealed that invertase, SOM and sucrose synthase (SS) had a strong positive effect on banana yield. Canonical correspondence analysis (CCA) suggested that available K, " "invertase, acid phosphatase and available P were the most important factors impacting leaf physiological attributes. Cluster analysis demonstrated distinct differences in N application " "treatment related to variations in soil and leaf factors. This study suggested that excessive N fertilization had a negative effect on soil fertility, crop physiology and yield. " "The lower N rates were more effective in improving crop yield than higher rates of N fertilization. The N rate of 352 g N per plant (N<sub>352</sub>) was recommended to reduce excess" " N input while maintaining the higher yield for local farmers’ banana planting.}" "}"
A = regexprep(Atext, '\d+(?:_(?=\d))?', '')
A = 16×1 string array
"ABSTRACT={The effects of N fertilization on soil properties and crop physiology are poorly understood in tropical red loam soil." " We conducted a field experiment to evaluate the effect of nitrogen fertilization rates on physiological attributes (chlorophyll, plant metabolic enzymes, soluble matters) on banana leaves, soil " "properties (soil enzymes, soil organic matter (SOM), soil available nutrients) as well as banana crop yield in a subtropical region of southern China. The N rates tested were (N<sub></sub>), " " (N<sub></sub>), (N<sub></sub>), (N<sub></sub>), (N<sub>FT</sub>), and (N<sub></sub>) g N per plant. The correlations among soil factors, leaf physiological factors " "and crop yield were evaluated. The results indiated that the high rates of N fertilization (N<sub>FT</sub> and N<sub></sub>) significantly decreased soil available potassium (K) content, " "available phosphorus (P) content, glutamine synthetase (GS) activity, and soluble protein and sugar contents compared with lower N rates. The N<sub></sub> treatment had the highest crop yields " "compared with higher N rates treatments, followed by the N<sub></sub> treatment. However, there were no significant differences in crop yields among N fertilization treatments. " "Factor analysis showed that the N<sub></sub> treatment had the highest integrated score for soil and leaf physiological factors among all treatments. Moreover, the N<sub></sub> " "treatment was the most effective in improving carbon and nitrogen metabolism in banana. Crop yield was significantly and positively linearly correlated with the integrated score (r = ., p < .). " "{ }. . Feb , : am." "Path analysis revealed that invertase, SOM and sucrose synthase (SS) had a strong positive effect on banana yield. Canonical correspondence analysis (CCA) suggested that available K, " "invertase, acid phosphatase and available P were the most important factors impacting leaf physiological attributes. Cluster analysis demonstrated distinct differences in N application " "treatment related to variations in soil and leaf factors. This study suggested that excessive N fertilization had a negative effect on soil fertility, crop physiology and yield. " "The lower N rates were more effective in improving crop yield than higher rates of N fertilization. The N rate of g N per plant (N<sub></sub>) was recommended to reduce excess" " N input while maintaining the higher yield for local farmers’ banana planting.}" "}"
  3 Comments
Sulaymon Eshkabilov
Sulaymon Eshkabilov on 20 Feb 2023
Please shart your text or dat file to give a proper solution or guidance.
Sophia Starzynski
Sophia Starzynski on 20 Feb 2023
here is the zip folder, the data are all in one column, I only need the notes on the files not the numerical data, and then compress into a spreadsheet.

Sign in to comment.


Sulaymon Eshkabilov
Sulaymon Eshkabilov on 20 Feb 2023
Edited: Sulaymon Eshkabilov on 20 Feb 2023
Here is the solution:
unzip('https://www.mathworks.com/matlabcentral/answers/uploaded_files/1300600/CONTROLDS2247_SGI_PPCS_Battery_1142022_1218PM_Splice.zip')
A = readlines('CONTROLDS2247_SGI_PPCS_Battery_1142022_1218PM_Splice.txt');
Bnum = regexp(A,'\d*','Match'); % Only numbers
Ctxt = regexprep(A, '\d+(?:_(?=\d))?', ''); % Only texts taken out
% Empty cells are cleaned up
Index1 = (Ctxt(1:end,:)=='-.');
Ctxt(Index1,:)=[];
Index2 = (Ctxt(1:end,:)=='.');
Ctxt(Index2,:)=[] % Only text strings are stored and all empty cells are removed
Ctxt = 10311×1 string array
"degcal_target_LE" "Collected" "// :: PM" "Delta T" "Right Eye Horizontal" "Left Eye Horizontal" "Left Eye Vertical" "Left Eye Pupil" "Right Eye Vertical" "Right Eye Pupil" "End Trial" "degcal_target_LE" "Collected" "// :: PM" "Delta T" "Right Eye Horizontal" "Left Eye Horizontal" "Left Eye Vertical" "Left Eye Pupil" "Right Eye Vertical" "Right Eye Pupil" "End Trial" "degcal_target_LE" "Collected" "// :: PM" "Delta T" "Right Eye Horizontal" "Left Eye Horizontal" "Left Eye Vertical" "Left Eye Pupil"
% Write the cleaned strings into MS Excel
xlswrite('OUT.xlsx', Ctxt)
  1 Comment
Sophia Starzynski
Sophia Starzynski on 20 Feb 2023
is there a way to keep the time and date cell that appears after every collected.

Sign in to comment.


Sulaymon Eshkabilov
Sulaymon Eshkabilov on 20 Feb 2023
Here is the solution to keep the time of data collected in the external file:
unzip('https://www.mathworks.com/matlabcentral/answers/uploaded_files/1300600/CONTROLDS2247_SGI_PPCS_Battery_1142022_1218PM_Splice.zip')
A = readlines('CONTROLDS2247_SGI_PPCS_Battery_1142022_1218PM_Splice.txt');
Bnum = regexp(A,'\d*','Match'); % Only numbers
Ctxt = regexprep(A, '\d+(?:_(?=\d))?', ''); % Only texts taken out
% Keep the dates:
Index0 = Ctxt(1:end, :)=='// :: PM';
Ctxt(Index0,:) = A(Index0,:);
% Empty cells are cleaned up:
Index1 = (Ctxt(1:end,:)=='-.');
Ctxt(Index1,:)=[];
Index2 = (Ctxt(1:end,:)=='.');
Ctxt(Index2,:)=[]
Ctxt = 10311×1 string array
"degcal_target_LE" "Collected" "11/4/2022 12:24:23 PM" "Delta T" "Right Eye Horizontal" "Left Eye Horizontal" "Left Eye Vertical" "Left Eye Pupil" "Right Eye Vertical" "Right Eye Pupil" "End Trial" "degcal_target_LE" "Collected" "11/4/2022 12:24:34 PM" "Delta T" "Right Eye Horizontal" "Left Eye Horizontal" "Left Eye Vertical" "Left Eye Pupil" "Right Eye Vertical" "Right Eye Pupil" "End Trial" "degcal_target_LE" "Collected" "11/4/2022 12:24:48 PM" "Delta T" "Right Eye Horizontal" "Left Eye Horizontal" "Left Eye Vertical" "Left Eye Pupil"
% Cleaned data is stored in an external file
xlswrite('OUT.xlsx', Ctxt)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!