Extracting survey data from individual files and combining

1 view (last 30 days)
I have folder containing thousands of individuals files. In each file is text (representing a survey response) of the following format:
2022_1_29_14_29_51_913,0, 0
2022_1_29_14_29_56_761,1, 0
2022_1_29_14_30_4_268,2, 3
2022_1_29_14_30_8_6,4, 1
2022_1_29_14_30_11_87,5, 1
2022_1_29_14_30_16_668,6, 0
2022_1_29_14_30_19_470,7, 1
2022_1_29_14_30_23_586,8, 3
2022_1_29_14_30_32_501,16, 4
Where the 2nd to last number indicates the question number on a survey and the last number is the answer value for that question.
I need to combine all of these files into a single table, where each row represents one file (survey) and each answer (last number) is placed in a column that corresponds to the question number (2nd to last number).
I have no problem reading in the files. I do not know how to access just the last 2 values from each line, and use the 2nd to last number as the location of the last number in a table (with NaN values where there are skipped questions).
Any ideas would be very useful. Thank you.
  2 Comments
David Hill
David Hill on 18 Oct 2022
Attaching a file or two would be helpful. Do all the files have the same number of table columns?
Erik J
Erik J on 18 Oct 2022
There are 16 possible questions, but not all files have all 16 answers because of skip logic in the survey. So, I need to use the 2nd to last number to place the last number in that column, and NaN elsewhere.
I have attached a couple files.

Sign in to comment.

Accepted Answer

Karim
Karim on 18 Oct 2022
Edited: Karim on 19 Oct 2022
I used the text files from your comment to update my answer, to (hopfully) better fit it to what you are looking for.
I was not sure what to do with question no "0", hence i added a column for that one as well. I added comments in the code to explain the idea.
Hope it helps :)
% list the files
MyFiles = ["https://www.mathworks.com/matlabcentral/answers/uploaded_files/1161118/003_003_0_2022_1_29_14_29_40_741_SA.txt";
"https://www.mathworks.com/matlabcentral/answers/uploaded_files/1161123/003_003_0_2022_2_6_12_31_6_722_SA.txt"];
MyFileNames = ["003_003_0_2022_1_29_14_29_40_741_SA";
"003_003_0_2022_2_6_12_31_6_722_SA"];
% allocate a table
sz = [numel(MyFiles) 18]; % 16 questions + survey name
varTypes = ["string",repmat("double",1,sz(2)-1)];
varNames = ["Survey","Q0","Q1","Q2","Q3","Q4","Q5","Q6","Q7","Q8","Q9","Q10","Q11","Q12","Q13","Q14","Q15","Q16"];
MyTable = table('Size',sz,'VariableTypes',varTypes,'VariableNames',varNames);
% set all answers tot nan
MyTable{:, 2:end} = NaN;
% have a look at the empty table
MyTable
MyTable = 2×18 table
Survey Q0 Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 Q11 Q12 Q13 Q14 Q15 Q16 _________ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ <missing> NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN <missing> NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
for i = 1:numel(MyFiles)
% read new file
currTable = readtable(MyFiles(i),'Format','%s %f %f');
% use file name as survey name
MyTable{i, 1 } = MyFileNames(i);
% append the new data into the large table
Q_num = currTable{:,2}+2; % -> determine the column number in the table
Q_val = currTable{:,3}; % -> values for the answers
MyTable{i, Q_num } = Q_val.'; % -> note the transpose to switch from column to row vector
end
% have a look at the final table
MyTable
MyTable = 2×18 table
Survey Q0 Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 Q11 Q12 Q13 Q14 Q15 Q16 _____________________________________ __ __ __ ___ ___ __ __ __ __ ___ ___ ___ ___ ___ ___ ___ ___ "003_003_0_2022_1_29_14_29_40_741_SA" 0 0 3 NaN 1 1 0 1 3 NaN NaN NaN NaN NaN NaN NaN 4 "003_003_0_2022_2_6_12_31_6_722_SA" 0 0 1 1 NaN 1 1 1 3 2 0 4 0 4 2 4 4

More Answers (0)

Categories

Find more on Characters and Strings in Help Center and File Exchange

Tags

Products


Release

R2022b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!