How to prepare my data for ANOVA?
40 views (last 30 days)
Show older comments
Steve Schulz
on 3 Aug 2022
Edited: Scott MacKenzie
on 6 Aug 2022
Hi there,
I'm currently analyzing data from a randomized, double-blind, cross-over trial. Participants recieved a drug and a placebo, separated by a 7 day wash-out period. I suspect that there is an interaction effect between the drug and the timing of the drug administration. How do I have to prepare my data for a repeated measures ANOVA?
So far I've already extracted the data of interest (the mean value) and aranged it in the following way:
Drug_session1
Drug_session2
Placebo_session1
Placebo_session2
These are 15x1 doubles since there are 30 participants.
In order to execute the ANOVA I have to concatenate the 4 single 15x1 doubles, but in what way?
Thanks a lot!
5 Comments
Scott MacKenzie
on 4 Aug 2022
@Steve Schulz, a repeated measures ANOVA seems appropriate, since each participant received both the real drug and the placebo. So, your experiment is a 2 x 2 mixed design with 30 participants. There were two independent variables. One was drug which was within-subjects (with levels real and placebo) and the other was group which was between-subjects (with levels A and B). The group levels represent the different order of administering the drug.
Fifteen participants were in each group which is why group is a between-subjects factor. However, all 30 participants received both the real drug and the placeble which is why drug is a within-subjects factor.
There was (at least) one dependent variable: cognitive score.
This should be fairly easy to setup using MATLAB's ranova function. Any chance you can post the data, so @Jeff Miller or I can code-up a solution for you?
Accepted Answer
Scott MacKenzie
on 4 Aug 2022
Edited: Scott MacKenzie
on 4 Aug 2022
@Steve Schulz, thanks for posting the data. You have organized the data in a slighly peculiar way. You've got session 1 in rows 1 to15 and session 2 in rows 16 to 30. But, to organize the data for the anova, you want one row per participant, with the repeated measurements across the columns. So, the first step in my answer below is to rearrange the data. See code and comment below.
As seen in the ANOVA table, the effect of Dose Type on Cognitive Score was statistically significant, F(1,28) = 20.575, p = .0001. The effect of Group on Cognitive Score was also statistically significant, F(1,28) = 4.368, p = 0.0458). This just barely meets the conventional threshold for significance. So, the Group effect was modest, albeit statistically significant. The Dose Type x Group interaction effect was not statistically significant, however, F(1,28) = 0.250, ns).
% the data, as per posted comment (30x2)
M = [ 61 147
67 106
69 139
32 90
56 157
50 144
71 111
146 187
148 187
141 185
123 155
105 135
115 183
88 147
45 112
58 89
56 98
86 114
86 91
85 93
35 26
113 91
156 166
126 110
190 179
100 124
123 165
149 106
118 142
165 155];
% rearrange data to have one row per participant:
% Group A in rows 1-15, Group B in rows 16-30
% Drug results in column 1, placebo results in column 2
M = [M(1:15,1) M(16:30,2); M(16:30,1) M(1:15,2)];
% put the data into a table
T = array2table(M, 'VariableNames', {'Drug', 'Placebo'});
% add group code
T.Group = [repmat('A', 15, 1); repmat('B', 15, 1)];
% display the table
T
% setup the repeated measures model
withinDesign = table([1 2]', 'VariableNames', { 'DoseType' });
withinDesign.DoseType = categorical(withinDesign.DoseType);
rm = fitrm(T, 'Drug-Placebo ~ Group', 'WithinDesign', withinDesign);
% do the anova (supress output for the moment)
AT = ranova(rm, 'WithinModel', 'DoseType');
% output a conventional ANOVA table
disp(anovaTable(AT, 'CognitiveScore'));
% -------------------------------------------------------------------------
% Function to create a conventional ANOVA table from the overly-complicated
% and confusing ANOVA table created by the ranova function.
function [s] = anovaTable(AT, dvName)
c = table2cell(AT);
% remove erroneous entries in F and p columns
for i=1:size(c,1)
if c{i,4} == 1
c(i,4) = {''};
end
if c{i,5} == .5
c(i,5) = {''};
end
end
% use conventional labels in Effect column
effect = AT.Properties.RowNames;
for i=1:length(effect)
tmp = effect{i};
tmp = erase(tmp, '(Intercept):');
tmp = strrep(tmp, 'Error', 'Participant');
effect(i) = {tmp};
end
% determine the required width of the table
fieldWidth1 = max(cellfun('length', effect)); % width of Effect column
fieldWidth2 = 57; % width for df, SS, MS, F, and p columns
barDouble = repmat('=', 1, fieldWidth1 + fieldWidth2);
barSingle = repmat('-', 1, fieldWidth1 + fieldWidth2);
% re-organize the data
c = c(2:end,[2 1 3 4 5]);
c = [num2cell(repmat(fieldWidth1, size(c,1), 1)), effect(2:end), c]';
% create the ANOVA table
s = sprintf('ANOVA table for %s\n', dvName);
s = [s sprintf('%s\n', barDouble)];
s = [s sprintf('%-*s %4s %11s %14s %9s %9s\n', fieldWidth1, 'Effect', 'df', 'SS', 'MS', 'F', 'p')];
s = [s sprintf('%s\n', barSingle)];
s = [s sprintf('%-*s %4d %14.5f %14.5f %10.3f %10.4f\n', c{:})];
s = [s sprintf('%s\n', barDouble)];
end
6 Comments
Scott MacKenzie
on 6 Aug 2022
Edited: Scott MacKenzie
on 6 Aug 2022
@Steve Schulz, in your first queston, you astutely notice an important issue. Since the Group main effect is not significant, how can there be a significant interaction effect of Group with another independent variable? In this case, with p < .0001 for the Dose Type main effect, it is likely that the interaction effect is entirely due to the strong main effect of Dose Type; that is, there is likely no practical significance or real implication in the observed significance of the Dose Type x Group interacton -- or something like that.
A t-test can only be used to compare two conditions; that is, two levels of a single independent variable. The ANOVA procedure expands on this capability in two ways. First, it can be used with an independent variable having > 2 levels. Second, it can be used with > 1 independent variable. In the latter case, you are testing for both the main effects of each independent variable and the interaction effects between the independent variables. That's the case in your study. You could have used t-tests for both main effects (since each factor has only two levels) and ignored the interaction effect. However, this not a good approach. As more t-tests are used, the likelihood of getting an erroneously significant outcome increases. Using an ANOVA avoids this (since all effects are tested for in a single procedure).
I'm not sure I understand your third question. The Group effect in your study was not significant. That simply means that there is no evidence of a difference between Group A and Group B. Put another way, there is no evidence of a difference in cognitive score between the drug-then-placebo group and the placebo-then-drug group.
BTW, I often emphasise with students that the results of statistical procecures like the ANOVA are not the results per se. There is often a misconception about this and it is sometimes apparent in reseach papers that overly emphasise the results of statistical stats. There is so much effort invested in figuring out how to do statistical tests and what the outcomes mean, that one might be tempted to think that the results of statistical tests are the results. But, that's not the case. Really, what is needed -- first and foremost -- is to inspect the data collected. The first step is to compute the means in the measuements across all the conditions tested and then look at them and then think about what the observations and measurements suggest Then, to get a good visual sense of the data, plot the results in line charts, bar charts, or whatever. You'll no doubt see some differences in the means and you'll be thinking... Hmmm, the score on such-and-such seems to be quite a bit higher (or lower) under this condition compared to that condition. I wonder if that difference is real or is it just an artefact of the variability in the measures? That's the sort of thinking we all do. And that's when the statistical tests come into play. At the end of the day, the results of the statistical tests only play a supporting role: They allow you to add strength and confidence to concluding statements, which is why the ANOVA results are often presented in parentheses.
Hope this helps. Good luck.
More Answers (0)
See Also
Categories
Find more on Analysis of Variance and Covariance in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!