Chapter 11 Summary: Research Design Explained

Brief Summary of Chapter 11

In Chapter 11, we extend the logic of the two-group experiment to experiments that have more than two groups. The advantages of using more than two groups are:

We can compare several types of treatments at once to find the best type (e.g., chiropractor vs. physician vs. no-treatment)
We can compare several different amounts of a treatment to find the best amount. We might also be able to map the functional relationship between our variables if we
- choose the levels of our treatment variable so that they differ by either a constant amount (1,2,3) or a constant proportion (4, 8, 16)
- use a measure that gives us interval or ratio data
We can improve our construct validity by

adding control groups.
not basing our conclusions solely on comparing a treatment group to an empty control group

To analyze the results of the multiple-group experiment, we rely on the same basic logic as when we analyze a two-group experiment. That is, in both cases we compare:

variability (differences) between the means of our groups
TO
variability (differences) of scores within each group

Why would you compare these two variances? Because you know that the group means could differ for two reasons:

the treatment made them different and/or
non-treatment factors--random error--made them score differently (e.g., the groups were somewhat different to start with, random measurement error caused one group to score higher than another, the testing conditions were, on average, slightly better for one group than another, etc.).

Because the group means could differ by chance alone, you need to determine whether the difference you observe between group means is greater than would be expected by random error alone. To determine whether the difference between the group means that you observed at the end of the experiment is greater than would be expected from random error alone, you need to know how much of a difference you can reasonably expect random error (i.e., unsystematic, non-treatment factors) to make.
One key to developing an estimate of the extent to which random error (i.e., non-treatment factors) alone could make the groups different is to look at differences among scores within each group. Differences among scores within each group must be due to non-treatment factors (after all, differences among participants who are in the same treatment group cannot be due to getting different treatments because those participants are all getting the same treatment).
So, you compare the variability between your groups (which is influenced by both random, non-treatment effects and the treatment effect--i f there is one)
with
variability within your groups (which is influenced by only one thing--random, non-treatment effects)

If variability between group means is substantially bigger than variability with the groups, the results are statistically significant. But what is "substantially bigger"?
That depends.
To find out whether the variability between means is substantially bigger than the variability within groups, you first need to calculate an F-ratio, which is an index of the variability between the group means divided by an index of the variability within groups. The F-ratio tells you how many times bigger the Mean Square Between (an index of the variability between groups (which is affected by both treatment effects and random error) is than the Mean Square Within (an index of variability within groups, which is affected only by random error).

If your experiment has only one independent variable, to see whether the ratio of between group variance to within group variance (the F-ratio) is big enough to be statistically significant at the .05 level, you need to know

how many degrees of freedom you have for the treatment (calculated by subtracting 1 from the number of groups you have), and
how many degrees of freedom you have for the error term (calculated by subtracting the number of groups you have from the number of participants you have).

Once you have the degrees of freedom, you can use a computer or the F table in Appendix E to determine whether your results are statistically significant.

If you get a significant F, you know that at least two of your groups differ, but you may not know which two. To find out, use a post hoc test, such as the Tukey test (see Appendix E).

If you aren't interested in which particular groups differ from each other but rather you want to map the shape of the functional relationship between amount of the treatment variable and score on your measure, then you will want to use a post hoc trend analysis. Realize that if you think you might do a post hoc trend analysis, you will probably need to design your experiment with that goal in mind (see Box 11-2).

If you've properly designed your experiment, you can do a trend analysis by following the steps described on pages 344-345 -- or by having someone (a professor, a stat consultant) help you. However, if you don't design your study properly, nobody will be able to help you!

Back to Chapter 11 Menu