A within-subjects design is an experiment in which the same group of subjects serves in more than one treatment. Note that I’m using the word “treatment” to refer to levels of the independent variable, rather than “group”. It’s probably always better to use the word “treatment”, as opposed to group. The term “group” can be very misleading when you are using a within-subjects design because the same “group” of people is often in more than one treatment. As an example of a within-subjects design, let’s say that we are interested in the effect of different types of exercise on memory. We decide to use two treatments, aerobic exercise and anaerobic exercise. In the aerobic condition we will have participants run in place for five minutes, after which they will take a memory test. In the anaerobic condition we will have them lift weights for five minutes, after which they will take a different memory test of equivalent difficulty. Since we are using a within-subjects design we have all participants begin by running in place and taking the test, after which we have the same group of people lift weights and then take the test. We compare the memory test scores in order to answer the question as to what type of exercise aids memory the most.
There are two fundamental advantages of the within subjects design: a) power and b) reduction in error variance associated with individual differences. A fundamental inferential statistics principle is that, as the number of subjects increases, statistical power increases, and the probability of beta error decreases (the probability of not finding an effect when one “truly” exists). This is why it is always better to have more subjects, and why, if you look at a significance table, such as the t-table, as the number of subjects increases the t value necessary for statistical significance decreases. The reason this is so relevant to the within subjects design is that, by using a within-subjects design you have in effect increased the number of “subjects” relative to a between subjects design. For example, in the exercise experiment, since you have the same subjects in both groups, you will have twice as many “subjects” as you would have had if you would have used a between-subjects design. If ten students sign up for the experiment, and you use a between-subjects design, with equal size groups, you will have five subjects in the aerobic condition and 5 in the anaerobic condition. However, if you use a within-subjects design you will in effect have 10 subjects in both conditions. Just as with the term “groups” vs. “treatments”, instead of using the term “subjects” it’s better to speak of “observations”, since the term subjects is misleading in the within-subjects design when the same person may effectively be more than one “subject”.
The reduction in error variance is due to the fact that much of the error variance in a between-subjects’ design is due to the fact that, even though you randomly assigned subjects to groups, the two groups may differ with regard to important individual difference factors that effect the dependent variable. With within-subjects designs, the conditions are always exactly equivalent with respect to individual difference variables since the participants are the same in the different conditions. So, in our exercise example above, any factor that may effect performance on the dependent variable (memory) such as sleep the night before, intelligence, or memory skill, will be exactly the same for the two conditions, because they are the exact same group of people in the two conditions.
There is also a fundamental disadvantage of the within-subjects’ design, which can be referred to as “carryover effects”. In general, this means the participation in one condition may effect performance in other conditions, thus creating a confounding extraneous variable that varies with the independent variable. Two basic types of carryover effects are practice and fatigue. As you read about the hypothetical exercise and memory experiment, you may very possibly have recognized that one problem with this experiment would be that participating in one exercise condition first, followed by the memory test, may inadvertently effect performance in the second condition. First of all, participants may very possibly be more tired from running in place and weight lifting than they are from just running in place so that they perform worse on the second memory test. If this is the case, they wouldn’t do worse on the second test because aerobic exercise is better for memory than anaerobic, rather they would do worse because they were actually more worn out from exercising for ten minutes total than after only exercising for five. When one within-subjects treatment negatively effects performance on a later treatment this is referred to as a fatigue effect. On the other hand, in the exercise experiment the second memory test may be very similar to the first, so that by practicing with the first test they perform much better the second time. Again, the difference between the two conditions would not be due to the independent variable (aerobic vs. anaerobic), rather it would be due to practice with the test. When a within-subjects treatment positively effects performance on a later treatment this is referred to as a practice effect.