For a statistics exam, I wrote a question about the regression effect in application that I quite like. Here it is, with the answer this time.
Q. A lab assistant in charge of measuring how long it takes for rats to run a maze predicts that usually a rat will take less time on its second time through. A colleague of his, however, notes that the rats ought to regress to the mean maze running time, on average. Are these predictions compatible? Why or why not?
A. They are compatible. The lab assistant is predicting overall improvement, and his colleague is predicting the regression effect (if somewhat loosely stated). The key to their agreement is that we would expect the rats to (on average) have less extreme second maze running times relative to the second run’s mean and standard deviation than their first run time was, relative to the first run’s mean and standard deviation. If the second maze run has a lower mean time we could see general improvement of times without losing the regression effect (which of course is a statistical fact unless |r|=1, which seems unlikely). If the first run is assigned to the x-axis and the second run to the y-axis, visualize the graph of individual rat runtimes lying completely below the line x=y.
Note there is no regression fallacy here. Neither person is proposing an explanation for anything, just a guess at what the numbers will look like.