| |
From the Executive Editor
Critiquing the statistical methods section of a manuscript
In the last editorial, I discussed two fundamental principles of
statistical tests: pigs are randomly allocated to treatment, and
each observation is independent of the rest. This editorial will
cover the analysis of normally distributed outcomes and controlling
for clustering.
Statistical tests are selected on the basis of the expected
distribution of the outcome parameter or the dependent variable.
Think about the height of AASV members. Most people are of average
height, some are very short, others very tall, and the rest are in
between. Height represents normally distributed data and forms a
bell-shaped curve when it is graphed. We describe these data using
the average or mean height and the variation around the mean, often
with standard deviation or standard error. The standard error is
used in statistical tests to determine whether differences between
two groups are larger than we would expect by chance alone. If we
compare the height of AASV members with blue eyes to that of
members with brown eyes, we will likely find that the heights of
the two groups do not differ. However, if we compare the height of
women versus that of men, we will likely conclude that they differ
and that the difference is statistically significant. Although the
height of men is normally distributed and the height of women is
normally distributed, they represent two distinct bell-shaped
curves. Statistical tests compare the average height of men minus
two standard errors and the average height of women plus two
standard errors and determine whether these numbers overlap.
Average daily gain (ADG) is a continuous variable that can be
measured to multiple decimal points and, within a group of pigs of
similar ages, tends to be normally distributed. We can use a
Student’s t-test to compare the ADG between two groups. For
example, we might compare the ADG of vaccinated versus unvaccinated
pigs in a finisher barn. The assumptions of the test are that the
observations are independent of one another, the variation in ADG
is the same in vaccinated and unvaccinated pigs, and the data are
normally distributed. However, the only way we can fulfill the
assumption of independence is to put only one pig in each pen. Once
pigs are grouped in pens, the assumption of independence is
violated. Average daily gain may cluster by pen for many reasons:
for example, if the feeder becomes plugged or the pen is drafty.
Anything that might affect the ADG of pigs in a pen makes the pigs
within the pen more similar than pigs from two different pens.
These are not independent observations, and if we treat them as
such, we have violated a key assumption of all statistical tests.
We control for this clustering by adding pen to the analysis using
multiple linear regression.
Important assumptions of the multiple linear regression are that
the data are independent and normally distributed, that the
variances of the data do not change in a systematic manner with the
independent variables, and that the errors sum to zero. Multiple
linear regression allows us to determine whether or not ADG differs
by vaccination status after controlling for other variables that
may affect ADG. We are asking “Is vaccination status
associated with ADG after we control for pen, initial weight of the
pig, and gender of the pig?”
Once we have used the multiple linear regression, we must test
the assumptions of the model using a series of standard tests of
the residuals. The residuals are the differences between the
observed ADG and the ADG estimated by the model. If any of the
tests identify a problem, then the data must be reanalyzed using
another statistical technique. For example, the dependent variable
may have to be transformed mathematically so that its distribution
more closely matches a normal distribution. As a critical reader of
the statistical section of the materials and methods, I hope that
you verify that the authors have tested the assumptions of the
models. If they have violated the assumptions of a model, then the
conclusions made on the basis of the model are invalid.
How does the analysis differ if we are still interested in
measuring ADG, but the treatment is applied to the pen rather than
to the individual pig? The unit of analysis must be the smallest
unit to which the treatment is applied. Therefore, if we compare
feed additives, the comparison groups would be ADG of the group of
pigs in a pen.
Mixed models typically refer to multiple linear regression with
a random variable included in the model. Fixed effects, such as
gender and parity, are variables that we can reliably measure in
one study and repeat in another study. Random variables cannot be
repeated. Examples of this include farm or pen, which represent the
cluster of pigs. If I do a study in Ontario on 10 different farms,
I need to control for the farm effect. However, no one can repeat
the study in Iowa with the identical farm effect measured in
Ontario. By putting farm into the model as a random effect, we are
controlling for the random variation due to farm and then saying
“After controlling for the random variation due to farm, is
there any variation in our outcome parameter due to the
treatment?” This random farm effect includes measurable and
unmeasurable farm factors that are not in the model.
The key concepts are these: the statistical methods must be
valid for the results to be valid, the statistical models must be
evaluated to determine whether or not the methods are valid, and
finally, the statistical methods must be explained in sufficient
detail for the reader to critique the methods and for another
researcher to repeat the study. In the next editorial, I will
discuss dependent variables that are not normally distributed.
--Cate Dewey, DVM, MSc, PhD
|
|