Research Corner: Toward Understanding and Using Statistical Power Analysis

descriptionKaryn Holm, PhD, RN, FAAN,

Statistical power analysis, known simply as “power analysis,” is used to determine how many subjects will be needed when investigators plan to use parametric statistics to determine significant relationships and/or differences. Power analysis is now an expectation in the design of quantitative nursing studies, making it increasingly clear that investigators who intend to propose, secure funding for, conduct, and publish quantitative nursing research must understand and use power analysis.
Some hesitate to use power analysis because they do not understand it. This need not be the case. When addressing the issue of how many subjects will be needed to conduct a quantitative study, the question is this. How many subjects will be needed to increase the likelihood (probability) that the results of proposed statistical analysis are true, thereby a reflection of reality? Power analysis helps to insure that there will be enough subjects to provide an adequate amount of data to detect relationships or differences, given a particular, alpha, beta, effect size and variation in the dependent variable.
Let’s begin with a caveat concerning hypothesis testing. Investigators may mistakenly think that research and statistical hypotheses are essentially the same, at times substituting one for the other. This is wrong and should not be done. A research hypothesis is generated from a framework or theory and will be stated in a positive way while a statistical hypothesis, linked to probability and a specific statistical test is stated as a hypothesis of no difference or of no relationship, thus the term null hypothesis.

Example
A group of investigators are interested in determining differences between groups; in group one, the experimental group are paraplegic patients who undergo an eight week upper body exercise program while in group two, the control group are paraplegic patients who do not engage in the exercise program but rather conduct themselves as usual. The dependent variable (outcome variable) will be activities of daily living which is measured in both groups prior to the intervention. The research hypothesis is stated as follows: Those paraplegic patients who undergo the eight-week, upper body exercise program will increase their ability to perform activities of daily living as compared to the control group who will not improve their ability to perform activities of daily living. The null hypothesis (the hypothesis of no difference) is stated as follows: Activities of daily living in group one (experimental group) will equal activities of daily living in group two (control group), against an alternative hypothesis that activities of daily living in group one (experimental group) will be greater than activities of daily living in group two (control group). The investigators conduct a pilot study, assigning 10 patients to the experimental group and 10 patients to the control group. They collect pilot data and then proceed to determine sample size for their proposed study.

Essential Components of a Power Analysis
To determine sample size for the study in our example, the investigators will need values for alpha, beta, effect size and variation in the dependent variable. All are interrelated. Remember that alpha is the probability of a type 1 error, defined as falsely rejecting a null hypothesis. By convention, alpha is set at p = 0.05. Simply stated, there will be but five or fewer chances in a hundred to falsely reject the null hypothesis, thus five or fewer chances to commit a type 1 error. Now let’s consider beta defined as the probability of a type 2 error, by convention beta is assigned a probability of 0.20. A type 2 error occurs when statistically significant differences are not detected, when it fact differences exist. In our example the investigators will use the conventional value for alpha and beta, 0.05 and 0.20 respectively.
The next consideration is effect size, or how much of a difference in the dependent variable do the investigators in our example wish to detect. Some use one of three categories (small, medium, and large) to signify the proposed effect size. Others stress the need to express effect size in the unit of measurement of the dependent variable, in other words the unit of measurement used in the response. Recall that our investigators conducted a pilot study, thus now have an understanding of how much of an effect they can expect to observe in activities of daily living following the exercise intervention in the experimental group. Whichever approach is taken one should be clear that the smaller the difference on wishes to detect, the more subjects that will be needed to detect a significant difference.
The final consideration is that our investigators must determine the extent of variation in the dependent variable. Sometimes investigators rely on published research reports for an estimate of the standard deviation of the dependent variable. An alternative to using published research is to use pilot study data of one’s own and calculate the standard deviation of the dependent variable from this data. In this instance, our investigators conducted a pilot study. Within the context of their pilot study they collected pilot study data that enabled them to determine variation in activities of daily living in both the groups of patients. Either published research or pilot study data are acceptable avenues for estimating variation in the dependent variable. Investigators, however, should not negate the advantages of conducting a pilot study that will extend beyond detecting variation in the dependent variable to understanding and refining data collection and data analysis procedures, thus improving the conduct of a proposed study. A primary reason for failing to detect statistically significant differences is not having sufficient statistical power to detect significant differences, a function of having too few subjects in the proposed study Recall from your statistics classes that power is defined as one minus beta. Thus as beta (the probability of a type 2 error) increases, by definition power will decrease and vice versa. A study may also suffer from having too many subjects, the end result of which is deeming even the smallest effect/differences of questionable importance, statistically significant.

Conducting a Power Analysis
Power analysis programs are available online for no-charge; however, one should always use no-charge programs with caution as they may or may not contain flaws which may or may not be easily corrected. Using programs from reliable sources is preferred. There are statistical power analysis programs from known statistical analysis companies that specialize in developing and validating statistical programs and who offer online support to assist investigators in understanding and using power analysis. Specific examples include SPSS Incorporated, SAS, and Minitab.

Conclusion
Power analysis to determine sample size is now an expectation in nursing research in the design of quantitative studies. The best sample size determinations are those that represent a balance between alpha, beta, and effect size and further consider the degree of variation in the dependent variable. While published data is often used to estimate effect size and variation in the dependent variable, investigators are urged to conduct pilot studies that will enable them to have estimates of these factors that are unique to their proposed sample.

Karyn Holm, PhD, RN, FAAN, is a Professor of Nursing at DePaul University in Chicago, Illinois.

descriptionMatthew Sorenson PhD, RN, is an Assistant Professor of Nursing at De Paul University and Editor of the Research Corner. He welcomes your questions, comments, and suggestions. Contact Matthew at msorenso@depaul.edu

HOME

Comments are closed.