Understanding Clinical Trials: Why They Are Done and What Is Learned
Gary Cutter, PhD and Brian W. Waldersen, BS––University of Alabama, Birmingham, Alabama

Introduction
Almost every day we hear about the results of a new trial that has changed the way we think about a treatment, or confirmed what we already know or believe. We learn of clinical trial results so frequently, it is sometimes easy to develop skepticism about them. What follows is a brief synopsis of clinical trials––what they are and what we hope to learn from them. The purpose of this feature article is to point out their necessity, what is required to undertake them, and what potential participants like yourself need to know, especially since there are over 140 clinical trails currently enrolling subjects with multiple sclerosis (MS). A quick reference for many of these concepts can be found in reference 1 at the end of this article).
What are Controls and Why are They Necessary?
The concept of comparing two groups, one getting one treatment and the other getting another or no treatment as a so-called “control,” is central to understanding what intervention works better. Think about when you have a fever. If every time you get a fever, you drink orange juice until you feel better, does that mean that orange juice is what made you better? Late night TV is filled with such coincidences. What we want to know is whether you would beat your fever if you didn’t drink orange juice, but we can’t do that within the same person. Thus, we might take two similar people and give one orange juice and the other water and see who gets better to give us some idea of what works. Obviously, we need more than two people for evaluating a treatment and to build a reliable clinical trial. Investigators must make sure that all variables are the same among subjects between the treatment groups, thus making them comparable (as if it were the same group; one getting the treatment under investigation and the other a comparison or control treatment).
So, you might ask why can’t we take a history of what has happened to a group of individuals, then give them the drug of interest and compare them to their past behavior. This is called using historical controls. Since many diseases are progressive, rather than stable, a snapshot of a disease course at one point in time can introduce an unwanted variable: namely, the ever-changing progression of the disease. Thus, the individuals during the treatment period might actually seem worse when compared to the historical period and we would wrongly conclude that the drug doesn’t work. That is why in clinical trials, we require so-called “contemporary or concurrent controls.” The most effective comparisons involve two or more treatment groups identified in the same way and given their treatments free of biases that could influence the outcomes.
Conceptually, we would like to treat the exact same person with each treatment. We would first give either the experimental or control treatment, then “turn back the clock” so the patient receives the second treatment at exactly the same point in his or her disease. We would also view the results after the same amount of follow-up time. Of course, comparing two treatments in one person at the same time is impossible, so we take a similar group of patients and split them into two groups. We then follow the two groups forward in a manner that mimics the “turning back of the clock” idea.
What is Randomization?
The concept of providing treatments to similar patients, free of bias or beliefs, is a hallmark of clinical trials. Clinicians often have beliefs about the use of treatments given the patient’s specific situation, even though the overall evidence for such benefit is incomplete. Investigators may be correct or they may just have developed a bias for a treatment. When investigators participate in a trial, they have some doubt, even though they may have hunches or so-called biases. This doubt is called equipoise or being unsure about the benefits and risks of one treatment compared to another. For this reason, investigators agree to assign patients to specific treatment groups in such a way that their beliefs do not enter into the treatment assignment. This allows the treatments to be given to the recipients free of bias and enable a fair answer to the equipoise question.
We do this by a process that is similar to tossing a coin: randomization. These trials are known as Randomized Clinical Trials (RCT). By randomizing treatment, investigators avoid the preferential treatment bias. When reading about trials, preference is given to studies that include the following: prospective (planned in advance); randomized (‘flip of a coin’ assignment to a treatment group), controlled (carefully implemented studies with standardization of procedures and outcome measurements amongst all sites and personnel, along with groups that are comparable); and analyzed (according to the group to which the patient was randomized, even if they had to discontinue the treatment––known as intent to treat analysis). Why do we do the analysis on all the patients randomized to the treatment even if they had to quit? This avoids bias. Suppose I said pinching your upper lip when you get a nose bleed will stop the nose bleed if you hold it for 60 minutes. If I only evaluate the subjects who held their upper lip for 60 minutes, I would have more successes (who is going to sit there holding their lip for 60 minutes if their nose is pouring blood?––well only those in whom their nose bleed stops pretty quickly). Thus, looking at those who followed the instructions only will give us a higher rate of success than counting all who tried the treatment. This is the “wisdom” many testimonial treatments and “old wives tales” are based on.
Why are Inclusion and Exclusion Criteria Necessary?
Researchers take great care in determining which patients are eligible for a clinical trial. The criteria for this decision are extensive, but include that it is safe to flip a coin for the patient to receive one or another treatment and that no patient receives inappropriate treatments. In fact, the extensive consideration of who should be treated (inclusion criteria) and who should not (exclusion criteria) is often far more scientific than any patient would experience in a one-to-one, treatment-decision situation with his or her own physician. Establishing these criteria is often conducted by a group of scientists and is always reviewed by an ethics board or Institutional Review Board and a Data and Safety Monitoring Committee (an independent group of experts whose job it is to assess the safety of the trial while the study is proceeding).
Why Do We Use “Placebos”?So far, we have set the stage for a viable clinical trial; we now want to show that a treatment works. The easiest way to show something is effective is to show that it works better than if nothing was done.
Many people know the terms “placebo” or “dummy” treatment: a treatment that looks, acts, tastes or is similar in every way to the comparison treatment except for the active ingredients. There are several reasons for using placebos. First, doing almost anything in medicine seems to have at least a temporary effect, known as the placebo effect. These are real improvements and not just simply patients being fooled. We attribute it to the body’s ability to respond to expected improvements. The results of the placebo group enable us to measure how much improvement or lack of deterioration is due to this placebo. Subtracting this improvement from the improvement found with the experimental treatment enables us to estimate the actual effectiveness of the drug. We expect some improvement to occur with placebo administration, allowing the clinical trial to ethically continue. We also use placebos to keep the clinicians from their biases as well. Often the clinicians are not informed as to which treatment is used. So, if a placebo is used when injectable drugs are used, we can assess how much of the injection site redness is due to the injection and how much additional is due to the drug.
Trials using placebos are carefully considered and must ethically justify the use of an inactive treatment. On the other hand, using placebos requires fewer subjects and shortens the time frame of the trial, enabling fewer patients to be exposed to potentially ineffective new drugs for a shorter length of time. This occurs since it is easier to see the difference in results of an active drug compared to a placebo than it would be to compare two active treatments. The sample sizes often are 4 times as many patients and thus, up to 4 times the cost. Furthermore, by using a placebo, we get a better idea of just what side effects and serious adverse consequences are due to the active drug compared to consequences of the disease.
What Are The Phases of Clinical Trials?
Developing drugs requires careful stepwise increments to insure that patients are reasonably safe when entering trials. This has led the FDA to require a specific set of stages or phases. The various phases are the preclinical phase, Phase I, Phase II, Phase III, and Phase IV trials.
Pre-clinical trials involve basic laboratory investigation in cells or in animals. If the results are positive, investigators must then identify the formulation for dosing in humans; the drug maker must also apply to the FDA for an investigational new drug application (IND Application), which requests permission to begin trials in humans. The FDA examines the preclinical data and makes a determination (based on many safety parameters) as to whether or not the trial may proceed in humans. Then a Phase I clinical trial is conducted usually in healthy people, with the primary objectives (1) to identify the lowest effective dose that should be given to humans and (2) to assess any toxicity of a new drug in normal, healthy volunteers.
In Phase II Clinical Trials, the objectives are to (1) insure that the drug provides some degree of effect and (2) insure safety without too much toxicity in the diseased population. Often there is more than one Phase II study for a drug in development: an initial study to gain one level of knowledge, such as drug dosage; and a second study to refine assessments of safety or outcome of the treatment. Phase II studies are referred to as “proof of concept” studies and are essential in planning larger Phase III Trials.
Phase III studies are required in the final proof of safety and efficacy of a drug in development. These Phase III trials are called definitive or so-called “pivotal” trials. They are often large and generally take years to conduct. The design of phase III trials usually involves periodic assessments of the treatment responses, along with assessments of side effects and/or toxicity. Phase III trials are warranted if a new treatment shows some promise of effectiveness, possibly with fewer side effects than known drugs. The goal is to establish the effectiveness of the treatment to the FDA with reasonable safety. All drugs have side effects and we need to learn about the severity and risks associated with them. We can then use the effectiveness of drugs to be compared with the risks of the drug.
Since Phase III trials usually require large numbers of patients, obtaining enough study participants usually requires using multiple institutions often in several countries to be able to recruit the required numbers of patients in a reasonable time frame. Large sample size allows Phase III trials to provide more information about side effects and tolerability of treatments, along with their impact on quality of life and longer term effects of the treatments. Often we monitor for safety in terms of things we know, but sometimes it is the unexpected that benefits our careful monitoring in Phase III trials. An example of an unexpected outcome was the appearance of the first cases of progressive multifocal leukoencephalopathy ( PML) seen with Tysabri® during the Phase III Trials.
In Phase IV Clinical Trials, which occur after the FDA approves the treatment for prescribing by clinicians, the objectives are to gain additional knowledge regarding treatment and long-term safety data––as treatments are prescribed in practice where the rigor of inclusion and exclusion criteria are often not as carefully followed. These “post-marketing studies” can identify uses that were not specified in the pivotal clinical trials. They can also identify any unexpected outcomes that occur at such low frequencies that they would not appear in the pivotal the trials. Examples of this include finding birth defects in the offspring of people taking certain drugs. Too few births occur within the Phase III trial and it is not until the post market phase that a small number is seen. These risks are hard to identify, but often there are theoretical reasons to look for some problems, even though they were not observed during the Phase III Trials.
While many clinical trials are conducted at different sites, all use the same protocol, make measurements in the same way and look for the same benefits and risks. This allows results to be combined, so that the greater numbers give increased statistical “power” to demonstrate effectiveness, thus attempting to bring drugs to the market sooner rather than later. Many more drugs start the process than finish it.
What is Masking the Treatment and Who Does It?
Logistically, the running of a clinical trial is extremely complex. Developing a protocol and manual of procedures is necessary to guide the trial. These documents help insure that all the sites in the trial are working in the same way, using the same definitions and measurements and following the same rules while evaluating the patients. This type of organization, implementation, and monitoring is usually accomplished by a coordinating center consisting of a group of individuals who have the responsibility to insure a common protocol, provide the treatment assignments before patients are selected, and in the end, provide the collective analyses.
Another vital component of a clinical trial is the concept of masking or blinding. These concepts insure that the type of treatment (level of dose, active vs. placebo) is not revealed to the parties involved in treatment. Trials may be single-blinded, double-blinded, or triple-blinded. Double-blind trials are the most common. Since both the patient and clinician are blinded or masked from knowing what a patient is receiving except that it is one of the treatments from the study. Using masking, we can effectively prevent bias due to how patients are treated or assessed. If the patient knows they are on placebo, it could negate the placebo effect, and the patient may drop from the trial, why should they continue if they are on the “nothing” drug? With masking it is easier to remain in a trial. It should be noted that some treatments are harmful and being on the placebo may really be the better treatment, but the perception would be if I am on the placebo I’m going to do worse. Masking prevents this incorrect view (remember equipoise means we don’t really know).
If the physician knows the patient is on a placebo, they too might be biased. They might discount any side effects since they “know” it could not be from the drug. In turn, this would bias the assessment of side effects. One needs to collect data from both the placebo and treated groups in a blinded manner, and then compare the differences in order to insure an accurate measurement of how much change is due to the active drug.
Triple blind just means that the people who are analyzing the results do now know which group is the real treatment. They use codes, such as A and B in their assessment of safety and efficacy.
Masking is not always possible. For example, a trial comparing surgery to medicine would clearly not be able to mask the patient from their scar. Other times, a classic response to treatment unmasks the treatment assignment, such as a large change in heart beats on certain drugs or the absence of hot flashes in women given hormone replacement therapy. However, telling people about side effects even if on a placebo, will produce them, so if they can be masked, we can get an accurate assessment of how much more a treatment produces over the comparison treatment. In trials where the subjects must actively participate in the treatment, such as a low fat diet or exercise trial, masking is also impossible. In trials such as these, the common approach is to mask the person making evaluations and/or an independent observer the (who does not know which treatment has been used) to assess the outcome. In such situations, we often try to use outcomes that are totally objective, such as pregnancy in a treatment trial aimed at increasing fertility.
How is Safety Monitored?
The results are viewed over time by the coordinating center and a special group advisory to the trial called, as we noted above, a Data and Safety Monitoring Committee (DSMC). The DSMC members are not involved in the trial, have no financial stake in the trial or the agents used. They are responsible for monitoring the safety and often effectiveness of the treatment throughout the trial duration. This committee sees unblinded data with the charge to recommend stopping a trial if clear evidence of benefit or harm is discovered before the trial is scheduled to end. This is not an easy task because patients do not enter trials on the same day, and thus, the DSMC is always working with partial information. There are carefully crafted statistical rules to prevent erroneous early termination of a study. Despite the shocking headlines when a bad outcome occurs, most trials are safe and in fact, the stopping of trials early for safety reasons––which is usually not presented this way in the media––is a sign that our safety monitoring is working. It stops a trial because unexpected results have occurred and the DSMC has deemed patient safety trumps the completing of the trial as scheduled. Not all trials, however, end with positive results. The media often laments or touts this as failure, but in the search for effective treatments for diseases, trials that end with negative results are actually successes. Without such failures, many would believe in the benefits of a drug and may use something that costs a great deal and has limited or no value. Keep in mind that a positive expectation for a treatment is not the same as proving it is effective. The history of the FDA requires that we demonstrate effectiveness, and when a trial fails, most often it means that a treatment has not met the standards necessary to show success. We also want as much information that a treatment is safe before we subject multitudes of patients to the negative effects of treatment.
How Long Are Clinical Trials?
Most trials are planned to follow patients a fixed amount of time, although this varies between trials. The duration of the trial is usually 1-2 years for Multiple Sclerosis trials, but some, including the current Combination Therapy Trial COMBIRx, are scheduled to go at least 3 years with a few patients being followed for up to 6 years. The CombiRx trial completed its recruitment in April, 2009 and the results will be available in 3 years. Information from these studies accumulates gradually, and takes as long to release the results as the last patient must finish. Most trials continue to their planned termination time. The fact that most trials are not stopped early is in some ways a testament to the planning and prior information used in the various phases of trials and the hurdles put in place to move from one to the next.
Can I Trust What I Read About The Results of a Clinical Trial?
Truthfulness in reporting of trials is critical and expected. The scientific community pressures investigators to have public statements of clinical trial endpoints before trials start. This ensures the public and the scientific communities that the findings provided are indeed what were expected. In these statements, researchers prospectively define hypotheses, clinical objectives, and planned subgroups that may have different benefits or risks.
This information is available at the Web site www.clinicaltrials.gov, which lists most of the ongoing trials. All National Institutes of Health (NIH) trials must be listed on this site before patients can be entered into the trial. This prevents investigators from “data mining,” a term used to describe the act of looking to find extra value of a treatment-value that was not expected or planned before the trial. Data mining results may be questionable in their validity––not because someone is dishonest, but because looking at every possible subgroup of people, one may have a chance result. Think about flipping a fair coin. If you flip a coin 5 times, it is possible to get 5 heads in a row, but rare (about 3 percent of the time). However, if you flip the coin 1,000 times, it is much more likely that somewhere within those 1,000 tosses you can find 5 heads in a row. The single trial with 5 tosses of the coin and 5 heads is more information that the coin does not have an equal chance of heads or tails, where if the sequence of 1000 tosses, declaring that the coin is biased because you found 5 heads together once, seems like less evidence (and it is). Mandating that all NIH trials be listed on this site also prevents investigators from obscuring the details of unexpected additional results when given in a publication or presentation.
Conclusion
In summary, many technical components are involved with clinical trials. Everyone is searching for better treatments and wants to see positive results. We want better treatments; faster and free of risks. When reading about successes, one needs to understand that this is just one study and without other studies, we cannot be sure that the risks and benefits are reproducible. We want to know this because the patients in the trial have received the treatments, but when you are treated you want to have some confidence that what happened in the study, on average, will happen for you. Scientists are often skeptical to accept the results of a single study, no matter how large or how expensive. Scientists like replication before they begin replicating the treatments on their patients. We expect that a synthesis of results can lead us to the treatments that work.
References
1.) Understanding Clinical Trials: http://www.clinicaltrials.gov/ct2/info/understand
Driving with Multiple Sclerosis: Can I? Should I?
Osteoporosis and Physical Activity Among NARCOMS Participants