|Articles|March 21, 2013

Evaluating evaluations

Practitioners are slammed with information overload. To help cut through the avalanche, a set of questions can be useful for determining whether or not a study is worth investing your time and attention.

Most practitioners are inundated daily with an incredibly large amount of new information describing products or treatment choices. The inflow is familiar to every professional and includes hard-copy journals, e-journals, newsletters, or advertising brochures. Among the possible lines of evidence supporting safety and effectiveness, the presentation of clinical results is always a big plus. However, not every study report should be treated with the same degree of reverence.

After reading about 20 x 10⁴ articles during my career as a clinical scientist, I want to share my criteria for assessing studies. The following list of questions may help readers cope with the daily flood of information.

Category 1: Study Design

Was this a randomized, controlled, double-masked, prospective trial (RCT)?

You can earn 4 credit hours at a prestigious university learning why this is the gold standard among study designs. There are many benefits associated with this approach, but implementation is not universally practical.

The masked RCT design helps reduce the potential for bias that occurs when patients, doctors, and staff learn their assigned treatment conditions prior to or during the evaluation. For example, it lowers the probability that the worst patients were inadvertently assigned to one treatment group. A prospective, controlled design, addressing a well-defined, simple question, reduces the potential that a confounding variable will affect the study results and impact one treatment group and not the other. Overall, this design leads to a stronger cause-and-effect determination.

However, not every study can be conducted as a masked RCT. Studies involving medical devices are particularly challenging. For example, the surgical implantation of an IOL with a unique design creates problems when trying to mask the surgeon. The same is true with unique contact lenses. In the lens care arena, we might want all subjects to use disinfecting solutions dispensed from identical bottles. However, the transfer of the solutions to new bottles can significantly alter the product’s chemistry due to interactions with the plastics.

What was the control treatment?

First, I ask whether the treatment represents a relevant clinical choice. A head-to-head comparison between two competing treatment approaches can answer key medical and practice management questions. The study might determine that the treatments were basically equivalent or that there were large differences in safety or effectiveness. However, clever study designers seeking the magic p<0.05 level for statistical significance may choose to aim low, using an older treatment approach as the control in order to ensure that the new treatment wins. While this strategy helps ensure FDA approval and market entry, the old treatment may not represent current standards of practice. In addition, establishing clinical equivalence (i.e., not substantially different) is often much easier than establishing superiority.

Even when a contemporary control treatment is planned, the specific treatment selected can impact study outcomes. If the control (e.g., an established, marketed, effective pharmaceutical) is expected to perform equivalently to the test treatment, the expectations of both the patients and the doctors rise. All measures of effectiveness tend to improve when the participants realize that 50% of the patients were assigned a treatment condition recognized as effective. If an ineffective, untested treatment is used as a control (e.g., a placebo) expectations are often driven downward.

Were the right subjects recruited?

Here’s the dilemma. The perfect study is populated with patients who show up for all visits and complete all case report form questions with few errors. Perfect subjects have few non-essential pre-existing conditions that impact treatment efficacy and have a low probability for developing unrelated adverse events. Perfect patients are never extremely old, never too young, and never take any OTC or prescription products except those being investigated. Recruiting perfect study subjects allows for analyses to be completed quickly and makes interpretation easy. Unfortunately, once a drug or device is approved, practitioners cannot control the cases that walk into their waiting rooms. Compared to the study population, real patients often include older and younger subjects, and pregnant women-individuals who represent a much broader cross-section of the population. Enrolling more “Main Street” patients often drives up the number of patients targeted for enrollment and increases study costs. However, there is a strong support for a broader, more realistic inclusion/exclusion criteria that better predicts product performance in the marketplace.

Was the number of enrolled patients sufficient?

Numbers matter. Anecdotal reports concerning unique cases are extraordinarily valuable for the progression of science and medicine. This new, initial information is a catalyst for further explorations. However, these seeds should not be mistaken for mature trees. The smaller the study, the smaller the generalization. What is more believable: a 10% advantage in the treatment group in a study involving 10 patients or a 10% advantage in a trial involving 1,000 patients. Common sense (and a good statistical analysis) leads to the same conclusion.

Was the number of sites sufficient?

Having more study sites matters. Once again, the issue is the generalization of study results. Will the solid results observed among patients in Brooklyn apply to patients in Iowa City? Having a wide geographic diversity has numerous benefits. It helps address confounding variables such as seasonal effects, ethnic variability, humidity, and more. A prominent example is the success of extended-wear lenses worn at sea level (Norfolk, VA) as opposed to higher altitudes (Denver, CO).

Was the study duration sufficient?

Sometimes the answer to an important medical question can be obtained very quickly. For example, was a contact lens comfortable upon application? A quick study can address some simple questions. However, the development of adverse reactions and primary packaging failures are two examples of issues where time matters. Identifying the weak anti-microbial profile of a lens care product might be observed only after several month of daily usage.

CATEGORY 2: Interpretation

Were inferential statistics performed?

We all hate statistics and suspect that we are being manipulated by the clever geeks. But these calculations can provide useful tools that help guide interpretation. A statistically significant p value (p<0.05) means that only 5% of the time there is a Type I error (i.e., the study results found differences comparing the two treatment conditions when, in reality, the treatments should not have shown a difference in the primary endpoint). In other words, the results reported might have been a fluke, but the chances for this are very small. Statistics should not be the only criteria to measure success, but it provides strong support.

Were the results clinically meaningful?

Is the hunt worth the chase? Given enough enrolled patients, even a small difference in the mean values can be shown to be statistically different. For example, in a study comparing two topical anti-infectives, conjunctivitis was resolved after 1.5 days using Treatment #1 compared to 1.7 days with Treatment #2. While these results might be statistically significant, are the differences meaningful? In managing your practice, is it wise to switch your patients to a new product if the clinical benefit is small, and you have the choice of using a marketed product with an established safety and effectiveness profile?

Were the right questions asked?

You don’t know what you don’t know. Sometimes study designers and investigators guess wrong. They prepare questionnaires that ask the wrong questions or fail to ask the right ones. Patient interviews and spontaneous comments from study coordinators frequently provide key information. The casual conversation between a clinical research associate and study coordinator has uncovered many unexpected issues in the consumer products area, requiring re-designs.

Do the results make sense?

My SAT coach always told me to ask whether my mathematically computed answer made sense before I filled in the bubble. The same holds true here. Sometimes the results of a single study are just plain wrong. Resist changing anything in your practice until it makes common sense, especially in cases where safety is at risk.

Can the results be replicated?

Finally, we come to the most important factor: patience. Assessing whether a new treatment or product is a winner may take time. This is especially true regarding safety. We have all experienced situations in which individual adverse events are initially dismissed as rare anomalies. A noticeable trend might be detectable only after months or years of experience. Tracking results and objectively evaluating your experience over time is very important.

FACTORS YOU SHOULD NOT CONSIDER

This guide to interpreting clinical studies would not be completed without commenting on the factors that should not be considered.

Who sponsored the study?

It may be hard to believe, but drug and medical device companies do not have institutional policies that require researchers to lie and cheat. My experience is that companies are populated with honest people who want to develop safe and effective products. Besides, conspiracies always fall apart, and there is too much to lose. Studies destined for FDA review are often designed to the highest standards, and study sites are intensely controlled to avoid fraud. In contrast, self-funded studies at academic sites and NIH-sponsored studies are often only loosely monitored.

Where was the study published?

Getting published in a nationally recognized journal is difficult and very time consuming. The time from initial submission to publication might be measured in years when you add review time, rejections, and re-writes. There is bias by editors and publishers, and this impacts acceptance rates. A study published in a contemporary electronic journal or published by a manufacturer should be judged based on the same criteria described above.

What was the study location, school, or institution?

Results of a good study, meeting all design and interpretation criteria, should be considered regardless of the source (author or site). The best ideas and the best studies sometimes come from unknown little places. Studies conducted at nationally recognized institutions should be considered on their merits.ODT

Reference

1. Veys J and Schnider C. Evaluating clinical research for your practice. Optician, 2009; 234: 6118, 22-25

TAKE-HOME MESSAGE

FYI

During his long career at Alcon Laboratories, Dr. Stein led clinical teams responsible for the development of many lens care products. He has published more than 30 articles and is currently an independent writer and consultant. Reach him at SummerCreekC@gmail.com.

Anecdotal reports concerning unique cases are extraordinarily valuable for the progression of science and medicine.

While the results might be statistically significant, are the differences meaningful?

10 key questions to ask when evaluating clinical studies

STUDY DESIGN:

Was this a masked, randomized, controlled, double-masked, prospective trial (RCT)?

What was the control treatment?

Were the right subjects recruited?

Was the number of enrolled patients sufficient?

Was the number of sites sufficient?

INTERPRETING THE RESULTS

Were inferential statistics performed?

Were the results clinical meaningful?

Were right questions asked?

Do the results make sense?

Can the results be replicated?

Want more insights like this? Subscribe to Optometry Times and get clinical pearls and practice tips delivered straight to your inbox.

Subscribe Now!

Latest CME

In-Person Event

EnVision Summit

February 13-16, 2026

Evaluating evaluations

Category 1: Study Design

Was this a randomized, controlled, double-masked, prospective trial (RCT)?

What was the control treatment?

Were the right subjects recruited?

Was the number of enrolled patients sufficient?

Was the number of sites sufficient?

Was the study duration sufficient?

CATEGORY 2: Interpretation

Were inferential statistics performed?

Were the results clinically meaningful?

Were the right questions asked?

Do the results make sense?

Can the results be replicated?

FACTORS YOU SHOULD NOT CONSIDER

Who sponsored the study?

Where was the study published?

What was the study location, school, or institution?

TAKE-HOME MESSAGE

FYI

10 key questions to ask when evaluating clinical studies

Newsletter

Related Content

Contact Lens Institute rings in 2026 Visionaries

Early SPECTRUM results outline week 8 real-world outcomes with aflibercept 8 mg

Higher head elevation during sleep may be associated with higher IOP, study suggests

Alcon's PRECISION7 sphere, toric contact lenses now available in Canada

New study uses the MTII to evaluate progression of GA

Latest CME

EnVision Summit

(CME Track) Visionary Approaches: Rethinking Therapeutic and Interventional Glaucoma Management

(CME Track) The TED Perspective: A Multidisciplinary Approach to Thyroid Eye Care

(COPE Credit) Time Matters in GA: The Impact of Early Detection and Proactive Treatment Approaches

(CME Track) Expanding Horizons in Toric IOLs: Translating Technological Advances Into Improved Patient Outcomes

(CME Track) The Neural Frontier: Mapping Neurostimulation Across the DED Patient Spectrum for Refractive Surgery

(COPE Track) Expanding Horizons in Toric IOLs: Translating Technological Advances Into Improved Patient Outcomes

(COPE Track) Patient-Centered Treatment Strategies in the Management of nAMD and DME

(COPE Track) The TED Perspective: A Multidisciplinary Approach to Thyroid Eye Care

(COPE Track) The Neural Frontier: Mapping Neurostimulation Across the DED Patient Spectrum for Refractive Surgery

(COPE Track) Visionary Approaches: Rethinking Therapeutic and Interventional Glaucoma Management

Practical Approaches to Modern Dry Eye Treatment and Management

(CME Credit) Time Matters in GA: The Impact of Early Detection and Proactive Treatment Approaches

(CME Track) Revolutionizing nAMD and DME Management: Collaborative Strategies in the Age of Durable Treatments

(CME Track) Patient-Centered Treatment Strategies in the Management of nAMD and DME

(COPE Track) Revolutionizing nAMD and DME Management: Collaborative Strategies in the Age of Durable Treatments

(CME Track) Clinical Consultations™: Framing a New Approach to Geographic Atrophy Management – Expert Insights into Recent Developments

(COPE Track) Clinical Consultations™: Framing a New Approach to Geographic Atrophy Management – Expert Insights into Recent Developments

(CME Track) Rapid Reviews in Retina™: Emerging Updates from Winter 2025 – Addressing the Wealth of New Data in Treatments for nAMD and DME

(COPE Track) Rapid Reviews in Retina™: Emerging Updates from Winter 2025 – Addressing the Wealth of New Data in Treatments for nAMD and DME

Living With X-Linked Retinitis Pigmentosa: What We Can Learn From a Patient’s Experience

(CME Track) Collaborative Community Connections™: Mastering the Management of nAMD and DME Through Therapeutic Innovation

Living With X-Linked Retinitis Pigmentosa: What We Can Learn From a Patient’s Experience

(COPE Track) Collaborative Community Connections™: Mastering the Management of nAMD and DME Through Therapeutic Innovation

Navigating the Glaucoma Therapeutic and Surgical Landscape: From Conventional to Cutting-Edge

(COPE Track) Neurotrophic Keratitis: Multidisciplinary Approaches to Enhance Patient Outcomes

(CME Track) Neurotrophic Keratitis: Multidisciplinary Approaches to Enhance Patient Outcomes

(CME Track) The Neural Network: Exploring The Role of Neuromodulation in Dry Eye Disease Management

(COPE Track) The Neural Network: Exploring The Role of Neuromodulation in Dry Eye Disease Management

(CME Track) Clinical Case Connections: Expert Insights on Applying Therapeutic Innovations in nAMD

(CME Track) Clinical Case Connections: Understanding the Impact of Advances in Treatment for DME and DR

(CME Track) Toric IOLs Unleashed: From Technological Progress to Patient Success

(COPE Track) Clinical Case Connections: Expert Insights on Applying Therapeutic Innovations in nAMD

(COPE Track) Clinical Case Connections: Understanding the Impact of Advances in Treatment for DME and DR

(COPE Track) Toric IOLs Unleashed: From Technological Progress to Patient Success

(CME Credit) Navigating Pharmacological Presbyopia Treatment for Enhanced Patient Care

(COPE Credit) Navigating Pharmacological Presbyopia Treatment for Enhanced Patient Care

Neurotrophic Keratitis Insights: An Interactive Corneal Sensitivity Testing Workshop

(COPE Track) Small Mites, Big Impact: Revolutionizing Demodex Blepharitis Care

(CME Track) Small Mites, Big Impact: Revolutionizing Demodex Blepharitis Care

Rapid Reviews in Retina™: Emerging Updates from Spring 2025—Addressing the Wealth of New Data in Treatments for Neovascular Retinal Disease

Interventional Dry Eye: A Stepwise Treatment & Management Approach

Trending on Optometry Times - Clinical News & Expert Optometrist Insights

Early SPECTRUM results outline week 8 real-world outcomes with aflibercept 8 mg

Higher head elevation during sleep may be associated with higher IOP, study suggests

Contact Lens Institute rings in 2026 Visionaries

Alcon's PRECISION7 sphere, toric contact lenses now available in Canada

New study uses the MTII to evaluate progression of GA