Why Doctor Questionnaires Often Confuse Patients and Why That’s a Bigger Problem Than It Sounds
Filling out forms before seeing a doctor or therapist has become a routine part of healthcare. Whether it’s a first visit or a follow-up, patients are usually asked to complete a symptom questionnaire designed to help clinicians understand what’s going on. These forms are meant to guide diagnosis, treatment plans, and even long-term research. But a new study suggests something unsettling: many people don’t actually understand these questionnaires in the same way clinicians think they do.
Researchers from the University of Arizona have found that confusion around one of the most widely used mental health screening tools is not just occasional—it’s extremely common. The findings, published in JAMA Psychiatry, raise serious questions about how accurately patient-reported data reflects real experiences.
Understanding the Patient Health Questionnaire and Why It Matters
At the center of the study is the Patient Health Questionnaire, commonly known as the PHQ. Introduced in the 1990s, the PHQ has become a cornerstone of modern healthcare. It is used not only by mental health professionals, but also by primary care physicians, specialists, and researchers.
Different versions of the PHQ exist, including the well-known PHQ-9, which screens for depression. The questionnaire asks patients about symptoms such as sleep changes, appetite changes, difficulty concentrating, restlessness, and low mood. Health systems rely on these responses to triage care, monitor progress, and determine whether further evaluation or treatment is needed. Its use is mandated or strongly recommended by organizations such as the National Institutes of Health and other government agencies.
Because the PHQ is so deeply embedded in clinical care and research, any flaw in how people interpret its questions can have far-reaching consequences.
What the Study Set Out to Examine
The research was led by Zachary Cohen, an assistant professor of psychology at the University of Arizona and head of the Personalized Treatment Lab. Cohen’s interest in the issue goes back more than a decade, to his time in clinical training. He noticed that patients frequently asked for clarification when filling out questionnaires, and clinicians often had no clear guidance to offer beyond telling patients to answer as best they could.
The new study aimed to systematically examine whether people interpret the PHQ instructions consistently—and whether those interpretations align with what the questionnaire is actually designed to measure.
The key issue lies in a specific phrase used in the instructions. Patients are asked to report how often they have been “bothered by” certain symptoms over a given time period. The response options range from “not at all” to “nearly every day.”
On the surface, this may sound straightforward. In practice, it turns out to be anything but.
How the Study Was Conducted
The research team recruited approximately 850 participants and asked them to complete a standard Patient Health Questionnaire. Afterward, participants were presented with a hypothetical scenario to test how they interpreted the instructions.
In this scenario, participants were asked to imagine that they had overslept nearly every day for a week, but that the oversleeping did not bother them at all—perhaps because they were on vacation. They were then asked which response they would select on the PHQ: “not at all” (because they were not bothered) or “nearly every day” (because the symptom occurred frequently).
Participants were also asked whether their original PHQ responses reflected how often they experienced symptoms or how often those symptoms bothered them, and how they would answer the questionnaire in the future.
The Results Showed Widespread Confusion
The findings revealed a striking lack of consistency. Only 328 participants, or about 38%, selected the response that matched the questionnaire’s instructions. Even more telling, just 146 participants, roughly 17%, said they would base their answers on how bothered they felt if they filled out the PHQ again in the future.
In other words, most people were not interpreting the questionnaire the way it was intended. Some focused on symptom frequency, others on emotional distress, and many switched between the two depending on the question or context.
This inconsistency means that two people with the same experiences could produce completely different scores on the PHQ. From a data standpoint, that’s a serious problem.
Why Inconsistent Answers Can Lead to Bad Data
Mental health research and clinical care depend heavily on self-reported symptoms. If those reports are unreliable, treatment decisions may be built on shaky ground.
Cohen points out that this becomes especially problematic when questionnaire data is combined with other forms of measurement, such as digital health tools. For example, wearable devices can track sleep patterns objectively. If a smartwatch shows that many people are oversleeping, but half of them report “not at all” on the PHQ because they are not bothered by it, the data becomes difficult to interpret.
Instead of revealing meaningful patterns, the information starts to look like noise—even when a real issue exists.
Real-World Examples Highlight the Problem
The study also points to modern healthcare trends that make the issue even more relevant. Consider the growing use of GLP-1 weight loss medications, such as Ozempic. Reduced appetite is a common and expected effect of these drugs. However, appetite changes are also listed as potential symptoms of depression on the PHQ.
If a patient reports reduced appetite without being distressed by it, that information should not automatically be interpreted as a sign of depression. Without clearer wording, the questionnaire risks misclassifying normal or intentional changes as mental health symptoms.
Why This Matters for Clinicians and Patients
When clinicians rely on questionnaires that patients interpret differently, the results can affect diagnosis, treatment planning, and follow-up care. It also impacts large-scale research studies that inform clinical guidelines and public health policy.
Cohen emphasizes that it’s not reasonable to expect accurate outcomes when some people answer questions one way and others answer the exact opposite way for the same experience. Consistency is essential if these tools are going to remain useful.
A Straightforward Fix May Be Possible
The researchers stress that this study is only a first step, but they also suggest that the solution may not be complicated. One option is to clarify the wording so that questions clearly ask about symptom frequency alone. Another is to emphasize emotional distress more explicitly when that is the intended focus.
Either approach would require further testing to ensure it improves accuracy, but the idea is simple: ask exactly what you want to measure, and make sure everyone understands it the same way.
Additional Context: Why Patient-Reported Outcomes Are So Important
Patient questionnaires like the PHQ are part of a broader category known as patient-reported outcome measures. These tools are widely used because they are inexpensive, easy to administer, and scalable across healthcare systems.
However, their effectiveness depends entirely on shared understanding between patients and providers. As healthcare increasingly moves toward personalized and digital models, the clarity of these tools becomes even more critical.
This study serves as a reminder that even long-standing, widely accepted instruments should be periodically re-evaluated to ensure they still work as intended.
Research Paper Reference
JAMA Psychiatry (2025). DOI: 10.1001/jamapsychiatry.2025.3796
https://jamanetwork.com/journals/jamapsychiatry/fullarticle/2842842