A cardiology nurse in Augusta, Georgia scrolls through her phone between patients. She’s looking for a clearer way to explain heart failure symptoms to a man with limited literacy and newly diagnosed hypertension. She types his question—“Why does my heart feel tired?”—into a popular AI chatbot.

The response is fast, technically accurate… and almost impossible for him to understand.

As AI becomes a go-to source for patient education, especially in cardiovascular disease (CVD), these moments matter. And a new study from Augusta University helps explain why the gap between access and comprehension may be widening.

Why This Study Matters Now

Cardiovascular disease remains the leading cause of death globally. Clear information saves lives—yet over half of U.S. adults read below a 6th-grade level. Meanwhile, more people than ever use AI tools like ChatGPT or MediSearch for health advice.

But do AI tools adjust their answers to match users’ literacy or education levels? Or are we unintentionally creating new barriers for the very people who most need understandable information?

This study helps answer that question—and the results are a warning signal.

THE PROBLEM

AI Answers Are Often Too Complex for Most Patients

Researchers tested 105 cardiovascular questions—covering heart attacks, heart failure, stroke, blood pressure, and more—rewritten at three education levels:

Lower secondary (5th–8th grade)
Higher secondary (high school level)
College graduate

Then they fed those prompts into three AI tools: ChatGPT Free (GPT-4o mini), ChatGPT Premium (GPT-4o), and MediSearch.

Across all models, one pattern was unmistakable:

As the education level of the question increased, AI answers became significantly harder to read.

The Flesch–Kincaid scores decrease sharply as question complexity rises. This means the text becomes denser, longer, and more technical.

And here’s the key issue: Even the simplest answers were written at a college reading level or higher.

For a country where most adults read at or below an 8th-grade level, that’s a major accessibility barrier.

THE EVIDENCE

How the Researchers Tested Readability

The team used widely accepted readability metrics (Flesch–Kincaid Ease Score and Grade Level) to measure how simple or difficult each AI-generated answer was.

They also compared response similarity using BERT-based cosine similarity—a measure of how similar two pieces of text are in meaning.

What They Found

1. ChatGPT Premium adapted best to user education level. It showed the strongest correlation between the user’s prompt level and the readability of its responses, explaining ~35% of the variation.

2. ChatGPT Free was easier to read than MediSearch. This is important—because ChatGPT Free is the most widely used, especially among people who can’t (or won’t) pay for premium tools.

3. MediSearch produced the most complex responses. On average, its answers scored the lowest on readability and would challenge even highly educated health consumers.

4. AI responses often exceeded safe readability thresholds. NIH and AMA recommend that patient education materials be written at the 6th-grade level. None of the AI tools achieved this.

These findings reinforce something many clinicians and health educators already observe: AI is powerful—but still not people-centered enough.

THE IMPLICATIONS

Why This Matters for Public Health Practice

1. AI may unintentionally widen health literacy gaps.

Patients with lower literacy—who already face higher CVD risk—may struggle the most to understand AI-generated advice.

2. Providers may overestimate AI clarity.

Clinicians see accurate answers, but patients may see jargon.

3. Specialized medical AI tools aren’t always more accessible.

MediSearch, despite its clinical focus, produced the least readable answers.

4. ChatGPT Free may actually be the most public-friendly option.

Its responses were consistently easier to read than those from MediSearch and sometimes even the Premium model.

WHAT THIS MEANS IN PRACTICE

For Local Health Departments

Audit AI-based educational materials for readability before distributing them.
Use AI primarily to draft patient materials, then simplify manually.
Provide clear guidance to staff on verifying accuracy and readability.

For Clinicians and Care Teams

Encourage patients to bring in AI-generated explanations so gaps can be corrected.
Use ChatGPT Free or Premium for drafting handouts—but always check the reading level.
Teach patients to ask AI: “Explain at a 6th-grade level.”

For Community-Based Organizations

Continue producing plain-language materials; AI should support—not replace—your expertise.
Test AI-generated content with community members before deploying widely.

For Digital Health Developers

Bake readability into design.
Add user-controlled reading-level options.
Implement continuous monitoring of readability metrics.

THE BARRIERS & WHAT’S NEXT

Key Barriers Identified

Education level doesn’t equal health literacy. Someone with a college degree may still struggle with medical terminology.
Readability metrics don’t measure comprehension. Flesch–Kincaid counts syllables—not understanding.
AI updates change model behavior over time. The study tested models in late 2024; performance may differ now.
Semantic similarity ≠ clinical correctness. Two answers can be similar in structure but vary in accuracy.

Future Pathways

Tailoring AI responses based on health literacy—not just education level.
Integrating clinician review into AI-generated patient materials.
Including more diverse real-world patient questions in future testing.
Expanding research to newer models (e.g., DeepSeek, Claude, GPT-5).
Adding transparent versioning so AI performance can be tracked over time.

QUESTIONS FOR REFLECTION

How might your agency assess the readability of materials you generate—whether by humans or AI?
What supports would help clinicians ensure AI-generated explanations are usable by patients with lower literacy?
Should AI tools be regulated to meet minimum readability standards for health information?

People have learned about this: 153

AI’s Hidden Challenge in Heart Health Communication

Why This Study Matters Now

THE PROBLEM

AI Answers Are Often Too Complex for Most Patients

THE EVIDENCE

How the Researchers Tested Readability

What They Found

THE IMPLICATIONS

Why This Matters for Public Health Practice

1. AI may unintentionally widen health literacy gaps.

2. Providers may overestimate AI clarity.

3. Specialized medical AI tools aren’t always more accessible.

4. ChatGPT Free may actually be the most public-friendly option.

WHAT THIS MEANS IN PRACTICE

For Local Health Departments

For Clinicians and Care Teams

For Community-Based Organizations

For Digital Health Developers

THE BARRIERS & WHAT’S NEXT

Key Barriers Identified

Future Pathways

QUESTIONS FOR REFLECTION

Tags

Share this post

Discussion

No comments yet

Join the conversation

Why This Study Matters Now

THE PROBLEM

AI Answers Are Often Too Complex for Most Patients

THE EVIDENCE

How the Researchers Tested Readability

What They Found

THE IMPLICATIONS

Why This Matters for Public Health Practice

1. AI may unintentionally widen health literacy gaps.

2. Providers may overestimate AI clarity.

3. Specialized medical AI tools aren’t always more accessible.

4. ChatGPT Free may actually be the most public-friendly option.

WHAT THIS MEANS IN PRACTICE

For Local Health Departments

For Clinicians and Care Teams

For Community-Based Organizations

For Digital Health Developers

THE BARRIERS & WHAT’S NEXT

Key Barriers Identified

Future Pathways

QUESTIONS FOR REFLECTION

Tags

Share this post

Related posts:

Discussion

No comments yet

Join the conversation