Using Claude Code for a Second Opinion on MRI Scans - What Actually Happened

A developer named Antoine recently published an experiment that caught fire on Hacker News: he fed 266MB of DICOM MRI data from his right shoulder into Claude Code (Opus 4.8) to get a second opinion on his orthopedist's diagnosis.

The result? The AI disagreed with the human doctor. And the ensuing HN discussion - with actual radiologists weighing in - reveals a lot about where AI medical imaging stands today.

What Antoine Did

The setup was straightforward. Antoine had been dealing with right shoulder pain for two to three weeks. His doctor diagnosed a "Grade III (>50%-width) partial-thickness tear at the apical insertion" of the subscapularis tendon - a significant rotator cuff injury that typically leads to aggressive treatment.

Rather than accept this at face value, Antoine:

Exported his full MRI as DICOM files (266MB, hundreds of individual files)
Pointed Claude Code at the data
Let the AI install necessary packages for medical image analysis
Gave minimal clinical context: just "right shoulder pain for 2-3 weeks"

Claude developed a methodical analysis strategy, writing code to process the imaging data and examining it from multiple perspectives.

The Disagreement

Here's where it gets interesting. The human radiologist saw a significant partial tear. Claude Code reported an "intact tendon" - essentially no tear at all.

When Antoine had Claude arbitrate between the two readings (providing both reports plus clinical test results), the AI concluded with "moderate-to-high confidence" that the evidence favored its own reading: "Mild insertional tendinosis; NO discrete partial- or full-thickness tear."

Antoine was left in diagnostic limbo. As he put it, the AI second opinion suggested the human-recommended treatment plan was "premature and more intervention-heavy than the facts seemed to justify." But he also acknowledged uncertainty about fully trusting AI for medical interpretation.

What HN Is Saying

The thread exploded with 368+ comments, and the discussion divided into several camps.

Radiologists pushed back hard. One actual radiologist commented: "I can't really weigh in without seeing the full 3D MRI dataset." They pointed out a critical technical detail - ultrasound (which Antoine had also gotten) isn't great for detecting calcification and will miss small calcifications that would show on X-ray or MRI.

Multiple commenters noted that MRI is a 3D medium, and slicing it incorrectly can miss features entirely: "I would not be at all surprised if one could slice an MRI the wrong way to produce a 2D image that fails to show a feature that exists in the source data."

The "Claude is bad at images" camp appeared. Several commenters argued that Claude specifically underperforms on image understanding compared to other frontier models. One wrote: "Claude is the worst FM at image understanding. Prior to gpt-5.4 the only usable models were Gemini and Qwen."

Others countered that Claude handles some image types well, particularly PDF-to-markdown conversion and document understanding - but medical imaging is a different beast.

The sonography vs radiology distinction came up. A cardiac sonographer offered perspective: "Medical imaging is one of those things everyone thinks is simple because they don't know what they don't know. Any comment that doesn't start with 'I'm a radiologist' should be taken with a grain of salt."

The "AI second opinions help catch missed things" camp. Some shared stories of AI helping catch procedural errors or outdated treatment plans. One person described using AI-generated questions to push a GP who was mishandling their mother's care - and it worked.

The "this is a nightmare for doctors" camp. Multiple commenters argued that patients approaching doctors with AI-generated diagnoses creates friction: "Nightmare because users approach LLMs with the false confidence that they're always right, and present LLM outputs as fact to Doctors who have to waste time explaining that it's wrong most of the time."

Get the weekly deep dive

Tutorials on Claude Code, AI agents, and dev tools - delivered free every week.

From the archive

OpenAI's June API Updates Are Really a Control-Plane Upgrade

Jun 28, 2026 • 8 min read

Vercel AI SDK 7: The Production Agent Upgrade

Jun 28, 2026 • 9 min read

Grok Build Developer Guide: xAI's Terminal Coding Agent (June 2026)

Jun 27, 2026 • 9 min read

Perplexity Bumblebee: Developer Guide to the Open Source Supply Chain Scanner

Jun 27, 2026 • 7 min read

The Technical Reality

Several important technical points emerged from the discussion:

MRI complexity matters. 2D MRI scans have gaps between slices (typically 10% of slice thickness). 3D scans don't have gaps but are slower and more prone to movement artifacts. The voxels in 3D scans might be 1mm x 1mm x 1mm - which sounds precise until you realize subtle tears can be smaller than that.

Prompting affects diagnosis. One researcher noted: "Subtle changes in prompts can cause different diagnosis." The exact wording you use when asking an AI about medical images meaningfully changes the output.

Modality matters. When a radiology report says something "isn't present," there's always an implicit caveat that the finding isn't present within the context of that specific imaging modality. An ultrasound saying "no calcifications" and an X-ray showing calcifications can both be correct - the ultrasound just can't see small ones.

Why This Matters for Developers

This isn't really a story about whether you should trust AI for medical diagnosis (you shouldn't, not yet, not without human verification). It's a story about the current frontier of multimodal AI and where the edges are.

A few takeaways:

The capability gap is real but narrowing. Two years ago, asking any LLM to analyze raw DICOM files would have been absurd. Now Claude Code can install packages, write analysis code, and produce a structured medical reading. The reading might be wrong, but the workflow exists.

Domain expertise still matters. The radiologists in the thread could immediately identify limitations that a non-specialist wouldn't know to ask about - 2D vs 3D acquisition, slice gaps, modality-specific blind spots. AI doesn't yet surface these caveats reliably.

Second opinions have value, even imperfect ones. Antoine's doctor recommended shockwave therapy for a condition that recent clinical guidelines say doesn't respond to it (rotator cuff tendinopathy without calcification). Even if Claude's diagnosis is wrong, the friction of having a second opinion made Antoine dig deeper.

The probabilistic nature cuts both ways. As one commenter put it: "Not quite. An LLM generates text that would likely follow... A patient in pain with a bone protruding from their shin has a... 'broken leg.' The more training data, the more questions it can answer with a reasonable degree of probability of accuracy."

The counterpoint: "It can be helpful in your understanding the choices made by asking questions and thus in reassurance, but it requires something most people lack: understanding you are likely wrong since you are just collecting information without understanding it."

The Bigger Picture

What's notable about this story isn't that Claude Code can read MRIs (it can, sort of). It's that the experiment is now cheap and accessible enough that a solo developer can run it on a weekend, publish results, and get hundreds of HN comments including feedback from actual radiologists.

That feedback loop - AI output, expert critique, public discussion - is how capabilities actually improve. The radiologist comments are training data for the next iteration of these models, whether directly or through the discourse they generate.

For now, the prudent approach is obvious: AI as a thinking aid, not a replacement for professional judgment. But the gap is closing faster than the medical establishment is adapting.

Antoine ended his post in diagnostic limbo, uncertain whether to trust the AI or the doctor. That uncertainty is probably the healthiest response right now.

Sources

Original article by Antoine
Hacker News discussion (368+ comments)

What Antoine Did

The Disagreement

What HN Is Saying

OpenAI's June API Updates Are Really a Control-Plane Upgrade

Vercel AI SDK 7: The Production Agent Upgrade

Grok Build Developer Guide: xAI's Terminal Coding Agent (June 2026)

Perplexity Bumblebee: Developer Guide to the Open Source Supply Chain Scanner

The Technical Reality

Why This Matters for Developers

The Bigger Picture

Sources

Claude Code's Extended Thinking Is a Summary - What That Means for You

GLM 5.2 Outperforms Claude Code on Semgrep's IDOR Vulnerability Benchmarks

What Is Claude Code? The Complete Guide for 2026

Related Tools

Claude Code

Zed

Claude Opus 4.7

Conductor

Apps from Developers Digest

Skills Pro

Hookyard Pro

Agent Hub

Related Guides

Claude Code Setup Guide

Claude Code Complete Course

Getting Started with Claude Code

Related Videos

Open Design: Turn Websites into Design Assets for Cursor & Claude Code

Nimbalyst: The Open-Source Visual Workspace for Building with Codex and Claude Code

Composio: Connect OpenClaw & Claude Code to 1,000+ Apps via CLI

Related Posts

GLM 5.2 Outperforms Claude Code on Semgrep's IDOR Vulnerability Benchmarks

Vulnerability Reports Are Not Special Anymore

Unlimited OCR: Baidu's Open-Source Solution for Long Document Parsing

Claude Code's Extended Thinking Is a Summary - What That Means for You

Cloudflare Now Lets AI Agents Deploy Workers Without Signup

LLM Architectures Got Complicated Fast

Get Smarter About AI Dev

What Antoine Did

The Disagreement

What HN Is Saying

OpenAI's June API Updates Are Really a Control-Plane Upgrade

Vercel AI SDK 7: The Production Agent Upgrade

Grok Build Developer Guide: xAI's Terminal Coding Agent (June 2026)

Perplexity Bumblebee: Developer Guide to the Open Source Supply Chain Scanner

The Technical Reality

Why This Matters for Developers

The Bigger Picture

Sources

Claude Code's Extended Thinking Is a Summary - What That Means for You

GLM 5.2 Outperforms Claude Code on Semgrep's IDOR Vulnerability Benchmarks

What Is Claude Code? The Complete Guide for 2026

Related Tools

Claude Code

Zed

Claude Opus 4.7

Conductor

Apps from Developers Digest

Skills Pro

Hookyard Pro

Agent Hub

Related Guides

Claude Code Setup Guide

Claude Code Complete Course

Getting Started with Claude Code

Related Videos

Open Design: Turn Websites into Design Assets for Cursor & Claude Code

Nimbalyst: The Open-Source Visual Workspace for Building with Codex and Claude Code

Composio: Connect OpenClaw & Claude Code to 1,000+ Apps via CLI

Related Posts

GLM 5.2 Outperforms Claude Code on Semgrep's IDOR Vulnerability Benchmarks

Vulnerability Reports Are Not Special Anymore

Unlimited OCR: Baidu's Open-Source Solution for Long Document Parsing

Claude Code's Extended Thinking Is a Summary - What That Means for You

Cloudflare Now Lets AI Agents Deploy Workers Without Signup

LLM Architectures Got Complicated Fast

Get Smarter About AI Dev