LLMS

32 articles

All TopicsLLMsNews Hacker News Anthropic AI Models Developer Tools AI Agents

LATEST

CAPA Benchmark: Why Coding Agents Should Learn Your Habits Across Sessions

A new 600-session benchmark shows coding assistants that read a user's resolved session history resolve ambiguous requests with far fewer clarifying questions - Claude Opus 4.8's first-turn success jumps from 24.3% to 60.3% when history is available.

July 31, 2026•7 min read

Read Article

New8 min read

Fable 5 Effort Levels vs Switching Models: When to Dial and When to Change

Effort levels and model choice both cost more for more capability, but they are not interchangeable. Here is when to move the effort dial and when to switch models instead.

Anthropic AI Models Claude Code

New7 min read

DeepSeek Pauses Fundraising After Leaked Investor Transcript Reveals Compute Gap

DeepSeek suspended its $74B valuation fundraising round after a leaked transcript of founder Liang Wenfeng's investor meeting laid bare the compute gap between Chinese and US AI labs - revealing he needed 200,000 Huawei 950 chips but received only 16,000.

News Hacker News DeepSeek

8 min read

Mozilla's State of Open Source AI Report: The Gap Is 3%, But Deployment Remains the Real Problem

Mozilla's inaugural report reveals open models now match closed AI on capability, but only 51% reach production. The harness layer and permission model gaps explain why.

News Hacker News AI

7 min read

Detecting LLM Text with Classical ML: TF-IDF Still Works

A developer built an 85% accurate LLM text detector using TF-IDF and linear SVM - no neural networks required. Here is how it works and what HN thinks about AI detection.

News Hacker News Machine Learning

7 min read

Inkling: Thinking Machines Lab Drops a 975B Open-Weights Model

A new American open-weights frontier model with multimodal capabilities, 1M token context, and competitive benchmarks. Here's what the HN community thinks.

News Hacker News AI Models

6 min read

How to Stop Claude from Saying 'Load-Bearing'

A Hacker News discussion blows up over LLM vocabulary quirks, with developers sharing hooks, filters, and coping mechanisms for repetitive Claude-isms.

News Hacker News Claude

6 min read

Geohot on LLMs: Love the Tech, Hate the Hype

George Hotz publishes a post distinguishing genuine AI progress from manipulative hype narratives. HN's 126-comment thread debates whether he's right about doom-mongering and AGI inevitability.

News Hacker News AI Industry

6 min read

Colibri: Running GLM 5.2 on a 32GB Laptop with Disk Streaming and Expert Offloading

A solo developer built a 1,300-line C inference engine that runs the 744B GLM 5.2 model on consumer hardware by streaming routed experts from disk. Here's how it works.

News Hacker News LLMs

8 min read

AI Tutor Shows 0.71-1.30 SD Effect Size in Dartmouth Statistics Course

A new study from Dartmouth measures the impact of an AI tutoring platform on introductory statistics performance. Full engagement with the system correlated with significant exam score improvements, though selection bias remains a key limitation.

News Hacker News AI

8 min read

Anthropic Discovers J-Space: A Global Workspace Inside Language Models

Anthropic's new research reveals LLMs have an internal 'workspace' for silent reasoning - and it could change how we build safer AI.

AI Research News Hacker News

6 min read

Why Price Per 1M Tokens Is a Misleading Metric for LLM Costs

Comparing LLMs by token pricing alone can lead you to choose worse, more expensive models. Cost per task tells the real story.

AI News Hacker News

7 min read

Vulnerability Reports Are Not Special Anymore

Filippo Valsorda argues that LLMs have ended the era of treating security researchers with kid gloves. When anyone can discover vulnerabilities with an AI, the old coordinated disclosure model breaks down.

News Hacker News Security

Showing 12 of 31 articles

Keep exploring LLMs

- LLMs Topic Hub - tools and guides for LLMs from the Developers Digest directory
- Glossary - dive deeper across the Developers Digest knowledge base
- Developers Digest on YouTube - video tutorials covering LLMs and more

Explore 806 topics

Browse All Topics

LLMS

CAPA Benchmark: Why Coding Agents Should Learn Your Habits Across Sessions

Keep exploring LLMs

Get Smarter About AI Dev

LLMS

CAPA Benchmark: Why Coding Agents Should Learn Your Habits Across Sessions

Keep exploring LLMs

Get Smarter About AI Dev