RESEARCH

20 items

15 posts, 1 tool, 4 guides

BlogJul 24, 2026

FLUX 3: Black Forest Labs Ships a Unified Multimodal Foundation Model for Image, Video, Audio, and Robotics

Black Forest Labs released FLUX 3, a single multimodal model trained jointly on images, video, and audio that also drives robots on Audi production lines. Here is what it does, how it works, and how to try it.

Research AI Models Multimodal Video Generation Robotics Image Generation

BlogJul 23, 2026

Terence Tao Digests the Jacobian Conjecture Counterexample: How Claude Fable 5 Broke an 87-Year-Old Math Problem

Terence Tao published a deep mathematical digestion of the Jacobian conjecture counterexample discovered by Claude Fable 5. Here is what happened, what HN is saying, and what it means for AI-assisted research.

News Hacker News AI Mathematics Fable 5 Claude Anthropic Research

BlogJul 12, 2026

Dockerless Verification Is The Next Coding Agent Bottleneck

ByteDance's Dockerless paper asks whether coding-agent patches can be verified without spinning up per-repo environments. The practical answer is not replace CI. It is use cheaper evidence before CI.

AI Agents AI Coding Developer Workflow CI/CD Research

BlogJul 11, 2026

Ghost Font: Text That Humans Can Read But AI Cannot

A new experimental technology encodes messages in video using motion-based steganography, exploiting how AI models process video as individual frames rather than continuous motion.

News Hacker News AI Security Research

BlogJul 7, 2026

Harness Engineering and the Path to Self-Improving AI

Lilian Weng argues self-improving AI won't start with models rewriting their weights - it starts with the harness. Here's what that means for developers building agents.

AI Agents Harness Engineering Self-Improvement Context Engineering Coding Agents Research

BlogJul 7, 2026

Ilya Sutskever's 30 Papers: The Reading List That Covers 90% of What Matters

A CS student built 30papers.com to make Ilya's legendary ML reading list more accessible. HN has thoughts on the source, the format, and why compression equals intelligence.

News Hacker News Machine Learning AI Deep Learning Research

BlogJul 6, 2026

AI Tutor Shows 0.71-1.30 SD Effect Size in Dartmouth Statistics Course

A new study from Dartmouth measures the impact of an AI tutoring platform on introductory statistics performance. Full engagement with the system correlated with significant exam score improvements, though selection bias remains a key limitation.

News Hacker News AI Education Research LLMs

BlogJul 6, 2026

Clean Code Makes AI Agents 34% More Efficient - New Research

A controlled study of 660 Claude Code trials shows clean codebases reduce token usage by 7-8% and file revisitations by 34%, while pass rates stay the same. Traditional maintainability principles still matter in the age of AI coding.

News Hacker News AI Coding Claude Code Research Code Quality

BlogJul 6, 2026

Does Code Cleanliness Affect AI Coding Agents?

A new SonarSource study finds clean code doesn't boost agent pass rates - but it cuts token usage by 8% and file revisitations by 34%. Here's what that means for your codebase.

AI Coding Claude Code Research Code Quality

BlogJul 5, 2026

Program-as-Weights Turns Prompts Into Local Fuzzy Functions

The Program-as-Weights paper is a useful signal for developers: some LLM calls may move from per-request API prompts into compact local artifacts that behave like reusable fuzzy functions.

AI Coding Local AI LLM Research Developer Workflow

BlogJul 2, 2026

Claude Science Developer Guide 2026: AI Workbench for Research

Anthropic's Claude Science combines scientific tools, local code execution, and HPC integration into one AI workbench. Here is how to access it, what it costs, and where it fits alongside Claude Code.

Claude Anthropic Research Scientific Computing Developer Guide

BlogJun 22, 2026

Prompt Injection is Role Confusion - New ICML Research Explains Why LLMs Can't Tell Friend from Foe

New research from MIT reveals that LLMs identify speakers by writing style, not by tags - meaning attackers who sound like the system effectively become the system. The findings explain why prompt injection remains unsolved.

News Hacker News AI Security LLMs Research

BlogMay 23, 2026

Multi-Stream LLMs Hint at the Next Agent Architecture

The Multi-Stream LLMs paper argues that agents are bottlenecked by single chat streams. The practical takeaway is not to rebuild everything today, but to design agent runtimes around separated channels.

AI Agents LLMs Research Developer Workflow Agent Architecture

BlogMay 2, 2026

Refusal Directions Are a Systems Problem

A trending refusal-direction paper is a reminder that model safety cannot be treated as a thin refusal layer. Builders need layered controls around the model.

AI Safety LLMs Agents Developer Tools Research

ToolApr 23, 2026

NotebookLM

Google's AI notebook that lets you ground a Gemini chat in your own uploaded sources. Generates summaries, mind maps, and podcast-style audio overviews.

research rag google gemini notes audio

GuideApr 23, 2026

Fast Mode - Claude Code

2.5x faster Opus at a higher token cost (research preview).

GuideApr 23, 2026

Built-in Subagents - Claude Code

Researcher, auditor, reviewer, and other ready-made subagent types.

GuideApr 23, 2026

Subagent Context Isolation - Claude Code

Prevent bloating the main conversation with research or exploration.

BlogApr 22, 2026

Over-Editing: Why Your AI Coding Agent Rewrites What Isn't Broken

A new study from nrehiew quantifies a problem every Claude Code, Cursor, and Codex user has felt: models making huge diffs for tiny fixes. Here is why it happens, why tests do not catch it, and what to do about it.

AI Coding Claude Code Cursor Codex Code Review Research

GuideApr 21, 2026

Chronicle Research Preview Setup Guide

Set up Codex Chronicle on macOS, manage permissions, and understand privacy, security, and troubleshooting.

Browse All Tags

RESEARCH

Get Smarter About AI Dev

RESEARCH

Get Smarter About AI Dev