OPEN-WEIGHTS

20 items

20 posts

BlogJul 28, 2026

Anthropic CEO Dario Amodei on open-weights models: the position, the pushback, and what it means for developers

Dario Amodei published Anthropic's stance on open-weights models this week - no total ban, but support for chip export controls, distillation crackdowns, and mandatory safety testing. HN responded with 800+ comments calling it regulatory capture. Here is what the CEO said, what the thread argued, and why the debate matters for every developer deploying AI.

Anthropic Hacker News News Open-weights AI Policy

BlogJul 28, 2026

Kimi Linear: An Attention Architecture That Outperforms Full Attention

Moonshot AI's Kimi Linear paper introduces KDA, a hybrid linear attention that beats full attention at all scales - 75% less KV cache, 6x decoding at 1M context, and open-source checkpoints.

News Hacker News Kimi AI Research LLM Architecture Attention Open Weights

BlogJul 27, 2026

Kimi K3 Weights Land on HuggingFace: 2.8T Open Frontier Model You Can Actually Download

Moonshot AI released the full Kimi K3 weights on HuggingFace today - 2.8T parameters, 1M context, native MXFP4 quantization, ~1.63TB download. The HN community reaction, what the license really says, and why this matters for the open-weights AI market.

News Hacker News AI Models Open Weights Kimi Moonshot AI

BlogJul 23, 2026

Where to Access Kimi K3: Every Provider and Price Compared (2026)

Compare every verified Kimi K3 access route, including Moonshot, Together, Fireworks, Baseten, Modal, Vercel AI Gateway, Cloudflare, RunPod, SiliconFlow, OpenRouter, and OpenCode Go.

Kimi AI Models AI Coding Pricing Open Weights

BlogJul 16, 2026

Kimi K3 Drops: Moonshot's 2.8T Parameter Frontier Model Takes on GPT-5.6 and Fable 5

Moonshot AI releases Kimi K3 with 2.8 trillion parameters, 1M context window, and Delta Attention architecture. Here's what developers need to know about pricing, performance, and where it fits in the frontier model landscape.

News Hacker News AI Models Open Weights Kimi

BlogJul 7, 2026

GLM 5.2 and the AI Margin Collapse Thesis

Martin Alderson's argument for why open-weights models like GLM 5.2 will compress frontier lab margins is sparking debate on HN. Here is what the thesis actually says, where HN agrees and disagrees, and why it matters for developers choosing models.

News Hacker News GLM AI Models Pricing Open Weights

BlogJun 23, 2026

GLM-5.2 Local Deployment: Running Z.ai's 744B Model on Consumer Hardware

Unsloth's dynamic quantization makes GLM-5.2 runnable on a 256GB Mac or a 24GB GPU with CPU offloading. Here is the hardware math, the quantization tradeoffs, and what the HN community learned from actually running it.

News Hacker News LLMs Open Weights Local AI Quantization

BlogJun 20, 2026

Where to Run GLM-5.2 Free and Cheap: Every Provider Compared (2026)

GLM-5.2 ships under an MIT license, so it is hosted everywhere - and a few places run it for free or nearly free right now. Here is every way to access Z.ai's open-weights coding model, from OpenCode Go referral credits and Devin to the cheapest per-token routes on OpenRouter, Fireworks, and DeepInfra, plus local Ollama.

glm z-ai open-weights ai-coding-tools pricing

BlogJun 20, 2026

GPT-5.5 Has a 3x Higher Hallucination Rate Than MIT-Licensed GLM-5.2

New benchmark data shows GPT-5.5 hallucinates 86% of the time when it does not know the answer - versus 28% for the open-weights GLM-5.2. The numbers challenge the assumption that bigger models equal more reliable output.

News Hacker News LLMs GPT Benchmarks Open Weights

BlogJun 20, 2026

The Router Era: Why Not Owning a Frontier Model Became an Advantage

No single model wins every task anymore, and the companies that never trained one - Factory, Devin, Perplexity, Cursor, OpenCode - are turning that into a moat. This is how model routing works, why open weights and neoclouds make it cheap, and the honest counter-argument.

ai-models model-routing ai-coding-tools open-weights agents

BlogJun 18, 2026

Mellum2 Developer Guide: JetBrains' Open-Source Coding Model

JetBrains released Mellum2 on June 2, 2026 - a 12B MoE model with only 2.5B active parameters per token. Here is how to run it locally, when to use it, and where it fits in your AI coding stack.

mellum jetbrains open-weights local-models mcp ai-coding-tools

BlogJun 17, 2026

AI Model Routing: Why the Orchestration Layer Is the Next Big Play Next to the Labs

A $500M accidental Claude bill and an open-weights model beating GPT-5.5 at one-sixth the cost point to the same conclusion: the margin is moving to the layer that decides when to use which model for what. Here is how routing and orchestration differ, and how to cut your model spend.

AI Model Routing Model Orchestration Cost Open Weights AI Agents FinOps

BlogJun 17, 2026

DeepSeek V4 Economics: The Cost-Quality Frontier for Agentic Coding in 2026

DeepSeek V4 Pro lands a 63.5 on SWE-bench Verified at $0.435/$0.87 per million tokens, and Flash runs agent inner loops for cents. Here is the worked cost math, the Flash-vs-Pro split, and a clear guide on when to route to DeepSeek instead of a frontier model.

deepseek cost-analysis agentic-coding llm-pricing model-routing open-weights

BlogJun 17, 2026

GLM-5.2 Cost Math: When Open-Weights Coding Models Actually Save You Money

Z.ai's GLM-5.2 lands as a 753B open-weights coding model that beats GPT-5.5 on SWE-bench Pro for roughly one-sixth the per-token cost. Here is the real cost math, a worked cost-per-task example, and a when-to-use-which decision guide.

pricing ai-models open-weights glm ai-coding-tools

BlogJun 17, 2026

GLM-5.2 vs DeepSeek V4 vs Qwen3: The Open-Weights Coding Model Showdown (2026)

A data-rich, source-cited comparison of the open-weights coding models that matter in 2026: GLM-5.2, DeepSeek V4, Qwen3, and the new Kimi K3 frontier entrant. Benchmark table, per-token pricing, context windows, self-host footprint, and a clear pick-X-if decision matrix.

open-weights comparison glm deepseek qwen kimi ai-coding-tools

BlogJun 17, 2026

Self-Hosting Open-Weights Models: The Real Break-Even Math

Open weights are free to download, but inference is not free to run. Here is the honest break-even math on when self-hosting GLM-5.2, DeepSeek V4, or Llama beats paying per-token API prices - GPU rental and ownership costs, real throughput, utilization, the crossover in tokens per month, and the hidden ops bill nobody budgets for.

pricing open-weights self-hosting gpu llm-pricing cost-analysis

BlogJun 10, 2026

Fable 5 vs DeepSeek V4: The Cost-Quality Gap Measured in Real Tasks

DeepSeek V4-Flash costs $0.28 per million output tokens. Fable 5 costs $50. That 178x gap is real - but so is the quality difference. Here is where it matters and where it does not.

fable-5 deepseek cost-analysis open-weights llm-pricing

BlogJun 10, 2026

June 10, 2026: The Day the AI Dev Tool Market Showed Its Whole Hand

Pricing deadlines, infrastructure funding, a banking prompt injection case, and a 4x speed breakthrough - June 10 was one of the densest single days the AI dev tool market has ever produced.

ai-dev-tools roundup claude fable-5 security open-weights postgres

BlogJun 10, 2026

What the 'Notes on DeepSeek' Essay Gets Right About Open-Weights Economics

A first-hand visit to DeepSeek HQ reveals something more interesting than benchmark scores: a 300-person company that treats AI as infrastructure, not eschatology - and what that means for API pricing everywhere.

deepseek open-weights ai-economics model-routing developer-tools china-ai

BlogApr 29, 2026

Gemma 4: The Open Model Guide for Developers

Gemma 4 ships byte-for-byte open weights from Google DeepMind. How developers deploy it locally, fine-tune it, and ship agents on top of it.

Gemma 4 DeepMind Open Weights Local LLM Fine-tuning

Browse All Tags

OPEN-WEIGHTS

Get Smarter About AI Dev

OPEN-WEIGHTS

Get Smarter About AI Dev