LLM

21 items

14 posts, 7 tools

ToolJun 11, 2026

Open-source LLM engineering platform: tracing, evals, prompt management, and datasets. Self-hostable, OpenTelemetry-native, with 50+ framework integrations.

infrastructure observability llm tracing evals open-source self-hosted

BlogJun 10, 2026

The One-Cent Attack: Prompt Injection Through Bank Transfer Memos

Security researchers showed a €0.02 bank transfer could compromise a banking AI assistant. Here is the exact attack chain - and what every developer building agents needs to do differently.

security ai-agents prompt-injection llm banking

BlogJun 10, 2026

Mastra: Review and Setup Guide for TypeScript Agent Apps (2026)

A hands-on look at Mastra, the open source TypeScript framework for building production-ready AI agents and workflows -- with verified setup commands, honest tradeoffs, and current pricing.

typescript ai-agents developer-tools open-source llm

BlogJun 10, 2026

OpenRouter in 2026: Review, Setup, and When Model Routing Pays

OpenRouter gives you one API key for 300+ models, automatic fallbacks, and intelligent provider routing. Here is what it actually costs, how to set it up in five minutes, and when you should skip it entirely.

ai-tools api model-routing developer-tools llm

BlogJun 7, 2026

LLM Routers Compared: LiteLLM vs Portkey vs OpenRouter in 2026

A practical comparison of LLM routing tools - LiteLLM, Portkey, and OpenRouter - covering cost management, fallbacks, caching, and when to use each for production AI applications.

AI Infrastructure LLM Developer Tools Pricing Production

BlogApr 29, 2026

KV Caching: A Practical Guide to Optimizing Transformer Inference

How KV caching speeds up LLM inference - the math, the code, the memory tradeoffs, and when it stops helping. Every dev running local models hits this wall.

LLM Inference Optimization Hugging Face Local Models

BlogApr 29, 2026

Mercury 2 Developer Guide: Building With a Diffusion LLM in Production

A hands-on developer guide to Mercury 2 from Inception Labs. OpenAI-compatible API, reasoning levels, tool use, structured outputs, and when a diffusion LLM beats an autoregressive one in real apps.

AI LLM Mercury Diffusion Inception Labs API Tutorial

BlogApr 28, 2026

Promptlock: Deterministic Prompt Versioning for LLM Apps

Promptlock gives every prompt a 12-char content-addressable id and a diff-able artifact, turning silent prompt drift into a reviewable change.

AI Coding LLM Prompts Tooling Claude

ToolApr 23, 2026

Browser Harness

Self-healing browser automation harness that lets LLMs complete any browser task. 5,000+ stars in under a week.

browser automation llm agents self-healing python

ToolApr 9, 2026

Ollama

The easiest way to run LLMs locally. One command to pull and run any model. OpenAI-compatible API. 52M+ monthly downloads. Supports GGUF, Safetensors, and custom Modelfiles.

local-ai llm cli open-source self-hosted privacy

ToolApr 9, 2026

LM Studio

Desktop app for discovering, downloading, and running local LLMs. Clean chat UI, OpenAI-compatible API server, and automatic GPU detection. MLX engine optimized for Apple Silicon.

local-ai llm desktop gui apple-silicon open-source

ToolApr 9, 2026

Jan

Open-source ChatGPT alternative that runs 100% offline. Desktop app with local models, cloud API connections, custom assistants, and MCP integration. AGPLv3 licensed.

local-ai llm desktop open-source privacy offline mcp

ToolApr 9, 2026

GPT4All

Private local AI chatbot by Nomic. 250K+ monthly users, 65K GitHub stars. LocalDocs feature lets you chat with your own files. Runs on Windows, macOS, and Linux.

local-ai llm desktop privacy localdocs nomic

ToolApr 9, 2026

LocalAI

Open-source OpenAI API replacement. Runs LLMs, vision, voice, image, and video models on any hardware - no GPU required. 35+ backends. Distributed mode for scaling.

local-ai llm open-source self-hosted api-compatible multimodal

BlogFeb 24, 2026

Mercury 2: The LLM That Doesn't Generate Like an LLM

Inception Labs shipped the first reasoning model built on diffusion instead of autoregressive generation. Over 1,000 tokens per second, competitive benchmarks, and a fundamentally different approach to how AI generates text.

AI LLM Mercury Diffusion Inception Labs

BlogNov 2, 2025

Claude Skills: A technical deep dive into Anthropic's new approach to AI context management

A comprehensive look at Claude Skills-modular, persistent task modules that shatter AI's memory constraints and enable progressive, composable, code-capable workflows for developers and organizations.

AI Claude LLM Skills

BlogAug 8, 2025

GPT-5: OpenAI's Most Capable Model

GPT-5 introduces a fundamentally different approach to inference. Instead of forcing developers to manually configure reasoning parameters, the model operates as a unified system with real-time rou...

OpenAI GPT-5 AI LLM

BlogFeb 27, 2025

Diffusion Language Models: How Mercury Changed the LLM Speed Game

Inception Labs launched Mercury, the first commercial-grade diffusion large language model. It generates over 1,000 tokens per second on standard Nvidia hardware by replacing autoregressive generation with a coarse-to-fine diffusion process.

Diffusion Models Mercury LLM AI Architecture

BlogFeb 12, 2025

Unstract: Open-Source AI Document Parsing at Scale

Unstract is an open-source, no-code platform for extracting structured data from PDFs, invoices, scanned documents, and more. Here is how it works, how to set it up, and why automated document processing is becoming essential for organizations drowning in unstructured data.

Document AI Open Source Data Extraction PDF Parsing LLM

BlogJan 9, 2025

Microsoft PHI-4: A 14B Parameter Model That Rivals Models 5x Its Size

Microsoft's PHI-4 is an MIT-licensed 14 billion parameter model that matches Llama 3.3 70B and Qwen 2.5 72B on key benchmarks. Here is what makes it special, how to run it locally, and why small language models are increasingly practical for real development work.

Microsoft PHI-4 Open Source AI LLM Ollama Local AI

Page 1 of 2Next

Get Smarter About AI Dev

New tutorials, open-source projects, and deep dives on coding agents - delivered weekly.

One email per weekReal code, not theoryFree forever

Browse All Tags