3 items
3 posts
NVIDIA's Nemotron 3 Super combines latent mixture of experts with hybrid Mamba architecture - 120B total parameters, 12B active per token, 1M context, and up to 4x more experts at the same cost.
NVIDIA's Nemotron Nano 2 VL delivers vision-language capabilities at a fraction of the computational cost. This 12-billion-parameter open-source model processes videos, analyzes documents, and reas...
NVIDIA's Nemotron Nano 9B V2 delivers something rare: a small language model that doesn't trade capability for speed. This 9B parameter model outperforms Qwen 3B across instruction following, math,...

New tutorials, open-source projects, and deep dives on coding agents - delivered weekly.