TL;DR
Crawl any website, generate a podcast script with an LLM, and convert it to audio with ElevenLabs. A full-stack walkthrough from API keys to streaming playback.
Drop a handful of URLs into an input field. Click generate. Get back a voiced podcast that covers the highlights from every link you provided. That is what we are building in this tutorial - a full-stack Next.js app that chains together a web scraper, Groq for LLM inference, and ElevenLabs for text-to-speech.
The architecture is straightforward. The frontend collects URLs and streams progress updates. The backend scrapes each URL in parallel, feeds the combined content into an LLM to write a podcast script, then sends that script to ElevenLabs for audio generation. By the end, you will have a working app that turns any collection of web pages into a listenable podcast.
Three API keys, all available on free tiers:
Create a .env file in your project root:
SCRAPER_API_KEY=your_scraper_key
ELEVENLABS_API_KEY=your_elevenlabs_key
GROQ_API_KEY=your_groq_key
Initialize a Next.js project and install the dependencies:
npx create-next-app@latest podcast-engine
cd podcast-engine
pnpm add openai elevenlabs uuid
The OpenAI SDK works with Groq because Groq exposes an OpenAI-compatible endpoint. You point the SDK at a different base URL and it handles everything.
Create app/api/generate-podcast/route.ts. This is where all three services get wired together.
import { NextRequest } from "next/server";
import OpenAI from "openai";
import { ElevenLabsClient } from "elevenlabs";
import { v4 as uuidv4 } from "uuid";
import { writeFileSync } from "fs";
import path from "path";
async function scrapeUrl(url: string): Promise<string> {
const res = await fetch(url);
const html = await res.text();
// Use your preferred scraping solution to extract clean text
return html.replace(/<[^>]*>/g, "").trim();
}
const openai = new OpenAI({
apiKey: process.env.GROQ_API_KEY!,
baseURL: "https://api.groq.com/openai/v1",
});
const elevenlabs = new ElevenLabsClient({
apiKey: process.env.ELEVENLABS_API_KEY!,
});
Notice the baseURL on the OpenAI client. That single line is what routes all requests to Groq instead of OpenAI. You can swap this to any OpenAI-compatible provider.
async function createAudioFromText(text: string): Promise<string> {
// Limit characters for free tier (adjust as needed)
const truncatedText = text.substring(0, 800);
const audio = await elevenlabs.generate({
voice: "Rachel",
text: truncatedText,
model_id: "eleven_monolingual_v1",
});
const fileName = `podcast-${Date.now()}.mp3`;
const filePath = path.join(process.cwd(), "public", "podcasts", fileName);
const chunks: Buffer[] = [];
for await (const chunk of audio) {
chunks.push(Buffer.from(chunk));
}
writeFileSync(filePath, Buffer.concat(chunks));
return `/podcasts/${fileName}`;
}
The 800-character limit is a safety measure for the free tier. Once you are past testing, remove it or increase it substantially. ElevenLabs charges by character, so keep that in mind.
For production, you would swap writeFileSync for an upload to S3, Tigris, or any blob storage. The local file approach works fine for development.
This is where the pipeline comes together. We use a ReadableStream to send progress updates back to the frontend as each step completes:
export async function POST(req: NextRequest) {
const { urls } = await req.json();
const stream = new ReadableStream({
async start(controller) {
const send = (type: string, data: any) => {
controller.enqueue(
new TextEncoder().encode(
`data: ${JSON.stringify({ type, data })}\n\n`
)
);
};
try {
// Step 1: Scrape all URLs in parallel
send("status", "Scraping websites...");
const scrapeResults = await Promise.all(
urls.map(async (url: string) => {
const markdown = await scrapeUrl(url);
return markdown;
})
);
const combinedContent = scrapeResults.join("\n\n---\n\n");
if (!combinedContent.trim()) {
send("error", "No content could be scraped from the provided URLs.");
controller.close();
return;
}
// Step 2: Generate podcast script with LLM
send("status", "Compiling stories...");
const today = new Date().toLocaleDateString("en-US", {
weekday: "long",
year: "numeric",
month: "long",
day: "numeric",
});
const completion = await openai.chat.completions.create({
model: "llama-3.2-90b-vision-preview",
messages: [
{
role: "system",
content:
"You are a witty tech podcaster. Create a 5-minute script covering the top 5-10 most interesting stories. Summarize each story in 1-4 sentences, keeping the tone funny and entertaining. Aim for a mix of humor.",
},
{
role: "user",
content: `Create a hilarious and informative 5-minute podcast for ${today}. Here is the source material:\n\n${combinedContent}`,
},
],
stream: true,
});
send("status", "Crafting commentary...");
let fullScript = "";
for await (const chunk of completion) {
const text = chunk.choices[0]?.delta?.content || "";
fullScript += text;
send("script", text);
}
// Step 3: Generate audio from the script
send("status", "Generating audio...");
const audioPath = await createAudioFromText(fullScript);
send("complete", { audioPath });
} catch (error) {
send("error", String(error));
}
controller.close();
},
});
return new Response(stream, {
headers: {
"Content-Type": "text/event-stream",
"Cache-Control": "no-cache",
Connection: "keep-alive",
},
});
}
The three-step pipeline runs sequentially because each step depends on the previous one. But within step 1, all URLs are scraped in parallel using Promise.all. This matters when you have five or ten URLs - parallel scraping is significantly faster than sequential.
One thing to watch: the combined markdown from all scraped pages needs to fit within the LLM's context window. Llama 3.2 90B on Groq supports 128K tokens, which is generous. But if you are feeding in dozens of long articles, you may need to truncate or summarize individual pages before combining them.
Get the weekly deep dive
Tutorials on Claude Code, AI agents, and dev tools - delivered free every week.
The frontend needs to handle URL input, progress streaming, script display, and audio playback.
"use client";
import { useState, useRef } from "react";
interface StepStatus {
label: string;
complete: boolean;
}
export default function PodcastEngine() {
const [urls, setUrls] = useState<string[]>([]);
const [newUrl, setNewUrl] = useState("");
const [loading, setLoading] = useState(false);
const [script, setScript] = useState("");
const [audioSrc, setAudioSrc] = useState("");
const [statuses, setStatuses] = useState<StepStatus[]>([]);
const audioRef = useRef<HTMLAudioElement>(null);
function isValidUrl(str: string): boolean {
try {
new URL(str);
return true;
} catch {
return false;
}
}
function addUrl() {
if (isValidUrl(newUrl) && !urls.includes(newUrl)) {
setUrls((prev) => [...prev, newUrl]);
setNewUrl("");
}
}
function removeUrl(url: string) {
setUrls((prev) => prev.filter((u) => u !== url));
}
async function generate() {
setLoading(true);
setScript("");
setAudioSrc("");
setStatuses([]);
const response = await fetch("/api/generate-podcast", {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ urls }),
});
const reader = response.body?.getReader();
const decoder = new TextDecoder();
while (reader) {
const { done, value } = await reader.read();
if (done) break;
const text = decoder.decode(value);
const lines = text.split("\n").filter((l) => l.startsWith("data: "));
for (const line of lines) {
const { type, data } = JSON.parse(line.slice(6));
if (type === "status") {
setStatuses((prev) => [
...prev.map((s) => ({ ...s, complete: true })),
{ label: data, complete: false },
]);
} else if (type === "script") {
setScript((prev) => prev + data);
} else if (type === "complete") {
setAudioSrc(data.audioPath);
setStatuses((prev) =>
prev.map((s) => ({ ...s, complete: true }))
);
}
}
}
setLoading(false);
}
return (
<div className="max-w-4xl mx-auto p-8">
<h1 className="text-3xl font-bold mb-6">Podcast Engine</h1>
<div className="flex gap-2 mb-4">
<input
value={newUrl}
onChange={(e) => setNewUrl(e.target.value)}
placeholder="https://example.com/article"
className="flex-1 border rounded px-3 py-2"
onKeyDown={(e) => e.key === "Enter" && addUrl()}
/>
<button onClick={addUrl} className="px-4 py-2 bg-black text-white rounded">
Add
</button>
</div>
{urls.map((url) => (
<div key={url} className="flex items-center gap-2 mb-2">
<span className="text-sm truncate flex-1">{url}</span>
<button onClick={() => removeUrl(url)} className="text-red-500">
Remove
</button>
</div>
))}
<button
onClick={generate}
disabled={loading || urls.length === 0}
className="mt-4 px-6 py-3 bg-black text-white rounded disabled:opacity-50"
>
{loading ? "Generating..." : "Generate Podcast"}
</button>
{statuses.length > 0 && (
<div className="mt-6 space-y-2">
{statuses.map((s, i) => (
<div key={i} className="flex items-center gap-2">
<span>{s.complete ? "Done" : "Working..."}</span>
<span>{s.label}</span>
</div>
))}
</div>
)}
{script && (
<pre className="mt-6 p-4 bg-gray-50 rounded whitespace-pre-wrap text-sm">
{script}
</pre>
)}
{audioSrc && (
<audio ref={audioRef} controls src={audioSrc} className="mt-6 w-full" />
)}
</div>
);
}
The streaming consumption pattern here is reusable. You read chunks from the response body, split on newlines, parse the JSON payload, and dispatch to the correct state handler. This same approach works for any backend that returns Server-Sent Events.
The data flow is linear:
Each service does one thing well. The scraper handles the messy work of parsing web pages into LLM-friendly markdown. Groq runs fast inference on open models. ElevenLabs produces natural-sounding speech. Your code is just the orchestration layer.
The system prompt is where you control the podcast personality. The default asks for "witty" and "funny" but you could make it:
The voice selection in ElevenLabs also shapes the output dramatically. The Rachel voice works well for news-style delivery, but ElevenLabs offers dozens of alternatives. You can even clone a voice if you have sample audio.
For a real deployment, a few things need to change:
writeFileSync for S3, Tigris, or Vercel Blob. Associate each file with a user ID or session.This project demonstrates a pattern that shows up constantly in AI application development: chain multiple specialized APIs together, stream progress to the user, and let the LLM handle the creative work. The scraper turns messy web pages into clean text. The LLM turns clean text into structured content. ElevenLabs turns text into audio. Your application code is mostly plumbing.
The same architecture works for building AI newsletter generators, automated meeting summarizers, voice briefings, or any workflow where you need to go from raw data to polished, consumable output. Swap the services, change the prompts, and you have a different product.
Technical content at the intersection of AI and development. Building with AI agents, Claude Code, and modern dev tools - then showing you exactly how it works.
AI app builder - describe what you want, get a deployed full-stack app with React, Supabase, and auth. No coding requi...
View ToolFull-stack AI dev environment in the browser. Describe an app, get a deployed project with database, auth, and hosting....
View Tool
New tutorials, open-source projects, and deep dives on coding agents - delivered weekly.
Deployment platform behind Next.js. Git push to deploy. Edge functions, image optimization, analytics. Free tier is gene...
Install the dd CLI and scaffold your first AI-powered app in under a minute.
Getting StartedWhat MCP servers are, how they work, and how to build your own in 5 minutes.
AI AgentsInstall Ollama and LM Studio, pull your first model, and run AI locally for coding, chat, and automation - with zero cloud dependency.
Getting Started
Creating an AI-Enhanced Podcast Web App: Comprehensive Tutorial Repo: https://github.com/developersdigest/llm-podcast-engine You can obtain these API keys from the following sources: ...

Check out Tavus; https://tavus.plug.dev/v9tXAGz In this video, you'll discover Tavus, a powerful AI tool for creating realistic AI avatars. Learn how to build AI personas for various applications...

Sign up for a free Neon account today and get 10 complimentary projects at https://fyi.neon.tech/1dd! Building a Full Stack AI Enabled Platform: Step-by-Step Tutorial In this video, I'll...

A practical guide to building Next.js apps using Claude Code, Cursor, and the modern TypeScript AI stack.

The definitive full-stack setup for building AI-powered apps in 2026. Next.js 16, Vercel AI SDK, Convex, Clerk, and Tail...

Wire a Python LangGraph agent into a Next.js frontend using CopilotKit's co-agent architecture. Full walkthrough coverin...