One easy way to build voice agents and connect them to Twilio is the Pipecat ope...

ldenoue · 2025-12-25T03:57:03 1766635023

The problem with PipeCat and LiveKit (the 2 major stacks for building voice ai) is the deployment at scale.

That’s why I created a stack entirely in Cloudflare workers and durable objects in JavaScript.

Providers like AssemblyAI and Deepgram now integrate VAD in their realtime API so our voice AI only need networking (no CPU anymore).

nextworddev · 2025-12-25T04:16:23 1766636183

let me get this straight, you are storing convo threads / context in DOs?

e.g. Deepgram (STT) via websocket -> DO -> LLM API -> TTS?

ldenoue · 2025-12-27T01:26:03 1766798763

Yes DO let you handle long lived websocket connections. I think this is unique to Cloudflare. AWS or Google Cloud don't seem to offer these things (statefulness basically).

Same with TTS: some like Deepgram and ElevenLabs let you stream the LLM text (or chunks per sentence) over their websocket API, making your Voice AI bot really really low latency.

nextworddev · 2025-12-25T03:17:43 1766632663

This is good stuff.

In your opinion, how close is Pipecat + OSS to replacing proprietary infra from Vapi, Retell, Sierra, etc?

kwindla · 2025-12-25T15:11:22 1766675482

It depends on what you mean by replacing.

The integrated developer experience is much better on Vapi, etc.

The goal of the Pipecat project is to provide state of the art building blocks if you want to control every part of the multimodal, realtime agent processing flow and tech stack. There are thousands of companies with Pipecat voice agents deployed at scale in production, including some of the world's largest e-commerce, financial services, and healthtech companies. The Smart Turn model benchmarks better than any of the proprietary turn detection models. Companies like Modal have great info about how to build agents with sub-second voice-to-voice latency.[1] Most of the next-generation video avatar companies are building on Pipecat.[2] NVIDIA built the ACE Controller robot operating system on Pipecat.[3]

[1] https://modal.com/blog/low-latency-voice-bot - [2] https://lemonslice.com/ = [3] https://github.com/NVIDIA/ace-controller/

nextworddev · 2025-12-25T16:14:31 1766679271

Is there a simple, serverless version of deploying Pipecat stack, without: - me having to self host on my infra

I just want to provide: - business logic - tools - configuration metadata (e.g. which voice to use)

I don't like Vapi due to 1) extensive GUI driven experience, 2) cost

ldenoue · 2025-12-27T01:27:39 1766798859

Check out something like LayerCode (Cloudflare based).

Or PipeCat Cloud / LiveKit cloud (I think they charge 1 cent per minute?)