OpenAI Realtime API: The Missing Manual

swyx · on Nov 21, 2024

hey folks! I'm fresh off presenting a well received Realtime API demo (https://x.com/swyx/status/1859607840639549871) at the 3rd OpenAI devday. There's been a lot of little roadbumps along the way building with the new API, and I've benefited a lot from advice from @kwindla, that I figured it'd be worth publishing a guest post with him on everything we learned using this rather strange new audio (and soon video) API (which we think is quite different from standard text/json over HTTP.

hope it is helpful to someone out there, enjoy!

mooreds · on Nov 21, 2024

Heya!

I saw there was a mention of content moderation when the author discussed https://github.com/pipecat-ai/pipecat

But when I went to the github repo, I didn't see anything about that.

I'm loosely related to the content moderation space through my employer, so wanted to learn more about that.

kwindla · on Nov 22, 2024

We've helped a number of Pipecat users hook into a variety of content moderation systems or use LLMs as judges.

The most common approach is to use a `ParallelPipeline` to evaluate the output of the LLM at the same time as the TTS inference is running, then to cancel the output and call a function if a moderation condition is triggered.

Other people have written custom frame processors to make use of the content moderation scoring in the Google and Azure APIs.

If you're interested in building a Pipecat integration for your employer's tech, happy to support that. Feel free to DM me on Twitter.

mooreds · on Nov 23, 2024

Awesome, thanks! Will pass this along.