Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
OpenAI Realtime API: The Missing Manual (latent.space)
2 points by swyx on Nov 21, 2024 | hide | past | favorite | 4 comments


hey folks! I'm fresh off presenting a well received Realtime API demo (https://x.com/swyx/status/1859607840639549871) at the 3rd OpenAI devday. There's been a lot of little roadbumps along the way building with the new API, and I've benefited a lot from advice from @kwindla, that I figured it'd be worth publishing a guest post with him on everything we learned using this rather strange new audio (and soon video) API (which we think is quite different from standard text/json over HTTP.

hope it is helpful to someone out there, enjoy!


Heya!

I saw there was a mention of content moderation when the author discussed https://github.com/pipecat-ai/pipecat

But when I went to the github repo, I didn't see anything about that.

I'm loosely related to the content moderation space through my employer, so wanted to learn more about that.


We've helped a number of Pipecat users hook into a variety of content moderation systems or use LLMs as judges.

The most common approach is to use a `ParallelPipeline` to evaluate the output of the LLM at the same time as the TTS inference is running, then to cancel the output and call a function if a moderation condition is triggered.

Other people have written custom frame processors to make use of the content moderation scoring in the Google and Azure APIs.

If you're interested in building a Pipecat integration for your employer's tech, happy to support that. Feel free to DM me on Twitter.


Awesome, thanks! Will pass this along.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: