Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

What prompt do you use for that?


I just tried "analyze this audio file recording of a meeting and notes along with a transcript labeling all the speakers" (using the language from the parent's comment) and indeed Gemini 3 was significantly better than 2.5 Pro.

3 created a great "Executive Summary", identified the speakers' names, and then gave me a second by second transcript:

    [00:00] Greg: Hello.
    [00:01] X: You great?
    [00:02] Greg: Hi.
    [00:03] X: I'm X.
    [00:04] Y: I'm Y.
    ...
Super impressive!


Does it deduce everyone's name?


It does! I redacted them, but yes. This was a 3-person call.


I made a simple webpage to grab text from YouTube videos: https://summynews.com Great for this kind of testing? (want to expand to other sources in the long run)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: