On Thursday, OpenAI released the “system card” for ChatGPT’s new GPT-4o AI model that details model limitations and safety testing procedures. Among other examples, the document reveals that in rare occurrences during testing, the model’s Advanced Voice Mode unintentionally imitated users’ voices without permission. Currently, OpenAI has safeguards in place that prevent this from happening, but the instance reflects the growing complexity of safely architecting with an AI chatbot that could potentially imitate any voice from a small clip.

Advanced Voice Mode is a feature of ChatGPT that allows users to have spoken conversations with the AI assistant.

In a section of the GPT-4o system card titled “Unauthorized voice generation,” OpenAI details an episode where a noisy input somehow prompted the model to suddenly imitate the user’s voice. “Voice generation can also occur in non-adversarial situations, such as our use of that ability to generate voices for ChatGPT’s advanced voice mode,” OpenAI writes. “During testing, we also observed rare instances where the model would unintentionally generate an output emulating the user’s voice.”

It would certainly be creepy to be talking to a machine and then have it unexpectedly begin talking to you in your own voice. Ordinarily, OpenAI has safeguards to prevent this, which is why the company says this occurrence was rare even before it developed ways to prevent it completely. But the example prompted BuzzFeed data scientist Max Woolf to tweet, “OpenAI just leaked the plot of Black Mirror’s next season.”

  • noughtnaut
    link
    -31 month ago

    Keep in mind, your voice sounds quite different to others than it does to you (because of conductance within your skull). So, unless you have a same-sex twin, would you even recognise the voice as your own?

    • @Arbiter
      link
      61 month ago

      Have you never heard your own voice?

    • @Bangs42
      link
      English
      41 month ago

      Unless you’ve never heard a recording or seen a video of yourself, I’m going to go with no.

    • Flying Squid
      link
      21 month ago

      As someone who has done a lot of VO professionally, that is something you can train yourself to work around- the limitations of your own skull letting you hear yourself differently than everyone else. For a long time, I had to use headphones, but I got to the point where I understood my vocalizations enough to not need them much of the time.