For those of us still impatiently waiting, what is your experience so far with “Home Assistant Voice Preview Edition”?

—- I ordered just 2 hours in but the vendor I used sold out in 21 minutes. I just found out I also missed the restock, so hopefully some time next month.

  • JustEnoughDucks
    link
    fedilink
    English
    112 hours ago

    Does this speaker require nabu casa cloud stuff?

    The media player platform is “nabu” and a ton of things based on that. If nabu isn’t a requirement then maybe I will rebase by own spin on them (using an AV receiver with RCA cables instead of built in speakers) and see if it improves it is some way!

    • @spitfire
      link
      English
      3
      edit-2
      11 hours ago

      None of them required many cloud specifically, but you have to provide it with STT and TTS engines. You can use other 3rd party or run it on your own hardware, but to do it effectively (have it transcribe your voice in a second instead of 20) you need a GPU.

      • JustEnoughDucks
        link
        fedilink
        English
        3
        edit-2
        9 hours ago

        Nah that isn’t really true.

        I run my server on an AMD 2700X and voice assistant without GPU acceleration with a medium sized model takes normally under 3s. It doesn’t even spike my CPU usage to a very high level. Just don’t use a raspberry pi for it lol.

        I was talking about the “nabu” platform that they use in the source code more. I have never seen that “platform” before and it is not a component. It isn’t listed in ESPHome documentation at all.

        Normally you cannot use a media player and a speaker component at the same time. You can use voice assistant with a media player but there seems to be some bugs. If this “nabu” platform does not require cloud integration and fixes those issues, that is huge for the DIY voice assistant satellite building community.

        • @spitfire
          link
          English
          39 hours ago

          That highly depends on the language you’re using and therefore the model you’re going to settle on. If you’re using English you’re obviously lucky in that regard, but not everyone (including me) does. I’m not sure what do you mean it’s not open source, because Micro Wake Word and Voice Assistant are. Check the link I’ve posted it has both source for the software and hardware design

          • JustEnoughDucks
            link
            fedilink
            English
            2
            edit-2
            8 hours ago

            Yeah, true I have to use English because the Dutch/Flemish Wyoming pipeline is completely unusable. Every word is completely wrong in the STT pipeline, even simple words.

            I have no idea what you are talking about with “not open source” did you reply to the wrong comment?

            • @spitfire
              link
              English
              13 hours ago

              Im not sure what you meant by the nabu part that’s not a component and you can’t find it in source