How is it going with “Home Assistant Voice Preview Edition”?

@AA5B · 1 month ago

How is it going with “Home Assistant Voice Preview Edition”?

JustEnoughDucks · 1 month ago

Does this speaker require nabu casa cloud stuff?

The media player platform is “nabu” and a ton of things based on that. If nabu isn’t a requirement then maybe I will rebase by own spin on them (using an AV receiver with RCA cables instead of built in speakers) and see if it improves it is some way!

@spitfire · edit-2 1 month ago

None of them required many cloud specifically, but you have to provide it with STT and TTS engines. You can use other 3rd party or run it on your own hardware, but to do it effectively (have it transcribe your voice in a second instead of 20) you need a GPU.

JustEnoughDucks · edit-2 1 month ago

Nah that isn’t really true.

I run my server on an AMD 2700X and voice assistant without GPU acceleration with a medium sized model takes normally under 3s. It doesn’t even spike my CPU usage to a very high level. Just don’t use a raspberry pi for it lol.

I was talking about the “nabu” platform that they use in the source code more. I have never seen that “platform” before and it is not a component. It isn’t listed in ESPHome documentation at all.

Normally you cannot use a media player and a speaker component at the same time. You can use voice assistant with a media player but there seems to be some bugs. If this “nabu” platform does not require cloud integration and fixes those issues, that is huge for the DIY voice assistant satellite building community.

@spitfire · 1 month ago

That highly depends on the language you’re using and therefore the model you’re going to settle on. If you’re using English you’re obviously lucky in that regard, but not everyone (including me) does. I’m not sure what do you mean it’s not open source, because Micro Wake Word and Voice Assistant are. Check the link I’ve posted it has both source for the software and hardware design

JustEnoughDucks · edit-2 1 month ago

Yeah, true I have to use English because the Dutch/Flemish Wyoming pipeline is completely unusable. Every word is completely wrong in the STT pipeline, even simple words.

I have no idea what you are talking about with “not open source” did you reply to the wrong comment?

@spitfire · 1 month ago

Im not sure what you meant by the nabu part that’s not a component and you can’t find it in source