I can’t find any voicemail services that work the way I want them to though, so I started building my own using Twilio to handle the incoming phone call + ElevenLabs for text-to-speech + AssemblyAI for speech-to-text + Trestle Smart CNAM API for identifying the caller. I’ll open-source the code once it’s ready.
Twilio’s TTS isn’t as good as ElevenLabs, and their transcription isn’t as good as AssemblyAI. AssemblyAI can pull key details out of the message (eg people’s names, company names, callback numbers, etc) and IIRC it’s quite a bit cheaper than Twilio’s transcription. AssemblyAI provide $50 free credit to try their service, which should last me a very long time assuming it doesn’t expire.
Plus now I can put “AI engineer” on my resume, lol. A lot of “AI” is all about gluing other people’s work together, and that’s exactly what I’m doing.
Voicemail’s definitely not dead.
I can’t find any voicemail services that work the way I want them to though, so I started building my own using Twilio to handle the incoming phone call + ElevenLabs for text-to-speech + AssemblyAI for speech-to-text + Trestle Smart CNAM API for identifying the caller. I’ll open-source the code once it’s ready.
Seems awfully over complicated. Why not just use some twiml verbs like <say> and <gather>?
Twilio’s TTS isn’t as good as ElevenLabs, and their transcription isn’t as good as AssemblyAI. AssemblyAI can pull key details out of the message (eg people’s names, company names, callback numbers, etc) and IIRC it’s quite a bit cheaper than Twilio’s transcription. AssemblyAI provide $50 free credit to try their service, which should last me a very long time assuming it doesn’t expire.
Plus now I can put “AI engineer” on my resume, lol. A lot of “AI” is all about gluing other people’s work together, and that’s exactly what I’m doing.