Are there any good open source text-to-music models, preferably with lyrical abilities?

@[email protected] · 6 months ago

Are there any good open source text-to-music models, preferably with lyrical abilities?

@[email protected] · 6 months ago

The only text-to-audio model I can think of at the moment is Stable Audio Open, which AFAIK is rather underwhelming for your use-case, if it can even handle stuff more complex than basic sounds - and no lyrics.
It is even under the “new” membership licensing of SAI.

I remember reading about a more recent one, but I currently can’t find it, and I don’t think that that one too could handle lyrics.

I suppose the Music industry is a lot harder to fight, so not a lot of people want to entangle themself with it.

@[email protected] · 6 months ago

Interestingly, Jukebox from OpenAI was trained on what appears to be copyrighted music and involved styles and renditions that explicitly referenced specific artists. It’s now four years old though. The demo songs don’t seem to be available anymore on Soundcloud.

There is MusicLM from Google (2023) - no lyrics. Also, AudioCraft from Meta (2023) - also no lyrics as far as I can tell.

@Audalin · 6 months ago

ChatMusician isn’t exactly new and the underlying dataset isn’t particularly diverse, but it’s one of the few models made specifically for classical music.

Are there any others, by the way?

@[email protected] · 6 months ago

Maybe it would be possible to use a regular text-to-voice model and then use something similar to autotune