The Azure AI Speech Personal Voice feature has been upgraded to a new zero shot TTS model called DragonV2.1Neural. As a zero shot model, it means voices can be created from minimal data. The new model promises “more natural sounding and expressive voice” with “improved pronunciation accuracy and greater controllability.” The new model can synthesize speech in over 100 languages with just a few seconds of a voice sample. The previous DragonV1 model had pronunciation challenges especially with...

Read the full article at Neowin