Microsoft researchers are working on a text-to-speech (TTS) model that can mimic a person’s voice – complete with emotion and intonation – after a mere three seconds of training. The technology – called VALL-E and outlined in a 15-page research paper released this month on the arXiv research site – is a significant step forward for Microsoft.
Read full article on The Register