Generating chit-chat including laughs, yawns, ‘ums,’ and other nonverbal cues from raw audio
Cela fait partie des sujets que je test autour des NeuroVoice. J'ai enregistré des samples de ma voix avec différentes intonations.

In any given conversation, people exchange chock-full of nonverbal signals, like intonations, emotional expression, pauses, accents, rhythms — all of which are important to human interactions. But today’s AI systems fail to capture these rich, expressive signals because they learn only from written text, which captures what we say but not how we say it.
via Deep Learning Weekly : lire l’article source