Generating chit-chat including laughs, yawns, ‘ums,’ and other nonverbal cues from raw audio

Jean-Philippe Encausse

avril 11, 2022

🧠 Artificial Intelligence, Veille Technologique

Cela fait partie des sujets que je test autour des NeuroVoice. J'ai enregistré des samples de ma voix avec différentes intonations.

In any given conversation, people exchange chock-full of nonverbal signals, like intonations, emotional expression, pauses, accents, rhythms — all of which are important to human interactions. But today’s AI systems fail to capture these rich, expressive signals because they learn only from written text, which captures what we say but not how we say it.

via Deep Learning Weekly : lire l’article source

Generating chit-chat including laughs, yawns, ‘ums,’ and other nonverbal cues from raw audio

Partager :

Laisser un commentaire Annuler la réponse.