Max-Planck-Institut für Informatik
max planck institut
informatik
mpii logo Minerva of the Max Planck Society
 

MPI-I-2007-4-003

A nonlinear viseme model for triphone-based speech synthesis

Bargmann, Robert and Blanz, Volker and Seidel, Hans-Peter

MPI-I-2007-4-003. June 2007, 28 pages. | Status: available - back from printing | Next --> Entry | Previous <-- Entry

Abstract in LaTeX format:
This paper presents a representation of visemes that defines a measure
of similarity between different visemes, and a system of viseme
categories. The representation is derived from a statistical data
analysis of feature points on 3D scans, using Locally Linear
Embedding (LLE). The similarity measure determines which available
viseme and triphones to use to synthesize 3D face animation for a
novel audio file. From a corpus of dynamic recorded 3D mouth
articulation data, our system is able to find the best suited sequence
of triphones over which to interpolate while reusing the
coarticulation information to obtain correct mouth movements over
time. Due to the similarity measure, the system can deal with
relatively small triphone databases and find the most appropriate
candidates. With the selected sequence of database triphones, we can
finally morph along the successive triphones to produce the final
articulation animation.
In an entirely data-driven approach, our automated procedure for
defining viseme categories reproduces the groups of related visemes
that are defined in the phonetics literature.

Acknowledgement:
References to related material:

To download this research report, please select the type of document that fits best your needs.Attachement Size(s):
MPI-I-2007-4-003.ps31192 KBytes
Please note: If you don't have a viewer for PostScript on your platform, try to install GhostScript and GhostView
URL to this document: http://domino.mpi-inf.mpg.de/internet/reports.nsf/NumberView/2007-4-003
Hide details for BibTeXBibTeX
@TECHREPORT{BargmannBlanzSeidel2007,
  AUTHOR = {Bargmann, Robert and Blanz, Volker and Seidel, Hans-Peter},
  TITLE = {A nonlinear viseme model for triphone-based speech synthesis},
  TYPE = {Research Report},
  INSTITUTION = {Max-Planck-Institut f{\"u}r Informatik},
  ADDRESS = {Stuhlsatzenhausweg 85, 66123 Saarbr{\"u}cken, Germany},
  NUMBER = {MPI-I-2007-4-003},
  MONTH = {June},
  YEAR = {2007},
  ISSN = {0946-011X},
}