Menu VisualVoice / About

Visual Voice Home

About Artistic Vision People Teams Contact us


Publications Media Images/Movies


Related Links

Local Only

Website problems?

edit SideBar

About Visual Voice

This project created various composed stage works of increasing complexity that used newly-developed gesture-controlled computer-based speech and singing synthesis methods. The interfaces are known as Digital Ventriloquized Actors, or DiVAs. Creating these works required the modification of existing formant-based synthesis engines, and the construction of a new adaptive, gesture-controlled, 3D articulatory audio/visual speech and song synthesizer and its attendant interface.

For the stage works, performers controlled speech and song synthesis, multichannel sound diffusion, and synthesized, sampled, and processed sound. Because the DiVAs are controlled completely by hand gestures, the performers could also sing and talk with their own voice in tandem with the DiVA’s, thereby performing duets with themselves. These staged works were developed in a phased approach, implementing research results as they occurred to take advantage of improvements in the vocal quality, ease of use, and interface developments.

As part of our scientific investigation, each performer provided data for studying the mechanisms of speech production and the acquisition and honing of the skills needed to produce analogous gesture-based speech and song. The performers’ acquisition of gesture-based speaking skills was monitored and evaluated analytically and perceptually throughout the project, allowing us to study how each person changed and developed into a sophisticated user of the technology.

Through our highly interdependent artistic and scientific processes we were able to:

  1. develop and combine state-of-the-art synthesis of speech and singing, using physical gestures to mediate between the two, allowing a single user to perform simultaneously with a synthetic and a real voice, and thereby explore different personalities and voice types;
  2. create multiple art works that engage the audience in a new way, as they experience the depth of character and emotion possible when using this new technology to deliver and assist in the development of artistic ideas;
  3. create new methods to adapt human gesture control of sound and visuals useful in practical settings;
  4. develop a portable gesture-controlled speech and song synthesis system;
  5. expand the palette of expression for our most intimate performing instrument – the human voice – using gesture, another rich means of communicating human emotion and ideas;
  6. add a new gestural language to vocal performance;
  7. allow performers to gesturally control sound diffusion;
  8. develop and refine work on articulatory and visible speech synthesis;
  9. produce a unique group of performers that can talk and sing both visually and sonically with hand-gestures providing a rare opportunity for our empirical study of language phenomena using perceptual and physiological experiments.