Search:
Menu VisualVoice / RemainingWork

Visual Voice Home


About Artistic Vision People Teams Contact us


Activities



Publications Media Images/Movies


Opportunities


Related Links


Local Only


Website problems?

edit SideBar

Hints on Remaining Tasks

In order to implement expressions as described in the FutureWork Outline, the following tasks remain:

i) Split the Accent window into phoneme & vizeme modes, allowing the user to create accents for both phonemes & expressions
ii) Implement a mapping function from performance glove & tracker data to expression probabilities
iii) Add an inlet for expressions to ArtisynthManager, and add functionality so that it considers expressions as well as phonemes when mapping to PC vectors.

Some hints on how this should be implemented, and the principal considerations, are provided below.

Accent Window

This interface should be extended similarly to the Training Window (see Progress to Date). The principal differences are as follows:

i) Where before there was only phonemes, now there are both phonemes & expressions
ii) Where before there was only a single dictionary, now there is one dictionary for phonemes, and possibly multiple dictionaries for expressions

One must then consider where these differences must be reflected when modifying the code. The following known areas must be changed:

i) The filestructure of each user's Accent folder (under Profiles/[profileName]/Accent/) should be split into Phoneme and Vizeme subfolders, and the Vizeme Accent should be split into subfolders for each dictionary file
ii) The user must be required to choose a mode before attempting to create an accent, and when changing modes, the interface should check for unsaved changes
iii) An additional menu is required in Vizeme mode enabling the user to select a dictionary, since for vizeme mode there may exist multiple dictionaries containing different sets of expressions
iv) The Python scripts used when creating an accent must be modified to reflect the extended filestructure

Other considerations may also arise -- after a careful consideration of possible differences, the best way to discover unaccounted-for changes is to try using the interface, and note errors in usage or file storage.

Mapping Function

This function will be almost directly analogous to the vowelMapping and consonantMapping subpatchers of voiceMapping (subpatcher in the Perform window). The function should be implemented as a subpatcher of voiceMapping and named expressionMapping. It will take glove & tracker data, and expression accent weights as inputs, and use a Radial Basis Function to determine expression probabilities (see "An Adaptive Gesture-to-Speech Interface for DIVA" by Allison Lenters under "Publications" in the left-hand menu of this wiki, or explore the vowelMapping patcher in the DIVA code to see how the calculation is done). As mentioned in the outline, arm rotation will likely be used to determine the expression. The only additional difficulty this presents is that arm rotation is some combination of yaw, pitch and roll values, and an appropriate transformation must be devised to obtain the arm rotation from these values.

Data flow for the mapping function, then, will look something like this:

Raw Input Data -->Inlet of expressionMapping --> Select Yaw, Pitch & Roll Data --> Transform to arm rotation --> Use Accent Weights and RBF calculation --> Expression Probabilities --> Outlet of expressionMapping --> Inlet of Artisynth Manager

Artisynth Manager

The Artisynth Manager must be able to consider expressions as well as phonemes. There are two relevant classes in this respect: ArtisynthManager itself, as well as CalcPCs, each of which performs the following tasks:

i) ArtisynthManager is in charge of receiving data through the inlets, calling CalcPCs' methods, and sending data out through the socket.
ii) CalcPCs loads vectors from the map file and vizeme files, and, receiving data values from ArtisynthManager, performs the phoneme --> PC vector calculation, returning these values to ArtisynthManager.

Currently, map files already support multiple expressions, and these are parsed properly by CalcPCs. What is necessary, then, is to receive expression probabilities, and combine these with phoneme probabilities, sending these in the correct order to CalcPCs. For example, if the entries Ahappy, Aneutral, AAhappy, Aneutral appear in that order in the map file, then ArtisynthManager must send (prob. A) * (prob. happy), (prob. A) * (prob.neutral), (prob. AA) * (prob. happy), (prob. AA) * (prob. neutral) as weights to CalcPCs' calculatePCs method, in that order. This will require:

i) An additional inlet in Artisynth Manager to receive expression probabilities and store them in a float array
ii) A multiplication of each phoneme probability to each expression probability, to fit the order in the map file as described above

In order to consider expressions as well as phonemes, the Artisynth Manager must take expression probabilities in an additional inlet. The map file already contains entries for different expressions, and the parser (located in the loadMapping method of the CalcPCs object) already loads each entry in the order in which it appears. The calculatePCs method takes an array of weight values, one for each map file entry, and applies them to the vectors descr