Standard and pre-standard representation formats in the SEMAINE system

In view of future interoperability and reuse of components, the SEMAINE API aims to use standard representation formats where that seems possible and reasonable. For example, results of analysis components can be represented using EMMA (Extensible Multi-Modal Annotation), a Wold Wide Web Consortium (W3C) Recommendation⁠. Input to a speech synthesiser can be represented using SSML (Speech Synthesis Markup Language), also a W3C Recommendation⁠. Several other relevant representation formats are not yet standardised, but are in the process of being specified. This includes the Emotion Markup Language EmotionML⁠, used for representing emotions and related states in a broad range of contexts, and the Behaviour Markup Language BML⁠, which describes the behaviour to be shown by an Embodied Conversational Agent (ECA). Furthermore, a Functional Markup Language FML is under discussion, in order to represent the planned actions of an ECA on the level of functions and meanings. By implementing draft versions of these specifications, the SEMAINE API can provide hands-on input to the standardisation process, which may contribute to better standard formats.

On the other hand, it seems difficult to define a standard format for representing the concepts inherent in a given application's logic. To be generic, such an endeavour would ultimately require an ontology of the world. In the current SEMAINE system, which does not aim at any sophisticated reasoning over domain knowledge, a simple custom format named SemaineML is used to represent those pieces of information that are required in the system but which cannot be adequately represented in an existing or emerging standard format. It is conceivable that other applications built on top of the SEMAINE API may want to use a more sophisticated representation such as the Rich Description Format RDF to represent domain knowledge, in which case the API could be extended accordingly.

Whereas all of the aforementioned representation formats are based on the Extensible Markup Language XML⁠, there are a number of data types that are naturally represented in different formats. This is particularly the case for the representations of data close to input and output components. At the input end, low-level analyses of human behaviour are often represented as feature vectors. At the output end, the input to a player component is likely to include binary audio data or player-specific rendering directives.

The following table gives an overview of the representation formats currently supported in the SEMAINE API. The row headings link to pages describing the respective representation format.

Type of dataRepresentation formatStandardisation status
Low-level input featuresstring or binary feature vectorsad hoc
Analysis resultsEMMAW3C Recommendation
Emotions and related statesEmotionMLW3C Working Draft
Domain knowledgeSemaineMLad hoc
Speech synthesis inputSSMLW3C Recommendation
Functional action planFMLvery preliminary
Behavioural action planBMLdraft specification
Low-level output databinary audio, player commandsplayer-dependent
Last modified 7 years ago Last modified on 10/18/10 14:48:10