wiki:ConfiguringSEMAINE

Configuring the SEMAINE system and its components

Many aspects of the SEMAINE system can be influenced through configuration files. It suffices to change these values in the respective configuration file, and to restart the system, in order to modify the system behaviour. In particular, it is not necessary to recompile the source code for these changes to take effect.

System manager and java component runner

The java part of the SEMAINE system if highly flexibly configurable. Whereas in C++, different executables provide a hard-coded bundle of components, the presence of reflection in Java makes it possible to define the configuration of a given java process in a configuration file, and to load the actual components at runtime.

Therefore, a single entry point exists for all Java components in the SEMAINE API: the eu.semaine.system.ComponentRunner.

ComponentRunner

The simplest way to start the component runner is through a simple script file, such as the start scripts semaine-speech2face.sh and semaine-speech2face.bat. In addition to the memory available to the java process and the configuration settings for ActiveMQ, the ComponentRunner main method requires the user to provide as a command line parameter the configuration file to use. The syntax of that file is described in the following.

Java config file

The key place for the configuration of the Java subsystem is a system config file, such as SEMAINE-3.1/java/config/speech2face.config.

The java config file lists the components to be loaded, can contains any system properties that may be accessed by the java components. In particular, it points to additional config files.

The following configuration settings can be set directly in the main java configuration file.

  • semaine.components lists the components to be loaded. It is usually a multi-line entry with one component class listed per line. The following is the entry from speech2face.config:
semaine.components = \
    |eu.semaine.components.meta.SystemManager| \
    |eu.semaine.components.dialogue.interpreters.EmotionInterpreter| \
    |eu.semaine.components.dialogue.interpreters.TurnTakingInterpreter| \
    |eu.semaine.components.dialogue.interpreters.UtteranceInterpreter| \
    |eu.semaine.components.dialogue.interpreters.NonVerbalInterpreter| \
    |eu.semaine.components.dialogue.interpreters.AgentMentalStateInterpreter| \
    |eu.semaine.components.dialogue.actionproposers.UtteranceActionProposer| \
    |eu.semaine.components.dialogue.test.TestGui| \
    |eu.semaine.components.mary.SpeechPreprocessor| \
    |eu.semaine.components.mary.SpeechBMLRealiser| \
    |eu.semaine.components.mary.QueuingSpeechPreprocessor| \
    |eu.semaine.components.mary.QueuingSpeechBMLRealiser| \
    |eu.semaine.components.control.ParticipantControl| \
    |eu.semaine.components.MessageLogComponent($semaine.messagelog.topic, $semaine.messagelog.messageselector)| \
    |eu.semaine.components.emotion.EmotionFusion| \
    |eu.semaine.components.nonverbal.NonverbalFusion| \
    |eu.semaine.components.testing.StateLogger| \
    |eu.semaine.components.testing.AgentBehaviourObserver| \
    |eu.semaine.components.dialogue.interpreters.UserPresenceInterpreter| \
    

A minimalistic semaine.components entry is found in semaine-message-logger-only.config:

semaine.components = \
    |eu.semaine.components.meta.SystemManager| \
    |eu.semaine.components.MessageLogComponent($semaine.messagelog.topic, $semaine.messagelog.messageselector)| \
	

The meaning of the lines is the following:

    |eu.semaine.components.meta.SystemManager| \

This line instantiates an object of the class eu.semaine.components.meta.SystemManager using the default constructor with no arguments.

The following line instantiates eu.semaine.components.MessageLogComponent with a constructor taking two string arguments:

    |eu.semaine.components.MessageLogComponent($semaine.messagelog.topic, $semaine.messagelog.messageselector)| \

It would be possible to provide literal string values as parameters to the constructor inline; here, however, the values are prefixed by the special character $, which indicates that the actual values are to be read from property entries in the file; this way it is possible to change the system configuration by commenting / uncommenting entries in the config file.

# Show messages in all topics:
semaine.messagelog.topic = semaine.data.> \
	semaine.callback.>

# Show only dialog state messages:
#semaine.messagelog.topic = semaine.data.state.dialog

# Show all messages, i.e. periodic and event-based ones:
#semaine.messagelog.messageselector =
# Show only event-based messages:
semaine.messagelog.messageselector = event IS NOT NULL

The mechanism for instantiating components with a flexible number of string parameters in the constructor is generic. It is used here with the MessageLogComponent.

  • semaine.messagelog.topic is a configuration setting for the message log component to indicate the list of topics or topic hierarchies for which messages should be sent to the log mechanism.
  • semaine.messagelog.messageselector is a configuration setting for the message log component to filter messages before sending them to the log mechanism. The syntax is the JMS message selector syntax.
  • semaine.systemmanager.gui is a boolean property which determines whether the system manager will show a system monitor GUI or not.

If the system monitor gui is shown, it will display a message flow graph. As the system becomes more complex, this graph will be increasingly difficult to read. In order to simplify the graph, the following settings can be used to hide unnecessary detail.

  • semaine.systemmanager.hide.components provides a list of component names which should not be shown in the message flow graph. The names must match what the component's getName() method returns.
  • semaine.systemmanager.hide.topics provides the same functionality for Topics: any topic listed here will not be shown in the message flow graph.
  • semaine.systemmanager.gui.topics_to_ignore_when_sorting can be used to exclude certain topics when computing the layout of the components in the message flow graph.

The java components use jog4j for logging. log4j can be extensively configured using a specific configuration file.

  • semaine.log provides the location of log4j configuration file to use. If semaine.log is not set, the file log4j.properties is used which is in the same directory as the main java config file. For the config files shipping with SEMAINE, this is SEMAINE-3.1/java/config/log4j.properties. The system property log4j.logger.semaine can be used to override the log output specification of the semaine log messages; for example,
java -Dlog4j.logger.semaine=DEBUG,stderr ...

will log semaine message of level DEBUG or higher to standard error. This is useful for temporarily starting the system with a different log setting from the default without having to change the log4.properties configuration file.

Embedded ActiveMQ

  • semaine.use.embedded.broker is a boolean property which determines whether the java process will start an embedded ActiveMQ broker. If this is set to true, the ActiveMQ message-oriented middleware will be part of the java process; if it is set to false, an external ActiveMQ server must be started. See also ConfiguringActiveMQ.

User presence interpreter

A number of settings for the user presence interpreter component can be used to tweak how the presence or absence of a user is computed. For voice activity, face detection, and system utterances, thresholds can be set in milliseconds; only when times exceed these thresholds will the state of user presence change. The following are the values currently used in the speech2face.config file:

# For the user presence interpreter, set the thresholds (in milliseconds)
# which need to be exceeded before a certain event impacts user presence:
semaine.UserPresence.threshold.voiceAppeared = 1000
semaine.UserPresence.threshold.voiceDisappeared = 20000
semaine.UserPresence.threshold.faceAppeared = 0
semaine.UserPresence.threshold.faceDisappeared = 3000
semaine.UserPresence.threshold.systemStoppedSpeaking = 10000
semaine.UserPresence.threshold.externalUserPresence = 60000

These values mean that voice activity needs to go on for at least one second before a user is deemed present, whereas face presence triggers user presence immediately; if the face disappears for more than three seconds, of the voice disappears for more than 20 seconds, the user is deemed absent. User absence decisions are suspended while the system is speaking and for 10 seconds after the system has finished speaking. If an external source determines user presence (such as a gui control), this value is respected for 60 seconds.

Pointers to other config files

Several config file entries identify specific configuration files to be used in different contexts:

  • semaine.character-config points to the character config file (see below)
  • semaine.stateinfo-config points to the state information config file (see below)
  • semaine.DM-config points to the dialogue manager config file (see below).

Character config file

The character config file (e.g., SEMAINE-3.1/java/config/character-config.xml) contains the definition of the characters' properties, including the TTS voices they can use, their emotional predispositions, and their propensity to take the turn.

In a future version this file may also refer to the facial models that should be used for the visual appearance of the character, as well as any other character properties.

State info config file

The stateinfo config file (e.g., SEMAINE-3.1/java/config/stateinfo.config) is the backbone of communicating state information between the system components. It defines the short names of any information items – anything the system knows about the current state of the user, the agent, the dialog, and the context –, and defines how they are encoded and decoded in XML for communicating state within the system. For more details, see StateInfo.

In order to use a new information item in the code, it is sufficient to add it to the stateinfo.config file and make sure all producers and consumers of this information use the revised stateinfo.config file.

C++ components that use state information currently expect stateinfo.config to be in the same folder as the executable binary.

Dialog manager config file

The dialog manager config file (e.g., SEMAINE-3.1/java/config/DM.config) identifies the dialogue templates that are used to drive the verbal behaviour of the agents. By adding or removing template files from the entry “template_files”, the user can change the dialog strategies used.

For example, depending on the intended steps when changing from one character to another, exactly one of the following three config files should be included in the templates list. After finishing a dialog with one of the SAL characters:

  • /eu/semaine/components/dialogue/data/templates/CharChangeModeratorEval.xml will bring up a moderator character asking evaluation questions about the user's perception of the quality of the interaction, and then introduce the next character;
  • /eu/semaine/components/dialogue/data/templates/CharChangeModerator.xml will bring up the moderator character who will directly introduce the next character;
  • /eu/semaine/components/dialogue/data/templates/CharChange.xml will directly change from one character to the next without showing the moderator as intermediary.

The paths shown are interpreted as classpath locations, i.e. the respective files are expected to be in one of the jar files loaded when starting the system or as substructures in a directory that is included in the classpath when starting the system.

Listener behaviour

An xml file is used to configure the listener behaviour system. It is located in SEMAINE-3.0/Greta/listener/ASconfig.xml. It contains entries such as:

   <character name="Poppy" mimicry="0.5" backchannel="0.5" noutterance="1">
       <respondTo head="true" face="true" acoustic="true"/>
   </character>

For each character it defines the probability of generating mimicry, response backchannel (based on the agent's mental state) and utterances.

In order to switch off mimicry, set the mimicry attribute to 0 and the backchannel to 1. Conversely, to generate only mimicry behaviour, set the mimicry attribute to 1 and the backchannel to 0. Keep them at 0.5 for a similar quantity of mimicry and reponse backchannels.

The utterance attribute allows you to block the utterances coming from the dialogue manager. It is not a mandatory tag and if it is not there it is automatically set at 1 (all sentences are let through).

The agent's responsiveness to head, face or acoustic signals is determined by the attributes of the <respondsTo> element. Setting an attribute to "true" means that the listener intent planner generates signals for a certain modality (head, face, acoustic). This tag can be used without looking into the rules.

The individual rules that trigger and determine listener behaviour are quite straightforward. They describe the signals the agent reacts to and how. They are located in SEMAINE-3.0/Greta/listener/rulesfile.xml. For example:

   <rule name="trigger-AU12">
       <usersignals>
           <usersignal id="1" name="AU12" modality="face"/>
       </usersignals>
       <backchannels probability="1.0" priority="2">
           <mimicry probability="0.6">
               <mimicry_signal name="mouth=smile" modality="face"/>
           </mimicry>            
           <response_reactive probability="0.4"/>
       </backchannels>
   </rule>

This rule is triggered when the AU12 is detected. It can generate a signal of mimicry (that will be a smile on the face modality) or a response backchannel. The probabilities in the rules should be ignored: they are not used anymore since the action selection has been implemented.

So, in order to avoid that an agent responds to a signal, it suffices to just delete the associated rule.

Speech input component configurations

The technical details of the opensmile system are determined by a config file such as SEMAINE-3.0/Opensmile/conf/opensmileSemaine3a.conf or c++/src/tum/auxiliary/conf/opensmileSemaine3a.conf (in the source release or the SVN trunk version). The actual config file used is included in the call to Opensmile's SEMAINExtract executable, e.g. in SEMAINE-3.0/Opensmile/semaine-openSMILE-win5-run.bat (for Windows XP systems) and SEMAINE-3.0/Opensmile/semaine-openSMILE-win6-run.bat (for Windows Vista and above) or in bin/semaine-openSMILE-run.bat (in the source release or the SVN trunk version).

The top-level configuration file includes several other configuration files, which cover individual sub-tasks. Comments in the file explain which includes need to be enabled in order to run which configuration. A few most important examples are illustrated here:

  • Voice activity detector

Two voice activity detectors exist. A simple detector which used a fixed signal energy threshold can be enabled via the configuration file opensmileSemaineVADsimple.conf. The threshold for the RMS frame energy must be adjusted in the file opensmileSemaineVADsimple.conf to your setup by changing the “threshold = xxxx” option in the section [turn:cTurnDetector]. Typical values range from 0.001 to 0.1.

The advanced, self adapting voice activity detector is configured via opensmileSemaineVAD2.conf. Details are found in the D2b report. If the agent voice from the speakers causes problems (feedback), after the system is running for a certain time, uncomment the line “alwaysRejectAgent = 1” (remove the “;”) in the section [vad:cRnnVad] in opensmileSemaineVAD2.conf.

  • Speech recognition

By default the single stream HMM only recogniser is enabled. On faster systems one may want to try the multi-stream LSTM/HMM hybrid architecture. This can be enabled by uncommenting the line

;\{opensmileSemaineASRms.conf}

in opensmileSemaine3a.conf. Be sure to disable the single stream recogniser by commenting out the line which includes opensmileSemaineASR.conf.

To disable the speech recognition (words and non-linguistics), comment out both opensmileSemaineASR.conf and opensmileSemaineASRms.conf.

  • Emotion recognition

The emotion recognition is split to three configuration files: feature extraction (e.g. conf_B/opensmileSemaineEmoftAc.conf for feature Set B – see D3c for details on the feature sets), dimensional emotion recognition (e.g. conf_B/opensmileSemaineEmoBling.conf for acoustic and linguistic features or conf_B/opensmileSemaineEmoBsel.conf for acoustic features only), and detection of the user's level of interest (e.g. conf_B/opensmileSemaineIntB.conf). To disable either the dimensional affect recognition or the interest recognition, comment out the corresponding line. If both are disabled the line including the emotion feature extraction configuration should also be commented out.

Note: The feature extraction, dimensional affect recognition and interest detection configuration files from different feature sets (A, B, and C) may not be mixed.

Futher, the SEMAINExtract executable supports the command-line options -noWords, -noNonverbals, and -noInterest, to selectively disable the semaine components and their functionality as they appear in the GUI (Note, that the information is still computed and displayed in the openSMILE debug output, however it is not sent to the semaine system if the component is disabled – on the other hand, if the extraction of certain things is disabled in the opensmileSemaine3a.conf file, the the information will not be sent, even if the sender component is enabled and thus still visible).

Video input component configurations

The video input components, when installed, can be configured using C:\Program Files\iBUG\Semaine Video Components\videoConfig.cfg. Among other things, this config file determines whether a USB or Firewire camera is used.

Below all options are listed in the format of the config file, with their default settings:

 # -- Visualisation: ON / OFF.  
visualisation=ON 

This determines whether a visualisation of the face detection, 2D-head motion, eye detection, face normalisation, and detected facial points is shown. N.B. If you set smallvisualisation below, not all this information will be displayed because of the lack of screen real estate.

# -- Set cameraNr if you use OpenCV. Default is 0. Only useful if you have multiple cameras attached to your machine
cameranr=0

# -- The directory with all video models
modeldir=C:\Program Files\iBUG\Semaine Video Components\models\
# -- Size of visualisation. ON= small, OFF = big
smallvisualisation=ON
# -- Grabbertype: set to 0 for OpenCV, 1 for cmu1394
grabbertype=0 

N.B. You should test yourself whether your camera works with opencv. The cmu1394 driver supports most firewire cameras.

# -- turning on the nod-shake analysis module
nodshake=ON
# -- turning on the 5D emotion analysis module from head gestures 
nodshakedimaff=ON
# -- turning on the head tilt analysis module 
tilt=ON
# -- turning on the LBP-based AU detection module   
laud=ON
# -- turning on the face registeration module
useFaceRegistration=ON
# -- turning on the search for profile face module
searchProfileFace=ON
# -- turning on the user presence module 
useUserPresence=OFF

N.B. This userPresence module is superseded by a java component. It's only here for backwards compatibility.

# -- Parameter used by nod/shake detector (window-width)
nodshakewindowSize=20
# -- Turn facial point tracking on/off
facialPointTracking=OFF
# -- Set data directory for facial point tracker
trackdir=c:\temp

N.B. Unless you have an incredibly powerful machine, tracking will not work because the frame rate will be too low, and thus the appearance changes between frames too high.

MARY TTS configuration

The folder SEMAINE-3.1/MARY contains a full installation of the MARY TTS text-to-speech framework, release 4.2.0. Voices that ship with the release are the four voices for the SAL characters and a generic US English voice for the moderator character Greta.

It is possible to use the mary-component-installer to install and uninstall languages and voices into this instance of MARY TTS:

SEMAINE-3.1\MARY\bin\mary-component-installer.bat (windows)

SEMAINE-3.1/MARY/bin/mary-component-installer (linux/mac)

It is advisable to only install the voices that are needed, because more voices mean a higher memory footprint and longer startup times. In order to use the new voices, the java config file needs to point to a properly configured character config file. The SVN repository contains as an example a multilingual character config file, at tags/3.1.0/java/config/character-config-multilingual.xml.

The speech synthesis components are configured using a number of configuration files in SEMAINE-3.1/MARY/conf. The most important config file is marybase.config; its most important setting is “cache = false”. In order to switch on TTS caching to speed up the system, change this to “cache = true”.

History

This document draws upon and expands SEMAINE project deliverable D1d: Final SAL system

Last modified 7 years ago Last modified on 12/16/10 16:56:51