SEMAINE-2.0: The first full implementation of an autonomous Sensitive Artificial Listener system
Released 22 December 2009.
The aim of the SEMAINE project is to build a Sensitive Artificial Listener (SAL) – a multimodal dialogue system with the social interaction skills needed for a sustained conversation with a human user. SEMAINE-2.0 is the first full implementation of a SAL. This video illustrates the concept.
The open source parts of the system can be downloaded from the SEMAINE sourceforge project page. There are two required packages:
- SEMAINE-2.0-windows (~100 MB) includes binary versions of the Greta agent components, the Opensmile speech analysis components, and the message-oriented middleware ActiveMQ.
- SEMAINE-2.0-java (~650 MB) includes the System manager component, the dialogue components, and the speech synthesizer MARY TTS. The file is so large because the speech synthesizer comes with four high-quality TTS voices which need a lot of space.
To run the SEMAINE-2.0 system, both the windows and the java package are required. They can run together on a fast machine (tested on a laptop with a 2.53 GHz Core2Duo CPU with 4 GB RAM), or you can set up SEMAINE-2.0 as a distributed system.
- SEMAINE-2.0-source is an optional package which can be used to compile the system for Linux or Mac OS X, or to rebuild the windows components from source.
The Video analysis components are distributed as closed-source freeware from http://www.doc.ic.ac.uk/~maja/ (search for "SEMAINE Visual Components" on that page). Watch out for the Camera driver requirements if you are using a Firewire camera. If installed in the default location, the SEMAINE-2.0-windows start.bat script will notice that the video analysis components are installed and will try to run them. Since they are computationally heavy, you may need an additional computer to run them.
The SEMAINE-2.0 system will work without the video analysis components, but will then not be able to pick up the same amount of information from the user.
Running the system
In its simplest form, the system can be run on a single (fast) Windows Vista machine by installing all system components on the same computer as described above. The system is then run by starting the following batch file:
This will start ActiveMQ, wait until it is started, and then start all other installed components. If the system does not start correctly, double-check that you have unpacked both the windows and the java components in the same folder, and that you have met all the requirements.
To stop all components of the system, call
The system with windows and java open source components runs OK on a Core2Duo with 2.53 GHz and 4 GB RAM. When the video analysis components are added, the system is running but very slow. Therefore, it is recommended to run SEMAINE-2.0 as a distributed system on several computers.
The SEMAINE API for Java and C++, the SEMAINE dialogue components (in Java), and the speech synthesizer MARY TTS are distributed under the GNU Lesser General Public License (LGPL), version 3. The speech synthesis voices for the SAL agents are distributed under the Creative Commons ShareAlike - No Derivatives license.
The 3D agent animation software Greta and the speech analysis software Opensmile are distributed under the GNU General Public License (GPL).
The separately installable SEMAINE Video components for camera image analysis come as a freeware binary.
With some effort you can build the components from source.
Detailed documentation of the SEMAINE API is available in a number of documents:
- An upcoming Journal article The SEMAINE API: Towards a standards-based framework for building emotion-oriented systems (to appear in Advances in Human-Machine Interaction);
- the deliverable report D1c First full-scale SAL system.pdf
- section 3 of the deliverable report D1b: First integrated system
- Javadoc is available for the SEMAINE API
There is a public SEMAINE-users mailing list at https://lists.sourceforge.net/lists/listinfo/semaine-users. Feel free to ask questions there.
Detailed information on the system and the underlying software architecture can be found in the following set of public project deliverable reports:
- D1c First full-scale SAL system.pdf
- D2a Improved face and voice feature extraction.pdf
- D3a Human conversational signals analyser.pdf
- D3b Human affect analyser.pdf
- D4a Updated demo of the Dialogue Manager.pdf
- D5a SAL multimodal generation component.pdf
Furthermore, information about the data collected in the project can be found in the following report: