wiki:SEMAINE-1.0-speech2speech

SEMAINE-1.0 speech to speech dialog system

Pre-condition for all demonstrator configurations

Start an ActiveMQ server.

If you intend to run the demonstrator distributed across several machines, read Running a distributed system.

Text input, speech output

The simplest configuration of the system is a pure java system in which the user types his/her input via a GUI window, and hears the system response via speech output.

This system configuration pre-supposes installation of SEMAINE-1.0-java.

The system can be started as follows.

  • on Linux/Mac/Unix:
    SEMAINE-1.0/bin/semaine-speech2speech.sh
    
  • on Windows:
    SEMAINE-1.0\bin\semaine-speech2speech.bat 
    

Speech input, speech output

In this configuration, the user speaks to the system through a microphone, and the system response is produced via speech output.

This system configuration pre-supposes installation of SEMAINE-1.0-java and SEMAINE-1.0-linux.

The system can be started as follows.

1. Start the Java component

  • on Linux/Mac/Unix:
    SEMAINE-1.0/bin/semaine-speech2speech.sh
    

2. Start the Linux components

Make sure you have compiled the linux code.

The minimal Linux component to start is the SMILE component, doing feature extraction, voice activity detection, and emotion/interest recognition. Start it as:

SEMAINE-1.0/bin/run_components/start_component_tum.smile

In addition, it is advisable to start the Automatic Speech Recognition (ASR) component, so that the system has a chance to understand what the user is saying. (Note that the quality of ASR output at this stage is extremely limited due to very preliminary training data.)

SEMAINE-1.0/bin/run_components/start_component_tum.asr

Testing microphone level

It is essential for the proper functioning of the SMILE component that the microphone recording level is set to a reasonable value. To test that, watch the shell from which you started the SMILE component. When you start talking, the SMILE component should output a message saying "detected turn start", and when you stop talking, it should output a message saying "detected turn end". If there are no "turn start" messages, increase the recording volume; if there are no "turn end" messages, decrease the recording volume.

Last modified 9 years ago Last modified on 01/05/09 15:30:58