Introduction and Overview
A new application can assume a standard interface to implement a speech dialog as well as to output speech. GENIVI application cores can rely on standard interfaces to speech and voice recognition systems.
Identify requirements towards a unified Interface for speech / voice components in the system, a vendor agnostic abstraction layer, Integration of voice recognizer and TTS engines, identification of standards for resources (like phonetic alphabets).
IVI Speech Subprojects
- Speech Output Service (GENIVI Abstract Component -> Compliance on Interface Level)
- Speech Input Service (GENIVI Placeholder Component -> Compliance on Requirement Level)
- Speech Dialog Service (GENIVI Placeholder Component -> Compliance on Requirement Level)
Reasons for a GENIVI Speech API
Benefits of GENIVI Interfaces between Applications and Speech:
- Reduced the effort to exchange application cores (like e.g. navigation)
- Reduced number of PRs in a project due to well defined and robust interface
Benefits of a Genivi API for the „Dialog Step“:
- Reduces the effort to move from one dialog framework to another
- Provides a good basis for OSS dialog framework implementations
- Basis for App Development
The ivi-speech plans to use the GENIVI projects GIT repository to publish the following content:
- Speech Output Service Proof Of Concept Source code
- VMware image with Speech Output Service PoC integration Demo
Proof Of Concept (PoC)
In this chapter you will find the "Howto" for the GENIVI Speech Output Service PoC.
The aim of this PoC is to show the maturity of the Speech Output API. The Speech Output Service API is available in the
/pub folder of the Speech Output Service PoC delivery.
- ISpeechOutputService.h (interface declaration)
- ISpeechOutputServiceCommon.h (data type declaration)
This software uses the GNU gcc toolchain as its build tool and the espeak TTS project as TTS engine. Please make sure that the gcc toolchain is installed correctly.
Speech Output Service has dependencies to other system services, mentioned as follows:
- espeak-1.47.16 (TTS engine)
Please download the espeak-1.47.16 sources from the espeak homepage (http://sourceforge.net/projects/espeak/files/espeak/) and build the espeak shared libaries required by the PoC to access the TTS engine functionality.
You can find the shared libs build instructions as part of the espeak source package delivery.
Instructions for building the PoC Project
To build the GENIVI Speech Output Service Proof of Concept you have two options:
- Qt Creator: Open project file
'GENIVI_SpcOutputSrv_PoC.pro'with Qt Creator and run 'Build Project'
- Make: Type
'make all'command in shell within project root directory
The Speech Output Service PoC is pre-integrated into an VWware image.
Both you can download from the GENIVI IVI Speech GIT repository.
After you have opened the VMware image you find the Qt Creator link on the desktop. To open the project please load the Qt Creator .pro file
Running the test
You can run the Proof Of Concept Code by pushing the "run" button of the Qt Creator
- Use-case 1: Simple prompt.
- Use-case 2: Abort a prompt.
- Use-case 3: Queue buffer overflow discards the prompts.
- Use-case 4: Invalid prompt. Too long messages are ignored.
If you want to setup the VM build environment by your own please follow the instructions below.
VMware Player 7.0.0
VMware image LUbuntu 12.04
Open then Start the VM using file
Configure Virtual machine
Import files into the VM
Share a common folder between Host & Guest.
Follow Menu “Player -> Manage -> Virtual Machine Settings…” then add a shared folder as follow:
Update the VM tools
The VM will prompt you to update its tooling to enable advanced features (like drag&drop).
In case you accept, then at the bottom allow to install Tools:
Install VMware tools and in a terminal follow those steps:
sudo mkdir /mnt/cdrom --> “vmware-tools-distrib” is created
sudo mount /dev/cdrom /mnt/cdrom
rm -fr /tmp/vmware-tools-distrib
tar zxpf /mnt/cdrom/VMwareTools-9.9.0-2304977.tar.gz
cd vmware-tools-distrib --> previous version will be uninstalled first
Configure Linux distribution
Note: all passwords are generally set to "password"
For more help about Lubuntu: https://help.ubuntu.com/community/Lubuntu/Setup
HINT: to switch mouse focus back to Host use “CTRL+ALT”
Most administration tasks will need the super user privileges provided by sudo on the command line. To avoid repeatedly to input the password, you might add the user as automatic “sudoer”.
$ sudo visudo
At the bottom of the file add following line:
user ALL=(ALL) NOPASSWD: ALL
Update current packages (optional)
It is not directly used for this PoC but for your convenience you might first want to update the current default installation packages.
Note: After update restart the VM is needed.
Change the Keyboard Layout if needed (i.e. default is US keyboard):
$ sudo dpkg-reconfigure keyboard-configuration
Or manually (https://help.ubuntu.com/community/Lubuntu/Keyboard#Keyboard_Mapping) e.g. to use German
$ sudo leafpad /etc/xdg/lxsession/Lubuntu/autostart
Append following line:
@setxkbmap -layout "de"
Note: the latter method will also change the flag at the bottom of the desktop.
3rd parties packages
In terminal you might start the Synaptic package manager tool (apt-get is also an option):
$ sudo synaptic
In Synaptic search and install following packages:
Note: You should not install espeak neither espeak-dbg to avoid conflicts with the version used in PoC.
You should install the downloaded PoC source files into following folder:
For the sake of simplicity we will refer or imply this folder per default in the next configuration steps.
Qt Creator IDE Environment
Setup for debugging
In case you want to use the debug mode in Qt Creator and you are likely to encounter the following error:
A counter measure is to use an alternative terminal as follow.
First install the xterm terminal:
$ sudo apt-get install xterm
Then in the IDE follow the menu “Tools -> Options…” and edit the default Terminal as follow:
Replace the command “x-terminal-emulator -e” with “xterm -sb -rightbar -e”.
Debugging capability requires the system to support ptrace. To enable edit the following:
$ sudo vi /etc/sysctl.d/10-ptrace.conf
After changing 1 to 0 you will need to reboot the VM to get the change applied.
espeak open source TTS
The TTS software espeak is available as package(s) in Lubuntu. The version 1.1.46 is available and we decided to use the newer version 1.1.47 in the PoC. For some reasons we were not able to make the PoC work with the default 1.1.46 package. Nevertheless as information you can see in the following picture where the system would install the relevant libraries and we will reuse the same location. Thus it is not possible to have both versions at the same time.
One benefit from re-using the default system location is that we do not need to reconfigure other tools like Qt Creator. The libespeak location from default package installed and used by PoC is found in the IDE as follow with LD_LIBRARY_PATH environment variable:
Some plumbing is now needed to link the espeak libraries and data delivered along the PoC with your system. As root go to /usr/lib/i386-linux-gnu and then links from system to custom espeak:
root@lubuntu:/usr/lib/i386-linux-gnu# ln -s
root@lubuntu:/usr/lib/i386-linux-gnu# ln -s
In the user Home you should check if the lib folder contains the
following links: In case the links do not exist (e.g. the import from Windows as Host broke the links), you need to add them manually as follow:
drwxrwxr-x 2 user user 4096 Dec 10 08:25 ./
drwxrwxr-x 9 user user 4096 Dec 10 08:29 ../
-rwxrwxrwx 1 user user 459564 Dec 9 07:31 libespeak.a*
lrwxrwxrwx 1 user user 19 Dec 10 07:53 libespeak.so -> libespeak.so.1.1.47*
lrwxrwxrwx 1 user user 19 Dec 10 07:53 libespeak.so.1 -> libespeak.so.1.1.47*
-rwxrwxrwx 1 user user 345479 Dec 10 05:52 libespeak.so.1.1.47*
$ ln -s libespeak.so.1.1.47 libespeak.so.1
$ ln -s libespeak.so.1.1.47 libespeak.so
Additionally you need to install the espeak data. Note that the data from v1.1.46 (available as lubuntu package) are not compatible with the v1.1.47 (sound like a sampling rate difference).
As root you need to copy the data from your delivery location into “/usr/share/”.
- cp -r espeak-data /usr/share/
Now everything should be in place to start the project in the IDE and listen to the tests.