How are speech announcements (from Festival ?) implemented ?

archived · March 29, 2005, 12:20:37 PM

Hi,

I wonder if Pluto has speech announcements feature already implemented. I guess Festival could be and is used for that. I'm interested in general description and will help with some questions ...

- does Festival produce wav that is transferred to media clients ?
- can you set clients to mute/unmute ?
- are there different priorities ("House is on fire will go even on muted clients :-) " of speech announcements?

Thanks in advance,

regards,

Rob.

archived · March 29, 2005, 01:28:49 PM

Hi Rob,

Festival is one of things like motion that one of our programmers put together a module, but it hasn't been tested or used since we're focusing on the media sections. So there's no docs on it, and it's unlikely that it works. But it is there.

If you add the device "Text To Speech" to a computer (any computer, probably the core, though), it should install the device and all the software, including festival, automatically at reboot.

In our source code tree, you'll the directory Text_To_Speech, which is the project created by DCEGen. In the .cpp file, you'll see:

   /** @brief COMMAND: #253 - Send Audio To Device */
   /** Will convert the text to an audio file, and send it to the device with the "Play Media" Command. */
      /** @param #9 Text */
         /** What to say */
      /** @param #103 PK_Device_List */
         /** A comma delimited list of the devices to send it to */

void Text_To_Speech::CMD_Send_Audio_To_Device(string sText,string sPK_Device_List,string &sCMD_Result,Message *pMessage)

What that is supposed to mean is that you can send the Text To Speech the Command "Send Audio To Device", with the parameter Text = "I want you to say this", and the parameter PK_Device_List = "1,2,3" (assuming 1, 2 and 3 are devices that know how to play audio, like Xine, Orbiter, etc.), and the text to speech will convert the text to a wav file, and send a command "Play_Sound" to all those devices (1,2 and 3) with the wav file as an attachment.

Of course the devices 1,2 and 3 have to implement the Play_Sound command for it to work.

However, looking at the code, I see this:

void Text_To_Speech::CMD_Send_Audio_To_Device(string sText,string sPK_Device_List,string &sCMD_Result,Message *pMessage)
{
   int Size;
   char *pBuffer = NULL;//CreateWAV(sText,Size);
      DCE::CMD_Play_Sound_DL CMD_Play_Sound_DL(m_dwPK_Device,sPK_Device_List,pBuffer,Size,"WAV");
      SendCommand(CMD_Play_Sound_DL);

So clearly it hasn't been implemented since CreateWAV is commented out.

However, I know from v1, that the text to speech module is really trivial. Festival does all the work, and we have only 1 command that, this one, that just calls Festival and then fires a message. It's probably only a few lines of code. If the "CreateWAV" function were implemented, it probably would work as is.

archived · March 29, 2005, 01:41:45 PM

Quote from: "aaron.b"Hi Rob,

Festival is one of things like motion that one of our programmers put together a module, but it hasn't been tested or used since we're focusing on the media sections. So there's no docs on it, and it's unlikely that it works. But it is there.

If you add the device "Text To Speech" to a computer (any computer, probably the core, though), it should install the device and all the software, including festival, automatically at reboot.

In our source code tree, you'll the directory Text_To_Speech, which is the project created by DCEGen. In the .cpp file, you'll see:

   /** @brief COMMAND: #253 - Send Audio To Device */
   /** Will convert the text to an audio file, and send it to the device with the "Play Media" Command. */
      /** @param #9 Text */
         /** What to say */
      /** @param #103 PK_Device_List */
         /** A comma delimited list of the devices to send it to */

void Text_To_Speech::CMD_Send_Audio_To_Device(string sText,string sPK_Device_List,string &sCMD_Result,Message *pMessage)

What that is supposed to mean is that you can send the Text To Speech the Command "Send Audio To Device", with the parameter Text = "I want you to say this", and the parameter PK_Device_List = "1,2,3" (assuming 1, 2 and 3 are devices that know how to play audio, like Xine, Orbiter, etc.), and the text to speech will convert the text to a wav file, and send a command "Play_Sound" to all those devices (1,2 and 3) with the wav file as an attachment.

Of course the devices 1,2 and 3 have to implement the Play_Sound command for it to work.

However, looking at the code, I see this:

void Text_To_Speech::CMD_Send_Audio_To_Device(string sText,string sPK_Device_List,string &sCMD_Result,Message *pMessage)
{
   int Size;
   char *pBuffer = NULL;//CreateWAV(sText,Size);
      DCE::CMD_Play_Sound_DL CMD_Play_Sound_DL(m_dwPK_Device,sPK_Device_List,pBuffer,Size,"WAV");
      SendCommand(CMD_Play_Sound_DL);

So clearly it hasn't been implemented since CreateWAV is commented out.

However, I know from v1, that the text to speech module is really trivial. Festival does all the work, and we have only 1 command that, this one, that just calls Festival and then fires a message. It's probably only a few lines of code. If the "CreateWAV" function were implemented, it probably would work as is.

Hi,

we could for a start just use simple script provided with Festival:

text2wave and it will put speech into wave.

Is Festival using default voices ? I guess they are not so pretty to hear, but there are some better alternatives (one is additional voices from Festival and anothe I sued for quite some time are hts voices:

http://hts.ics.nitech.ac.jp/download.html

You can find demos at:
http://www.festvox.org/voicedemos.html

I can give a try on adding things to festival-pluto implementation. I'm just not so sure how to attach wav file to DCE message.

HTH,

regards,

Rob.

archived · March 29, 2005, 01:43:41 PM

Hi,

another quick question:
- how do media clients respond to wav message - do they pause, lower volume, ... ?
- can mixing be done (music lower volume, while wav is played? ) ?
- are there any settings to mute/unmute clients for wav messages ?

Regards,

Rob.

archived · March 29, 2005, 01:48:09 PM

BTW, I forgot. If you want to test it, you can use the message send utility like this:

/usr/pluto/bin/MessageSend localhost 0 [device id] 1 253 9 "say this" 103 "[other device id]"

Where [device id] is the device id of your Text To Speech device, and [other device id] is the media player where you want to hear it. If you tail -f /var/log/pluto/DCERouter.newlog you will see the CMD_Send_Audio_To_Device (253) going to the TTS device, and if the TTS device is working, you should then see a CMD_Play_Sound going back to the [other device id].

archived · March 29, 2005, 02:05:37 PM

OK, some fast replies.

Here's a code snippet (simple, no error checking, just to show the syntax):

// this is the auto-generated stub by DCEGen
void Text_To_Speech::CMD_Send_Audio_To_Device(string sText,string sPK_Device_List,string &sCMD_Result,Message *pMessage)
{
// Create the wav file
system( ("text2wave " + sText + " > /tmp/wave").c_str() );

// Use our FileUtils widgets to read the file into a buffer
size_t Size;
char *pBuffer=FileUtils::ReadFileIntoBuffer("/tmp/wave",&Size);

// Create a command the Play_Sound command and send it
DCE::CMD_Play_Sound_DL CMD_Play_Sound_DL(m_dwPK_Device,sPK_Device_List,pBuffer,Size,"WAV");
SendCommand(CMD_Play_Sound_DL);

That's it. Of course it needs error checking, unique file names, etc., but, theoretically, that code snipped is all that would be needed.

As far as festival settings, I don't think we've even started to mess with them. At the moment, the media clients just play, so the behavior is normally to blend audio, which admittedly isn't that good. However, if you wanted to pause the media devices first, you could send a CMD_Pause before the Play_Sound, and then an un-pause. The Play Sound command is not finalized, and nobody is using it. So feel free to add to it if you want. For example, you could go into device templates, add a bool data parameter "Pause Media", and rerun DCEGen. That will recreate the command stub so it includes a bool bPause_Media parameter, and, if the parameter is true, then you could do the pause before and after.

Similarly, in v2, we haven't finalized or started using the "Play Sound" command yet. So, again, we could add more parameters to it like volume level and so on.

I think the only module that issues commands for text to speech is the Security Plugin, which does a countdown on the orbiters when you arm your security system. In that case, it's just a verbal message on the touch-screen tablets, which are't used for playing other media anyway. However, even if you change and add parameters to the command, it won't break the existing Security Plugin. DCE is not "sensitive", so you really can't break it and if some modules are compiled against an older template, and others against a new one, it won't hurt anything.

archived · March 29, 2005, 03:36:41 PM

Hi,

thanks for info. I'll look into it. I'd like to suggest another few things that are maybe reasonable to add now, when things are yet "fresh".

In misterhouse world, speech announcements have different priorities.

Like :
"house is on fire" is urgent (it plays on everything regardless of its settings - it even powers up devices if needed)
"I'm going to sleep now" is important (plays on all active media clients, regardless of settings.
"Weather will be fine" is normal - it plays only on active clients that are unmuted...

Client can have following states:
- "unmute" - listen to all announcements
- "mute" - don't listen to normal ann.
- "mute & accumulate" - messages are queued and played after client unmutes...

There is another feature I particularly like in my setup - relaxation program. It spawns relaxing music in background and every now and then mixes with prerecorded thoughts (each member of family records its own positive thoughts). After some time (30,60,90 mins) music slowly fades and we are bzzzzzzzzzzzzzzzzzzzzzzzzzzzzz..... sleeping :-) ...

HTH,

regards,

Rob.

archived · March 29, 2005, 03:52:13 PM

I needed that little bit of humor right now. MisterHouse feeds you pleasant thoughts. Is it MisterHouse, or BigBrother? Be careful, if the government finds out about MisterHouse, they'll order one installed in everybody's home so they can program to feed you suggestions throughout your sleep. :lol:

None of that prioritization has been implemented. However, the functionality in the framework is essentially there to do what you described.

All devices have 2 persistant, user-defined (as in undefined) text fields: "State" and "Status". They are stored in the database, so they survive reboots too. The use of these fields, being undefined, can be anything.

Lighting plugin has a message interceptor that grabs any message going to any device that is in the category light:

RegisterMsgInterceptor(( MessageInterceptorFn )( &Lighting_Plugin::LightingCommand ), 0, 0, 0, DEVICECATEGORY_Lighting_Device_CONST, MESSAGETYPE_COMMAND, 0 );

And in the interceptor:
bool Lighting_Plugin::LightingCommand( class Socket *pSocket, class Message *pMessage, class DeviceData_Base *pDeviceFrom, class DeviceData_Base *pDeviceTo )

if( pMessage->m_dwID==COMMAND_Generic_On_CONST )
pDevice_RouterTo->m_sState_set("ON");
else if( pMessage->m_dwID==COMMAND_Generic_Off_CONST )
pDevice_RouterTo->m_sState_set("OFF");

So it sets the state of any device that is a light to the state "ON" or "OFF". It uses this state so that if you send a "Toggle Light" command to a light, the plugin can intercept it, determine what it's last state as, and respond accordingly.

Similarly security plugin intercepts several messages and sets the Status of any security devices (motion det, smoke, window, etc) to various values indicating if it's tripped or not, and sets the State to various settings indicating if it's armed, bypassed, etc.

For media devices, neither the State or Status flags are used. Media Plugin does interecept all messages to media devices, so it already has the interceptor and can store states such as "unmute", "mute", etc. And in Pluto admin, under "Automation" "Device Status" you can manipulate states/status.

So, the framework makes what you're describing pretty easy. Firstly, media plugin would set the state/status flag of all the media devices under its control. Second, media plugin would register a message interceptor for the "Play Sound" messages. A message interceptor means that the router will hand it an incoming message before sending the message off. It takes 1 line of code to do this. Third, we would add a "Priority" command parameter to the "Play Sound" command. Fourth, the interceptor would then return 'false' (ie abort) whenever it determines the message should be supressed based on the priority and the device state.

So, what you're describing is probably 20 lines of code, max, and doesn't require anything 'new'.

archived · March 29, 2005, 04:19:09 PM

Hi,

thanks for info. I'm glad the things are so well thought....

I'll wait for new releases, maybe will take a peek into Festival-pluto device....

Regards,

Rob.

LinuxMCE Forums

News:

How are speech announcements (from Festival ?) implemented ?

archived

archived

archived

archived

archived

archived

archived

archived

archived