The text to speech plugin is fine. It does what it is supposed to do - that is turn text into speech. Sure, it needs upgraded to festival2 with better voices, but that is a separate issue.
Any pre-processing of the text string should happen in the device that calls text to speech. For example, make an "Announcement" device, and give it a devicedata for zipcode (for weather info retreival), and a devicedata for a custom text string. Then, pre-define some tags that will be replaced with text, such as:
<current_time>,<current_temp>,<user_logged_in>,<current_date>,<current_day>, etc. In Ruby, using GSD, have these pieces of data fetched from the web or from the system. Then replace the above tags with that data, so in the end, you have a text string to pass to text_to_speech. For example, the custom string device data may look like:
"Good morning <user_logged_in>. It is currently <current_time> on <current_day>, and it is <current_temp> degrees outside." etc...
You get the point. Done this way, and with many supported "tags", anyone could have a custom announcement just by changing the device data.
Basically, the building of the text string for text_to_speech should happen in the device that needs to use it. There isn't a need to clutter the text_to_speech plugin with code that doesn't belong there.