Likewise .. I've been very slowly writing this myself for a couple years now (inbetween jEmplode and RMML
). Here are the pieces I'm using (each with differing degrees of completeness):
AT&T Natural Voices for synth,
Cloudgarden (JSAPI on top of MSAPI) + Microsoft Speech SDK for recognition,
a RedRat for IR transmission,
a Zaurus for remote control,
Creative Modem Blaster voice modem (Cloudgarden recently supports the full JavaSound API for input/output, so I can finally I think use my modem's voice capabiility)
One of the biggest problems I have run into is mic'ing. Ultimately I think the route I'm going to go is that the Zaurus will be the controller -- so you take that around with the in the house. It will have an external mic on it and will do wireless transmission of audio over to the server which will then do voice recognition. The other problem I have is location identification. Since the Zaurus has a crappy IR transmitter, the approach I've been taking is that when you hit a button on the Zaurus "remote", it sends a request to the server, which then issues a remote command. But without IR on the client you lose locality of control -- so if you hit TV Power, I need to know you're in the bedroom or the loft to know which command to send and transmitter to talk to. I've been playing with plotting my wireless signal strengths at various points in my house and seeing if the server can identify where I am in the house based on link/level #'s + knowing its previous values (like I can't jump from the bedroom into the basement instanteously, so it can use that fact to try and remove unlikely candidates).
Oh well. This is my dream anyway
ms