This was tried by Pluto, as early as 2005, with the earliest versions of the Version 2 codebase.
CMU Sphinx II was used as the test engine, and if you look in the source tree, you'll find code related to this effort.
I have experience in this type of work.. so what i'm going to tell you won't be easy to swallow...
Short answer, it works maybe 30% of the time... and maybe 60% of the time with a highly optimized setup.
To make it work better would require a localised microphone such as a headset...
HOWEVER,
given the large amount of codec processing that happens as a result of using something such as a bluetooth headset, the resulting waveform will not be accurately matched by the hidden markov models (HMM) currently present in Sphinx, and thus a new model will have to be created (difficult), alongside a domain specific corpus (easy compared to the former, but user interface issues must be considered.)
This is possible. But it's going to take some adventurous hackers to do it. Come on guys, who of you will take the challenge?
-Thom