Author Topic: Sphinx2 as speech recognition for Asterisk  (Read 5789 times)

archived

  • Hello, I'm new here
  • Posts: 0
    • View Profile
Sphinx2 as speech recognition for Asterisk
« on: July 16, 2005, 02:21:42 pm »
Hi,

I'm reporting this just in case if someone will use it.... I've found excellent article on connecting Sphinx2 to Asterisk :



newbielink:http://turnkey-solution.com/asterisk-sphinx.html [nonactive]

I've followed instructions with some slight changes:

1. I've used Sphinx 2.0.5 (current version under Sarge)....
It only lacks one little file in /usr/include/sphinx2/ad_conf.h :
Code: newbielink:javascript:void(0); [nonactive]
/* ====================================================================
 * Copyright (c) 1999-2001 Carnegie Mellon University.  All rights
 * reserved.
 *
 * Redistribution and use in source and binary forms, with or without
 * modification, are permitted provided that the following conditions
 * are met:
 *
 * 1. Redistributions of source code must retain the above copyright
 *    notice, this list of conditions and the following disclaimer.
 *
 * 2. Redistributions in binary form must reproduce the above copyright
 *    notice, this list of conditions and the following disclaimer in
 *    the documentation and/or other materials provided with the
 *    distribution.
 *
 * This work was supported in part by funding from the Defense Advanced
 * Research Projects Agency and the National Science Foundation of the
 * United States of America, and the CMU Sphinx Speech Consortium.
 *
 * THIS SOFTWARE IS PROVIDED BY CARNEGIE MELLON UNIVERSITY ``AS IS'' AND
 * ANY EXPRESSED OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
 * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
 * PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL CARNEGIE MELLON UNIVERSITY
 * NOR ITS EMPLOYEES BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
 * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
 * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
 * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
 * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
 * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
 * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 *
 * ====================================================================
 *
 */
#ifndef _AD_CONF_H_
#define _AD_CONF_H_
/* OS dependent part of ad.h, properly set by configure
*/
#if !defined(AD_BACKEND_OSS)
#define AD_BACKEND_OSS
#endif
 
#endif


2. I've used newer SPX Perl wrapper from Cpan (it's 0.07 version). It only needs one modification in SPX.c (it reports too many arguments to one function - you just have to eliminate argument 'c') .

3. I've downloaded alternative acoustic models as instructed.....

4. it kind of works with this AGI :

Code: newbielink:javascript:void(0); [nonactive]
#!/usr/bin/perl
#use strict;
use Asterisk::AGI;

$|=1;

my $AGI = new Asterisk::AGI;


#my %input = $AGI->ReadParse();


# Setup some variables
my %AGI1; my $tests = 0; my $fail = 0; my $pass = 0;

while(<STDIN>) {
chomp;
last unless length($_);
if (/^agi_(\w+)\:\s+(.*)$/) {
$AGI1{$1} = $2;
}
}

print STDERR "AGI Environment Dump:\n";
foreach my $i (sort keys %AGI1) {
print STDERR " -- $i = $AGI1{$i}\n";
}



sub asr {
use IO::Socket;
use FileHandle;
use IPC::Open2;
my $file = shift or return undef;
my $host = shift || 'localhost';
my $port = shift || '1069';
my $fh;

my $remote =  IO::Socket::INET->new(
Proto    => "tcp",
PeerAddr => "$host",
PeerPort => "$port",
) or return undef;

#Idea here being that you can pass a reference to an existing file handle... not yet implemented, just pass a filename.
if (ref $file) {
   my $fh = $file;
} else {
   open (FH, $file) || return undef;
   $fh = *FH;
}

$file =~ /(gsm|wav)$/;
my $type = $1;
if ($type !~ /gsm|wav/) {
   warn "Unknown file type ($file)";
   return undef;
}
#print   STDERR "Reading ($file)\n";
#print  STDERR "FTYPE: $type\n";
#$pid = open2(*SOXIN, *SOXOUT, "sox -t $type - -s -r 16000 -w -t wav - 2>/dev/null") || warn ("Could not open2.\n");
$pid = open2(*SOXIN, *SOXOUT, "sox -t $type - -s -r 8000 -w -t wav - 2>/dev/null") || warn ("Could not open2.\n");

#print  STDERR "After open2\n";

binmode $fh;
binmode SOXIN;
binmode SOXOUT;
binmode $remote;

#print  STDERR "into while 1\n";
while (defined(my $b = read $fh, my($buf), 4096)) {
#   print  STDERR "B: $b\n";
   last if $b == 0;
   $count += $b;
#   print  STDERR "Count: $count\n";
   print SOXOUT $buf;
}
close SOXOUT;

#print  STDERR "into while 2\n";
$count = 0;
my $sox = undef;
while (defined(my $b = read SOXIN, my($buf), 4096)) {
   last if $b == 0;
   $count += $b;
   $sox .= $buf;
}

#print  STDERR  "Sending to server: " . length($sox) . " bytes\n";

print $remote length($sox) . "\n";
print $remote "$sox";
close SOXIN;

#print  STDERR  "DEBUG: Waiting for result.\n";
   
$count=0;
while (defined(my $b = read $remote, my($buf), 4096)) {
   last if $b == 0;
   $count += $b;
   $result .= $buf;
}

close $fh;
close $remote;

return "$result";
}



sub confirm {
      my $tries = 0;
      while ($tries <= 3) {
         $tries++;
         $AGI->stream_file("custom/pozdravljeni",'""');

         $AGI->stream_file("beep",'""');
         $AGI->record_file("/tmp/$$", 'gsm', '0',3000);
         $AGI->stream_file("beep",'""');

         my $vresponse = asr("/tmp/$$.gsm");
         $AGI->verbose("CONFIRM: $vresponse");

         next if $vresponse !~ /YES|NO|ACCEPT|CANCEL/;

         $gotresp = 1;

         if ($vresponse =~ /NO|CANCEL/i) {
            sleep 1;
            $AGI->stream_file("custom/razveljavljen",'""');
            return undef;
         } elsif ($vresponse =~ /YES|ACCEPT/i) {
            sleep 1;
            $AGI->stream_file("custom/dodan",'""');
            return undef;
         } else {
            $tries++;  
#            return 1;
         }
      }

      if (! $gotresp) {
         sleep 1;
         $AGI->stream_file("invalid",'""');
         return undef;
      }
}




print STDERR "1.  Testing yes/no...";
&confirm;

print STDERR "================== Complete ======================\n";



5. my installation notes are like that (could be some errors in there )....

Code: newbielink:javascript:void(0); [nonactive]

apt-get install sphinx2-bin
apt-get install libsphinx2-dev
apt-get libsphinx2g0            
apt-get sphinx2-hmm-6k          
apt-get sphinx2-language      



perl -MCPAN -e "install Sphinx::Recognizer::SPX"
wget http://search.cpan.org/CPAN/authors/id/D/DJ/DJHD/Speech-Recognizer-SPX-0.07.tar.gz
tar zxvf Speech-Recognizer-SPX-0.07.tar.gz
cd Speech-Recognizer-SPX-0.07

cp ../ad_conf.h /usr/include/sphinx2/
mv ./Audio/SPX.c ./Audio/SPX.c.orig

cp SPX.c ./Audio

perl Makefile.PL
make install
cd ..
 
wget http://turnkey-solution.com/confirm.tgz
tar -xvzf confirm.tgz
cp -R ./confirm /usr/share/sphinx2/model/lm/

wget http://www.speech.cs.cmu.edu/sphinx/models/hmm/communicator-2000-11-17-2.tgz
tar -xvzf communicator-2000-11-17-2.tgz
cd communicator-2000-11-17-2
mv  -f sphinx_2_format communicator
mv  -f communicator /usr/share/sphinx2/model/hmm/
cd ..

#create custom language models
http://www.speech.cs.cmu.edu/tools/lmtool.html




cp -f /var/lib/asterisk/sounds/251ivrrecording.wav ./test.wav



It kind of works - maybe some more testers to see if it's usefull... Of course Sphinx4 is newer engine and should be better - but maybe this little system is a good testing bed....

HTH,

regards.

Rob.

archived

  • Hello, I'm new here
  • Posts: 0
    • View Profile
Re: Sphinx2 as speech recognition for Asterisk
« Reply #1 on: January 11, 2006, 03:50:36 am »
Quote from: "tinia"
Hi,

I'm reporting this just in case if someone will use it.... I've found excellent article on connecting Sphinx2 to Asterisk :



newbielink:http://turnkey-solution.com/asterisk-sphinx.html [nonactive]

I've followed instructions with some slight changes:

1. I've used Sphinx 2.0.5 (current version under Sarge)....
It only lacks one little file in /usr/include/sphinx2/ad_conf.h :
Code: newbielink:javascript:void(0); [nonactive]
/* ====================================================================
 * Copyright (c) 1999-2001 Carnegie Mellon University.  All rights
 * reserved.
 *
 * Redistribution and use in source and binary forms, with or without
 * modification, are permitted provided that the following conditions
 * are met:
 *
 * 1. Redistributions of source code must retain the above copyright
 *    notice, this list of conditions and the following disclaimer.
 *
 * 2. Redistributions in binary form must reproduce the above copyright
 *    notice, this list of conditions and the following disclaimer in
 *    the documentation and/or other materials provided with the
 *    distribution.
 *
 * This work was supported in part by funding from the Defense Advanced
 * Research Projects Agency and the National Science Foundation of the
 * United States of America, and the CMU Sphinx Speech Consortium.
 *
 * THIS SOFTWARE IS PROVIDED BY CARNEGIE MELLON UNIVERSITY ``AS IS'' AND
 * ANY EXPRESSED OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
 * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
 * PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL CARNEGIE MELLON UNIVERSITY
 * NOR ITS EMPLOYEES BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
 * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
 * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
 * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
 * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
 * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
 * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 *
 * ====================================================================
 *
 */
#ifndef _AD_CONF_H_
#define _AD_CONF_H_
/* OS dependent part of ad.h, properly set by configure
*/
#if !defined(AD_BACKEND_OSS)
#define AD_BACKEND_OSS
#endif
 
#endif


2. I've used newer SPX Perl wrapper from Cpan (it's 0.07 version). It only needs one modification in SPX.c (it reports too many arguments to one function - you just have to eliminate argument 'c') .

3. I've downloaded alternative acoustic models as instructed.....

4. it kind of works with this AGI :

Code: newbielink:javascript:void(0); [nonactive]
#!/usr/bin/perl
#use strict;
use Asterisk::AGI;

$|=1;

my $AGI = new Asterisk::AGI;


#my %input = $AGI->ReadParse();


# Setup some variables
my %AGI1; my $tests = 0; my $fail = 0; my $pass = 0;

while(<STDIN>) {
chomp;
last unless length($_);
if (/^agi_(\w+)\:\s+(.*)$/) {
$AGI1{$1} = $2;
}
}

print STDERR "AGI Environment Dump:\n";
foreach my $i (sort keys %AGI1) {
print STDERR " -- $i = $AGI1{$i}\n";
}



sub asr {
use IO::Socket;
use FileHandle;
use IPC::Open2;
my $file = shift or return undef;
my $host = shift || 'localhost';
my $port = shift || '1069';
my $fh;

my $remote =  IO::Socket::INET->new(
Proto    => "tcp",
PeerAddr => "$host",
PeerPort => "$port",
) or return undef;

#Idea here being that you can pass a reference to an existing file handle... not yet implemented, just pass a filename.
if (ref $file) {
   my $fh = $file;
} else {
   open (FH, $file) || return undef;
   $fh = *FH;
}

$file =~ /(gsm|wav)$/;
my $type = $1;
if ($type !~ /gsm|wav/) {
   warn "Unknown file type ($file)";
   return undef;
}
#print   STDERR "Reading ($file)\n";
#print  STDERR "FTYPE: $type\n";
#$pid = open2(*SOXIN, *SOXOUT, "sox -t $type - -s -r 16000 -w -t wav - 2>/dev/null") || warn ("Could not open2.\n");
$pid = open2(*SOXIN, *SOXOUT, "sox -t $type - -s -r 8000 -w -t wav - 2>/dev/null") || warn ("Could not open2.\n");

#print  STDERR "After open2\n";

binmode $fh;
binmode SOXIN;
binmode SOXOUT;
binmode $remote;

#print  STDERR "into while 1\n";
while (defined(my $b = read $fh, my($buf), 4096)) {
#   print  STDERR "B: $b\n";
   last if $b == 0;
   $count += $b;
#   print  STDERR "Count: $count\n";
   print SOXOUT $buf;
}
close SOXOUT;

#print  STDERR "into while 2\n";
$count = 0;
my $sox = undef;
while (defined(my $b = read SOXIN, my($buf), 4096)) {
   last if $b == 0;
   $count += $b;
   $sox .= $buf;
}

#print  STDERR  "Sending to server: " . length($sox) . " bytes\n";

print $remote length($sox) . "\n";
print $remote "$sox";
close SOXIN;

#print  STDERR  "DEBUG: Waiting for result.\n";
   
$count=0;
while (defined(my $b = read $remote, my($buf), 4096)) {
   last if $b == 0;
   $count += $b;
   $result .= $buf;
}

close $fh;
close $remote;

return "$result";
}



sub confirm {
      my $tries = 0;
      while ($tries <= 3) {
         $tries++;
         $AGI->stream_file("custom/pozdravljeni",'""');

         $AGI->stream_file("beep",'""');
         $AGI->record_file("/tmp/$$", 'gsm', '0',3000);
         $AGI->stream_file("beep",'""');

         my $vresponse = asr("/tmp/$$.gsm");
         $AGI->verbose("CONFIRM: $vresponse");

         next if $vresponse !~ /YES|NO|ACCEPT|CANCEL/;

         $gotresp = 1;

         if ($vresponse =~ /NO|CANCEL/i) {
            sleep 1;
            $AGI->stream_file("custom/razveljavljen",'""');
            return undef;
         } elsif ($vresponse =~ /YES|ACCEPT/i) {
            sleep 1;
            $AGI->stream_file("custom/dodan",'""');
            return undef;
         } else {
            $tries++;  
#            return 1;
         }
      }

      if (! $gotresp) {
         sleep 1;
         $AGI->stream_file("invalid",'""');
         return undef;
      }
}




print STDERR "1.  Testing yes/no...";
&confirm;

print STDERR "================== Complete ======================\n";



5. my installation notes are like that (could be some errors in there )....

Code: newbielink:javascript:void(0); [nonactive]

apt-get install sphinx2-bin
apt-get install libsphinx2-dev
apt-get libsphinx2g0            
apt-get sphinx2-hmm-6k          
apt-get sphinx2-language      



perl -MCPAN -e "install Sphinx::Recognizer::SPX"
wget http://search.cpan.org/CPAN/authors/id/D/DJ/DJHD/Speech-Recognizer-SPX-0.07.tar.gz
tar zxvf Speech-Recognizer-SPX-0.07.tar.gz
cd Speech-Recognizer-SPX-0.07

cp ../ad_conf.h /usr/include/sphinx2/
mv ./Audio/SPX.c ./Audio/SPX.c.orig

cp SPX.c ./Audio

perl Makefile.PL
make install
cd ..
 
wget http://turnkey-solution.com/confirm.tgz
tar -xvzf confirm.tgz
cp -R ./confirm /usr/share/sphinx2/model/lm/

wget http://www.speech.cs.cmu.edu/sphinx/models/hmm/communicator-2000-11-17-2.tgz
tar -xvzf communicator-2000-11-17-2.tgz
cd communicator-2000-11-17-2
mv  -f sphinx_2_format communicator
mv  -f communicator /usr/share/sphinx2/model/hmm/
cd ..

#create custom language models
http://www.speech.cs.cmu.edu/tools/lmtool.html




cp -f /var/lib/asterisk/sounds/251ivrrecording.wav ./test.wav



It kind of works - maybe some more testers to see if it's usefull... Of course Sphinx4 is newer engine and should be better - but maybe this little system is a good testing bed....

HTH,

regards.

Rob.