What do you need help with?

Voice Command Recognition

Follow

Although we have been considering doing an integration with a 3rd party Voice vendor, this has not been released to date.  The reason for this is that most vendor’s technology is network based, and we would like to have capability even when not connected to the network.  Also, because we are not a Google Approved Device, we do not have access to some of the Android voice services provided by Google.  Finally, most of the voice licensing models are per Application, not per platform.  We would rather find a platform based solution that can used by all Applications on the device.

There are these standard Android APIs related to speech:

but they appear to depend on having some Google components installed. So, although the references mentioned here are not included in the Google Mobile Services (gms) packages, it appears they have a dependancy. Also, most services require a network connection to work.

Embedded Voice:

There are many other options to do speech recognition. An Application Developer can directly license embedded technology from a vendor (example: Nuance) and pay a per App instance license fee for the speech recognition library. We  did some voice integration into our platform using Nuance libraries, & built some evaluation Applications, but have not released this to our customers yet.

These were the components we evaluated:

  • Nuance Mobile Toolkit SDK for Android – V1.6
  • VoCon Hybrid Base (which includes VoCon Wake-up Word), – V4.6
  • Vocalizer Expressive – V1.3
  • Nuance Voice Security Library (“NVSL”) – V4.5


Overall, we thought voice command recognition worked pretty well; We developed 3 different Apps for evaluation.  We also integrated command recognition into our baseline launcher so a user can use voice to start Applications and some other functions.  If our customers are interested in this, please inquire and we can talk about releasing this functionality.

Here’s a brief description of this integration:  When the service is turned on, it is in wake up mode, you need to say “hi ODG” or “hi Reticle” in order to wake it up.  “Thank you ODG” will put it back to wake up mode.  Other than those phrases, here is the vocabulary list, which could easily be expanded. 

“Yes” | “No” | “Back” | “Return” | “Enter” | “Return Home” | “Menu” | “Control Panel” | “Sleep” | “Exit” | “Forward” | “Backward” | “Wake up” | “Record video” | “Stop recording” | “Hang up” | “End call” | “Long press” | “Voice recognition off” | “Volume up” | “Volume down” | “Volume mute” | “Select” | “Home” | “Home screen” | “Start recording” | “Take photo” | “Airplane mode on” | “Airplane mode off” | “Go up”

“Call” | “Dial”

“Record video” | “Stop recording” | “Take photo”


Open Source Voice :

There is a pretty good open source option from Carnegie Mellon: http://cmusphinx.sourceforge.net/wiki/tutorialandroid

Their site: Pocket Sphix, CMU – http://www.speech.cs.cmu.edu/pocketsphinx

We used it for a while about 2 years ago, and one of our customers has used it recently and that App works very well, although its vocabulary is very small.

 

pocketsphinx configuration:

SpeechListener.java class has the init code, and doesn't programmatically select a mic (it might be the default one that gets selected). Below is example recognizer initialization code:

 

recognizer = defaultSetup()

.setAcousticModel(new File(assetsDir, "en-us-ptm"))

.setDictionary(new File(assetsDir, "cmudict-en-us.dict"))

 

// To disable logging of raw audio comment out this call (takes

// a lot of space on the device)

.setRawLogDir(assetsDir)

 

// Threshold to tune for keyphrase to balance between false

// alarms and misses

.setKeywordThreshold(1e-20f)

 

// Use context-independent phonetic search, context-dependent is

// too slow for mobile

.setBoolean("-allphone_ci", true)

 

.getRecognizer();

recognizer.addListener(this);

 

 

Low Microphone Input Gain:

Multiple partners are using PocketSphinx successfully, with some settings that are somewhat mysterious, but still, the voice reco works well. Make sure you are picking the audio source correctly.  See this reference: 

Microphones-R-7-Smartglasses

These are the audio sources we know work. Choose based on if you want to record the voice of the glasses wearer or who they are talking to / video recording. We are told that CAMCORDER definitely works; note that there was a report that the audio source VOICE_COMMUNICATION does not work (this now switches to DEFAULT). VOICE_COMMUNICATION is suppose to fall back to DEFAULT if VOICE_COMMUNICATION is not  supported. and DEFAULT is MIC. That is the correct API behavior. We believe that in ReticleOS v3.5.17 and later, this is fixed.

Also note, the default mic input gain can be set pretty low.  Most Apps either enable automatic again control, or they do a calibration and adjust.  Here's a reference: http://developer.android.com/reference/android/media/audiofx/AutomaticGainControl.html

We have a report from one partner that auto gain control does NOT work.  Note that an App can adjust the input gain digitally itself.  A voice recorder that does not allow the mic input gain to be adjusted, the recording is very quiet.  [Note : In release 3.5.19, the default input gain was raised to a good value for both DEFAULT and CAMCORDER].

 

Others:
Here’s a list of other voice vendors we’ve looked at or have heard about, listed here to show the range of support an App can have available to it.  

  1. Speak With Me: http://www.speakwithme.com/
  2. Sensory Inc,
  3. Brighten.ai - check out John Burkey's articles on LinkedIn Pulse:
  4. Keen Research
  5. SRI has the core technology used in SIRI – http://www.speechatsri.com/products/sdk.shtml
  6. SoundHound Inc.: https://www.houndify.com/
  7. Verbio
  8. Media Insight
    • Personalize Speech recognition
  9. Rubidium – demo shows only wakeup word;
  10. List of Multiple Voice Recognition Embedded/Cloud Based/…

 

Have more questions? Submit a request

Comments

  • Avatar
    Saideep Annadatha

    Could you post on the samples developed for the Command recognition?

  • Avatar
    Paul Matz

    Hi Saideep,

    We are in the process of adding platform support for voice, hoping to have it included in our next R-7 ReticleOS release, so stand by. Until then, there should be a bunch of samples in the CMU pocketsphinx open source you can look at.

    -PMatz

Powered by Zendesk