What do you need help with?

Voice Command Recognition


Updated as of June 2018.

Although we have been considering doing an integration with a 3rd party Voice vendor for a while now, this has not been released to date.  The reason for this is that most vendor’s technology is network based, and we would like to have capability even when not connected to the network.  Also, because we are not a Google Approved Device, we do not have access to some of the Android voice services provided by Google.  Finally, most of the voice licensing models are per Application, not per platform.  We would rather find a platform based solution that can used by all Applications on the device.

There are these standard Android APIs related to speech:

but they appear to depend on having some Google components installed. So, although the references mentioned here are not included in the Google Mobile Services (gms) packages, it appears they have a dependancy. Also, most services require a network connection to work.

Embedded Voice:

There are many other options to do speech recognition. An Application Developer can directly license embedded technology from a vendor (example: Nuance) and pay a per App instance license fee for the speech recognition library. We  did some voice integration into our platform using Nuance libraries, & built some evaluation Applications, but have not released this to our customers yet. We did this work originally in 2014, and then updated and retested in 2016.

These were the components we evaluated:

  • Nuance Mobile Toolkit SDK for Android – V1.6
  • VoCon Hybrid Base (which includes VoCon Wake-up Word), – V4.6
  • Vocalizer Expressive – V1.3
  • Nuance Voice Security Library (“NVSL”) – V4.5

Overall, we thought voice command recognition worked pretty well; We developed 3 different Apps for evaluation.  We also integrated command recognition into our baseline launcher so a user can use voice to start Applications and some other functions.  If our customers are interested in this, please inquire and we can talk about releasing this functionality.

Here’s a brief description of this integration:  When the service is turned on, it is in wake up mode, you need to say “hi ODG” or “hi Reticle” in order to wake it up.  “Thank you ODG” will put it back to wake up mode.  Other than those phrases, here is the vocabulary list, which could easily be expanded. 

“Yes” | “No” | “Back” | “Return” | “Enter” | “Return Home” | “Menu” | “Control Panel” | “Sleep” | “Exit” | “Forward” | “Backward” | “Wake up” | “Record video” | “Stop recording” | “Hang up” | “End call” | “Long press” | “Voice recognition off” | “Volume up” | “Volume down” | “Volume mute” | “Select” | “Home” | “Home screen” | “Start recording” | “Take photo” | “Airplane mode on” | “Airplane mode off” | “Go up”

“Call” | “Dial”

“Record video” | “Stop recording” | “Take photo”

Open Source Voice :

There is a pretty good open source option from Carnegie Mellon: http://cmusphinx.sourceforge.net/wiki/tutorialandroid

Their site: Pocket Sphix, CMU – http://www.speech.cs.cmu.edu/pocketsphinx

We used it for a while about 3-4 years ago, and one of our customers has been using it recently and that App works very well, although its vocabulary is very small.


pocketsphinx configuration:

SpeechListener.java class has the init code, and doesn't programmatically select a mic (it might be the default one that gets selected). Below is example recognizer initialization code:


recognizer = defaultSetup()

.setAcousticModel(new File(assetsDir, "en-us-ptm"))

.setDictionary(new File(assetsDir, "cmudict-en-us.dict"))


// To disable logging of raw audio comment out this call (takes

// a lot of space on the device)



// Threshold to tune for keyphrase to balance between false

// alarms and misses



// Use context-independent phonetic search, context-dependent is

// too slow for mobile

.setBoolean("-allphone_ci", true)






Low Microphone Input Gain:

Multiple partners are using PocketSphinx successfully, with some settings that are somewhat mysterious, but still, the voice reco works well. Make sure you are picking the audio source correctly.  See this reference: 


These are the audio sources we know work. Choose based on if you want to record the voice of the glasses wearer or who they are talking to / video recording. We are told that CAMCORDER definitely works; note that there was a report that the audio source VOICE_COMMUNICATION does not work (this now switches to DEFAULT). VOICE_COMMUNICATION is suppose to fall back to DEFAULT if VOICE_COMMUNICATION is not  supported. and DEFAULT is MIC. That is the correct API behavior. We believe that in ReticleOS v3.5.17 and later, this is fixed.

Also note, the default mic input gain can be set pretty low.  Most Apps either enable automatic again control, or they do a calibration and adjust.  Here's a reference: http://developer.android.com/reference/android/media/audiofx/AutomaticGainControl.html

We have a report from one partner that auto gain control does NOT work.  Note that an App can adjust the input gain digitally itself.  A voice recorder that does not allow the mic input gain to be adjusted, the recording is very quiet.  [Note : In release 3.5.19, the default input gain was raised to a good value for both DEFAULT and CAMCORDER].


Here’s a list of other voice vendors we’ve looked at or have heard about, listed here to show the range of support an App can have available to it.  

  1. Speak With Me: http://www.speakwithme.com/
  2. Sensory Inc,
  3. Brighten.ai - Near to releasing Android Support;  check out John Burkey's articles on LinkedIn Pulse:
  4. Keen Research - http://keenresearch.com/   (embedded/on-device Voice Recognition SDK Solution for both Android and iOS)
  5. SRI has the core technology used in SIRI – http://www.speechatsri.com/products/sdk.shtml
  6. SoundHound Inc.: https://www.houndify.com/
  7. Verbio
  8. Media Insight
    • Personalize Speech recognition
  9. Rubidium – demo shows only wakeup word;
  10. List of Multiple Voice Recognition Embedded/Cloud Based/…


Have more questions? Submit a request


  • Avatar
    Saideep Annadatha

    Could you post on the samples developed for the Command recognition?

  • Avatar
    Paul Matz

    Hi Saideep,

    We are in the process of adding platform support for voice, hoping to have it included in our next R-7 ReticleOS release, so stand by. Until then, there should be a bunch of samples in the CMU pocketsphinx open source you can look at.


  • Avatar

    Except CMU pocketsphinx any other paid or open source sdk for voice recognition.

  • Avatar
    Saideep Annadatha

    Microsoft Bing APIs are working well with the ODG R-7 glasses. Only clause is that, it requires internet connectivity for the recognition and speech-to-text conversion

  • Avatar
    Paul Matz

    Hey Saideep,

    Thanks for the details. Can you provide a link to the API set you are using? Happy to update the article.


    Like mentioned in the article, the SDK we are the most familiar with is Nuance. We've used a variety of their SDKs, both fully embedded, and the Hybrid client that can run embedded without a network connection, or can connect to Nuance's server for faster, more highly accurate recognition.

  • Avatar
    Saideep Annadatha

    This is the sample provided by Microsoft which I used to test :
    https://github.com/Azure-Samples/Cognitive-Speech-STT-Android . Complete details are provided in the link

Powered by Zendesk