Speech Recognition on the iPhone 5
In addition to Siri, the "personal assistant" that has received much fanfare since it's release on the iPhone 4S, this and subsequent versions of the iPhone and iPadinclude the same "keyboard dictation" that is found on the iPhone 4S and third generation iPad. Both Siri and the keyboard dictation (now simply called "dictation" in the Apple iOS 6 User Guide) use cloud-based speech recogntion.
How does it work?
Neither Apple nor Nuance are talking about it, but it's pretty darn clear that Apple dictation is based upon the cloud-based speech recognition software pioneered by Nuance Communications and likely licensed to Apple. Why do I say this? Back when Siri was a stand-alone and before purchased by Apple, it was clearly labled with the "speech recognition by Dragon" motto. Further, Nuance has created several stand-alone speech recognition applications for the iPhone and which ... well ... work exactly like the speech recognition built into the iPad and iPhone 5. But who cares where it came from. Good move on Apples part to integrate the world's best cloud-based speech recognition into the iPad and new iPhone 5.
But back to the "how does it work" part. When activated and you speak, your words are captured, digitized and compressed into a wave file that is sent SOMEWHERE, where it is processed using speech recognition software, converted to text, and then sent back to your device via the internet whereupon they appear magically on your screen. It's petty amazing.
It is important to understand a few things about all speech recognition:
- The quality of the end product is dependent upon the quality of the signal it receives.
- Speech recognition depends upon statistical models which know which words tend to occur with each other, so it is very context based. In other words, the decision it makes on a specific word is dependent upon the words it believes it heard before and after the word it is deciding upon.
In view of how speech recognition works on the iPhone 5, you can optimize your accuracy when using the integrated speech recognition in the iPhone 5 or third generation iPad by doing the following:
- Speaking clearly (enunciate carefully)
- Avoiding extraneous noise (this has the effect of contaminating your otherwise clear speech)
- Speaking in phrases or sentences (this may require thinking ahead before you initate dictation)
Practically speaking, if you are using the iPhone 5 speech recognition for casual use, we suggest that you use it without extraneous equipment and optimize your accuracy by 1) speaking near the speaker (bottom of the phone) and 2) avoiding loud extraneous noise. If you try dictating both from two distances ranges (a few inches from the microphone vs. a couple of feet) and with and without external noise, you will very quickly see the beneficial effect of staying close and avoiding noise.
For the Highest Accuracy
If integrated speech recognition on the iPhone 4S, 5, or third generatation iPad is essential to you, then we suggest you make the leap to using an alternative to the on-board microphone on these devices. Why? In a nutshell, while there is nothing wrong with the on-board mic, it has neither the accuracy nor the external noise rejection characteristics of a basic or high quality headset microphone. By the way this same principle holds if you are using other audio applications with your iPhone, or for that matter, with other smart phones and portable devices.
To understand the noise rejection issue with the on-board microphone listen to the following two recordings, made with the on-board mic and a quality headset microphone both with and without contaminating noise. You will immediately see how much external noise the iPhone 5 microphones picks up, and how much this is reduced with a good headset microphone.
- Recording using iPhone 5 Mic (coming)
- Recording using the Andrea NC-181 (coming)
While the Apple Camera Connection kit was a means of connecting a USB microphone to an iPhone or iPad, this is currently not an option with the new "Lightning" connector. Whether is will be an option in the future depends upon exactly what the eight pins on the new connector carry. At this point there are only two ways to interface a microphone with an iPhone 5: via the audio jack at the bottom of the phone, or via Bluetooth. For a number of reasons we suggest using Bluetooth only for telephone uses and not for serious audio applications, including speech recognition.
In order to use a traditional headset microphone with an iPhone or iPad, it is necessary to connect it with an adapter which will split the single jack into its respective mic-in and stereo mic-out functionalities. A sample adapter which we have manufactured for us is shown below.
Making it Happen
If you choose to go beyond the built-in microphone, we suggest the following:
- Pick up an iPhone/iPad headset adapter (we sell one for $14.95; there are other sources)
- Choose a microphone suited to your purposes
For most applications, any of the microphones which we have listed on our iPad Speech Recognition page will work with the iPhone 5
If you will be working in a particularly noisy environment, we suggest you consider one of our most external noise rejecting microphones such as the Sennheiser ME3 or Audio Technica 8HEmW (both are microphone only and don't include speakers). Other strongly performing microphones are the UmeVoice theBoom mics, including the "O", "C", "V4" and "Quiet"
Voice Commands and Punctuation
When using speech recognition with your iPad or iPhone 5, keep in mind that you need to speak all of your puctuation. This is the list of command from the iPhone OS6 User Guide
- quote … end quote or open quote/close quote
- new paragraph
- cap—to capitalize the next word
- caps on … caps off—to capitalize the first character of each word
- all caps—to make the next word all uppercase
- all caps on … all caps off—to make the enclosed words all uppercase
- no caps on … no caps off—to make the enclosed words all lowercase
- no space on … no space off—to run a series of words together
- smiley—to insert :-)
- frowny—to insert :-(
- winky—to insert ;-)
Here is a somewhat lengthier list of punctuation and commands to be used with Apple's cloud based voice recognition which comes from our Speech Recognition Comes to the iPad white paper:
Dash (or hyphen)
Open quote/begin quote
Close quote/end quote
Open parenthesis/left parenthesis
Close parenthesis/right parenthesis
New paragraph (or Next Paragraph)
All caps on
All caps off
No caps on
No caps off
No space on
No space off
For More Information: