Super performance speech recognition technology of smartphone

When speech recognition technology is applied to computer desktops, this seems like a good idea. However, for most people, speech recognition cannot replace the keyboard and mouse. Now, voice technology is being used in a brand new environment: mobile phones. The application of voice recognition technology in mobile phones will further promote the development and application of this technology in new directions. This is the direction in which speech recognition technology has never been involved in desktop computer applications.

IBM will mark its 100th anniversary this year. In the early 1960s, IBM created an experimental speech recognition system called "Shoebox." This system solves the problem of spoken algorithms. Speech recognition technology first appeared as an early technology in the 1950s, mainly due to curiosity. In the early 1960s, IBM's "Shoebox" device was able to recognize 16 spoken words and answer simple mathematical questions such as "3 + 4 =?"

DragonDictate, which Dragon Systems launched for DOS computers in the early 1980s, may be the first speech recognition application. This app can only recognize a single word and speak only one word at a time. Over time, this application has evolved into a product called "Dragon NaturallySpeaking" (currently the 11th version, owned by Nuance Communications Corporation). This application is capable of translating texts read at normal conversational voice and speed.

There are two constraints to the application of speech recognition technology in desktop computers. First, in order for this application to work with higher accuracy, the application must be trained in order to recognize the user's speech features. Third-party products such as native voice-to-text technology and Dragon NaturallySpeaking in Windows Vista and Windows 7 operating systems still require a user training period before they can be used.

The second constraint is the popularity of the keyboard. Most people are used to keyboard typing rather than speaking, so voice control faces the same application barriers as Dvorak keyboard layout. When simple old-fashioned QWERTY keyboards are available and working well, why learn to use Dvorak keyboards?

The Microsoft TellMe team is the department responsible for developing speech recognition technology for the multimedia environment. Abhi Rele, senior product manager of the TellMe team, pointed out that in a desktop computer environment, users have convenient human-machine communication modes, such as a keyboard and mouse. Therefore, the use of voice is mainly aimed at voice lovers.

The wider application of voice-controlled computing requires two things: better and more convenient applications and where voice is mainly used. Mobile phones are the place that has been growing for a long time.

Matt Revis, vice president of product management and marketing at Nuance, explained that the difference between a desktop computer and a mobile environment is this: a desktop computer is a fixed environment and the focus is entirely on the use of desktop computers. Therefore, the voice technology of desktop computers mainly performs the following tasks: supporting office applications, web browsing, communications, etc. On the mobile side, voice is more used to support various lifestyles: professionals on the move, fun activities outdoors, speakerphones, and so on.

Gartner analyst Tuong Nguyen agrees with this view: Voice is more meaningful in a mobile environment. He said that from the perspective of use, the voice recognition function of the handheld device is more valuable. It adds user-friendly and convenient input methods.

Nguyen added that if you don't use voice technology to say a simple statement, but flip through many menus or try to enter on the small display keyboard, the value of voice recognition will be revealed. As the use of touch screen devices (without a physical keyboard) grows, voice recognition technology will be used to enhance data input and output. Voice recognition also supports hands-free requirements or legal requirements.

On mobile devices

Because mobile devices generally only support a part of the storage and processing functions of desktop computers, voice processing takes some time to appear in the mobile phone in a basic form.

The Springer manual for voice processing explains the situation of mobile phones in the early 2000s. Although there were still some limitations at that time, the mobile phone was able to recognize the dial-by-digit dialing voice after programming, and to some extent could also recognize the name of the person. The main problem is memory, so most mobile phones can only recognize 10 numbers or names at a time. However, another problem pointed out by these authors is that this feature is used less often, probably because mobile phone manufacturers have poor marketing in this area.

With the increase of memory and processing capabilities of mobile phones, the recognition capabilities of ordinary mobile phones have also increased. Samsung Electronics released the $ 99 SCH-p-207 mobile phone in 2005, which added voice-to-text dictation and voice dialing. With the memory reaching hundreds of MB and the storage capacity reaching several GB, the current generation of smartphones is rarely restricted.

Another key improvement is network speed. The wave of faster wireless networks has lifted many big ships, including the latest generation of voice processing technology. Faster networks can move voice processing tasks from the network to remote servers.

Google Voice Search Product Manager Amir Mane explained how a faster network can help the Google Voice application. He said that because all the heavy processing tasks are handled by Google servers on the network, we have reduced the computing power of handheld devices.

Current application

The current state of mobile phone voice recognition technology is not limited to voice dialing. The function of voice activation actually includes voice dialing. This is the first speech recognition feature that appears on mobile phones. At present, even many low-end mobile phones have this function, although this function is slightly worse when dealing with some uncommon names in the phone book of the mobile phone.

Gartner analyst Nguyen pointed out that the relatively new generation of voice functions is more open. Without programming specific voice commands to perform certain functions, the application can recognize the voice and perform appropriate actions. Higher-end, more powerful devices make these applications more feasible. In other words, not only can you use the phrase "Call 888-555-1212" to dial a phone number, users can also say "Call Mom" ​​or "Call My Mom."

Google Voice Search has fewer restrictions than previous voice recognition technology, because all the heavy tasks are completed by the web server. This makes voice-driven applications such as Google Voice Search more feasible. For example, if you say "Chuang Zhan Ji Movie Time", you will see a web page listing the area code or location. This application can not only recognize the meaning of this phrase, but also provide information on your mobile phone (your current location) and website (release time).

This app is also very familiar with English and can automatically distinguish some vocabulary differences without training. If I say "Motley Crue", this app can even use the band's unique spelling even when searching for words, although it will miss the diacritical marks. Search for "Motley's Crew" and you will get a comedy.

This means that the limitations of Google's speech recognition clearly indicate that it will take you further away from mainstream English. The foreigner's name is not helpful. Another problem with speech recognition applications is environmental noise. Mobile users are generally more affected by environmental noise than desktop computer users. Nuance's Revis said that in noisy outdoor environments, the accuracy of speech recognition is a problem.

Since the launch of Samsung ’s phone in 2005, dictation has made great strides. The Dragon dictation feature of the iPhone driven by Dragon NaturallySpeaking allows users to dictate everything from memos, emails, and Twitter updates. The Dragon software for email provides similar functionality for BlackBerry devices.

For Android phones, Nuance provides FlexT9 software. This software combines the Dragon dictation function with three types of touch screen input methods. There is also a Handcent SMS application. This application integrates Android local voice recognition technology to help you send text messages with voice.

Translations between texts have been launched for many years (for example, through the well-known Babel Fish website). Simultaneous translation is not yet available, but this software will be available soon. For example, the Jibbigo software for iPhone can translate words, phrases and reasonable simple sentences, allowing both parties to speak alternately.

Slimmer and lighter outdoor stage rental LED displays are creatively designed all at the dimension of 500*500, which enables free piece-together. 8.5kg only for each LED display panel, no tool needed during LED screen installation and integrated handle design all bring you great convenience. Every LED Panel is manufactured with CNC auto-machines of high precision so that to guarantee its seamlessness during assembly.  Slimmer and lighter outdoor stage rental Led Display is well accepted by many customers as it is applicable for both indoor and outdoor occasion.


Outdoor Rental LED Display

Led Large Screen Display,Led Advertising Board,Led Display Signs,Led Rental Outdoor Display

Shenzhen Joy LED Display Co., Ltd. , https://www.joe-led.com

This entry was posted in on