Voice Recognition
Simply
speaking
"Commonly
used applications like word that is essential though tedious.
However, speech recognition technology can make life a lot
easier as it does away with the chores of dictation and data
entry"
Thanks
to the personal computer, tedious tasks such as maintaining
records and the medical history of patients got automated- it
was not considered time –consuming and monotonous anymore.
Back in the dark ages, records were maintained by
painstakingly writing out each and every detail and then
filing it alphabetically. But with the PC, all that the user
does is enter the data in the requisite fields and maintain
the details in digital format. But soon even this seemingly
easy method will lose its shine as the user gets more and more
accustomed to automation, and sets out in search of simpler
method of doing things.
The
dawn of speech recognition technology has given the option to
dictate text instead of typing the data thereby making the
task of data entry even easier. With swift strides being made
in this area, don’t be surprised if very soon simply
thinking would initiate data entry!
Working
of the speech recognition engine
Thanks
to the immense demand for speech recognition packages form all
over the world, there are at least a dozen packages available
in the market today Developers have gone a step further and
begun customizing packages to even recognize dialects within a
country. Wonder why would they need to do something like that?
Speech
recognition technology works on the basis of an in-built
dictionary; on the same lines of a digital version of standard
dictionary like the Webster’s. The user needs to train the
software for the speech recognition engine to be able to
understand the accent as well as method of pronunciation. For
this process, there is a standard amount of text provided by
the developer that is already mapped into the application.
When the user reads out this text to software, the voice
recognition engine is able to map the pronunciation of the
words have been pronounced. The more the user trains the
software, the better would be the level of accuracy yielded by
the application.
So
Which one should you buy?
This
could be one tough decision as there are many options
available in the market today. However, there are a couple of
aspects that one needs to keep in mind while selecting a
speech recognition application. The first feature, as
mentioned earlier, is that one should look for the
customization of the package for Asian accent recognition.
This will automatically provide you with a greater level of
accuracy with a lower amount of training. The second feature
is hardware related and is concerned with the input device,
also called the microphone. Many speech recognition
applications are marketed with their own specialized
microphone and these would be a better choice over the ones
that do not have microphone. Regardless that the speech
recognition package would work with any microphone, the one
that is packaged with the software is the one that has been
designed to cut out the surrounding sound thereby increasing
the sensitivity and accuracy of the application. Does
this mean that microphones are interchangeable amongst
speech recognition packages? No, that may not be possible as
each microphone has been particular speech recognition
packages? No, that may not be possible as each microphone has
been designed and tested to work with just that particular
speech engine and therefore would not provide the desired
result with other packages.
Dragon
NaturallySpeaking 3.0 has been one of the most popular
packages in the market. Ever since it’s launch, this package
has been able to attain an average accuracy of about 91
percent, and most users can expect to get 87 to 95 percent
accuracy after completing the general training. Its New
user Wizard is thorough and easy to follow. You can dictate
directly into most applications, and the program integrates
with Microsoft word and Corel WordPerfect. It also includes
some useful shortcuts that streamline dictation.
The
latest continuous- speech recognition offering from IBM,
ViaVoice 98 includes a smarter speech recognition engine,
modeless operation, and extensive command and control
capabilities. However, it falls short on accuracy. Intelligently
designed and easy to learn ViaVoice features a 64,000 word
base vocabulary that can be expanded to 128,000 words. Once
you computer the setup a wizard provides a short tour helps
you configure the microphone and speakers, and walks you
through the quick Training module. To enroll, you simply read
from your choice of several texts and then let the software
process your voice information. The system supports multiple
users and multiple enrolments per user to accommodate
different vocabularies or acoustical environments or both.
Targeted
for home and SOHO users, Philips FreeSpeech 98 is a limited
speech recognition solution. The setup is straightforward. The
program walks you through an audio tuning process and some
initial training. To set up your voice profile, you
must read into the microphone for about 15 minutes. The
system spends another 15 minutes processing your input. A
green light on FreeSpeech 98 s toolbar lets you know when
you’re in a program that can accept voice input.
Voice
Direct Professional from IMSI provides the user with the
option to dictate directly in virtually any windows
application, including Excel, Word, WordPerfect, Lotus 1-2-3, PowerPoint,
even Internet Explorer, Use voice commands to dictate, edit
and control application functionality- everything from data
entry to formatting to printing, without even touching the
keyboard. The Voice Direct Professional’s proprietary Mouse
Grid technology enables
hands free cursor control with pinpoint accuracy. It even
includes a high quality noise canceling microphone that
eliminates background noise so you can be sure your message
gets through. Voice Direct includes a 120,000 word total
vocabulary for maximum accuracy and minimum keyboarding time
and you can even build macros for redundant tasks using BASIC
like scripting language.
What
can you use It for ?
There
are various functions that can be performed using speech
recognition software. Infact, the options range right form
inputting data such as text and number to logging into your
machine as well as shutting it down. Speech recognition
software not only allows the user to input text and numeric
data into documents and forms but also allows adding
punctuation such as tabs, commas, paragraph changes and full
stops. Control of basic windows functions can also be
performed that are generally done using the mouse and or the
keyboard. Functions like opening and closing applications,
saving and deleting files can be done using voice commands.
Most speech recognition packages have a read back function
built into the engine that would allow the system to read back
dictated files to the user. The user can use this feature to
read back documents including those that have just been typed
in. This feature give the user the facility to even have their
e mails read back to them.
Copyright © 2002 Dr. Subrahmanyam
Karuturi
|