Back Home Next

ASC Proceedings of the 40th Annual Conference
Brigham Young University - Provo, Utah
April 8 - 10, 2004        

Construction Applications Using Voice Recognition

 
Gregg R. Corley and William C. Ligon
Clemson University
Clemson, South Carolina

 

This paper is a study of current voice recognition tools available for the construction industry. Two voice recognition computer applications are discussed and compared. One application is tested using common word-processing and spreadsheet applications and then tested using a construction-specific scheduling and project management application.

 

Key Words: Computer Applications, Information Technology, Voice Recognition, Project Management

 

Introduction

Voice recognition technology first came into popular usage in the construction industry in the early 1990s. The software was first used in CAD design, material takeoffs, and inspection. Over the past ten years the software has developed into one of the most state-of-the-art technologies available. Voice recognition technology has the potential to change the way contractors operate, improving their productivity and proving to be cost effective. The software is compatible with Microsoft Word and Excel, as well as with construction-related software such as Primavera Project Planner. In addition, it is easy to install, set up and use.

Voice recognition software allows the user to enter data into a computer program without touching a keyboard. The user simply speaks into a hands-free headset microphone and with spoken words and phrases perform complex tasks. Voice recognition technology allows users to create custom voice commands to create documents, insert boilerplate text, launch applications, create and complete forms, and automate complicated tasks (ScanSoft, Inc., 2003, March). The software is extremely useful for entering data in computers, instructing remote-controlled equipment, and recording instrument readings, while providing the user with mobility to do other tasks other than using the keyboard for data entry (Stukhart & Berry, 1992).

Voice recognition technology is now being used by many large and small businesses, both in and out of the construction industry, to compose letters, memos, and e-mail messages. Many are even using it to enter data into complex forms and spreadsheets. Users can also search Internet Web sites, access information, and navigate Web pages by speaking URLs and links (ScanSoft, Inc., 2003, November 10). Mobile data entry can also be obtained for the use of wireless systems, and there is now voice recognition software that allows e-mail users to speak their e-mail messages into their personal digital assistants (PDAs) and send them to recipients as voice messages. It also permits users to hear received e-mail as spoken messages and to navigate through their entire e-mail resources via a comprehensive range of voice commands (Domain Dynamics, 2003, August 14).

Comparison of Voice Recognition Software

There are a variety of voice recognition technology products available. From a CNet.com search, the two most popular products are Dragon Naturally Speaking 7 (DNS7) and the IBM ViaVoice 10.0. Both DNS7 and IBM allow the user to perform complex tasks using spoken words and phrases and cost around $200. Table 1 shows a comparison of these two software packages (CNet.com search, 2003, November 11 and PCWorld.com search, 2003, November 12).

Table 1

Comparison of DNS7 and IBM voice recognition software.

DRAGON NATURALLY SPEAKING 7 – PREFERRED EDITION

IBM VIAVOICE 10.0 – PRO USB EDITION

SYSTEM REQUIREMENTS

·         Pentium III 500 MHz processor (or equivalent)

·         128 MB RAM memory

·         300 MB free hard disk space (700 MB for full installation)

·         Microsoft Windows XP, ME, 98SE, 2000, or NT 4.0 (with SP-6 or greater)

·         Microsoft Internet Explorer 5.0 or higher

·         CD-ROM

·         Speakers

·         Microphone

·         Sound Card

·         Pentium II 600 MHz processor (or equivalent)

·         192 MB RAM memory

·         510 MB free hard disk space

·         Microsoft Windows XP, ME, 98SE, 2000, or NT 4.0 (with SP-6 or greater)

·         Microsoft Internet Explorer 6.0

·         CD-ROM

·         Speakers

·         Microphone

·         Sound Card

PRICE

·         $200

·         $190

WHAT'S INCLUDED

·         Noise-canceling headset microphone

·         User's Guide

·         Command Reference Card

·         Noise-canceling stereo USB headset microphone

·         User's Guide

·         Command Reference Card

KEY BENEFITS

·         Dictation into most Microsoft Windows-based applications

·         Control menus and dialog boxes in most Microsoft Windows based applications by voice

·         Format and edit by voice

·         Mouse control by voice

·         Use automatic punctuation control

·         Issue Web browser commands by voice

·         Use with handheld digital recorder

·         Dictation into pocket PC

·         Dictation, editing, formatting and correction into ViaVoice SpeakPad software

·         Dictation, editing, formatting and correction into Microsoft Office applications

·         Direct dictation into Internet applications

·         Launch URLs by voice

·         Issue Web browser commands by voice

·         Mouse control by voice

There are six different types of DNS7 software—the Essentials Edition, Standard Edition, Preferred Edition, Medical Edition, Legal Edition, and Professional Edition. The Preferred Edition is suited for home and small office users.  There are a number of types of IBM voice recognition software editions as well, including the ViaVoice for Windows Pro USB Edition.

Financial restraints prevented the testing of both so, after careful consideration, the authors choose DNS7 Preferred Edition for this study. The software was installed on a laptop computer with the following specifications:

bulletDell Inspiron 8200 Pentium 4, 1.8 GHz notebook computer
bullet384 MB SDRAM, 266 M memory
bullet32 MB DDR 4X AGP NeVidia NV17 3DVideo graphics board
bullet30 GB hard drive
bulletFixed 24X internal CD-RW/DVD combination drive
bulletWindows XP Home operating system

The Preferred Edition comes with just one easy-to-use CD-Rom, and instructions included with the package are easy to read and easy to follow. Once the CD is inserted, the installation starts automatically, and the user simply has to follow the on-screen instructions. Installing the software took the authors approximately 15 minutes to complete.

The first step in using the software is creating and training the new user. DNS7 uses a speech model to adapt to the user’s voice during training (ScanSoft, Inc., 2003, March). The New User Wizard explains how to position the microphone. While this may seem like a simple task, it is extremely important. If the microphone is not positioned properly, DNS7 may not detect all words or phrases spoken, which could lead to mistakes. Training a new user consists of training the computer to recognize the user’s voice. More than one person can use the program on the same computer as long as each person has created an individual user profile. Each person who wants to use the program needs to create a new set of user speech files and trains DNS7 to understand his or her voice (ScanSoft, Inc., 2003, March). The user speech files contain all of the information that DNS7 gathers about the user—pronunciation, vocabulary, how often certain words are used, and preferences, such as whether to insert one or two spaces after a period or a full stop (ScanSoft, Inc., 2003, March). The Preferred Edition allows the user to dictate in different dialects and even different languages. Short paragraphs are read during the setup and training of a new user so that the program can begin to recognize the user’s speech patterns. It took just under 15 minutes for the software to effectively recognize the user’s voice in tests conducted by the authors.

Using DNS7 with Microsoft Word and Excel

The first tested application of the DNS7 software was composing a business letter in Microsoft Word (see appendix of the business letter created). DNS7 comes with a Command Reference Card containing quick reference to common commands a user might need to write this type of letter. DNS7 automatically inserts punctuation if the user desires, or the user can override this feature and add or change the punctuation as needed. Dictating the sample business letter took approximately ten minutes, which is probably about the same amount of time it would have taken to actually type it using a keyboard. As with most computer applications, the more it is used, the more comfortable the user gets so that it soon takes less time than typing. The only problem encountered when dictating was that DNS7 did not always select the word or phrase that was spoken. Sometimes the software misspells a word or types something that only sounds like what was said (see Figure 1). In order to correct this, the user says into the microphone, “select punctuation” and then “choose 1” or “choose 2” to select the suggested spelling. The user can rely on the keyboard to quickly correct a mistake, and then the software remembers the next time the same word is spoken.

 

Figure 1: Screenshot of DNS7 used in Microsoft Word to select and correct a word.

One thing the user should remember is that yelling into the microphone does not help. In fact, DNS7 works better when the user speaks full sentences and speaks in his or her natural voice (ScanSoft, Inc., 2003, March). Using DNS7 in word-processing applications may at first be a little frustrating, and it may be hard not to use the keyboard because users are so accustomed to typing. However, after some practice of the voice commands, it becomes apparent that dictating text is much faster than typing. A relatively fast typist who can type 50 words per minute will produce a 900-word document in 18 minutes. Using DNS7, a person dictating 140 to 160 words per minute can produce the same 900-word document in about 6 minutes—one-third of the time (ScanSoft, Inc., 2003, March).

The next application the software was tested in was Microsoft Excel. DNS7 is simple to use in Excel, allowing the user to enter data into cells much more quickly than it could be typed. DNS7 also allows the user to move multiple rows and columns with a simple voice command instead of the multiple menu-selection and mouse-movement steps associated with the traditional Microsoft Windows interface (ScanSoft, Inc., 2003, November 10). DNS7 also allows the user to enter formulas by simply speaking a command, eliminating the necessity of typing multiple formulas into individual cells.

Using DNS7 with Primavera Project Planner

The final test of the DNS7 software was with Primavera Project Planner (P3), a construction scheduling and project management application. Since P3 is a Windows-based application, the assumption was made that it would be compatible with the voice recognition software, and it did not take long to realize that DNS7 does in fact work well with P3. Nothing too complex was tried—just simple activities and their durations were entered into the program. The activities and their durations were entered one after the other much faster than they could have been typed. It is extremely easy to navigate through the columns and rows by using simple voice commands such as “move left” and “move down.” Once in the appropriate row and column, the user speaks the data to be entered (i.e., “final paint”) and then says “press enter” to complete the data entry. Using DNS7 with P3 proved to be much more time efficient than using a conventional keyboard.  Figure 2 shows a screenshot of the DNS7 sample commands list available in P3.

 

Figure 2: Screenshot showing the DNS7 sample commands available in P3.

Conclusion

Voice recognition technology is something the construction industry needs investigate seriously. Not only is it time efficient, but it could also prove to be cost effective by eliminating much of the frustration non-typists encounter with any software program. This technology makes it possible for anyone who does not have keyboard skills to take advantage of computer applications. In construction, many project superintendents have the desire to take advantage of computer technology but lack the time to learn to type. Voice recognition could help quickly solve this problem.

The use of DNS7, IBM, and other similar applications is only the beginning of what this technology is capable of. Anyone who has basic computer knowledge can cost-effectively install and operate DNS7. The cost of this software is negligible when compared to the possible future time-savings; however, changing user habits can be difficult.

References

Domain Dynamics. (2003, August 14). Voice-activated email becomes a reality. M2 Presswire.

CNet.com search. (2003, November 11). Voice recognition. [WWW document]. URL http://cnet.search.com/

PCWorld.com search. (2003, November 12). Voice recognition. [WWW document]. URL http://search.pcworld.com/

ScanSoft, Inc. (2003, March). Dragon NaturallySpeaking version 7 user's guide. Peabody, MA: ScanSoft, Inc., Worldwide Headquarters.

ScanSoft, Inc. (2003, November 10). Feature comparison matrix [WWW document]. URL http://www.scansoft.com/naturallyspeaking/matrix/

Stukhart, G., & Berry, W. B. (1992). Evaluation of voice recognition technology. A report to the Construction Industry Institute, the University of Texas at Austin, under the guidance of the Electronic Data Management Task Force, Austin, TX : Bureau of Engineering Research, University of Texas at Austin, SD-76.

 

Appendix

 

Business Letter Created Using DNS7 and Microsoft Word

 

 

123 Road Drive

Anywhere, ST

October 8, 2003

 

 

Palmetto Bridge Constructors:

 

As a senior graduating from Clemson University, I am seeking employment in the construction industry. It has come to my attention through the Internet that there are some available positions on the Ravenel Bridge Project.

 

Bridge construction has always been extremely fascinating to me. To be able to participate in the construction of North America’s longest cable-stayed bridge would be extremely exciting. My academic background in civil engineering and construction science makes me an excellent candidate for employment. I feel that my multifaceted knowledge in both engineering and construction would be valuable to your company.

 

If you will refer to the enclosed resume you will find my qualifications, training and experience.

 

I would really like to try to set up a personal interview with you. You will find that I am extremely flexible with meeting times and places. Please feel free to contact me by phone at (864) 654-5555 or by e-mail at wligion@clemson.edu. Thank you for your consideration.

 

Sincerely yours,

 

 

 

Will Ligion

 

Enclosure: Resume