advertisement
javaboutique
Search Tips
Articles  |   Tutorials  |   Reviews  |   Tools  |   by Category  |   by Date  |   by Name  |   Submit  |   Source  |   Forums  |  
javaboutique
Browse DevX


Partners & Affiliates











advertisement

Tutorial: The Java Speech API: A Primer on Speech Applications:

ReaderDemo Application

The ReaderDemo application is a simple speech synthesizer that demonstrates the features of JSAPI. The application presents five modes of operation:
  • Read (no JSML): Reads out some plain text with no formatting.
  • Read (JSML): Reads out text formatted with JSML. In this case, the text is read in a more natural way.
  • Features Test (JSML): You will get a feel of most of the features of JSML in this mode including volume, speech rate, pitch, etc.
  • Test (no JSML): In this mode, you can enter some plain text and see how the engine generates speech. Enter additional spaces, new lines, periods, commas, and check the differences they make in speech generation.
  • Test (JSML): In this mode, you can enter JSML formatted text and test how the output varies (click here for more information on the JSML syntax).
The application demonstrates, among other things, the use of speech events and event listeners.

To Run the demo applications use java speechdemo.TellTime and java speechdemo.ReaderDemo.

You can also run the demos using FreeTTS. For this, you need to put the speech.properties file (loactedin the FreeTTS package) in either the lib folder of your 'java.home' (the location of your JRE) or your 'user.home' (usually "c:\documents and settings\username" on Windows 2000 and XP machines). To run the above demos in FreeTTS, use the TellTimeFTTS and ReaderDemoFTTS Java classes. Unfortunately, FreeTTS does not render JSML—it reads all the JSML as plain text.

Speech Engines

Speech Technology is implemented as software, hardware, or a combination of both. When implemented using software, speech applications never directly interact with the audio components of a computer. Instead, they are abstracted from the underlying hardware through the speech engine. This ensures that speech applications are not tied down to specific hardware implementations and are portable across all platforms.

The speech engine is the heart of any speech-based application. It takes speech input or speech output and converts the speech to a standard format so that speech applications can process them and produce desired results. Speech synthesizers, speech recognizers, speaker verification systems, speaker identification systems, etc. are examples of speech engines.

Each of the systems below is a specialized speech engine that does some predefined processing with the speech input or output:

  • Speech Synthesis: This engine handles conversion of textual input to synthetic spoken output. This is often referred to as "text-to-speech" (TTS) conversion.
  • Speech Recognition: This engine performs the conversion of spoken input to digital output, such as text. Note that speech recognition does not mean understanding speech.
  • Speaker Identification: This engine identifies a person by the sound of their voice. Identified is performed by comparing the person's voice against existing voices in a voice database.
  • Speaker Verification: This engine authenticates a person based on their voice.
The basic functionality of speech engines is abstracted in the classes and interfaces of the javax.speech package. However, specific speech engine functionalities can be added using additional packages. The Java speech API provides built-in support for the synthesizer and recognizer speech engines. The additional classes which support these functionalities are defined in the javax.speech.synthesis and javax.speech.recognition packages.

Speech Synthesis

A speech synthesizer converts text to speech. The synthesizer supports two different formats of feeding text and getting speech output. The first format is plain text (which is self-explanatory) and the second format uses the Java Speech Markup Language (JSML).
  • The Plain Text Format: When input is fed to the synthesizer as plain text, the speech rendered is machine-like, without any emotion attached to it. Moreover, plain text synthesis does not distinguish and recognize known notations like date and time. For instance, 12:15PM is read as "twelve colon fifteen PM" which is obviously not how a user would expect to hear the text. To improve the readability of the synthesizer, the speech API supports annotations, which are embedded in the speech text. These annotations are supported through JSML.
  • The JSMLFormat: JSML is an XML-based markup language that allows you to embed structural and presentational information within the text. Synthesizers interpret these annotations and convert the text to speech appropriately. For instance, the annotation <SAYAS class=time> 12:15PM </SAYAS> in JSML will make sure the synthesizer interprets the time as twelve fifteen PM.

How to Add Java Applets to Your Site

New on the Java Boutique:

New Review:

Time Management Made Easy with the Quartz Enterprise Job Scheduler
Why not just use the Java timer API? This open source scheduling API boasts simplicity, ease-of-integration, a well-rounded feature set, and it's free!

New Applet:

Reverse Complement
Reverse Complement is a simple applet that converts DNA or RNA sequences into three useful formats.

Elsewhere on internet.com:

WebDeveloper Java
Lots of Java information on webdeveloper.com

WDVL Java
Thorough Java resource at the Web Developer's Virtual Library.

ScriptSearch Java
Hundreds of free Java code files to download.

jGuru: Your View of the Java Universe
Customizable portal with online training, FAQs, regular news updates, and tutorials.

 Avaya DevConnect Center
 Service Component Architecture/Service Data Objects Solution Center
 Intel Go Parallel Portal
 Internet.com eBook Library
 IBM Software Construction Toolbox
 Microsoft RIA Development Center
 Destination .NET
XML error: not well-formed (invalid token) at line 53
advertisement
Receive Articles via our XML/RSS feed
Receive Articles via our XML/RSS feed

JavaBytes
Internet Cyclone
This powerful, easy-to-use, internet optimizer is for Windows 95, 98, ME, NT, 2000 and XP. It's designed to automatically optimize your Windows settings, boosting your Internet connection up to 200%.

Latest Linux Hits Networking Flaws
Metasploit 3.2 Offers More 'Evil Deeds'
'Thank You Apple. Seriously.'
The Buzz: BlackBerry App Store Seen Next
Is .NET on Linux Finally Ready?
Red Hat Takes on HPC Market, Microsoft
Python's New Release Bridges the Gap
No Flash Seen on iPhone Horizon
Apple Yields to Complaints Over iPhone NDA
Microsoft Shows Some Ankle With Visual Studio

Use Explicit Conversion Functions to Avert Reckless Implicit Conversions
Polyglot Programming: Building Solutions by Composing Languages
Automated testing for .NET by Ben Hall
"Supply Chain" SOA with SKOS
Service Component Architecture in Real Life
C++Ox: The Dawning of a New Standard
Getting Started with Virtualization
Master Complex Builds with MSBuild
eCryptfs: Single-File Encryption in Linux
CCXML in Action: A CCXML Auto Attendant

Advertising Info  |   Member Services  |   Contact Us  |   Help  |   Feedback  |   Site Map  |   Network Map  |   About



JupiterOnlineMedia

internet.comearthweb.comDevx.commediabistro.comGraphics.com

Search:

Jupitermedia Corporation has two divisions: Jupiterimages and JupiterOnlineMedia

Jupitermedia Corporate Info


Legal Notices, Licensing, Reprints, & Permissions, Privacy Policy.

Advertise | Newsletters | Tech Jobs | Shopping | E-mail Offers

Solutions
Whitepapers and eBooks
IBM Whitepaper: Innovative Collaboration to Advance Your Business
Internet.com eBook: Real Life Rails
Avaya Article: Call Control XML - Powerful, Standards-Based Call Control
Tripwire Whitepaper: Seven Practical Steps to Mitigate Virtualization Security Risks
Internet.com eBook: The Pros and Cons of Outsourcing
Go Parallel Article: Scalable Parallelism with Intel(R) Threading Building Blocks
Internet.com eBook: Best Practices for Developing a Web Site
IBM CXO Whitepaper: The 2008 Global CEO Study "The Enterprise of the Future"
Avaya Article: Call Control XML in Action - A CCXML Auto Attendant
Go Parallel Article: James Reinders on the Intel Parallel Studio Beta Program
IBM CXO Whitepaper: Unlocking the DNA of the Adaptable Workforce--The Global Human Capital Study 2008
Adobe Acrobat Connect Pro: Web Conferencing and eLearning Whitepapers
Go Parallel Article: Getting Started with TBB on Windows
HP eBook: Storage Networking , Part 1
MORE WHITEPAPERS, EBOOKS, AND ARTICLES
Webcasts
Go Parallel Video: Intel(R) Threading Building Blocks: A New Method for Threading in C++
HP Video: Is Your Data Center Ready for a Real World Disaster?
Microsoft Partner Portal Video: Microsoft Gold Certified Partners Build Successful Practices
HP On Demand Webcast: Virtualization in Action
Go Parallel Video: Performance and Threading Tools for Game Developers
Rackspace Hosting Center: Customer Videos
Intel vPro Developer Virtual Bootcamp
HP Disaster-Proof Solutions eSeminar
HP On Demand Webcast: Discover the Benefits of Virtualization
MORE WEBCASTS, PODCASTS, AND VIDEOS
Downloads and eKits
Microsoft Download: Silverlight 2 Software Development Kit Beta 2
30-Day Trial: SPAMfighter Exchange Module
Red Gate Download: SQL Toolbelt
Iron Speed Designer Application Generator
Microsoft Download: Silverlight 2 Beta 2 Runtime
MORE DOWNLOADS, EKITS, AND FREE TRIALS
Tutorials and Demos
IBM IT Innovation Article: Green Servers Provide a Competitive Advantage
Microsoft Article: Expression Web 2 for PHP Developers--Simplify Your PHP Applications
Featured Algorithm: Intel Threading Building Blocks - parallel_reduce
MORE TUTORIALS, DEMOS AND STEP-BY-STEP GUIDES