Sei sulla pagina 1di 7

How to Use the Speech to Text (STT) Service in a WP7 Client Application

Disclaimer: This document is provided as-is. Information and views expressed in this document, including URL and other Internet Web site references, may change without notice. You bear the risk of using it. This document does not provide you with any legal rights to any intellectual property in any Microsoft product. You may copy and use this document for your internal, reference purposes. 2011 Microsoft Corporation. All rights reserved. Microsoft, Visual Basic, Visual Studio, and Windows are trademarks of the Microsoft group of companies. All other trademarks are property of their respective owners.

Introduction
This document describes the steps for a developer to follow to invoke the Hawaii STT API on a Windows Phone 7 application. In order to highlight specific points this article references the SpeechRecognitionTestClient sample application provided as part of the Hawaii Cloud Services SDK for WP7 1.0.7. You can download the SDK here. The source code for this sample can be found in the download location in the folder Cloud Services SDK for WP7\1.0.7\Samples\SpeechRecognitionTestClient.

The Speech Recognition Client Library


The simplest way to communicate with the Hawaii STT service is to use the Speech Recognition Client Library. This library is included in the SDK as source code and it provides a set of simple APIs that allows a client WP7 application to communicate with the Hawaii STT service. The location of the source code for this library is under the SDK download folder at: \Cloud Services SDK for WP7\Clients\SpeechRecognitionClientLibrary As an example you can look at the Visual Studio solution that implements the SpeechRecognitionTestClient application. As shown in the following screenshot the client library is included in this solution as a class library project

Cloud Services SDK for WP7 1.0.7

Page 1

Create an Application Using the SpeechRecognitionClientLibrary


When writing an application using the SpeechRecognitionClientLibrary you should include the following steps. 1. When creating your own application you can include the SpeechRecognitionrClientLibrary and the HawaiiBaseClientProxy projects (SpeechRecognitionrClientLibrary has a dependency on the HawaiiBaseClientProxy) in your Visual Studio solution. Alternatively you can build those libraries separately and use references to the resultant dlls in your Visual Studio solution.

Cloud Services SDK for WP7 1.0.7

Page 2

2. In your client application, at the point where you want to initiate the STT process you need to create an instance of the Speech Recognition Client library class:
SpeechRecognitionClient service = new SpeechRecognitionClient("stt.hawaiiservices.net", clientId);

The first parameter of the SpeechRecognitionClient constructor specifies the Uri of the Hawaii STT service, "stt.hawaii-services.net". The second parameter, clientId, is a Guid that should uniquely identify the client. In the sample SpeechRecognitionTestClient application, this part is implemented in the MainPage.xaml.cs file. 3. As an optional step you can query the server for the list of available grammars. The user can then select a particular grammer from this list to use as the context in which they run the speech to text process. To obtain the list of available grammars call service.GetSpeechGrmmarsAsync(). You will also have to provide a callback method that will be called by the client library when the asynchronous call to GetSpeechGrmmarsAsync completes.

service.SpeechGrammarsReceived += this.OnSpeechGrammarsReceived; service.GetSpeechGrmmarsAsync();

The Speech Recognition client library will call the OnSpeechGrammarsReceived at the completion of the asynchronous service call. However it is important to note that the STT client library will make this call on a worker thread. In Silverlight you can only access UI elements on the main UI thread. Since in the OnSpeechGrammarsReceived method you will most likely want to directly or indirectly set elements in the UI, you must make sure that this method is executed in the main UI thread. One simple solution is to set the service. SpeechGrammarsReceived event to a method that will invoke OnSpeechGrammarsReceived via Dispatcher. BeginInvoke. Using Dispatcher. BeginInvoke will ensure that OnSpeechGrammarsReceived is executed on the main UI thread. The following code illustrates this process:
service.SpeechGrammarsReceived += (s, e) => { // This section defines the body of what is known // as an anonymous method. // This anonymous method is the event handler method // for the service.SpeechGrammarsReceived event. // Using Dispatcher.BeginInvoke ensures that // OnSpeechGrammarsReceived is invoked on the Main UI thread. this.Dispatcher.BeginInvoke(() => OnSpeechGrammarsReceived(s, e)); };

Cloud Services SDK for WP7 1.0.7

Page 3

... private void OnSpeechGrammarsReceived( object sender, SpeechGrammarsReceivedEventArgs e) { ... }

The syntax

(s, e) => { statement ;}

that you see in the code is a simple example of a lambda expression. It can be confusing when seen for the first time but is a simple way to write an inline delegate. Think of the content inside the curly brackets as the content of a method. This is called an anonymous method since it does not have a declaration in which you provide a name for it. It is equivalent to the following code:

service.SpeechGrammarsReceived += OnSpeechGrammarsReceivedDispatcher; ... private void OnSpeechGrammarsReceivedDispatcher( object sender, SpeechGrammarsReceivedEventArgs e) { this.Dispatcher.BeginInvoke(() => OnSpeechGrammarsReceived(sender, e)); }

private void OnSpeechGrammarsReceived( object sender, SpeechGrammarsReceivedEventArgs e) { ... }

4. You also need to implement the event handler that is called when the asynchronous GetSpeechGrmmarsAsync method completes. This means implementing the content of the previously mentioned OnSpeechGrammarsReceived method. Inside the on complete event handler you must do the following: a. Check whether the call completed successfully or if it had an error. Cloud Services SDK for WP7 1.0.7 Page 4

b. On successful completion do the appropriate processing. This can be as simple as showing a list of all available grammars. c. In the case of an error take care of the error handling. This could be as simple as displaying an error message. The following code illustrates the OnSpeechRecognitionCompleted method implementation.

private void OnSpeechGrammarsReceived( object sender, SpeechGrammarsReceivedEventArgs e) { if (!e.IsErrored) { // Use the response from the service. In our case the useful // data is in e.Grammars. // Each item is a string containing the name of a grammar. } else { // Display the error state. } }

5. Next, you can create another instance of SpeechRecognitionClient to trigger an asynchronous call that does the actual speech to text processing.
// The grammar parameter is optional. The default is "Dictation" grammar. SpeechRecognitionClient service = new SpeechRecognitionClient( "stt.hawaii-services.net", clientId, grammar); service.SpeechRecognitionCompleted += (s, e) => this.Dispatcher.BeginInvoke(() => OnSpeechRecognitionCompleted(s, e)); service.RecognizeSpeechAsync(audioBuffer);

... private void OnSpeechRecognitionCompleted (object sender, SpeechRecognitionCompletedEventArgs e) { ... }

The audioBuffer parameter is a byte array containing the content of a PCM audio wave you want to process. In the SpeechRecognitionTestClient sample this is the content of an audio wave returned by the MicroPhone.GetData. When this statement is executed, a call to the Hawaii STT service is made. Since the call is performed asynchronously, the RecognizeSpeechAsync will return immediately. The execution of the client application will continue in parallel with the Cloud Services SDK for WP7 1.0.7 Page 5

execution of the asynchronous service call. At some point that call will complete and the on complete handler will be invoked. 6. You also need to implement the event handler that is called when the asynchronous Hawaii STT service call completes. This means implementing the content of the previously mentioned OnSpeechRecognitionCompleted method. Inside the on complete event handler you must do the following: a. Check whether the call completed successfully or if it had an error. b. On successful completion do the appropriate processing. This can be as simple as updating a list with the text options provided by the speech to text translation. c. In the case of an error take care of the error handling. This could be as simple as displaying an error message. The following code illustrates the OnSpeechRecognitionCompleted method implementation.

private void OnSpeechRecognitionCompleted( object sender, SpeechRecognitionCompletedEventArgs e) { if (!e.IsErrored) { // Use the response from the service. In this case the relevant // data is in e.RecognitionResults. // Each item is a string that represents one possible text translation. } else { // Display the error state. } }

7. On successful completion the result of the STT process is provided by the e. RecognitionResults property.

// Gets or sets the list of recognized texts. List<string> SpeechRecognitionCompletedEventArgs.RecognitionResults

RecognitionResults is a list of 10 strings each representing a possible text string for the speech identified by the STT service. The strings are listed in descending order of their recognition confidence level with the first string having the highest confidence level. For more information on the classes and properties see the Cloud Services SDK for WP7 help file, Cloud Services SDK for WP7.chm, at ..\Microsoft Research\Cloud Services SDK for WP7\1.0.7\Documentation or Cloud Services SDK for WP7 1.0.7 Page 6

wherever you downloaded the SDK. The following class diagram shows the STT service results.

Audio Tips and Guidelines


Use the following tips and guidelines for the STT service. Limit speech input to a maximum of 10 seconds. Up to 10 seconds of speech is supported by the Speech to Text service. Audio streams longer than this will result in the error Null/Invalid response object from server. You may experience lower-quality results on Speech-to-Text services with the Dell Venue.

Conclusion
Your client application can now call the Hawaii STT service and your event handler will do the appropriate processing when the asynchronous Hawaii STT service call completes.

Cloud Services SDK for WP7 1.0.7

Page 7

Potrebbero piacerti anche