Invoking the web API of IBM Watson’s Speech to Text service from .NET

Categories Nitty-gritty stuff

With this blog post, I will try to help you in case you need to use IBM Watson web services in your application and your application is being developed in .NET. The problem is that IBM does not provide .NET development toolkits for accessing their services for some reason. They have only Java and Node. Without a toolkit, the task of accessing IBM Watson’s web API becomes slightly more difficult.

In the following example I will acccess the Speech To Text service using .NET’s HttpClient.aspx) class in order to transcribe audio file.

Before trying to make the first call it is necessary to register for the service and obtain username and password. Next, you will want to create and tune the instance of HttpClient:

using(var client = new HttpClient()) {
    client.DefaultRequestHeaders.Accept.Clear();
    client.DefaultRequestHeaders.Accept.Add(new MediaTypeWithQualityHeaderValue("application/json"));
    client.DefaultRequestHeaders.Authorization = new AuthenticationHeaderValue(
    "Basic",
        Convert.ToBase64String(
            Encoding.ASCII.GetBytes(
                "yourusername:yourpassowrd")));
}

Now it’s time to prepare the content to be sent to the IBM Watson. This content must be audio file stream, which must support one of the allowed formats, for details, see docs. Here I will use wav format, because I have discovered that compressed audio formats lead to more errors in the resulted transcripts:

var content = new StreamContent(new FileStream(@"D:\youraudiofile.wav", FileMode.Open));
content.Headers.ContentType = new MediaTypeHeaderValue("audio/wav");

Finally, the created client needs to be called with this content as an input

var response = client.PostAsync(
    "https://stream.watsonplatform.net/speech-to-text/api/v1/recognize?continuous=true", content).Result;

The continuous is an example of many possible parameters, which can be passed along with content, a complete list of which can be looked up in the documentation. This one actually tells to Watson that it does not have to split text into pieces at the audio pauses.

To observe JSON response:

if (response.IsSuccessStatusCode)
{
   var res = response.Content.ReadAsStringAsync().Result;
   Console.WriteLine(res);
}

Note that I basically ignore the async nature of HttpClient here by extracting the result right away. One could easily do the API call asynchronously. But I actually prefer the simplicity of the synchronous solution, trading some execution speed for this.

Anyway, here is the complete code with the call of Watson and the output (don’t forget to replace your username, password as well as your file name):

using System;
using System.IO;
using System.Net.Http;
using System.Net.Http.Headers;
using System.Text;


class Program
{
    static void Main()
    {
        using (var client = new HttpClient())
        {
            client.DefaultRequestHeaders.Accept.Clear();
            client.DefaultRequestHeaders.Accept.Add(new MediaTypeWithQualityHeaderValue("application/json"));
            client.DefaultRequestHeaders.Authorization = new AuthenticationHeaderValue(
                "Basic",
                Convert.ToBase64String(
                    Encoding.ASCII.GetBytes(
                       "yourusername:yourpassowrd")));

        var content = new StreamContent(new FileStream(@"D:\youraudiofile.wav", FileMode.Open));
            content.Headers.ContentType = new MediaTypeHeaderValue("audio/wav");

            var response = client.PostAsync("https://stream.watsonplatform.net/speech-to-text/api/v1/recognize?continuous=true", content).Result;
            if (response.IsSuccessStatusCode)
            {
                var res = response.Content.ReadAsStringAsync().Result;
                Console.WriteLine(res);
            }
        }
    }
}
FacebooktwitterlinkedinFacebooktwitterlinkedin

8 thoughts on “Invoking the web API of IBM Watson’s Speech to Text service from .NET

  1. Good Afternoon NECHAI.
    I loved the post, it was very useful.
    Did you happen to have an example to generate an audio from the text?

    Grateful

  2. Hi!
    Congrats for your article, it’s very clear (A thing a little hard to find nowadays)
    I need to implement this, but in an asp .net web application.
    Didyou do something similar?

    Thanks!

    1. Hi, Rodrigo!

      No, I did not implement it in ASP.NET web app, only as a console app (it was all I needed for my project). However, the implementation should be exactly the same. It is just C# code which should be able to execute in the context of ASP.NET.

  3. I am getting below error:

    {StatusCode: 400, ReasonPhrase: ‘Bad Request’, Version: 1.1, Content: System.Net.Http.StreamContent, Headers:
    {
    X-Backside-Transport: FAIL FAIL
    Connection: Keep-Alive
    Transfer-Encoding: chunked
    Session-Name: AGWNJBRLJJPKXEXK-en-US_BroadbandModel
    X-Content-Type-Options: nosniff
    X-XSS-Protection: 1
    X-Global-Transaction-ID: 1556985792
    Strict-Transport-Security: max-age=31536000;
    X-DP-Watson-Tran-ID: stream-dp01-1556985792
    Date: Fri, 18 Aug 2017 07:10:57 GMT
    Set-Cookie: Watson-DPAT=QUNu58HMKa94Kvr67Ci9ZvLPrvBWEegnEXvgWId2VFHWRzHX5hZ6yeflt%2FZy4SzOh2p9uV3%2BmXg66qFmhBkrc0uIwxv%2BQvW0qG1Psxul1oGOyVqj5CxZqPg7dSXz1KO5nuT3x7072jD%2BMJU%2BxOHfq%2B4cdkj8jnU46l9Jxb7TYFTdj6QusnTCYbefPgCpcWQj68mV3CkHT%2FIDwLg8Vywh%2BmYn3ced4bNDzrmRJpARuoMLjVoRLpdWy6sO1Ha4JMlcPWGTJfI6Jf17JjW7VjlgRmnNJKpWljCMl%2B3V9tGwmcFPlhT1beFcHzFjX2VLvWfq6glFjI04n8x4ZeE4hsdp1ZKsP0yIId6eAotOVsZko0S4WGMtpmYu2XGEA%2FAahcx8jI3Pn%2FrpTO1JjBTdb7oci2vJ5%2Fucfk7qf7x3PIXk%2BCG5A2ZOuT%2FoLRy%2BqKSUegetraIMgpG4trt%2B7lj6j42Jr5U%2BZgL47hNjOklUftfVr7DBzG9tlombO2h5opjtRiu0oWpWu0%2BOTpmb5zmTkRDVcaMXP3pkEi1BY5fNL80zdxqZy6xh5K8aPfuKTHdYUfgg2Wm3SY5FtX9Gg%2F2i%2FQoXXj4HdNTiKK0wvULsHaoH0o4Vz90nhQkjjkkSQB6W3DcciG0sJoVjsnw%2B014NGprwTQsRl0cjTmVM2MWhMHVSYnDzyBeMiMiQvGMA8DOvwLO2nTze4xxXLmjmVGTVXv0ZQcVjSXZFmOJFENAESquYhNfw1Jzj%2FJQyQdFhL0h1qpem0jRSZoX9MmNt7XH8mSEPkIO385Io6c0USThsoPDiX1XRUY4HBarMJOPB6mB%2BQlnwLHHYzbR8TrqgqTzTnWFSDMp3qQOzMSNxOthpxRs%2FsOcV44df%2FOVRfKtmQllFEwVn%2BKxpiXeFFvur8s6XcogE4dawPpmJCeEDQRtGHso3m62j%2BzCT3FZlRqD4f5SzNro5; path=/speech-to-text/api; secure; HttpOnly
    Server: –
    Via: 1.1 fafcdae
    Via: 1.1 d9166cb
    Via: HTTP/1.1 df8742c
    Content-Type: application/json
    }}

  4. Hi,
    i have implemented this in my vb.net windows application but my file is mp3 and it is around 4 min so during the request and after a certain time an error appears and the operation stops.
    Do you have any idea about that?

  5. {StatusCode: 400, ReasonPhrase: ‘Bad Request’, Version: 1.1, Content: System.Net.Http.StreamContent, Headers:
    {
    X-Backside-Transport: FAIL FAIL
    Connection: Keep-Alive
    Transfer-Encoding: chunked
    Session-Name: AGWNJBRLJJPKXEXK-en-US_BroadbandModel
    X-Content-Type-Options: nosniff
    X-XSS-Protection: 1
    X-Global-Transaction-ID: 1556985792
    Strict-Transport-Security: max-age=31536000;
    X-DP-Watson-Tran-ID: stream-dp01-1556985792
    Date: Fri, 18 Aug 2017 07:10:57 GMT
    Set-Cookie: Watson-DPAT=QUNu58HMKa94Kvr67Ci9ZvLPrvBWEegnEXvgWId2VFHWRzHX5hZ6yeflt%2FZy4SzOh2p9uV3%2BmXg66qFmhBkrc0uIwxv%2BQvW0qG1Psxul1oGOyVqj5CxZqPg7dSXz1KO5nuT3x7072jD%2BMJU%2BxOHfq%2B4cdkj8jnU46l9Jxb7TYFTdj6QusnTCYbefPgCpcWQj68mV3CkHT%2FIDwLg8Vywh%2BmYn3ced4bNDzrmRJpARuoMLjVoRLpdWy6sO1Ha4JMlcPWGTJfI6Jf17JjW7VjlgRmnNJKpWljCMl%2B3V9tGwmcFPlhT1beFcHzFjX2VLvWfq6glFjI04n8x4ZeE4hsdp1ZKsP0yIId6eAotOVsZko0S4WGMtpmYu2XGEA%2FAahcx8jI3Pn%2FrpTO1JjBTdb7oci2vJ5%2Fucfk7qf7x3PIXk%2BCG5A2ZOuT%2FoLRy%2BqKSUegetraIMgpG4trt%2B7lj6j42Jr5U%2BZgL47hNjOklUftfVr7DBzG9tlombO2h5opjtRiu0oWpWu0%2BOTpmb5zmTkRDVcaMXP3pkEi1BY5fNL80zdxqZy6xh5K8aPfuKTHdYUfgg2Wm3SY5FtX9Gg%2F2i%2FQoXXj4HdNTiKK0wvULsHaoH0o4Vz90nhQkjjkkSQB6W3DcciG0sJoVjsnw%2B014NGprwTQsRl0cjTmVM2MWhMHVSYnDzyBeMiMiQvGMA8DOvwLO2nTze4xxXLmjmVGTVXv0ZQcVjSXZFmOJFENAESquYhNfw1Jzj%2FJQyQdFhL0h1qpem0jRSZoX9MmNt7XH8mSEPkIO385Io6c0USThsoPDiX1XRUY4HBarMJOPB6mB%2BQlnwLHHYzbR8TrqgqTzTnWFSDMp3qQOzMSNxOthpxRs%2FsOcV44df%2FOVRfKtmQllFEwVn%2BKxpiXeFFvur8s6XcogE4dawPpmJCeEDQRtGHso3m62j%2BzCT3FZlRqD4f5SzNro5; path=/speech-to-text/api; secure; HttpOnly
    Server: –
    Via: 1.1 fafcdae
    Via: 1.1 d9166cb
    Via: HTTP/1.1 df8742c
    Content-Type: application/json
    }}

Leave a Reply

Your email address will not be published.