copyright | lastupdated | keywords | subcollection | content-type | account-plan | completion-time | ||
---|---|---|---|---|---|---|---|---|
|
2025-02-21 |
text to speech,IBM cloud,getting started,tutorial,synthesize audio,speech synthesis |
text-to-speech |
tutorial |
lite |
10m |
{{site.data.keyword.attribute-definition-list}}
{: #gettingStarted} {: toc-content-type="tutorial"} {: toc-completion-time="10m"}
The {{site.data.keyword.texttospeechfull}} service converts written text to natural-sounding speech to provide speech-synthesis capabilities for applications. This curl
-based tutorial can help you get started quickly with the service. The examples show you how to call the service's POST
and GET /v1/synthesize
methods to request an audio stream.
{: shortdesc}
The tutorial uses the curl
command-line utility to demonstrate REST API calls. For more information about curl
, see Using curl with Watson examples.
{: note}
[IBM Cloud]{: tag-ibm-cloud} Watch the following video for a visual summary of getting started with the {{site.data.keyword.texttospeechshort}} service.
{: video output="iframe" data-script="none" id="watsonmediaplayer" width="560" height="315" scrolling="no" allowfullscreen webkitallowfullscreen mozAllowFullScreen frameborder="0" style="border: 0 none transparent;"}
{: #getting-started-before-you-begin}
{: #getting-started-before-you-begin-cloud}
[IBM Cloud]{: tag-ibm-cloud}
-
Create an instance of the service: {: hide-dashboard}
- Go to the {{site.data.keyword.texttospeechshort}}{: external} page in the {{site.data.keyword.cloud_notm}} catalog.
- Sign up for a free {{site.data.keyword.cloud_notm}} account or log in.
- Read and agree to the terms of the license agreement.
- Click Create.
-
Copy the credentials to authenticate to your service instance: {: hide-dashboard}
-
View the Manage page for the service instance:
- If you are on the Getting started page for your service instance, click the Manage entry in the list of topics.
- If you are on the Resource list page, expand the AI / Machine Learning grouping in the Name column, and click the name of your service instance.
-
On the Manage page, click Show Credentials in the Credentials box.
-
Copy the
API Key
andURL
values for the service instance.
-
This tutorial uses an API key to authenticate. In production, use an IAM token. For more information see Authenticating to IBM Cloud. {: tip}
{: #getting-started-before-you-begin-icpd}
[IBM Cloud Pak for Data]{: tag-cp4d}
The {{site.data.keyword.texttospeechshort}} service must be installed and configured before beginning this tutorial. For more information, see Watson Speech services on Cloud Pak for Data{: external}.
- Create an instance of the service by using the web client, the API, or the command-line interface. For more information about creating a service instance on {{site.data.keyword.icp4dfull_notm}}, see Creating a service instance for Watson Speech services{: external}.
- Follow the instructions in Creating a Watson Speech services instance to obtain a Bearer token for the instance. This tutorial uses a Bearer token to authenticate to the service.
{: #getting-started-synthesize-english} {: step}
The following command use the POST /v1/synthesize
method to synthesize US English input to audio. The request uses the voice en-US_MichaelV3Voice
. It produces audio in the WAV format.
You can use a browser or other tools to play the audio files that are produced by the examples in this tutorial. For more information, see Playing an audio file. {: tip}
-
Issue the following command to synthesize the string "hello world". The request produces a WAV file that is named
hello_world.wav
.[IBM Cloud]{: tag-ibm-cloud}
- Replace
{apikey}
and{url}
with your API key and URL. {: hide-dashboard}
curl -X POST -u "apikey:{apikey}" \ --header "Content-Type: application/json" \ --header "Accept: audio/wav" \ --data "{\"text\":\"hello world\"}" \ --output hello_world.wav \ "{url}/v1/synthesize?voice=en-US_MichaelV3Voice"
{: pre}
[IBM Cloud Pak for Data]{: tag-cp4d} [IBM Software Hub]{: tag-teal}
- Replace
{token}
and{url}
with the access token and URL for your service instance.
curl -X POST \ --header "Authorization: Bearer {token}" \ --header "Content-Type: application/json" \ --header "Accept: audio/wav" \ --data "{\"text\":\"hello world\"}" \ --output hello_world.wav \ "{url}/v1/synthesize?voice=en-US_MichaelV3Voice"
{: pre}
- Replace
{: #getting-started-different-audio} {: step}
The following command again uses the POST /v1/synthesize
method to synthesize the same US English input to audio. But this request uses the voice en-US_AllisonV3Voice
and explicitly requests audio in the default Ogg format.
-
Issue the following command to synthesize the string "hello world" but with a different voice. The request produces an Ogg file that is named
hello_world.ogg
.[IBM Cloud]{: tag-ibm-cloud}
- Replace
{apikey}
and{url}
with your API key and URL. {: hide-dashboard}
curl -X POST -u "apikey:{apikey}" \ --header "Content-Type: application/json" \ --data "{\"text\":\"hello world\"}" \ --output hello_world.ogg \ "{url}/v1/synthesize?voice=en-US_AllisonV3Voice"
{: pre}
[IBM Cloud Pak for Data]{: tag-cp4d} [IBM Software Hub]{: tag-teal}
- Replace
{token}
and{url}
with the access token and URL for your service instance.
curl -X POST \ --header "Authorization: Bearer {token}" \ --header "Content-Type: application/json" \ --header "Accept: audio/wav" \ --data "{\"text\":\"hello world\"}" \ --output hello_world.wav \ "{url}/v1/synthesize?voice=en-US_AllisonV3Voice"
{: pre}
- Replace
{: #getting-started-synthesize-spanish} {: step}
The following command uses the GET /v1/synthesize
method to synthesize Spanish input to an audio file. The GET
method includes three query parameters: accept
to specify the audio format, text
to specify the input text for the audio, and voice
to specify a Spanish voice. Because accept
and text
are passed as query parameters, the request is URL-encoded.
-
Issue the following command to synthesize the string "hola mundo" and produce a WAV file that is named
hola_mundo.wav
.[IBM Cloud]{: tag-ibm-cloud}
- Replace
{apikey}
and{url}
with your API key and URL. {: hide-dashboard}
curl -X GET -u "apikey:{apikey}" \ --output hola_mundo.wav \ "{url}/v1/synthesize?accept=audio%2Fwav&text=hola%20mundo&voice=es-ES_EnriqueV3Voice"
{: pre}
[IBM Cloud Pak for Data]{: tag-cp4d} [IBM Software Hub]{: tag-teal}
- Replace
{token}
and{url}
with the access token and URL for your service instance.
curl -X POST \ --header "Authorization: Bearer {token}" \ --output hola_mundo.wav \ "{url}/v1/synthesize?accept=audio%2Fwav&text=hola%20mundo&voice=es-ES_EnriqueV3Voice"
{: pre}
- Replace
{: #getting-started-next-steps}
- To try an example application that accepts text and generates speech with different voices, see the {{site.data.keyword.texttospeechshort}} demo{: external}.
- For more information about the service's interfaces and features, see Service features.
- For more information about all methods of the service's interfaces, see the API & SDK reference{: external}.