Google Cloud
The Google Cloud integration allows you to use Google Cloud Platform
Configuration
To add the Google Cloud service to your Home Assistant instance, use this My button:
Manual configuration steps
If the above My button doesn’t work, you can also perform the following steps manually:
-
Browse to your Home Assistant instance.
-
In the bottom right corner, select the
Add Integration button. -
From the list, select Google Cloud.
-
Follow the instructions on screen to complete the setup.
Obtaining service account file
-
Visit Cloud Resource Manager
. -
Click
CREATE PROJECT
button at the top. -
Specify convenient
Project name
and clickCREATE
button. -
Make sure that billing is enabled for your Google Cloud Platform project
. -
Enable needed Cloud API visiting one of the links below or APIs library
, selecting your Project
from the dropdown list and clicking theContinue
button: -
Set up authentication:
- Visit this link
- From the toolbar above the
Service account
list, selectCreate service account
. - In the
Service account name
field, enter any name.
If you are requesting a text-to-speech API key:
- Don’t select a value from the Role list. No role is required to access this service.
- Click
Create
. If a note appears, warning that this service account has no role, you may ignore that. - Return to the
Service account
list page and click on the service account you created in step 5 to see the details for this service account. - Choose the
Keys
tab within the details view for this service account. - In the
Add Key
dropdown, selectCreate New Key
. - Specify a
JSON
key type and clickCreate
. - A
[serviceaccountname].json
file will download to your browser. - Upload this file when asked in the integration setup.
- Visit this link
Google Cloud text-to-speech
Google Cloud text-to-speech
Pricing
The Cloud text-to-speech API is priced monthly based on the number of characters to synthesize into audio sent to the service. For up-to-date pricing, see here
Text-to-speech configuration
Below settings can be configured in the options of the integration and in the options
parameter of the tts.speak
service.
Configuration Variables
Default gender of the voice, e.g., male
. Supported languages, genders and voices listed here
Default voice name, e.g., en-US-Wavenet-F
. Supported languages, genders and voices listed herelanguage
and gender
parameters if set.
Default audio encoder. Supported encodings are ogg_opus
, mp3
and linear16
.
Default rate/speed of the voice, in the range [0.25, 4.0]. 1.0 is the normal native speed supported by the specific voice. 2.0 is twice as fast, and 0.5 is half as fast. If unset(0.0), defaults to the native 1.0 speed.
Default pitch of the voice, in the range [-20.0, 20.0]. 20 means increase of 20 semitones from the original pitch. -20 means decrease of 20 semitones from the original pitch.
Default volume gain (in dB) of the voice, in the range [-96.0, 16.0]. If unset, or set to a value of 0.0 (dB), will play at normal native signal amplitude. A value of -6.0 (dB) will play at approximately half the amplitude of the normal native signal amplitude. A value of +6.0 (dB) will play at approximately twice the amplitude of the normal native signal amplitude. Strongly recommend not to exceed +10 (dB) as there’s usually no effective increase in loudness for any value greater than that.
An identifier which selects ‘audio effects’ profiles that are applied on (post synthesized) text-to-speech. Effects are applied on top of each other in the order they are given. Supported profile ids listed here
Default text type. Supported text types are text
and ssml
. Read more on what is that and how to use SSML here
Action speak
The tts.speak
action is the modern way to use Google Cloud TTS action. Add the speak
action, select the entity for your Google Cloud TTS, select the media player entity or group to send the TTS audio to, and enter the message to speak.
For more options about speak
, see the Speak section on the main TTS building block page.
A tts.speak
service call can look like:
service: tts.speak
target:
entity_id: tts.google_cloud
data:
cache: true
media_player_entity_id: media_player.living_room_display
message: this is a test
language: en-US
options:
gender: male
voice: en-US-Wavenet-F
encoding: linear16
speed: 0.9
pitch: -2.5
gain: -5.0
text_type: ssml
profiles:
- telephony-class-application
- wearable-class-device
Action say (legacy)
The tts.google_cloud_say
action can be used when configuring the legacy google_cloud
text-to-speech platform in configuration.yaml
. We recommend new users to instead set up the integration in the UI and use the tts.speak
action with the corresponding Google Cloud text-to-speech entity as target. If you are an existing user of tts.google_cloud_say
, you can still use it but don’t remove the legacy google_cloud
text-to-speech platform in configuration.yaml
. If you remove it, you will have to manually migrate to tts.speak
.
Google Cloud speech-to-text
Google Cloud speech-to-text
Pricing
Speech-to-text is priced based on the amount of audio successfully processed by the service each month, measured in increments of one second. For up-to-date pricing, see here
Speech-to-text configuration
Configuration Variables
One of the transcription models herelatest_short
because this is the recommended one. If you get: 400 Invalid recognition 'config': The requested model is currently not supported for language : <language code>
try changing this to the legacy command_and_search
.