menu
arrow_back

Speaking with a Webpage - Streaming Speech Transcripts

Speaking with a Webpage - Streaming Speech Transcripts

50 minutes 5 Credits

GSP125

Google Cloud Self-Paced Labs

Overview

The Google Cloud Speech streaming API enables developers to turn spoken language into text in real time. Using the API in combination with Javascript's Web Audio API and Websockets, a Java servlet can accept streamed speech from a webpage and provide text transcripts of it, enabling any web page to use the spoken word as an additional user interface.

This lab is split into multiple sections, each section introduces a component of the final web application.

The webapp you'll create will take audio from the client's microphone and stream it to a Java servlet. The Java servlet passes the data to the Cloud Speech API, which will stream transcriptions of any speech it detects back to the servlet. The servlet then passes the transcription results to the client, which then displays it on the page.

arch.png

To accomplish this, you'll need to create several components:

  • A Java servlet to serve the static HTML, Javascript, and CSS for the web page.
  • The Javascript, HTML, and CSS to connect the webpage to the user's microphone, extract the raw bytes, and stream them to the servlet through a Websocket.
  • A servlet Websocket handler to stream the sound bytes it receives from the client to the Cloud Speech API, and streams the transcription results from the Cloud Speech API back to the client.

What you'll do

  • Create a virtual machine (VM)
  • Start an HTTP Java servlet
  • Capture audio on a webpage
  • Transcribe voice to text

Prerequisites

This lab assumes familiarity with:

  • The Java programming language.
  • Java servlets (specifically, the Jetty servlet container). While other servlet containers can be used, the sample solution uses Jetty, making solutions using other containers harder to verify against.
  • The Javascript programming language. Code for webpages are done almost exclusively in Javascript, and a lab about a webpage would be hard-pressed to avoid using it.
  • The Linux command line. Much of the lab will take place at a Linux command prompt, and familiarity with some common tools and a text editor for that environment will make things easier.
  • The Maven project management tool. While in principle, any Java project management tool can be used, the sample solution uses Maven, making solutions using other tools harder to verify against.

Join Qwiklabs to read the rest of this lab...and more!

  • Get temporary access to the Google Cloud Console.
  • Over 200 labs from beginner to advanced levels.
  • Bite-sized so you can learn at your own pace.
Join to Start This Lab
Score

—/100

Create a virtual machine (zone: us-central1-f)

Run Step

/ 20

Install necessary software on VM instance

Run Step

/ 20

Create a firewall rule to allow TCP traffic on 8443 port in the default network

Run Step

/ 20

Run the sample solution (hello-https)

Run Step

/ 20

Run the sample solution to capture audio on webpage

Run Step

/ 20