Troubleshooting and Solving Data Join Pitfalls




Create a new dataset

Identify a key field in your ecommerce dataset

Pitfall: non-unique key

Join pitfall solution

Troubleshooting and Solving Data Join Pitfalls

1 个小时 5 个积分


Google Cloud Self-Paced Labs


BigQuery is Google's fully managed, NoOps, low cost analytics database. With BigQuery you can query terabytes and terabytes of data without having any infrastructure to manage or needing a database administrator. BigQuery uses SQL and can take advantage of the pay-as-you-go model. BigQuery allows you to focus on analyzing data to find meaningful insights.

Joining data tables can provide meaningful insight into your dataset. However when you join your data, there are common pitfalls that could corrupt your results. This lab focuses on avoiding those pitfalls. Types of joins:

  • Cross join: combines each row of the first dataset with each row of the second dataset, where every combination is represented in the output.
  • Inner join: requires that key values exist in both tables for the records to appear in the results table. Records appear in the merge only if there are matches in both tables for the key values.
  • Left join: Each row in the left table appears in the results, regardless of whether there are matches in the right table.
  • Right join: the reverse of a left join. Each row in the right table appears in the results, regardless of whether there are matches in the left table.

For more information about joins, see Join Page.

The dataset you'll use is an ecommerce dataset that has millions of Google Analytics records for the Google Merchandise Store loaded into BigQuery. You have a copy of that dataset for this lab and will explore the available fields and row for insights.

For syntax information to help you follow and update the queries, see Standard SQL Query Syntax.

What you'll do

In this lab, you perform these tasks:

  • Use BigQuery to explore a dataset

  • Troubleshoot duplicate rows in a dataset

  • Create joins between data tables

  • Understand each join type

加入 Qwiklabs 即可阅读本实验的剩余内容…以及更多精彩内容!

  • 获取对“Google Cloud Console”的临时访问权限。
  • 200 多项实验,从入门级实验到高级实验,应有尽有。
  • 内容短小精悍,便于您按照自己的节奏进行学习。