Monday, December 23, 2019

Getting Started With BigQuery–A Cloud Storage

Of course we need a place where we can store all the necessary data. In this article we’re going to talk about Google BigQuery, a cloud storage service that perfectly handle such task.

Besides keeping data in one place, you can also use BigQuery for analyzing the collected data with SQL queries. And in order to do everything properly and accurately, you need to structure your data correctly. How to do that? Read in our article.

Datasets: What they are and how to create one

You need to adjoin a dataset in BigQuery whenever you create a project in GCP.

What is a dataset? It’s a top-level container which basically keep your data organized in plenty of tables and views. You also get control access to data with it.

So now go ahead and open your project in GCP and go to the BigQuery tab. There you need to choose “Create Dataset.” Indicate the title for the dataset and the shelf of a table.

If you want tables with data to be deleted automatically, specify when exactly. If there’s no such need, just leave the default Perpetual option.

Adding a table for upload data

Having a dataset you now can add a table that will get the data. It consists of rows which in turn consist of columns–fields. You can create a table in different ways:

  • Create an empty table and set up a data schema for it by yourself
  • Create a table using the result of a previously calculated SQL query
  • Input a file from your computer
  • Instead of downloading or streaming data, you can create a table that refers to an external source: Cloud Bigtable, Cloud Storage, or Google Drive.

So let’s take a close look at the first option.

Here’s what you need to do.

1. Pick the dataset to which you want to add the table, then — Create Table.

2. Choose the Empty Table in the Source field, then select Table in the native format of the target object and name it in Latin characters.

3. Specify the table schema: two necessary (column name and data type) and two optional (column mode and description). To properly work with the data, please, pay attention with all these fields.

With an unfilled table in BigQuery, you need to set the schema manually:

Either click the “Add field” button or enter the table schema as a JSON array using the Edit as a text switch.

Please note: BigQuery can automatically change the name of a column to make it compatible with its own SQL syntax when loading Google files. That’s why tables should be named using English.

Making modifications to the table schema

Just as we’ve mentioned above the BigQuery might change something automatically in the tables information. But no need to panic. You can make changes to tables by your hands.

Using a SQL query as an example, we’ve added one below–select all the columns in the table and retitle the table. In this case, you can overwrite the existing table or create a new one:

#legacySQL

Select

date,

order_id,

order___________ as order_type, — new field name

product_id

from

[project_name:dataset_name.owoxbi_sessions_20190314]

#standardSQL

Select

* EXCEPT (orotp, ddat),

orotp as order_id,

ddat as date

from `project_name.dataset_name.owoxbi_sessions_20190314`

And using another SQL query, you can make changes to the organization of your data. Go ahead and choose all data from a table and convert the corresponding column to a different data type. And here’s another example:

#standardSQL

Select

CAST (order_id as STRING) as order_id,

CAST (date as TIMESTAMP) as date

from `project_name.dataset_name.owoxbi_sessions_20190314`

To all the possible changes mentioned above the column, mode change can be also added as described in the help documentation.

Use the SELECT * EXCEPT query to remove a column (or columns), then write the query results to the old table or create a new one. One more example of such a query:

#standardSQL

Select

* EXCEPT (order_id)

from `project_name.dataset_name.owoxbi_sessions_20190314`

More information you can find here Google Cloud Platform help documentation.

Exporting and importing data stored in BigQuery

There are also several ways to export and import your data. Let’s look through the first option — via the Google BigQuery interface

Importing data

  • Open your dataset
  • Click Create Table, and select the data source: Cloud Storage, your computer, Google Drive, or Cloud Bigtable
  • Specify the path to the file, its format, and the name of the table
  • Click Create Table

Ta-da, a table will appear in your dataset.

Exporting data

  • Create a report through the system interface:
    • Open the desired table with data
    • Click Export.

You’ll see two variants: see and save the report in Google Data Studio or download data to Google Cloud Storage by specifing where to save the data and in what format.

Export and import data using an add-on from OWOX BI

And this is a second option for exporting and importing data — using the free OWOX BI BigQuery Reports add-on. With it you can transfer data practically in the blink of an eye from Google BigQuery to Google Sheets and vice versa without any CSV files.

Let’s imagine such situation. You want to transfer offline data to BigQuery for a ROPO report. You can do this by simply following the instruction:

  1. Set up add-on in your browser
  2. Open your data file in Google Sheets
  3. Choose OWOX BI BigQuery Reports → Upload data to BigQuery in the tab Add-ons
  4. Select the necessary project and dataset in BigQuery and give a title to the table
  5. Pick the fields whose values you want to load.
  6. Start the Upload

If you want to export data from BigQuery to Google Sheets follow these steps:

  1. Open Google Sheets.
  2. Pick OWOX BI BigQuery Reports → Add a new report in the menu
  1. Enter your project in Google BigQuery
  2. Select Add new query and insert your SQL query to upload data or to pull and calculate the necessary data.
  3. Retitle with a convenient name
  4. Click the Save & Run button.

Sometimes we need to get the updated accurate report on a daily/weekly basis, right? For this you can schedule such automatic updates:

  1. Choose OWOX BI BigQuery Reports → Schedule report in the Google Sheets menu.
  2. Set the time and frequency
  3. Click Save

To get more specific instruction on starting work with BigQuery click just follow the link https://www.owox.com/blog/use-cases/bigquery-schema/



from Feedster https://www.feedster.com/technology/getting-started-with-bigquery-a-cloud-storage/

No comments:

Post a Comment