Skip to main content
Skip table of contents

Google BigQuery

Introduction

This guide takes you through setting up Google BigQuery for use with Cyclr, as well as installing a Google BigQuery Connector.

You will need a Google Cloud Platform account.

Setup & Authentication

Overview

The Google BigQuery Connector supports 2 Authentication Types, which are explained in this guide:

  • OAuth 2.0 Authorization Code - requires user interaction to install the Connector.

  • Service Account - impersonates a domain account and requires less user interaction.

Remote Setup in Google Cloud Console - performed by Cyclr Partner

Depending on the Authentication Type you choose, the setup is slightly different after following the initial “common” setup:

Common Setup - Enabling the Google BigQuery API

Both Authentication Types require the following setup to be performed.

To access the Google BigQuery API endpoints, you need to enable the Google BigQuery API within the project in your workspace. Google's documentation on how to do this can be found here.

Then skip to the appropriate section below:

OAuth 2.0 Authorization Code Setup

To authenticate the Connector using the “OAuth 2.0 Authorization Code” Authentication Type, you need to create a web application inside your Google Cloud Platform account to get your OAuth 2.0 credentials:

  1. From the Google Cloud Platform Credentials page, select CREATE CREDENTIALS > OAuth client ID. Create a project with the following settings:

Setting

Value

Application type

Web application

Authorised redirect URIs

A URL containing your Cyclr Service Domain. This can be found under Settings > General Settings > Service Domain in your Cyclr console. For example, https://{ServiceDomain}/connector/callback

  1. Make note of the Client ID and Client secret.

For more information, see Google's guide on how to create client credentials.

Service Account Setup

To authenticate the Connector using the “Service Account” Authentication Type, you need to do the following:

  1. Enable the Google BigQuery API for a project within your workspace

  2. Create a service account

  3. Enable domain-wide delegation for the service account

Creating a Service Account

A service account needs to be created to allow admin-level access to Google BigQuery within your workspace. Google's documentation on how to do this can be found here. When creating the service account, set up the fields below as follows:

  • Grant this service account access to project: Set the service account role to Owner.

  • Grant users access to this service account: Set the Service account admins role field to any users that will be administering the service account.

Make note of the OAuth 2 client ID and Email for your service account. You will need these to enable domain-wide delegation and to authorize your account with Cyclr. You can find this on the service accounts page.

When adding the service account key, set the Key type to JSON, then open the downloaded .json key file and make note of the private_key field. You need this to authorize your account with Cyclr.

Enabling Domain-wide delegation for your Service Account

Domain-wide delegation needs to be enabled for your service account to allow it to access user data on behalf of users in your workspace. You need to be a "super administrator" of the workspace to configure this. Google's documentation on how to do this can be found here. When enabling domain-wide delegation, set up the fields below as follows:

  • Client ID: Set to the OAuth 2 client ID from the previous section.

  • OAuth scopes (comma-delimited): Add the following scope:

    • https://www.googleapis.com/auth/bigquery for BigQuery access.

Partner Setup in Cyclr Console

Having performed the appropriate setup described above, go to your Cyclr Partner Console:

  1. Go to Connectors > Application Connector Library.

  2. Use the search box to locate the Google BigQuery Connector entry.

  3. Select the Pencil button.

  4. Select the Settings tab.

  5. Enter the below values:

Value

Description

Client ID

The default client ID to use.
Only used with “OAuth 2.0 Authorization Code” Authentication. Leave blank if using “Serivce Account“.

Client Secret

The default client secret to use.
Only used with “OAuth 2.0 Authorization Code” Authentication. Leave blank if using “Serivce Account“.

Scopes

The scopes you want to use.

Cyclr uses a default scope of https://www.googleapis.com/auth/bigquery if you don't set this value.

  1. Select Save Changes.

If you leave these values blank, they must be provided each time the Connector is installed.

Cyclr Connector Installation

Depending on the Authentication Type you’ve chosen, the installation of a Google BigQuery Connector is slightly different, as explained below.

You’ll also be able to provide a Project ID and a Dataset ID, but neither are required unless you wish to use the Methods in the Connector’s Tables Category. See the Obtaining Project ID and Dataset ID Values section below for details.

OAuth 2.0 Authorization Code Installation

You can provide the following values:

Value

Description

Project ID

The project ID of the project to use.
This must be entered to use methods in the Connector’s Tables method category.

Dataset ID

The dataset ID of the dataset to use.
This must be entered to use methods in the Connector’s Tables method category.

Service Account Installation

You can provide the following values:

Value

Description

Service Account Email

The email address of the service account.

Service Account Administrator Email

The email address of the administrator of the service account.

Service Account Private Key

The private key of the service account.

Project ID

The project ID of the project to use.
This must be entered to use methods in the Connector’s Tables method category.

Dataset ID

The dataset ID of the dataset to use.
This must be entered to use methods in the Connector’s Tables method category.

Additional Information

Obtaining Project ID and Dataset ID Values

You will need a Project ID and a Dataset ID to use Methods in the Connector’s Tables Category.

Project ID

From the Google Cloud Platform dashboard:

  1. Select the project name from drop down button in the navigation bar.

  2. Make a note of the project ID listed under the ID heading. As an example, that might be:
    myproject-389110

You can also find the project ID if you install the Google BigQuery Connector without a project ID or dataset ID set, then use the Utilities > Projects > List Projects Method.

After you this, you can re-authenticate the Connector and set the Project ID on it.

Dataset ID

From the Google BigQuery dashboard:

  1. In the Explorer pane, select the arrow to the left of the project ID to expand it.

  2. Select the arrow to the left of each dataset to view the tables within it and make a note of the dataset ID.

You can also find the dataset ID if you install the Google BigQuery Connector without a project ID or dataset ID set, then use the Utilities > Datasets > List Datasets Method.

After you this, you can re-authenticate the Connector and set the Dataset ID on it.

Create a Table Custom Object Category

You can create a Custom Object Category to access each table within a project and dataset as it's own method category. Create multiple Custom Object Categories to access multiple tables under a single connector installation.

You need to install the Google BigQuery connector with a Project ID and Dataset ID for table custom objects to function.

To create a table Custom Object, from the Edit Connector page of the Google BigQuery connector:

  1. Under the Methods & Fields heading, select the Tables methods category.

  2. Select the pink Copy this Category to create a Custom Category Object icon.

  3. Select the Select object dropdown.

  4. Select the table you want to create a Custom Object Category for.

  5. Select Copy.

The methods within the new method category you created now target the selected table.

Create a Cycle to request a large set of data

Use the List New Table Data Incrementally or List Partial Table Data method to incrementally list data in a Cycle. You can use Generic Webhook Connector methods to allow the cycle to send a request to itself to continually run. This approach is useful for very large sets of data.

When you use the List Partial Table Data method, you need to track the last index of the table data returned in the current request and send it in the next request. To do this, add request fields for the Generic Webhook Connector then map the last index of the table data returned as additional steps, as explained below.

Prerequisites

  • Install the Google BigQuery connector.

  • Create a Table Custom Object that targets the required table.

  • Install the Generic Webhook Connector

    • To install the Generic Webhook Connector, navigate to the Utility Connectors page:

      • From the console, select Templates > Template Connectors > + Install New Utility.

      • From an account, select Connectors > + Install New Utility.

    • Select Install under the Generic Webhook Connector.

    • Update the Name field to a recognisable name.

    • Select Next.

Add request fields for the Generic Webhook Connector

You only need to do this step for the List Partial Table Data method.

To allow the next request to use the last index of the table data returned in the current request, you must add request fields to the Generic Webhook connector that track the index:

Navigate to the Edit Connector page for the Generic Webhook Connector:

  • From the console, select Templates > Template Connectors. Under the Installed Utility Connectors heading, select the Edit Connector icon next to the the Generic Webhook Connector.

  • From an account, select Connectors. Under the Installed Utility Connectors heading, select the Edit Connector icon next to the Generic Webhook Connector.

Add a request field to the POST method

  1. Under the Methods and Fields heading, select HTTP Methods.

  2. Select POST.

  3. Under the Request Fields heading, select the Add Field icon.

  4. Set the Field Location to lastIndex.

  5. Set the Display Name to Last Index.

  6. Set the Data Type to Integer.

  7. Select Create.

Add a request field to the Webhook method

  1. Under the Methods and Fields heading, select Webhooks.

  2. Select Webhook.

  3. Under the Request Fields heading, select the Add Field icon.

  4. Set the Field Location to lastIndex.

  5. Set the Display Name to Last Index.

  6. Set the Data Type to Integer.

  7. Select Create.

Create a Cycle

From the console:

  1. Select Templates > Template Library.

  2. Select Create New Template.

  3. Enter a template name.

  4. Select Create.

From an account:

  1. Select Cycles.

  2. Select Design New Cycle.

  3. Enter a Cycle name.

  4. Select Create.

Add methods to the Cycle

Add following methods to the Cycle:

  • Generic Webhook > HTTP Methods > POST

  • Generic Webhook > Webhooks > Webhook

  • Google BigQuery > Tables > List New Table Data Incrementally or List Partial Table Data

  • Tools > Delay: You can use the Delay tool to set the time between requests once all current table data has been retrieved.

Connect the methods

Connect the methods as follows:

In this example, add any additional Cycle methods between the true exit of the List Partial Table Data method and the POST method.

Configure a Cycle

Setup the POST to webhook loop

The POST method must target the Webhook method to allow the Cycle to send a request to itself to continually run:

  1. Select the Step setup of the Webhook method.

  2. Copy the webhook URL and close the window.

  3. Select the Step setup of the POST method.

  4. Select Select... next to the URL field.

  5. Select Type a Value.

  6. Enter the webhook URL into the text box and close the window.

Map the last index of the table data returned

This step is only required for the List Partial Table Data method.

  1. Select the Step setup of the List Partial Table Data method.

  2. Select Ignore next to the Start Index field.

  3. Select Webhook.

  4. Select Nothing Selected.

  5. Select Last Index. This is the field added earlier in this process.

  6. Close the window.

  7. Select the Step setup icon of the POST method.

  8. Select Ignore next to the Last Index field. This is the field added earlier in this process.

  9. Select the List Partial Table Data method.

  10. Select Nothing Selected.

  11. Select Last Row Index.

  12. Close the window.

Configure the Rows Per Execution setting

The rows per execution setting determines how many table rows are returned per Cycle execution. You need to manually set the rows per execution, which is dependant on the width of your table data. For example, a starting point of 10000 might be useful.

  1. Select the Step setup of the List Partial Table Data or List New Table Data Incrementally methods.

  2. Select Ignore next to the Rows Per Execution field.

  3. Select Type a Value.

  4. Enter the enter the rows per execution into the text box.

  5. Close the window.

Configure the delay time

Use the delay step to set the time between cycle executions once the cycle retrieves all of the table data. When new table data is found, the cycle continues to run until it retrieves all table data and then reverts back to the delay.

For example, a delay duration of 12 hours makes the Cycle check for new table data once every 12 hours once it retrieves all of the table data.

  1. Select the Step setup of the Delay method.

  2. Set the duration and close the window.

Give dataset access to another user

You can give other users access to a dataset. For more information, see Google BigQuery's documentation on IAM access control.

Add access for another user from the Google BigQuery console:

  1. From the Explorer pane, navigate and select the dataset.

  2. From the dataset navigation menu bar, select Sharing > Permissions.

  3. Select Add Principal.

  4. Under the Add principals heading, in the New principals field, enter users, groups, domains, or service accounts to give access to.

  5. Under the Assign roles heading, select the Select a role dropdown box to assign roles.

  6. Select Save.

Insert table data delay

Because Google BigQuery needs to process and stream the data from Google’s internal servers to the table before you can access it, there might be a delay when you insert table data, before you can access the data.

If the insert call doesn't return an error, the data appears in your table within 24 hours.

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.