Customerlabs CDP Documentation
A fast, highly scalable, cost-effective, and fully managed cloud data warehouse for analytics, with built-in machine learning. Create your own data marketing lake with BigQuery by storing all your customer event data in one place. CustomerLabs can send the event data into BigQuery with a single click without having to write any code.
To sync your data with BigQuery, you have to follow the following three steps:
- Create a project if not already created and enable BigQuery API for the selected project.
- Add IAM reference ID as a BigQuery user in your Google Cloud account.
- Enable Billing
- Copy the project ID and paste it in CustomerLabs
Create a Project and Enable BigQuery API
1. Log in to your Google Cloud Platform, select a project, or create one if you haven’t created one.
2. Copy the project ID of the new project/relevant project that you want to connect.
3. Check if the BigQuery API is enabled for the chosen project.
BigQuery is automatically enabled in new projects. To activate BigQuery in an existing project, enable the BigQuery API.
Add IAM reference ID
1. Open your Google Cloud platform and navigate to IAM & Admin → IAM → The project you want to connect → Under Permissions, View by principals tab → Grant access
2. In the Add Principals section → paste the Service account ID ([email protected]) under New principals text box.
3. Assign the following roles and click on Save.
- BigQuery Data Owner
- BigQuery Job User
Enable Billing
- If you haven’t enabled billing, navigate to Billing and linked billing account.
- Make sure, your billing is active.
Note: Billing should be enabled for the project to be active.
Configure the project ID in the CustomerLabs
1. Login to your CustomerLabs account.
2. Navigate to “Destinations” and select BigQuery
3. You will see a pop-up screen, click on Enable and then go to Configuration settings
4. Now, paste your Google project ID which you had copied from your cloud account → Save Changes
You have successfully connected CustomerLabs with BigQuery for sending your data.
Schema
Table: events_data | ||||
field_name | field_type | field_mode | Description | Samples |
screen_size (generated field based on IP address) | STRING | NULLABLE | Screen size of the user’s device | 1920 x 1080, 1536 x 864 |
platform (generated field based on IP address) | STRING | NULLABLE | The OS specification of the user’s device | Windows 10, Mac OS X10 |
country (generated field based on IP address) | STRING | NULLABLE | Country of the user | India, United States |
city (generated field based on IP address) | STRING | NULLABLE | City of the user | Chennai, Saint Paul |
state (generated field based on IP address) | STRING | NULLABLE | State of the user | Tamil Nadu, Minnesota |
postal_code (generated field based on IP address) | STRING | NULLABLE | Postal code of the user | 500044 |
continent (generated field based on IP address) | STRING | NULLABLE | Continent of the user | Asia, North America |
visitor_type | STRING | NULLABLE | The value will be a New user or a Returning user | New, Returning |
session_id | STRING | NULLABLE | The session_id of the event | CL-5bh3y984u23uefhf |
source | STRING | NULLABLE | The value will be ‘direct’ or ‘google search’ | direct, google search |
source_type | STRING | NULLABLE | The value will be ‘organic’ or ‘unknown’ | organic, unknown |
page_title | STRING | NULLABLE | The title of the page visited by the user | Festive Sale|Buy Now| XXXX |
page_url | STRING | NULLABLE | The URL of the page visited by the user | www.sample.com |
device | STRING | NULLABLE | The value will be ‘mobile device’ or ‘desktop’ | mobile device, desktop |
ip_address | STRING | NULLABLE | IP Address of the user | 123.123.123.123 |
location (generated field based on IP address) | STRING | NULLABLE | latitude and longitude | 37.7510 , -97.8220 |
timezone (generated field based on IP address) | STRING | NULLABLE | Timezone of the user | Asia/Kolkata, America/Chicago |
user_traits | RECORD | REPEATED | first_name, last_name, email, and phone_number are accurate fields, explicitly collected from the users and displayed as key-value pairs. Fields such as location, city, state, country, and timezone are derived from ip_address. | user_traits |
key | STRING | NULLABLE | ||
value | STRING | NULLABLE | ||
group_traits | RECORD | REPEATED | Future use | NA |
key | STRING | NULLABLE | ||
value | STRING | NULLABLE | ||
cl_utm_params | RECORD | REPEATED | utm_source, utm_medium will be categorised as cl_utm_params and displayed as key-value pairs | cl_utm_params |
key | STRING | NULLABLE | ||
value | STRING | NULLABLE | ||
cl_other_params | RECORD | REPEATED | Query parameters from the URL will be fetched and stored as key-value pairs. Since UTM parameters are managed by cl_utm_params, they won’t be included in the cl_other_params field. | cl_other_params |
key | STRING | NULLABLE | ||
value | STRING | NULLABLE | ||
event_attributes | RECORD | REPEATED | Event attributes provide additional context or details about an event such as source, status etc., and will be categorised as event_attributes displayed as key-value pairs with type. | event_attributes |
key | STRING | NULLABLE | ||
value | STRING | NULLABLE | ||
type | STRING | NULLABLE | ||
products | RECORD | REPEATED | This field is available only for e-commerce. Standard events provided by CustomerLabs are:cl_added_to_cartcl_added_to_wishlistcl_category_viewedcl_checkout_madecl_product_clickedcl_product_viewedcl_purchasedcl_removed_from_cartThe products with properties such as product_name, product_price etc., involved in a particular event will be displayed as key-value pairs with type. | product_traits |
product_traits | RECORD | REPEATED | ||
key | STRING | NULLABLE | ||
value | STRING | NULLABLE | ||
type | STRING | NULLABLE | ||
external_ids | RECORD | REPEATED | Other user identification parameters used by different platforms are stored as External IDs. These can be appended to a user which helps unify the user details in CustomerLabs and identify the user across multiple platforms based on the IDs collected.Apart from fetching standard user identifiers such as email & phone number, the system also supports the collection of External User IDs The following external IDs identify_by_email, identify_by_phone, audiencelab_id, google_analytics__session_id, google_analytics__client_id, facebook___fbp, facebook___fbc, customerlabs_user_id and other external IDs will be displayed as key-value pairs with type | external_ids |
key | STRING | NULLABLE | ||
value | STRING | NULLABLE | ||
type | STRING | NULLABLE | ||
group_external_ids | RECORD | REPEATED | Future use | NA |
key | STRING | NULLABLE | ||
value | STRING | NULLABLE | ||
type | STRING | NULLABLE | ||
additional_info | RECORD | REPEATED | Details about the user’s first_touch (first interaction), device, session, platform, and other information will be displayed as key-value pairs. | additional_info |
key | STRING | NULLABLE | ||
value | STRING | NULLABLE | ||
segments | RECORD | REPEATED | Future use | NA |
segment_id | STRING | NULLABLE | ||
segment_name | STRING | NULLABLE | ||
added_at | TIMESTAMP | NULLABLE | ||
Table: users_data | ||||
field_name | field_type | field_mode | Description | Samples |
user_id | STRING | NULLABLE | The userID given to a user by CustomerLabs | cl3967trkvb12gh35mm802-999k-42k9-6bvd-nj9034msgfr7 |
traits | RECORD | REPEATED | first_name, last_name, email, and phone_number are accurate fields, explicitly collected from the users and displayed as key-value pairs. Fields such as location, city, state, country, and timezone are derived from ip_address. | user_traits |
key | STRING | NULLABLE | ||
value | STRING | NULLABLE | ||
external_ids | RECORD | REPEATED | Other user identification parameters used by different platforms are stored as External IDs. These can be appended to a user which helps unify the user details in CustomerLabs and identify the user across multiple platforms based on the IDs collected.Apart from fetching standard user identifiers such as email & phone number, the system also supports the collection of External User IDs The following external IDs identify_by_email, identify_by_phone, audiencelab_id, google_analytics__session_id, google_analytics__client_id, facebook___fbp, facebook___fbc, customerlabs_user_id and other external IDs will be displayed as key-value pairs with type. | external_ids |
key | STRING | NULLABLE | ||
value | STRING | NULLABLE | ||
type | STRING | NULLABLE | ||
additional_info | RECORD | REPEATED | Details about the user’s first_touch (first interaction), device, session, platform, and other information will be displayed as key-value pairs. | additional_info |
key | STRING | NULLABLE | ||
value | STRING | NULLABLE | ||
segments | RECORD | REPEATED | Future use | NA |
segment_id | STRING | NULLABLE | ||
segment_name | STRING | NULLABLE | ||
added_at | STRING | NULLABLE | ||
inserted_at | TIMESTAMP | NULLABLE | Time the user data is ingested in the DB. | 2024-06-07T14:30:03.433Z |
Table: anonymous_users_data | ||||
field_name | field_type | field_mode | Description | Samples |
user_id | STRING | NULLABLE | The userID given to a user by CustomerLabs | cl3967trkvb12gh35mm802-999k-42k9-6bvd-nj9034msgfr7 |
anonymous_user_id | STRING | NULLABLE | The ID is given to an anonymous user by CustomerLabs. This user will be converted as a known user when the user shares credentials in other sessions. The obtained information is stitched with the user ID for user unification. | cl9090tjkdew789854fgg-0hkl-k0w9-6bvd-klbh234598gf |
inserted_at | TIMESTAMP | NULLABLE | Time the user data is ingested in the DB. | 2024-06-07T14:30:03.433Z |
Sample data
user_traits
[
{
"key": "email",
"value": "[email protected]"
},
{
"key": "first_name",
"value": "test"
},
{
"key": "last_name",
"value": "last"
}
]
cl_utm_params
[
{
"key": "utm_source",
"value": "[email protected]"
},
{
"key": "utm_medium",
"value": "test"
},
{
"key": "utm_campaign",
"value": "last"
}
]
cl_other_params
[
{
"key": "gad_source",
"value": "2"
},
{
"key": "gclid",
"value": "Cxfgcgyugyhuivdshue89732r89uy298r2y37uhr38wyhdixh8he38rh8d32ywhf2u289uerdjiw9"
},
{
"key": "user-agent",
"value": "Mozilla/5.0 (Linux; Android 10; K) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/116.0.0.0 Mobile Safari/537.36"
}
]
event_attributes
[
{
"key": "event_from",
"value": "website"
},
{
"key": "event_source",
"value": "js_helper"
},
{
"key": "event_status",
"value": "active"
}
]
product_traits
[
[ //Product 1
{
"key": "product_id",
"value": "1",
"type": "number"
},
{
"key": "product_name",
"value": "test",
"type": "string"
},
{
"key": "product_price",
"value": "100",
"type": "number"
}
],
[ //Product 2
{
"key": "product_id",
"value": "2",
"type": "number"
},
{
"key": "product_name",
"value": "test",
"type": "string"
},
{
"key": "product_price",
"value": "100",
"type": "number"
}
]
]
external_ids
[
{
"key": "client_id",
"value": "js8628376773",
"type": "google_analytics"
},
{
"key": "_fbc",
"value": "js786486283752523",
"type": "facebook"
}
]
additional_info
[
{
"key": "platform",
"value": "Android 6"
},
{
"key": "browser",
"value": "Googlebot 2"
},
{
"key": "browser_language",
"value": "en-US"
},
{
"key": "screen_size",
"value": "412 x 732"
}
]
Sample Query
Sample 1 : Query list of known and anonymous user ids who performed an event.
SELECT user_id, count(user_id) FROM `[Dataset Id].events_data` WHERE action = "pageview" GROUP BY user_id
Sample 2 : Query with identity resolution (merging anonymous and known user data) using `[Dataset Id].anonymous_users_data`
WITH
anonymous_user_list AS (
SELECT user_id, anonymous_user_id FROM (
SELECT user_id, anonymous_user_id, ROW_NUMBER() OVER( PARTITION BY anonymous_user_id ORDER BY inserted_at DESC ) AS top
FROM anonymous_users
) WHERE top = 1
)
SELECT user_id, count(user_id) FROM (
SELECT IFNULL(aul.user_id, t0.user_id) AS user_id FROM `[Dataset Id].events_data` t0
LEFT JOIN anonymous_user_list AS aul ON aul.anonymous_user_id = t0.user_id
WHERE action = "pageview"
) GROUP BY user_id
Sample 3 : Query to get user information based on the event
WITH
anonymous_user_list AS (
SELECT user_id, anonymous_user_id FROM (
SELECT user_id, anonymous_user_id, ROW_NUMBER() OVER( PARTITION BY anonymous_user_id ORDER BY inserted_at DESC ) AS top
FROM `[Dataset Id].anonymous_users_data`
) WHERE top = 1
),
event_users_list AS (
SELECT user_id FROM (
SELECT IFNULL(aul.user_id, t0.user_id) AS user_id FROM `[Dataset Id].events_data` t0
LEFT JOIN anonymous_user_list AS aul ON aul.anonymous_user_id = t0.user_id
WHERE action = "Added to cart"
) GROUP BY user_id
),
users_master AS (
SELECT * FROM (
SELECT *, ROW_NUMBER() OVER( PARTITION BY user_id ORDER BY inserted_at DESC ) AS top FROM `[Dataset Id].users_data`
) WHERE top = 1
)
SELECT * FROM event_users_list AS eul
INNER JOIN users_master AS um ON eul.user_id = um.user_id
Notes:
Salesforce to BigQuery Data Integration: You can send data from your Salesforce CRM to BigQuery through CustomerLabs CDP seamlessly.
Hubspot to BigQuery Data Integration: You can sync your first party data in Hubspot to BigQuery through CustomerLabs first-party data CDP without much effort.