2024-06-18 15:53:06 +02:00
|
|
|
---
|
|
|
|
title: GCS
|
|
|
|
slug: /connectors/storage/gcs
|
|
|
|
---
|
|
|
|
|
|
|
|
{% connectorDetailsHeader
|
|
|
|
name="GCS"
|
|
|
|
stage="PROD"
|
2024-08-01 17:02:33 +05:30
|
|
|
platform="OpenMetadata"
|
2024-07-31 00:05:58 +05:30
|
|
|
availableFeatures=["Metadata", "Structured Containers"]
|
|
|
|
unavailableFeatures=["Unstructured Containers"]
|
2024-06-18 15:53:06 +02:00
|
|
|
/ %}
|
|
|
|
|
|
|
|
This page contains the setup guide and reference information for the GCS connector.
|
|
|
|
|
|
|
|
Configure and schedule GCS metadata workflows from the OpenMetadata UI:
|
|
|
|
|
|
|
|
- [Requirements](#requirements)
|
|
|
|
- [Metadata Ingestion](#metadata-ingestion)
|
|
|
|
|
|
|
|
{% partial file="/v1.5/connectors/ingestion-modes-tiles.md" variables={yamlPath: "/connectors/storage/gcs/yaml"} /%}
|
|
|
|
|
|
|
|
## Requirements
|
|
|
|
|
|
|
|
We need the following permissions in GCP:
|
|
|
|
|
|
|
|
### GCS Permissions
|
|
|
|
|
|
|
|
For all the buckets that we want to ingest, we need to provide the following:
|
|
|
|
- `storage.buckets.get`
|
|
|
|
- `storage.buckets.list`
|
|
|
|
- `storage.objects.get`
|
|
|
|
- `storage.objects.list`
|
|
|
|
|
|
|
|
|
|
|
|
### OpenMetadata Manifest
|
|
|
|
|
|
|
|
In any other connector, extracting metadata happens automatically. In this case, we will be able to extract high-level
|
|
|
|
metadata from buckets, but in order to understand their internal structure we need users to provide an `openmetadata.json`
|
|
|
|
file at the bucket root.
|
|
|
|
|
|
|
|
`Supported File Formats: [ "csv", "tsv", "avro", "parquet", "json", "json.gz", "json.zip" ]`
|
|
|
|
|
|
|
|
You can learn more about this [here](/connectors/storage). Keep reading for an example on the shape of the manifest file.
|
|
|
|
|
|
|
|
{% partial file="/v1.5/connectors/storage/manifest.md" /%}
|
|
|
|
|
|
|
|
## Metadata Ingestion
|
|
|
|
|
|
|
|
{% stepsContainer %}
|
|
|
|
|
|
|
|
{% step srNumber=1 %}
|
|
|
|
|
|
|
|
{% stepDescription title="1. Visit the Services Page" %}
|
|
|
|
|
|
|
|
The first step is ingesting the metadata from your sources. Under
|
|
|
|
Settings, you will find a Services link an external source system to
|
|
|
|
OpenMetadata. Once a service is created, it can be used to configure
|
|
|
|
metadata, usage, and profiler workflows.
|
|
|
|
|
|
|
|
To visit the Services page, select Services from the Settings menu.
|
|
|
|
|
|
|
|
{% /stepDescription %}
|
|
|
|
|
|
|
|
{% stepVisualInfo %}
|
|
|
|
|
|
|
|
{% image
|
|
|
|
src="/images/v1.5/connectors/visit-services-page.png"
|
|
|
|
alt="Visit Services Page"
|
|
|
|
caption="Find Dashboard option on left panel of the settings page" /%}
|
|
|
|
|
|
|
|
{% /stepVisualInfo %}
|
|
|
|
|
|
|
|
{% /step %}
|
|
|
|
|
|
|
|
{% step srNumber=2 %}
|
|
|
|
|
|
|
|
{% stepDescription title="2. Create a New Service" %}
|
|
|
|
|
|
|
|
Click on the 'Add New Service' button to start the Service creation.
|
|
|
|
|
|
|
|
{% /stepDescription %}
|
|
|
|
|
|
|
|
{% stepVisualInfo %}
|
|
|
|
|
|
|
|
{% image
|
|
|
|
src="/images/v1.5/connectors/create-new-service.png"
|
|
|
|
alt="Create a new service"
|
|
|
|
caption="Add a new Service from the Storage Services page" /%}
|
|
|
|
|
|
|
|
{% /stepVisualInfo %}
|
|
|
|
|
|
|
|
{% /step %}
|
|
|
|
|
|
|
|
{% step srNumber=3 %}
|
|
|
|
|
|
|
|
{% stepDescription title="3. Select the Service Type" %}
|
|
|
|
|
|
|
|
Select GCS as the service type and click Next.
|
|
|
|
|
|
|
|
{% /stepDescription %}
|
|
|
|
|
|
|
|
{% stepVisualInfo %}
|
|
|
|
|
|
|
|
{% image
|
|
|
|
src="/images/v1.5/connectors/GCS/select-service.png"
|
|
|
|
alt="Select Service"
|
|
|
|
caption="Select your service from the list" /%}
|
|
|
|
|
|
|
|
{% /stepVisualInfo %}
|
|
|
|
|
|
|
|
{% /step %}
|
|
|
|
|
|
|
|
{% step srNumber=4 %}
|
|
|
|
|
|
|
|
{% stepDescription title="4. Name and Describe your Service" %}
|
|
|
|
|
|
|
|
Provide a name and description for your service.
|
|
|
|
|
|
|
|
#### Service Name
|
|
|
|
|
|
|
|
OpenMetadata uniquely identifies services by their Service Name. Provide
|
|
|
|
a name that distinguishes your deployment from other services, including
|
|
|
|
the other Storage services that you might be ingesting metadata
|
|
|
|
from.
|
|
|
|
|
|
|
|
{% /stepDescription %}
|
|
|
|
|
|
|
|
{% stepVisualInfo %}
|
|
|
|
|
|
|
|
{% image
|
|
|
|
src="/images/v1.5/connectors/GCS/add-new-service.png"
|
|
|
|
alt="Add New Service"
|
|
|
|
caption="Provide a Name and description for your Service" /%}
|
|
|
|
|
|
|
|
{% /stepVisualInfo %}
|
|
|
|
|
|
|
|
{% /step %}
|
|
|
|
|
|
|
|
{% step srNumber=5 %}
|
|
|
|
|
|
|
|
{% stepDescription title="5. Configure the Service Connection" %}
|
|
|
|
|
|
|
|
In this step, we will configure the connection settings required for
|
|
|
|
this connector. Please follow the instructions below to ensure that
|
|
|
|
you've configured the connector to read from your GCS service as
|
|
|
|
desired.
|
|
|
|
|
|
|
|
{% /stepDescription %}
|
|
|
|
|
|
|
|
{% stepVisualInfo %}
|
|
|
|
|
|
|
|
{% image
|
|
|
|
src="/images/v1.5/connectors/GCS/service-connection.png"
|
|
|
|
alt="Configure service connection"
|
|
|
|
caption="Configure the service connection by filling the form" /%}
|
|
|
|
|
|
|
|
{% /stepVisualInfo %}
|
|
|
|
|
|
|
|
{% /step %}
|
|
|
|
|
|
|
|
{% extraContent parentTagName="stepsContainer" %}
|
|
|
|
|
|
|
|
#### Connection Details
|
|
|
|
|
|
|
|
|
|
|
|
{% /extraContent %}
|
|
|
|
|
|
|
|
{% partial file="/v1.5/connectors/test-connection.md" /%}
|
|
|
|
|
|
|
|
{% partial file="/v1.5/connectors/storage/configure-ingestion.md" /%}
|
|
|
|
|
|
|
|
{% partial file="/v1.5/connectors/ingestion-schedule-and-deploy.md" /%}
|
|
|
|
|
|
|
|
{% /stepsContainer %}
|
|
|
|
|
|
|
|
{% partial file="/v1.5/connectors/troubleshooting.md" /%}
|