mirror of
https://github.com/open-metadata/OpenMetadata.git
synced 2025-08-18 14:06:59 +00:00
parent
3bcef4f58c
commit
8708520c28
@ -393,6 +393,12 @@ site_menu:
|
||||
url: /openmetadata/connectors/pipeline/fivetran/airflow
|
||||
- category: OpenMetadata / Connectors / Pipeline / Fivetran / CLI
|
||||
url: /openmetadata/connectors/pipeline/fivetran/cli
|
||||
- category: OpenMetadata / Connectors / Pipeline / Dagster
|
||||
url: /openmetadata/connectors/pipeline/dagster
|
||||
- category: OpenMetadata / Connectors / Pipeline / Dagster / Airflow
|
||||
url: /openmetadata/connectors/pipeline/dagster/airflow
|
||||
- category: OpenMetadata / Connectors / Pipeline / Dagster / CLI
|
||||
url: /openmetadata/connectors/pipeline/dagster/cli
|
||||
|
||||
- category: OpenMetadata / Connectors / ML Model
|
||||
url: /openmetadata/connectors/ml-model
|
||||
|
@ -54,6 +54,7 @@ OpenMetadata can extract metadata from the following list of connectors:
|
||||
- [Airflow](/openmetadata/connectors/pipeline/airflow)
|
||||
- [Glue](/openmetadata/connectors/pipeline/glue)
|
||||
- [Fivetran](/openmetadata/connectors/pipeline/fivetran)
|
||||
- [Dagster](/openmetadata/connectors/pipeline/dagster)
|
||||
|
||||
## ML Model Services
|
||||
|
||||
|
@ -0,0 +1,304 @@
|
||||
---
|
||||
title: Run Dagster Connector using Airflow SDK
|
||||
slug: /openmetadata/connectors/pipeline/dagster/airflow
|
||||
---
|
||||
|
||||
# Run Dagster using the Airflow SDK
|
||||
|
||||
In this section, we provide guides and references to use the Dagster connector.
|
||||
|
||||
Configure and schedule Dagster metadata and profiler workflows from the OpenMetadata UI:
|
||||
- [Requirements](#requirements)
|
||||
- [Metadata Ingestion](#metadata-ingestion)
|
||||
|
||||
## Requirements
|
||||
|
||||
<InlineCallout color="violet-70" icon="description" bold="OpenMetadata 0.12 or later" href="/deployment">
|
||||
To deploy OpenMetadata, check the <a href="/deployment">Deployment</a> guides.
|
||||
</InlineCallout>
|
||||
|
||||
To run the Ingestion via the UI you'll need to use the OpenMetadata Ingestion Container, which comes shipped with
|
||||
custom Airflow plugins to handle the workflow deployment.
|
||||
|
||||
### Python Requirements
|
||||
|
||||
To run the Dagster ingestion, you will need to install:
|
||||
|
||||
```bash
|
||||
pip3 install "openmetadata-ingestion[dagster]"
|
||||
```
|
||||
|
||||
## Metadata Ingestion
|
||||
|
||||
All connectors are defined as JSON Schemas.
|
||||
[Here](https://github.com/open-metadata/OpenMetadata/blob/main/catalog-rest-service/src/main/resources/json/schema/entity/services/connections/pipeline/dagsterConnection.json)
|
||||
you can find the structure to create a connection to Dagster.
|
||||
|
||||
In order to create and run a Metadata Ingestion workflow, we will follow
|
||||
the steps to create a YAML configuration able to connect to the source,
|
||||
process the Entities if needed, and reach the OpenMetadata server.
|
||||
|
||||
The workflow is modeled around the following
|
||||
[JSON Schema](https://github.com/open-metadata/OpenMetadata/blob/main/catalog-rest-service/src/main/resources/json/schema/metadataIngestion/workflow.json)
|
||||
|
||||
### 1. Define the YAML Config
|
||||
|
||||
This is a sample config for Dagster:
|
||||
|
||||
```yaml
|
||||
source:
|
||||
type: dagster
|
||||
serviceName: dagster_source
|
||||
serviceConnection:
|
||||
config:
|
||||
type: Dagster
|
||||
hostPort: http://localhost:8080
|
||||
numberOfStatus: 10
|
||||
dbConnection:
|
||||
type: name of database service
|
||||
username: db username
|
||||
password: db password
|
||||
databaseSchema: database name
|
||||
hostPort: host and port for database
|
||||
sourceConfig:
|
||||
config:
|
||||
type: PipelineMetadata
|
||||
# includeLineage: true
|
||||
# pipelineFilterPattern:
|
||||
# includes:
|
||||
# - pipeline1
|
||||
# - pipeline2
|
||||
# excludes:
|
||||
# - pipeline3
|
||||
# - pipeline4
|
||||
sink:
|
||||
type: metadata-rest
|
||||
config: { }
|
||||
workflowConfig:
|
||||
loggerLevel: INFO
|
||||
openMetadataServerConfig:
|
||||
hostPort: http://localhost:8585/api
|
||||
authProvider: no-auth
|
||||
```
|
||||
|
||||
#### Source Configuration - Service Connection
|
||||
|
||||
|
||||
- **hostPort**: host and port for dagster pipeline
|
||||
- **numberOfStatus**: 10
|
||||
- **dbConnection**
|
||||
- **type**: Name of the Database Service
|
||||
- **username**: db username
|
||||
- **password**: db password
|
||||
- **databaseSchema**: database name
|
||||
- **hostPort**: host and port for database connection
|
||||
|
||||
#### Source Configuration - Source Config
|
||||
|
||||
The `sourceConfig` is defined [here](https://github.com/open-metadata/OpenMetadata/blob/main/catalog-rest-service/src/main/resources/json/schema/metadataIngestion/pipelineServiceMetadataPipeline.json):
|
||||
|
||||
- `dbServiceName`: Database Service Name for the creation of lineage, if the source supports it.
|
||||
- `pipelineFilterPattern` and `chartFilterPattern`: Note that the `pipelineFilterPattern` and `chartFilterPattern` both support regex as include or exclude. E.g.,
|
||||
|
||||
```yaml
|
||||
pipelineFilterPattern:
|
||||
includes:
|
||||
- users
|
||||
- type_test
|
||||
```
|
||||
|
||||
#### Sink Configuration
|
||||
|
||||
To send the metadata to OpenMetadata, it needs to be specified as `type: metadata-rest`.
|
||||
|
||||
#### Workflow Configuration
|
||||
|
||||
The main property here is the `openMetadataServerConfig`, where you can define the host and security provider of your OpenMetadata installation.
|
||||
|
||||
For a simple, local installation using our docker containers, this looks like:
|
||||
|
||||
```yaml
|
||||
workflowConfig:
|
||||
openMetadataServerConfig:
|
||||
hostPort: http://localhost:8585/api
|
||||
authProvider: no-auth
|
||||
```
|
||||
|
||||
We support different security providers. You can find their definitions [here](https://github.com/open-metadata/OpenMetadata/tree/main/catalog-rest-service/src/main/resources/json/schema/security/client).
|
||||
You can find the different implementation of the ingestion below.
|
||||
|
||||
<Collapse title="Configure SSO in the Ingestion Workflows">
|
||||
|
||||
### Auth0 SSO
|
||||
|
||||
```yaml
|
||||
workflowConfig:
|
||||
openMetadataServerConfig:
|
||||
hostPort: 'http://localhost:8585/api'
|
||||
authProvider: auth0
|
||||
securityConfig:
|
||||
clientId: '{your_client_id}'
|
||||
secretKey: '{your_client_secret}'
|
||||
domain: '{your_domain}'
|
||||
```
|
||||
|
||||
### Azure SSO
|
||||
|
||||
```yaml
|
||||
workflowConfig:
|
||||
openMetadataServerConfig:
|
||||
hostPort: 'http://localhost:8585/api'
|
||||
authProvider: azure
|
||||
securityConfig:
|
||||
clientSecret: '{your_client_secret}'
|
||||
authority: '{your_authority_url}'
|
||||
clientId: '{your_client_id}'
|
||||
scopes:
|
||||
- your_scopes
|
||||
```
|
||||
|
||||
### Custom OIDC SSO
|
||||
|
||||
```yaml
|
||||
workflowConfig:
|
||||
openMetadataServerConfig:
|
||||
hostPort: 'http://localhost:8585/api'
|
||||
authProvider: custom-oidc
|
||||
securityConfig:
|
||||
clientId: '{your_client_id}'
|
||||
secretKey: '{your_client_secret}'
|
||||
domain: '{your_domain}'
|
||||
```
|
||||
|
||||
### Google SSO
|
||||
|
||||
```yaml
|
||||
workflowConfig:
|
||||
openMetadataServerConfig:
|
||||
hostPort: 'http://localhost:8585/api'
|
||||
authProvider: google
|
||||
securityConfig:
|
||||
secretKey: '{path-to-json-creds}'
|
||||
```
|
||||
|
||||
### Okta SSO
|
||||
|
||||
```yaml
|
||||
workflowConfig:
|
||||
openMetadataServerConfig:
|
||||
hostPort: http://localhost:8585/api
|
||||
authProvider: okta
|
||||
securityConfig:
|
||||
clientId: "{CLIENT_ID - SPA APP}"
|
||||
orgURL: "{ISSUER_URL}/v1/token"
|
||||
privateKey: "{public/private keypair}"
|
||||
email: "{email}"
|
||||
scopes:
|
||||
- token
|
||||
```
|
||||
|
||||
### Amazon Cognito SSO
|
||||
|
||||
The ingestion can be configured by [Enabling JWT Tokens](https://docs.open-metadata.org/deployment/security/enable-jwt-tokens)
|
||||
|
||||
```yaml
|
||||
workflowConfig:
|
||||
openMetadataServerConfig:
|
||||
hostPort: 'http://localhost:8585/api'
|
||||
authProvider: auth0
|
||||
securityConfig:
|
||||
clientId: '{your_client_id}'
|
||||
secretKey: '{your_client_secret}'
|
||||
domain: '{your_domain}'
|
||||
```
|
||||
|
||||
### OneLogin SSO
|
||||
|
||||
Which uses Custom OIDC for the ingestion
|
||||
|
||||
```yaml
|
||||
workflowConfig:
|
||||
openMetadataServerConfig:
|
||||
hostPort: 'http://localhost:8585/api'
|
||||
authProvider: custom-oidc
|
||||
securityConfig:
|
||||
clientId: '{your_client_id}'
|
||||
secretKey: '{your_client_secret}'
|
||||
domain: '{your_domain}'
|
||||
```
|
||||
|
||||
### KeyCloak SSO
|
||||
|
||||
Which uses Custom OIDC for the ingestion
|
||||
|
||||
```yaml
|
||||
workflowConfig:
|
||||
openMetadataServerConfig:
|
||||
hostPort: 'http://localhost:8585/api'
|
||||
authProvider: custom-oidc
|
||||
securityConfig:
|
||||
clientId: '{your_client_id}'
|
||||
secretKey: '{your_client_secret}'
|
||||
domain: '{your_domain}'
|
||||
```
|
||||
|
||||
</Collapse>
|
||||
|
||||
|
||||
## 2. Prepare the Ingestion DAG
|
||||
|
||||
Create a Python file in your Airflow DAGs directory with the following contents:
|
||||
|
||||
```python
|
||||
import pathlib
|
||||
import yaml
|
||||
from datetime import timedelta
|
||||
from airflow import DAG
|
||||
|
||||
try:
|
||||
from airflow.operators.python import PythonOperator
|
||||
except ModuleNotFoundError:
|
||||
from airflow.operators.python_operator import PythonOperator
|
||||
|
||||
from metadata.config.common import load_config_file
|
||||
from metadata.ingestion.api.workflow import Workflow
|
||||
from airflow.utils.dates import days_ago
|
||||
|
||||
default_args = {
|
||||
"owner": "user_name",
|
||||
"email": ["username@org.com"],
|
||||
"email_on_failure": False,
|
||||
"retries": 3,
|
||||
"retry_delay": timedelta(minutes=5),
|
||||
"execution_timeout": timedelta(minutes=60)
|
||||
}
|
||||
|
||||
config = """
|
||||
<your YAML configuration>
|
||||
"""
|
||||
|
||||
def metadata_ingestion_workflow():
|
||||
workflow_config = yaml.safe_load(config)
|
||||
workflow = Workflow.create(workflow_config)
|
||||
workflow.execute()
|
||||
workflow.raise_from_status()
|
||||
workflow.print_status()
|
||||
workflow.stop()
|
||||
|
||||
with DAG(
|
||||
"sample_data",
|
||||
default_args=default_args,
|
||||
description="An example DAG which runs a OpenMetadata ingestion workflow",
|
||||
start_date=days_ago(1),
|
||||
is_paused_upon_creation=False,
|
||||
schedule_interval='*/5 * * * *',
|
||||
catchup=False,
|
||||
) as dag:
|
||||
ingest_task = PythonOperator(
|
||||
task_id="ingest_using_recipe",
|
||||
python_callable=metadata_ingestion_workflow,
|
||||
)
|
||||
```
|
||||
|
||||
Note that from connector to connector, this recipe will always be the same. By updating the YAML configuration, you will
|
||||
be able to extract metadata from different sources.
|
@ -0,0 +1,256 @@
|
||||
---
|
||||
title: Run Dagster Connector using the CLI
|
||||
slug: /openmetadata/connectors/pipeline/dagster/cli
|
||||
---
|
||||
|
||||
# Run Dagster using the metadata CLI
|
||||
|
||||
In this section, we provide guides and references to use the Dagster connector.
|
||||
|
||||
Configure and schedule Dagster metadata and profiler workflows from the OpenMetadata UI:
|
||||
- [Requirements](#requirements)
|
||||
- [Metadata Ingestion](#metadata-ingestion)
|
||||
|
||||
## Requirements
|
||||
|
||||
<InlineCallout color="violet-70" icon="description" bold="OpenMetadata 0.12 or later" href="/deployment">
|
||||
To deploy OpenMetadata, check the <a href="/deployment">Deployment</a> guides.
|
||||
</InlineCallout>
|
||||
|
||||
To run the Ingestion via the UI you'll need to use the OpenMetadata Ingestion Container, which comes shipped with
|
||||
custom Airflow plugins to handle the workflow deployment.
|
||||
|
||||
### Python Requirements
|
||||
|
||||
To run the Dagster ingestion, you will need to install:
|
||||
|
||||
```bash
|
||||
pip3 install "openmetadata-ingestion[dagster]"
|
||||
```
|
||||
|
||||
## Metadata Ingestion
|
||||
|
||||
All connectors are defined as JSON Schemas.
|
||||
[Here](https://github.com/open-metadata/OpenMetadata/blob/main/catalog-rest-service/src/main/resources/json/schema/entity/services/connections/pipeline/dagsterConnection.json)
|
||||
you can find the structure to create a connection to Dagster.
|
||||
|
||||
In order to create and run a Metadata Ingestion workflow, we will follow
|
||||
the steps to create a YAML configuration able to connect to the source,
|
||||
process the Entities if needed, and reach the OpenMetadata server.
|
||||
|
||||
The workflow is modeled around the following
|
||||
[JSON Schema](https://github.com/open-metadata/OpenMetadata/blob/main/catalog-rest-service/src/main/resources/json/schema/metadataIngestion/workflow.json)
|
||||
|
||||
### 1. Define the YAML Config
|
||||
|
||||
This is a sample config for Dagster:
|
||||
|
||||
```yaml
|
||||
source:
|
||||
type: dagster
|
||||
serviceName: dagster_source
|
||||
serviceConnection:
|
||||
config:
|
||||
type: Dagster
|
||||
hostPort: http://localhost:8080
|
||||
numberOfStatus: 10
|
||||
dbConnection:
|
||||
type: name of database service
|
||||
username: db username
|
||||
password: db password
|
||||
databaseSchema: database name
|
||||
hostPort: host and port for database
|
||||
sourceConfig:
|
||||
config:
|
||||
type: PipelineMetadata
|
||||
# includeLineage: true
|
||||
# pipelineFilterPattern:
|
||||
# includes:
|
||||
# - pipeline1
|
||||
# - pipeline2
|
||||
# excludes:
|
||||
# - pipeline3
|
||||
# - pipeline4
|
||||
sink:
|
||||
type: metadata-rest
|
||||
config: { }
|
||||
workflowConfig:
|
||||
loggerLevel: INFO
|
||||
openMetadataServerConfig:
|
||||
hostPort: http://localhost:8585/api
|
||||
authProvider: no-auth
|
||||
```
|
||||
|
||||
#### Source Configuration - Service Connection
|
||||
|
||||
|
||||
- **hostPort**: host and port for dagster pipeline
|
||||
- **numberOfStatus**: 10
|
||||
- **dbConnection**
|
||||
- **type**: Name of the Database Service
|
||||
- **username**: db username
|
||||
- **password**: db password
|
||||
- **databaseSchema**: database name
|
||||
- **hostPort**: host and port for database connection
|
||||
|
||||
#### Source Configuration - Source Config
|
||||
|
||||
The `sourceConfig` is defined [here](https://github.com/open-metadata/OpenMetadata/blob/main/catalog-rest-service/src/main/resources/json/schema/metadataIngestion/pipelineServiceMetadataPipeline.json):
|
||||
|
||||
- `dbServiceName`: Database Service Name for the creation of lineage, if the source supports it.
|
||||
- `pipelineFilterPattern` and `chartFilterPattern`: Note that the `pipelineFilterPattern` and `chartFilterPattern` both support regex as include or exclude. E.g.,
|
||||
|
||||
```yaml
|
||||
pipelineFilterPattern:
|
||||
includes:
|
||||
- users
|
||||
- type_test
|
||||
```
|
||||
|
||||
#### Sink Configuration
|
||||
|
||||
To send the metadata to OpenMetadata, it needs to be specified as `type: metadata-rest`.
|
||||
|
||||
#### Workflow Configuration
|
||||
|
||||
The main property here is the `openMetadataServerConfig`, where you can define the host and security provider of your OpenMetadata installation.
|
||||
|
||||
For a simple, local installation using our docker containers, this looks like:
|
||||
|
||||
```yaml
|
||||
workflowConfig:
|
||||
openMetadataServerConfig:
|
||||
hostPort: http://localhost:8585/api
|
||||
authProvider: no-auth
|
||||
```
|
||||
|
||||
We support different security providers. You can find their definitions [here](https://github.com/open-metadata/OpenMetadata/tree/main/catalog-rest-service/src/main/resources/json/schema/security/client).
|
||||
You can find the different implementation of the ingestion below.
|
||||
|
||||
<Collapse title="Configure SSO in the Ingestion Workflows">
|
||||
|
||||
### Auth0 SSO
|
||||
|
||||
```yaml
|
||||
workflowConfig:
|
||||
openMetadataServerConfig:
|
||||
hostPort: 'http://localhost:8585/api'
|
||||
authProvider: auth0
|
||||
securityConfig:
|
||||
clientId: '{your_client_id}'
|
||||
secretKey: '{your_client_secret}'
|
||||
domain: '{your_domain}'
|
||||
```
|
||||
|
||||
### Azure SSO
|
||||
|
||||
```yaml
|
||||
workflowConfig:
|
||||
openMetadataServerConfig:
|
||||
hostPort: 'http://localhost:8585/api'
|
||||
authProvider: azure
|
||||
securityConfig:
|
||||
clientSecret: '{your_client_secret}'
|
||||
authority: '{your_authority_url}'
|
||||
clientId: '{your_client_id}'
|
||||
scopes:
|
||||
- your_scopes
|
||||
```
|
||||
|
||||
### Custom OIDC SSO
|
||||
|
||||
```yaml
|
||||
workflowConfig:
|
||||
openMetadataServerConfig:
|
||||
hostPort: 'http://localhost:8585/api'
|
||||
authProvider: custom-oidc
|
||||
securityConfig:
|
||||
clientId: '{your_client_id}'
|
||||
secretKey: '{your_client_secret}'
|
||||
domain: '{your_domain}'
|
||||
```
|
||||
|
||||
### Google SSO
|
||||
|
||||
```yaml
|
||||
workflowConfig:
|
||||
openMetadataServerConfig:
|
||||
hostPort: 'http://localhost:8585/api'
|
||||
authProvider: google
|
||||
securityConfig:
|
||||
secretKey: '{path-to-json-creds}'
|
||||
```
|
||||
|
||||
### Okta SSO
|
||||
|
||||
```yaml
|
||||
workflowConfig:
|
||||
openMetadataServerConfig:
|
||||
hostPort: http://localhost:8585/api
|
||||
authProvider: okta
|
||||
securityConfig:
|
||||
clientId: "{CLIENT_ID - SPA APP}"
|
||||
orgURL: "{ISSUER_URL}/v1/token"
|
||||
privateKey: "{public/private keypair}"
|
||||
email: "{email}"
|
||||
scopes:
|
||||
- token
|
||||
```
|
||||
|
||||
### Amazon Cognito SSO
|
||||
|
||||
The ingestion can be configured by [Enabling JWT Tokens](https://docs.open-metadata.org/deployment/security/enable-jwt-tokens)
|
||||
|
||||
```yaml
|
||||
workflowConfig:
|
||||
openMetadataServerConfig:
|
||||
hostPort: 'http://localhost:8585/api'
|
||||
authProvider: auth0
|
||||
securityConfig:
|
||||
clientId: '{your_client_id}'
|
||||
secretKey: '{your_client_secret}'
|
||||
domain: '{your_domain}'
|
||||
```
|
||||
|
||||
### OneLogin SSO
|
||||
|
||||
Which uses Custom OIDC for the ingestion
|
||||
|
||||
```yaml
|
||||
workflowConfig:
|
||||
openMetadataServerConfig:
|
||||
hostPort: 'http://localhost:8585/api'
|
||||
authProvider: custom-oidc
|
||||
securityConfig:
|
||||
clientId: '{your_client_id}'
|
||||
secretKey: '{your_client_secret}'
|
||||
domain: '{your_domain}'
|
||||
```
|
||||
|
||||
### KeyCloak SSO
|
||||
|
||||
Which uses Custom OIDC for the ingestion
|
||||
|
||||
```yaml
|
||||
workflowConfig:
|
||||
openMetadataServerConfig:
|
||||
hostPort: 'http://localhost:8585/api'
|
||||
authProvider: custom-oidc
|
||||
securityConfig:
|
||||
clientId: '{your_client_id}'
|
||||
secretKey: '{your_client_secret}'
|
||||
domain: '{your_domain}'
|
||||
```
|
||||
|
||||
</Collapse>
|
||||
|
||||
### 2. Run with the CLI
|
||||
|
||||
First, we will need to save the YAML file. Afterward, and with all requirements installed, we can run:
|
||||
|
||||
```bash
|
||||
metadata ingest -c <path-to-yaml>
|
||||
```
|
||||
|
||||
Note that from connector to connector, this recipe will always be the same. By updating the YAML configuration,
|
||||
you will be able to extract metadata from different sources.
|
@ -0,0 +1,201 @@
|
||||
---
|
||||
title: Dagster
|
||||
slug: /openmetadata/connectors/pipeline/dagster
|
||||
---
|
||||
|
||||
# Dagster
|
||||
|
||||
In this section, we provide guides and references to use the Dagster connector.
|
||||
|
||||
Configure and schedule Dagster metadata and profiler workflows from the OpenMetadata UI:
|
||||
- [Requirements](#requirements)
|
||||
- [Metadata Ingestion](#metadata-ingestion)
|
||||
|
||||
If you don't want to use the OpenMetadata Ingestion container to configure the workflows via the UI, then you can check
|
||||
the following docs to connect using Airflow SDK or with the CLI.
|
||||
|
||||
<TileContainer>
|
||||
<Tile
|
||||
icon="air"
|
||||
title="Ingest with Airflow"
|
||||
text="Configure the ingestion using Airflow SDK"
|
||||
link="/openmetadata/connectors/pipeline/dagster/airflow"
|
||||
size="half"
|
||||
/>
|
||||
<Tile
|
||||
icon="account_tree"
|
||||
title="Ingest with the CLI"
|
||||
text="Run a one-time ingestion using the metadata CLI"
|
||||
link="/openmetadata/connectors/pipeline/dagster/cli"
|
||||
size="half"
|
||||
/>
|
||||
</TileContainer>
|
||||
|
||||
## Requirements
|
||||
|
||||
<InlineCallout color="violet-70" icon="description" bold="OpenMetadata 0.12 or later" href="/deployment">
|
||||
To deploy OpenMetadata, check the <a href="/deployment">Deployment</a> guides.
|
||||
</InlineCallout>
|
||||
|
||||
To run the Ingestion via the UI you'll need to use the OpenMetadata Ingestion Container, which comes shipped with
|
||||
custom Airflow plugins to handle the workflow deployment.
|
||||
|
||||
## Metadata Ingestion
|
||||
|
||||
### 1. Visit the Services Page
|
||||
|
||||
The first step is ingesting the metadata from your sources. Under
|
||||
Settings, you will find a Services link an external source system to
|
||||
OpenMetadata. Once a service is created, it can be used to configure
|
||||
metadata, usage, and profiler workflows.
|
||||
|
||||
To visit the Services page, select Services from the Settings menu.
|
||||
|
||||
<Image
|
||||
src="/images/openmetadata/connectors/visit-services.png"
|
||||
alt="Visit Services Page"
|
||||
caption="Find Services under the Settings menu"
|
||||
/>
|
||||
|
||||
### 2. Create a New Service
|
||||
|
||||
Click on the Add New Service button to start the Service creation.
|
||||
|
||||
<Image
|
||||
src="/images/openmetadata/connectors/create-service.png"
|
||||
alt="Create a new service"
|
||||
caption="Add a new Service from the Services page"
|
||||
/>
|
||||
|
||||
### 3. Select the Service Type
|
||||
|
||||
Select Dagster as the service type and click Next.
|
||||
|
||||
<div className="w-100 flex justify-center">
|
||||
<Image
|
||||
src="/images/openmetadata/connectors/dagster/select-service.png"
|
||||
alt="Select Service"
|
||||
caption="Select your service from the list"
|
||||
/>
|
||||
</div>
|
||||
|
||||
### 4. Name and Describe your Service
|
||||
|
||||
Provide a name and description for your service as illustrated below.
|
||||
|
||||
#### Service Name
|
||||
|
||||
OpenMetadata uniquely identifies services by their Service Name. Provide
|
||||
a name that distinguishes your deployment from other services, including
|
||||
the other {connector} services that you might be ingesting metadata
|
||||
from.
|
||||
|
||||
|
||||
<div className="w-100 flex justify-center">
|
||||
<Image
|
||||
src="/images/openmetadata/connectors/dagster/add-new-service.png"
|
||||
alt="Add New Service"
|
||||
caption="Provide a Name and description for your Service"
|
||||
/>
|
||||
</div>
|
||||
|
||||
|
||||
### 5. Configure the Service Connection
|
||||
|
||||
In this step, we will configure the connection settings required for
|
||||
this connector. Please follow the instructions below to ensure that
|
||||
you've configured the connector to read from your dagster service as
|
||||
desired.
|
||||
|
||||
<div className="w-100 flex justify-center">
|
||||
<Image
|
||||
src="/images/openmetadata/connectors/dagster/service-connection.png"
|
||||
alt="Configure service connection"
|
||||
caption="Configure the service connection by filling the form"
|
||||
/>
|
||||
</div>
|
||||
|
||||
|
||||
Once the credentials have been added, click on `Test Connection` and Save
|
||||
the changes.
|
||||
|
||||
<div className="w-100 flex justify-center">
|
||||
<Image
|
||||
src="/images/openmetadata/connectors/test-connection.png"
|
||||
alt="Test Connection"
|
||||
caption="Test the connection and save the Service"
|
||||
/>
|
||||
</div>
|
||||
|
||||
#### Connection Options
|
||||
|
||||
- **Dagster API Key**: Dagster API Key.
|
||||
- **Dagster API Secret**: Dagster API Secret.
|
||||
|
||||
### 6. Configure Metadata Ingestion
|
||||
|
||||
In this step we will configure the metadata ingestion pipeline,
|
||||
Please follow the instructions below
|
||||
|
||||
<Image
|
||||
src="/images/openmetadata/connectors/configure-metadata-ingestion-pipeline.png"
|
||||
alt="Configure Metadata Ingestion"
|
||||
caption="Configure Metadata Ingestion Page"
|
||||
/>
|
||||
|
||||
#### Metadata Ingestion Options
|
||||
|
||||
- **Name**: This field refers to the name of ingestion pipeline, you can customize the name or use the generated name.
|
||||
- **Pipeline Filter Pattern (Optional)**: Use to pipeline filter patterns to control whether or not to include pipeline as part of metadata ingestion.
|
||||
- **Include**: Explicitly include pipeline by adding a list of comma-separated regular expressions to the Include field. OpenMetadata will include all pipeline with names matching one or more of the supplied regular expressions. All other schemas will be excluded.
|
||||
- **Exclude**: Explicitly exclude pipeline by adding a list of comma-separated regular expressions to the Exclude field. OpenMetadata will exclude all pipeline with names matching one or more of the supplied regular expressions. All other schemas will be included.
|
||||
- **Include lineage (toggle)**: Set the Include lineage toggle to control whether or not to include lineage between pipelines and data sources as part of metadata ingestion.
|
||||
- **Enable Debug Log (toggle)**: Set the Enable Debug Log toggle to set the default log level to debug, these logs can be viewed later in Airflow.
|
||||
|
||||
### 7. Schedule the Ingestion and Deploy
|
||||
|
||||
Scheduling can be set up at an hourly, daily, or weekly cadence. The
|
||||
timezone is in UTC. Select a Start Date to schedule for ingestion. It is
|
||||
optional to add an End Date.
|
||||
|
||||
Review your configuration settings. If they match what you intended,
|
||||
click Deploy to create the service and schedule metadata ingestion.
|
||||
|
||||
If something doesn't look right, click the Back button to return to the
|
||||
appropriate step and change the settings as needed.
|
||||
|
||||
<Image
|
||||
src="/images/openmetadata/connectors/schedule.png"
|
||||
alt="Schedule the Workflow"
|
||||
caption="Schedule the Ingestion Pipeline and Deploy"
|
||||
/>
|
||||
|
||||
After configuring the workflow, you can click on Deploy to create the
|
||||
pipeline.
|
||||
|
||||
### 8. View the Ingestion Pipeline
|
||||
|
||||
Once the workflow has been successfully deployed, you can view the
|
||||
Ingestion Pipeline running from the Service Page.
|
||||
|
||||
<Image
|
||||
src="/images/openmetadata/connectors/view-ingestion-pipeline.png"
|
||||
alt="View Ingestion Pipeline"
|
||||
caption="View the Ingestion Pipeline from the Service Page"
|
||||
/>
|
||||
|
||||
### 9. Workflow Deployment Error
|
||||
|
||||
If there were any errors during the workflow deployment process, the
|
||||
Ingestion Pipeline Entity will still be created, but no workflow will be
|
||||
present in the Ingestion container.
|
||||
|
||||
You can then edit the Ingestion Pipeline and Deploy it again.
|
||||
|
||||
<Image
|
||||
src="/images/openmetadata/connectors/workflow-deployment-error.png"
|
||||
alt="Workflow Deployment Error"
|
||||
caption="Edit and Deploy the Ingestion Pipeline"
|
||||
/>
|
||||
|
||||
From the Connection tab, you can also Edit the Service if needed.
|
@ -9,3 +9,4 @@ slug: /openmetadata/connectors/pipeline
|
||||
- [Airflow](/openmetadata/connectors/pipeline/airflow)
|
||||
- [Glue](/openmetadata/connectors/pipeline/glue)
|
||||
- [Fivetran](/openmetadata/connectors/pipeline/fivetran)
|
||||
- [Dagster](/openmetadata/connectors/pipeline/dagster)
|
||||
|
Binary file not shown.
After Width: | Height: | Size: 83 KiB |
Binary file not shown.
After Width: | Height: | Size: 381 KiB |
Binary file not shown.
After Width: | Height: | Size: 238 KiB |
Loading…
x
Reference in New Issue
Block a user