mirror of
https://github.com/open-metadata/OpenMetadata.git
synced 2025-12-26 15:10:05 +00:00
Add Redpanda documentation (#7205)
This commit is contained in:
parent
6f196a5c12
commit
bba908e33f
@ -364,6 +364,12 @@ site_menu:
|
||||
url: /openmetadata/connectors/messaging/kafka/airflow
|
||||
- category: OpenMetadata / Connectors / Messaging / Kafka / CLI
|
||||
url: /openmetadata/connectors/messaging/kafka/cli
|
||||
- category: OpenMetadata / Connectors / Messaging / Redpanda
|
||||
url: /openmetadata/connectors/messaging/redpanda
|
||||
- category: OpenMetadata / Connectors / Messaging / Redpanda / Airflow
|
||||
url: /openmetadata/connectors/messaging/redpanda/airflow
|
||||
- category: OpenMetadata / Connectors / Messaging / Redpanda / CLI
|
||||
url: /openmetadata/connectors/messaging/redpanda/cli
|
||||
|
||||
- category: OpenMetadata / Connectors / Pipeline
|
||||
url: /openmetadata/connectors/pipeline
|
||||
|
||||
@ -47,6 +47,7 @@ OpenMetadata can extract metadata from the following list of connectors:
|
||||
## Messaging Services
|
||||
|
||||
- [Kafka](/openmetadata/connectors/messaging/kafka)
|
||||
- [Redpanda](/openmetadata/connectors/messaging/redpanda)
|
||||
|
||||
## Pipeline Services
|
||||
|
||||
|
||||
@ -5,4 +5,6 @@ slug: /openmetadata/connectors/messaging
|
||||
|
||||
# Messaging Services
|
||||
|
||||
- [Kafka](/openmetadata/connectors/messaging/kafka)
|
||||
- [Kafka](/openmetadata/connectors/messaging/kafka)
|
||||
|
||||
- [Redpanda](/openmetadata/connectors/messaging/redpanda)
|
||||
|
||||
@ -140,7 +140,7 @@ In this step we will configure the metadata ingestion pipeline,
|
||||
Please follow the instructions below
|
||||
|
||||
<Image
|
||||
src="/images/openmetadata/connectors/configure-metadata-ingestion-messaging.png"
|
||||
src="/images/openmetadata/connectors/kafka/configure-metadata-ingestion-messaging.png"
|
||||
alt="Configure Metadata Ingestion"
|
||||
caption="Configure Metadata Ingestion Page"
|
||||
/>
|
||||
|
||||
@ -0,0 +1,292 @@
|
||||
---
|
||||
title: Run Redpanda Connector using Airflow SDK
|
||||
slug: /openmetadata/connectors/messaging/redpanda/airflow
|
||||
---
|
||||
|
||||
# Run Redpanda using the Airflow SDK
|
||||
|
||||
In this section, we provide guides and references to use the Redpanda connector.
|
||||
|
||||
Configure and schedule Redpanda metadata and profiler workflows from the OpenMetadata UI:
|
||||
- [Requirements](#requirements)
|
||||
- [Metadata Ingestion](#metadata-ingestion)
|
||||
|
||||
## Requirements
|
||||
|
||||
<InlineCallout color="violet-70" icon="description" bold="OpenMetadata 0.12 or later" href="/deployment">
|
||||
To deploy OpenMetadata, check the <a href="/deployment">Deployment</a> guides.
|
||||
</InlineCallout>
|
||||
|
||||
To run the Ingestion via the UI you'll need to use the OpenMetadata Ingestion Container, which comes shipped with
|
||||
custom Airflow plugins to handle the workflow deployment.
|
||||
|
||||
### Python Requirements
|
||||
|
||||
To run the Redpanda ingestion, you will need to install:
|
||||
|
||||
```bash
|
||||
pip3 install "openmetadata-ingestion[redpanda]"
|
||||
```
|
||||
|
||||
## Metadata Ingestion
|
||||
|
||||
All connectors are defined as JSON Schemas.
|
||||
[Here](https://github.com/open-metadata/OpenMetadata/blob/main/catalog-rest-service/src/main/resources/json/schema/entity/services/connections/messaging/redpandaConnection.json)
|
||||
you can find the structure to create a connection to Redpanda.
|
||||
|
||||
In order to create and run a Metadata Ingestion workflow, we will follow
|
||||
the steps to create a YAML configuration able to connect to the source,
|
||||
process the Entities if needed, and reach the OpenMetadata server.
|
||||
|
||||
The workflow is modeled around the following
|
||||
[JSON Schema](https://github.com/open-metadata/OpenMetadata/blob/main/catalog-rest-service/src/main/resources/json/schema/metadataIngestion/workflow.json)
|
||||
|
||||
### 1. Define the YAML Config
|
||||
|
||||
This is a sample config for Redpanda:
|
||||
|
||||
```yaml
|
||||
source:
|
||||
type: redpanda
|
||||
serviceName: local_redpanda
|
||||
serviceConnection:
|
||||
config:
|
||||
type: Redpanda
|
||||
bootstrapServers: localhost:9092
|
||||
schemaRegistryURL: http://localhost:8081 # Needs to be a URI
|
||||
consumerConfig: {}
|
||||
schemaRegistryConfig: {}
|
||||
sourceConfig:
|
||||
config:
|
||||
topicFilterPattern:
|
||||
excludes:
|
||||
- _confluent.*
|
||||
# includes:
|
||||
# - topic1
|
||||
generateSampleData: true
|
||||
sink:
|
||||
type: metadata-rest
|
||||
config: {}
|
||||
workflowConfig:
|
||||
# loggerLevel: DEBUG # DEBUG, INFO, WARN or ERROR
|
||||
openMetadataServerConfig:
|
||||
hostPort: http://localhost:8585/api
|
||||
authProvider: no-auth
|
||||
|
||||
```
|
||||
|
||||
#### Source Configuration - Service Connection
|
||||
|
||||
- **bootstrapServers**: Redpanda bootstrap servers. Add them in comma separated values e.g.: host1:9092,host2:9092.
|
||||
- **schemaRegistryURL**: Redpanda Schema Registry URL. URI format.
|
||||
- **consumerConfig**: Redpanda Consumer Config.
|
||||
- **schemaRegistryConfig**: Redpanda Schema Registry Config.
|
||||
|
||||
#### Source Configuration - Source Config
|
||||
|
||||
The sourceConfig is defined [here](https://github.com/open-metadata/OpenMetadata/blob/main/catalog-rest-service/src/main/resources/json/schema/metadataIngestion/messagingServiceMetadataPipeline.json):
|
||||
|
||||
- `generateSampleData`: Option to turn on/off generating sample data during metadata extraction.
|
||||
- `topicFilterPattern`: Note that the `topicFilterPattern` supports regex as include or exclude. E.g.,
|
||||
|
||||
```yaml
|
||||
topicFilterPattern:
|
||||
includes:
|
||||
- users
|
||||
- type_test
|
||||
```
|
||||
|
||||
#### Sink Configuration
|
||||
|
||||
To send the metadata to OpenMetadata, it needs to be specified as `type: metadata-rest`.
|
||||
|
||||
#### Workflow Configuration
|
||||
|
||||
The main property here is the `openMetadataServerConfig`, where you can define the host and security provider of your OpenMetadata installation.
|
||||
|
||||
For a simple, local installation using our docker containers, this looks like:
|
||||
|
||||
```yaml
|
||||
workflowConfig:
|
||||
openMetadataServerConfig:
|
||||
hostPort: http://localhost:8585/api
|
||||
authProvider: no-auth
|
||||
```
|
||||
|
||||
We support different security providers. You can find their definitions [here](https://github.com/open-metadata/OpenMetadata/tree/main/catalog-rest-service/src/main/resources/json/schema/security/client).
|
||||
You can find the different implementation of the ingestion below.
|
||||
|
||||
<Collapse title="Configure SSO in the Ingestion Workflows">
|
||||
|
||||
### Auth0 SSO
|
||||
|
||||
```yaml
|
||||
workflowConfig:
|
||||
openMetadataServerConfig:
|
||||
hostPort: 'http://localhost:8585/api'
|
||||
authProvider: auth0
|
||||
securityConfig:
|
||||
clientId: '{your_client_id}'
|
||||
secretKey: '{your_client_secret}'
|
||||
domain: '{your_domain}'
|
||||
```
|
||||
|
||||
### Azure SSO
|
||||
|
||||
```yaml
|
||||
workflowConfig:
|
||||
openMetadataServerConfig:
|
||||
hostPort: 'http://localhost:8585/api'
|
||||
authProvider: azure
|
||||
securityConfig:
|
||||
clientSecret: '{your_client_secret}'
|
||||
authority: '{your_authority_url}'
|
||||
clientId: '{your_client_id}'
|
||||
scopes:
|
||||
- your_scopes
|
||||
```
|
||||
|
||||
### Custom OIDC SSO
|
||||
|
||||
```yaml
|
||||
workflowConfig:
|
||||
openMetadataServerConfig:
|
||||
hostPort: 'http://localhost:8585/api'
|
||||
authProvider: custom-oidc
|
||||
securityConfig:
|
||||
clientId: '{your_client_id}'
|
||||
secretKey: '{your_client_secret}'
|
||||
domain: '{your_domain}'
|
||||
```
|
||||
|
||||
### Google SSO
|
||||
|
||||
```yaml
|
||||
workflowConfig:
|
||||
openMetadataServerConfig:
|
||||
hostPort: 'http://localhost:8585/api'
|
||||
authProvider: google
|
||||
securityConfig:
|
||||
secretKey: '{path-to-json-creds}'
|
||||
```
|
||||
|
||||
### Okta SSO
|
||||
|
||||
```yaml
|
||||
workflowConfig:
|
||||
openMetadataServerConfig:
|
||||
hostPort: http://localhost:8585/api
|
||||
authProvider: okta
|
||||
securityConfig:
|
||||
clientId: "{CLIENT_ID - SPA APP}"
|
||||
orgURL: "{ISSUER_URL}/v1/token"
|
||||
privateKey: "{public/private keypair}"
|
||||
email: "{email}"
|
||||
scopes:
|
||||
- token
|
||||
```
|
||||
|
||||
### Amazon Cognito SSO
|
||||
|
||||
The ingestion can be configured by [Enabling JWT Tokens](https://docs.open-metadata.org/deployment/security/enable-jwt-tokens)
|
||||
|
||||
```yaml
|
||||
workflowConfig:
|
||||
openMetadataServerConfig:
|
||||
hostPort: 'http://localhost:8585/api'
|
||||
authProvider: auth0
|
||||
securityConfig:
|
||||
clientId: '{your_client_id}'
|
||||
secretKey: '{your_client_secret}'
|
||||
domain: '{your_domain}'
|
||||
```
|
||||
|
||||
### OneLogin SSO
|
||||
|
||||
Which uses Custom OIDC for the ingestion
|
||||
|
||||
```yaml
|
||||
workflowConfig:
|
||||
openMetadataServerConfig:
|
||||
hostPort: 'http://localhost:8585/api'
|
||||
authProvider: custom-oidc
|
||||
securityConfig:
|
||||
clientId: '{your_client_id}'
|
||||
secretKey: '{your_client_secret}'
|
||||
domain: '{your_domain}'
|
||||
```
|
||||
|
||||
### KeyCloak SSO
|
||||
|
||||
Which uses Custom OIDC for the ingestion
|
||||
|
||||
```yaml
|
||||
workflowConfig:
|
||||
openMetadataServerConfig:
|
||||
hostPort: 'http://localhost:8585/api'
|
||||
authProvider: custom-oidc
|
||||
securityConfig:
|
||||
clientId: '{your_client_id}'
|
||||
secretKey: '{your_client_secret}'
|
||||
domain: '{your_domain}'
|
||||
```
|
||||
|
||||
</Collapse>
|
||||
|
||||
## 2. Prepare the Ingestion DAG
|
||||
|
||||
Create a Python file in your Airflow DAGs directory with the following contents:
|
||||
|
||||
```python
|
||||
import pathlib
|
||||
import yaml
|
||||
from datetime import timedelta
|
||||
from airflow import DAG
|
||||
|
||||
try:
|
||||
from airflow.operators.python import PythonOperator
|
||||
except ModuleNotFoundError:
|
||||
from airflow.operators.python_operator import PythonOperator
|
||||
|
||||
from metadata.config.common import load_config_file
|
||||
from metadata.ingestion.api.workflow import Workflow
|
||||
from airflow.utils.dates import days_ago
|
||||
|
||||
default_args = {
|
||||
"owner": "user_name",
|
||||
"email": ["username@org.com"],
|
||||
"email_on_failure": False,
|
||||
"retries": 3,
|
||||
"retry_delay": timedelta(minutes=5),
|
||||
"execution_timeout": timedelta(minutes=60)
|
||||
}
|
||||
|
||||
config = """
|
||||
<your YAML configuration>
|
||||
"""
|
||||
|
||||
def metadata_ingestion_workflow():
|
||||
workflow_config = yaml.safe_load(config)
|
||||
workflow = Workflow.create(workflow_config)
|
||||
workflow.execute()
|
||||
workflow.raise_from_status()
|
||||
workflow.print_status()
|
||||
workflow.stop()
|
||||
|
||||
with DAG(
|
||||
"sample_data",
|
||||
default_args=default_args,
|
||||
description="An example DAG which runs a OpenMetadata ingestion workflow",
|
||||
start_date=days_ago(1),
|
||||
is_paused_upon_creation=False,
|
||||
schedule_interval='*/5 * * * *',
|
||||
catchup=False,
|
||||
) as dag:
|
||||
ingest_task = PythonOperator(
|
||||
task_id="ingest_using_recipe",
|
||||
python_callable=metadata_ingestion_workflow,
|
||||
)
|
||||
```
|
||||
|
||||
Note that from connector to connector, this recipe will always be the same. By updating the YAML configuration, you will
|
||||
be able to extract metadata from different sources.
|
||||
@ -0,0 +1,245 @@
|
||||
---
|
||||
title: Run Redpanda Connector using the CLI
|
||||
slug: /openmetadata/connectors/messaging/redpanda/cli
|
||||
---
|
||||
|
||||
# Run Redpanda using the metadata CLI
|
||||
|
||||
In this section, we provide guides and references to use the Redpanda connector.
|
||||
|
||||
Configure and schedule Redpanda metadata and profiler workflows from the OpenMetadata UI:
|
||||
- [Requirements](#requirements)
|
||||
- [Metadata Ingestion](#metadata-ingestion)
|
||||
|
||||
## Requirements
|
||||
|
||||
<InlineCallout color="violet-70" icon="description" bold="OpenMetadata 0.12 or later" href="/deployment">
|
||||
To deploy OpenMetadata, check the <a href="/deployment">Deployment</a> guides.
|
||||
</InlineCallout>
|
||||
|
||||
To run the Ingestion via the UI you'll need to use the OpenMetadata Ingestion Container, which comes shipped with
|
||||
custom Airflow plugins to handle the workflow deployment.
|
||||
|
||||
### Python Requirements
|
||||
|
||||
To run the Redpanda ingestion, you will need to install:
|
||||
|
||||
```bash
|
||||
pip3 install "openmetadata-ingestion[redpanda]"
|
||||
```
|
||||
|
||||
## Metadata Ingestion
|
||||
|
||||
All connectors are defined as JSON Schemas.
|
||||
[Here](https://github.com/open-metadata/OpenMetadata/blob/main/catalog-rest-service/src/main/resources/json/schema/entity/services/connections/messaging/redpandaConnection.json)
|
||||
you can find the structure to create a connection to Redpanda.
|
||||
|
||||
In order to create and run a Metadata Ingestion workflow, we will follow
|
||||
the steps to create a YAML configuration able to connect to the source,
|
||||
process the Entities if needed, and reach the OpenMetadata server.
|
||||
|
||||
The workflow is modeled around the following
|
||||
[JSON Schema](https://github.com/open-metadata/OpenMetadata/blob/main/catalog-rest-service/src/main/resources/json/schema/metadataIngestion/workflow.json)
|
||||
|
||||
### 1. Define the YAML Config
|
||||
|
||||
This is a sample config for Redpanda:
|
||||
|
||||
```yaml
|
||||
source:
|
||||
type: redpanda
|
||||
serviceName: local_redpanda
|
||||
serviceConnection:
|
||||
config:
|
||||
type: Redpanda
|
||||
bootstrapServers: localhost:9092
|
||||
schemaRegistryURL: http://localhost:8081 # Needs to be a URI
|
||||
consumerConfig: {}
|
||||
schemaRegistryConfig: {}
|
||||
sourceConfig:
|
||||
config:
|
||||
topicFilterPattern:
|
||||
excludes:
|
||||
- _confluent.*
|
||||
# includes:
|
||||
# - topic1
|
||||
generateSampleData: true
|
||||
sink:
|
||||
type: metadata-rest
|
||||
config: {}
|
||||
workflowConfig:
|
||||
# loggerLevel: DEBUG # DEBUG, INFO, WARN or ERROR
|
||||
openMetadataServerConfig:
|
||||
hostPort: http://localhost:8585/api
|
||||
authProvider: no-auth
|
||||
|
||||
```
|
||||
|
||||
#### Source Configuration - Service Connection
|
||||
|
||||
- **bootstrapServers**: Redpanda bootstrap servers. Add them in comma separated values e.g.: host1:9092,host2:9092.
|
||||
- **schemaRegistryURL**: Redpanda Schema Registry URL. URI format.
|
||||
- **consumerConfig**: Redpanda Consumer Config.
|
||||
- **schemaRegistryConfig**: Redpanda Schema Registry Config.
|
||||
|
||||
#### Source Configuration - Source Config
|
||||
|
||||
The sourceConfig is defined [here](https://github.com/open-metadata/OpenMetadata/blob/main/catalog-rest-service/src/main/resources/json/schema/metadataIngestion/messagingServiceMetadataPipeline.json):
|
||||
|
||||
- `generateSampleData`: Option to turn on/off generating sample data during metadata extraction.
|
||||
- `topicFilterPattern`: Note that the `topicFilterPattern` supports regex as include or exclude. E.g.,
|
||||
|
||||
```yaml
|
||||
topicFilterPattern:
|
||||
includes:
|
||||
- users
|
||||
- type_test
|
||||
```
|
||||
|
||||
#### Sink Configuration
|
||||
|
||||
To send the metadata to OpenMetadata, it needs to be specified as `type: metadata-rest`.
|
||||
|
||||
#### Workflow Configuration
|
||||
|
||||
The main property here is the `openMetadataServerConfig`, where you can define the host and security provider of your OpenMetadata installation.
|
||||
|
||||
For a simple, local installation using our docker containers, this looks like:
|
||||
|
||||
```yaml
|
||||
workflowConfig:
|
||||
openMetadataServerConfig:
|
||||
hostPort: http://localhost:8585/api
|
||||
authProvider: no-auth
|
||||
```
|
||||
|
||||
We support different security providers. You can find their definitions [here](https://github.com/open-metadata/OpenMetadata/tree/main/catalog-rest-service/src/main/resources/json/schema/security/client).
|
||||
You can find the different implementation of the ingestion below.
|
||||
|
||||
<Collapse title="Configure SSO in the Ingestion Workflows">
|
||||
|
||||
### Auth0 SSO
|
||||
|
||||
```yaml
|
||||
workflowConfig:
|
||||
openMetadataServerConfig:
|
||||
hostPort: 'http://localhost:8585/api'
|
||||
authProvider: auth0
|
||||
securityConfig:
|
||||
clientId: '{your_client_id}'
|
||||
secretKey: '{your_client_secret}'
|
||||
domain: '{your_domain}'
|
||||
```
|
||||
|
||||
### Azure SSO
|
||||
|
||||
```yaml
|
||||
workflowConfig:
|
||||
openMetadataServerConfig:
|
||||
hostPort: 'http://localhost:8585/api'
|
||||
authProvider: azure
|
||||
securityConfig:
|
||||
clientSecret: '{your_client_secret}'
|
||||
authority: '{your_authority_url}'
|
||||
clientId: '{your_client_id}'
|
||||
scopes:
|
||||
- your_scopes
|
||||
```
|
||||
|
||||
### Custom OIDC SSO
|
||||
|
||||
```yaml
|
||||
workflowConfig:
|
||||
openMetadataServerConfig:
|
||||
hostPort: 'http://localhost:8585/api'
|
||||
authProvider: custom-oidc
|
||||
securityConfig:
|
||||
clientId: '{your_client_id}'
|
||||
secretKey: '{your_client_secret}'
|
||||
domain: '{your_domain}'
|
||||
```
|
||||
|
||||
### Google SSO
|
||||
|
||||
```yaml
|
||||
workflowConfig:
|
||||
openMetadataServerConfig:
|
||||
hostPort: 'http://localhost:8585/api'
|
||||
authProvider: google
|
||||
securityConfig:
|
||||
secretKey: '{path-to-json-creds}'
|
||||
```
|
||||
|
||||
### Okta SSO
|
||||
|
||||
```yaml
|
||||
workflowConfig:
|
||||
openMetadataServerConfig:
|
||||
hostPort: http://localhost:8585/api
|
||||
authProvider: okta
|
||||
securityConfig:
|
||||
clientId: "{CLIENT_ID - SPA APP}"
|
||||
orgURL: "{ISSUER_URL}/v1/token"
|
||||
privateKey: "{public/private keypair}"
|
||||
email: "{email}"
|
||||
scopes:
|
||||
- token
|
||||
```
|
||||
|
||||
### Amazon Cognito SSO
|
||||
|
||||
The ingestion can be configured by [Enabling JWT Tokens](https://docs.open-metadata.org/deployment/security/enable-jwt-tokens)
|
||||
|
||||
```yaml
|
||||
workflowConfig:
|
||||
openMetadataServerConfig:
|
||||
hostPort: 'http://localhost:8585/api'
|
||||
authProvider: auth0
|
||||
securityConfig:
|
||||
clientId: '{your_client_id}'
|
||||
secretKey: '{your_client_secret}'
|
||||
domain: '{your_domain}'
|
||||
```
|
||||
|
||||
### OneLogin SSO
|
||||
|
||||
Which uses Custom OIDC for the ingestion
|
||||
|
||||
```yaml
|
||||
workflowConfig:
|
||||
openMetadataServerConfig:
|
||||
hostPort: 'http://localhost:8585/api'
|
||||
authProvider: custom-oidc
|
||||
securityConfig:
|
||||
clientId: '{your_client_id}'
|
||||
secretKey: '{your_client_secret}'
|
||||
domain: '{your_domain}'
|
||||
```
|
||||
|
||||
### KeyCloak SSO
|
||||
|
||||
Which uses Custom OIDC for the ingestion
|
||||
|
||||
```yaml
|
||||
workflowConfig:
|
||||
openMetadataServerConfig:
|
||||
hostPort: 'http://localhost:8585/api'
|
||||
authProvider: custom-oidc
|
||||
securityConfig:
|
||||
clientId: '{your_client_id}'
|
||||
secretKey: '{your_client_secret}'
|
||||
domain: '{your_domain}'
|
||||
```
|
||||
|
||||
</Collapse>
|
||||
|
||||
### 2. Run with the CLI
|
||||
|
||||
First, we will need to save the YAML file. Afterward, and with all requirements installed, we can run:
|
||||
|
||||
```bash
|
||||
metadata ingest -c <path-to-yaml>
|
||||
```
|
||||
|
||||
Note that from connector to connector, this recipe will always be the same. By updating the YAML configuration,
|
||||
you will be able to extract metadata from different sources.
|
||||
@ -0,0 +1,203 @@
|
||||
---
|
||||
title: Redpanda
|
||||
slug: /openmetadata/connectors/messaging/redpanda
|
||||
---
|
||||
|
||||
# Redpanda
|
||||
|
||||
In this section, we provide guides and references to use the Redpanda connector.
|
||||
|
||||
Configure and schedule Redpanda metadata and profiler workflows from the OpenMetadata UI:
|
||||
- [Requirements](#requirements)
|
||||
- [Metadata Ingestion](#metadata-ingestion)
|
||||
|
||||
If you don't want to use the OpenMetadata Ingestion container to configure the workflows via the UI, then you can check
|
||||
the following docs to connect using Airflow SDK or with the CLI.
|
||||
|
||||
<TileContainer>
|
||||
<Tile
|
||||
icon="air"
|
||||
title="Ingest with Airflow"
|
||||
text="Configure the ingestion using Airflow SDK"
|
||||
link="/openmetadata/connectors/messaging/redpanda/airflow"
|
||||
size="half"
|
||||
/>
|
||||
<Tile
|
||||
icon="account_tree"
|
||||
title="Ingest with the CLI"
|
||||
text="Run a one-time ingestion using the metadata CLI"
|
||||
link="/openmetadata/connectors/messaging/redpanda/cli"
|
||||
size="half"
|
||||
/>
|
||||
</TileContainer>
|
||||
|
||||
## Requirements
|
||||
|
||||
<InlineCallout color="violet-70" icon="description" bold="OpenMetadata 0.12 or later" href="/deployment">
|
||||
To deploy OpenMetadata, check the <a href="/deployment">Deployment</a> guides.
|
||||
</InlineCallout>
|
||||
|
||||
To run the Ingestion via the UI you'll need to use the OpenMetadata Ingestion Container, which comes shipped with
|
||||
custom Airflow plugins to handle the workflow deployment.
|
||||
|
||||
## Metadata Ingestion
|
||||
|
||||
### 1. Visit the Services Page
|
||||
|
||||
The first step is ingesting the metadata from your sources. Under
|
||||
Settings, you will find a Services link an external source system to
|
||||
OpenMetadata. Once a service is created, it can be used to configure
|
||||
metadata, usage, and profiler workflows.
|
||||
|
||||
To visit the Services page, select Services from the Settings menu.
|
||||
|
||||
<Image
|
||||
src="/images/openmetadata/connectors/visit-services.png"
|
||||
alt="Visit Services Page"
|
||||
caption="Find Services under the Settings menu"
|
||||
/>
|
||||
|
||||
### 2. Create a New Service
|
||||
|
||||
Click on the Add New Service button to start the Service creation.
|
||||
|
||||
<Image
|
||||
src="/images/openmetadata/connectors/create-service.png"
|
||||
alt="Create a new service"
|
||||
caption="Add a new Service from the Services page"
|
||||
/>
|
||||
|
||||
### 3. Select the Service Type
|
||||
|
||||
Select Redpanda as the service type and click Next.
|
||||
|
||||
<div className="w-100 flex justify-center">
|
||||
<Image
|
||||
src="/images/openmetadata/connectors/redpanda/select-service.png"
|
||||
alt="Select Service"
|
||||
caption="Select your service from the list"
|
||||
/>
|
||||
</div>
|
||||
|
||||
### 4. Name and Describe your Service
|
||||
|
||||
Provide a name and description for your service as illustrated below.
|
||||
|
||||
#### Service Name
|
||||
|
||||
OpenMetadata uniquely identifies services by their Service Name. Provide
|
||||
a name that distinguishes your deployment from other services, including
|
||||
the other {connector} services that you might be ingesting metadata
|
||||
from.
|
||||
|
||||
|
||||
<div className="w-100 flex justify-center">
|
||||
<Image
|
||||
src="/images/openmetadata/connectors/redpanda/add-new-service.png"
|
||||
alt="Add New Service"
|
||||
caption="Provide a Name and description for your Service"
|
||||
/>
|
||||
</div>
|
||||
|
||||
|
||||
### 5. Configure the Service Connection
|
||||
|
||||
In this step, we will configure the connection settings required for
|
||||
this connector. Please follow the instructions below to ensure that
|
||||
you've configured the connector to read from your Redpanda service as
|
||||
desired.
|
||||
|
||||
<div className="w-100 flex justify-center">
|
||||
<Image
|
||||
src="/images/openmetadata/connectors/redpanda/service-connection.png"
|
||||
alt="Configure service connection"
|
||||
caption="Configure the service connection by filling the form"
|
||||
/>
|
||||
</div>
|
||||
|
||||
|
||||
Once the credentials have been added, click on `Test Connection` and Save
|
||||
the changes.
|
||||
|
||||
<div className="w-100 flex justify-center">
|
||||
<Image
|
||||
src="/images/openmetadata/connectors/test-connection.png"
|
||||
alt="Test Connection"
|
||||
caption="Test the connection and save the Service"
|
||||
/>
|
||||
</div>
|
||||
|
||||
#### Connection Options
|
||||
|
||||
- **Bootstrap Servers**: Redpanda bootstrap servers. Add them in comma separated values e.g.: host1:9092,host2:9092.
|
||||
- **Schema Registry URL**: Redpanda Schema Registry URL. URI format.
|
||||
- **Consumer Config**: Redpanda Consumer Config.
|
||||
- **Schema Registry Config**: Redpanda Schema Registry Config.
|
||||
|
||||
### 6. Configure Metadata Ingestion
|
||||
|
||||
In this step we will configure the metadata ingestion pipeline,
|
||||
Please follow the instructions below
|
||||
|
||||
<Image
|
||||
src="/images/openmetadata/connectors/redpanda/configure-metadata-ingestion-messaging.png"
|
||||
alt="Configure Metadata Ingestion"
|
||||
caption="Configure Metadata Ingestion Page"
|
||||
/>
|
||||
|
||||
#### Metadata Ingestion Options
|
||||
|
||||
- **Name**: This field refers to the name of ingestion pipeline, you can customize the name or use the generated name.
|
||||
- **Topic Filter Pattern (Optional)**: Use to pipeline filter patterns to control whether or not to include topics as part of metadata ingestion.
|
||||
- **Include**: Explicitly include topics by adding a list of comma-separated regular expressions to the Include field. OpenMetadata will include all topics with names matching one or more of the supplied regular expressions. All other topics will be excluded.
|
||||
- **Exclude**: Explicitly exclude topics by adding a list of comma-separated regular expressions to the Exclude field. OpenMetadata will exclude all topics with names matching one or more of the supplied regular expressions. All other topics will be included.
|
||||
- **Ingest Sample Data (toggle)**: To ingest sample data from the topics.
|
||||
- **Enable Debug Log (toggle)**: Set the 'Enable Debug Log' toggle to set the default log level to debug, these logs can be viewed later in Airflow.
|
||||
|
||||
### 7. Schedule the Ingestion and Deploy
|
||||
|
||||
Scheduling can be set up at an hourly, daily, or weekly cadence. The
|
||||
timezone is in UTC. Select a Start Date to schedule for ingestion. It is
|
||||
optional to add an End Date.
|
||||
|
||||
Review your configuration settings. If they match what you intended,
|
||||
click Deploy to create the service and schedule metadata ingestion.
|
||||
|
||||
If something doesn't look right, click the Back button to return to the
|
||||
appropriate step and change the settings as needed.
|
||||
|
||||
<Image
|
||||
src="/images/openmetadata/connectors/schedule.png"
|
||||
alt="Schedule the Workflow"
|
||||
caption="Schedule the Ingestion Pipeline and Deploy"
|
||||
/>
|
||||
|
||||
After configuring the workflow, you can click on Deploy to create the
|
||||
pipeline.
|
||||
|
||||
### 8. View the Ingestion Pipeline
|
||||
|
||||
Once the workflow has been successfully deployed, you can view the
|
||||
Ingestion Pipeline running from the Service Page.
|
||||
|
||||
<Image
|
||||
src="/images/openmetadata/connectors/view-ingestion-pipeline.png"
|
||||
alt="View Ingestion Pipeline"
|
||||
caption="View the Ingestion Pipeline from the Service Page"
|
||||
/>
|
||||
|
||||
### 9. Workflow Deployment Error
|
||||
|
||||
If there were any errors during the workflow deployment process, the
|
||||
Ingestion Pipeline Entity will still be created, but no workflow will be
|
||||
present in the Ingestion container.
|
||||
|
||||
You can then edit the Ingestion Pipeline and Deploy it again.
|
||||
|
||||
<Image
|
||||
src="/images/openmetadata/connectors/workflow-deployment-error.png"
|
||||
alt="Workflow Deployment Error"
|
||||
caption="Edit and Deploy the Ingestion Pipeline"
|
||||
/>
|
||||
|
||||
From the Connection tab, you can also Edit the Service if needed.
|
||||
|
Before Width: | Height: | Size: 355 KiB After Width: | Height: | Size: 355 KiB |
Binary file not shown.
|
After Width: | Height: | Size: 456 KiB |
Binary file not shown.
|
After Width: | Height: | Size: 602 KiB |
Binary file not shown.
|
After Width: | Height: | Size: 325 KiB |
Binary file not shown.
|
After Width: | Height: | Size: 543 KiB |
Loading…
x
Reference in New Issue
Block a user