mirror of
https://github.com/open-metadata/OpenMetadata.git
synced 2025-12-26 15:10:05 +00:00
Add amundsen docs (#6369)
This commit is contained in:
parent
aac64e161b
commit
ca58c47933
@ -337,6 +337,11 @@ site_menu:
|
||||
- category: OpenMetadata / Connectors / Pipeline / Glue / CLI
|
||||
url: /openmetadata/connectors/pipeline/glue/cli
|
||||
|
||||
- category: OpenMetadata / Connectors / Metadata
|
||||
url: /openmetadata/connectors/metadata
|
||||
- category: OpenMetadata / Connectors / Metadata / Amundsen
|
||||
url: /openmetadata/connectors/metadata/amundsen
|
||||
|
||||
- category: OpenMetadata / Ingestion
|
||||
url: /openmetadata/ingestion
|
||||
- category: OpenMetadata / Ingestion / Workflows
|
||||
|
||||
@ -8,6 +8,7 @@ slug: /openmetadata/connectors
|
||||
OpenMetadata can extract metadata from the following list of connectors:
|
||||
|
||||
## Database Services
|
||||
|
||||
- [Athena](/openmetadata/connectors/database/athena)
|
||||
- [AzureSQL](/openmetadata/connectors/database/azuresql)
|
||||
- [BigQuery](/openmetadata/connectors/database/bigquery)
|
||||
@ -33,6 +34,7 @@ OpenMetadata can extract metadata from the following list of connectors:
|
||||
- [Vertica](/openmetadata/connectors/database/vertica)
|
||||
|
||||
## Dashboard Services
|
||||
|
||||
- [Looker](/openmetadata/connectors/dashboard/looker)
|
||||
- [Metabase](/openmetadata/connectors/dashboard/metabase)
|
||||
- [PowerBI](/openmetadata/connectors/dashboard/powerbi)
|
||||
@ -41,6 +43,7 @@ OpenMetadata can extract metadata from the following list of connectors:
|
||||
- [Tableau](/openmetadata/connectors/dashboard/tableau)
|
||||
|
||||
## Messaging Services
|
||||
|
||||
- [Kafka](/openmetadata/connectors/messaging/kafka)
|
||||
|
||||
## Pipeline Services
|
||||
@ -48,3 +51,7 @@ OpenMetadata can extract metadata from the following list of connectors:
|
||||
- [Airbyte](/openmetadata/connectors/pipeline/airbyte)
|
||||
- [Airflow](/openmetadata/connectors/pipeline/airflow)
|
||||
- [Glue](/openmetadata/connectors/pipeline/glue)
|
||||
|
||||
## Metadata Services
|
||||
|
||||
- [Amundsen](/openmetadata/connectors/metadata/amundsen)
|
||||
@ -0,0 +1,156 @@
|
||||
---
|
||||
title: Amundsen
|
||||
slug: /openmetadata/connectors/metadata/amundsen
|
||||
---
|
||||
|
||||
# Amundsen
|
||||
|
||||
In this page, you will learn how to use the `metadata` CLI to run a one-ingestion.
|
||||
|
||||
<Requirements />
|
||||
|
||||
## Python requirements
|
||||
|
||||
To run the Amundsen ingestion, you will need to install:
|
||||
|
||||
```commandline
|
||||
pip3 install "openmetadata-ingestion[amundsen]"
|
||||
```
|
||||
|
||||
Make sure you are running openmetadata-ingestion version 0.10.2 or above.
|
||||
|
||||
|
||||
## Create Database Services
|
||||
|
||||
You need to create database services before ingesting the metadata from Amundsen. In the below example we have 5 tables
|
||||
from 3 data sources i.e., `hive`, `dynamo` & `delta` so in OpenMetadata we have to create database services with the same name
|
||||
as the source.
|
||||
|
||||
<Image src="/images/openmetadata/connectors/amundsen/create-db-service.png" alt="db-service" caption="Amundsen dashboard"/>
|
||||
|
||||
To create database service follow these steps:
|
||||
|
||||
### 1. Visit the Services Page
|
||||
|
||||
The first step is ingesting the metadata from your sources. Under Settings, you will find a Services link an external
|
||||
source system to OpenMetadata. Once a service is created, it can be used to configure metadata, usage, and profiler
|
||||
workflows.To visit the Services page, select Services from the Settings menu.serv
|
||||
|
||||
<Image src="/images/openmetadata/connectors/amundsen/create-service-1.png" alt="db-service" caption="Navigate to Settings >> Services"/>
|
||||
|
||||
### 2. Create a New Service
|
||||
|
||||
Click on the Add New Service button to start the Service creation.
|
||||
|
||||
<Image src="/images/openmetadata/connectors/amundsen/create-service-2.png" alt="db-service" caption="Add a New Service from the Database Services Page"/>
|
||||
|
||||
### 3. Select the Service Type
|
||||
|
||||
Select the service type which are available on the amundsen and create a service one by one. In this example we will
|
||||
need to create services for hive, dynamo db & deltalake. Possible service names are `athena`, `bigquery`, `db2`, `druid`, `delta`,
|
||||
`salesforce`, `oracle`, `glue`, `snowflake` or `hive`.
|
||||
|
||||
<Image src="/images/openmetadata/connectors/amundsen/create-service-3.png" alt="db-service"/>
|
||||
|
||||
|
||||
<Image src="/images/openmetadata/connectors/amundsen/create-service-4.png" alt="db-service"/>
|
||||
|
||||
Note: Adding ingestion in this step is optional, because we will fetch the metadata from Amundsen. After creating all
|
||||
the database services, `my service` page looks like below, and we are ready to start with the Amundsen ingestion via the CLI.
|
||||
|
||||
<Image src="/images/openmetadata/connectors/amundsen/create-service-5.png" alt="db-service"/>
|
||||
|
||||
## Metadata Ingestion
|
||||
|
||||
All connectors are now defined as JSON Schemas. [Here](https://github.com/open-metadata/OpenMetadata/blob/main/catalog-rest-service/src/main/resources/json/schema/entity/services/connections/metadata/amundsenConnection.json)
|
||||
you can find the structure to create a connection to Amundsen.
|
||||
|
||||
In order to create and run a Metadata Ingestion workflow, we will follow the steps to create a
|
||||
YAML configuration able to connect to the source, process the Entities if needed, and reach the OpenMetadata server.
|
||||
|
||||
The workflow is modeled around the following [JSON Schema](https://github.com/open-metadata/OpenMetadata/blob/main/catalog-rest-service/src/main/resources/json/schema/entity/services/connections/metadata/amundsenConnection.json).
|
||||
|
||||
### 1. Define the YAML Config
|
||||
|
||||
This is a sample config for Amundsen:
|
||||
|
||||
```yaml
|
||||
source:
|
||||
type: amundsen
|
||||
serviceName: local_amundsen
|
||||
serviceConnection:
|
||||
config:
|
||||
type: Amundsen
|
||||
username: <username>
|
||||
password: <password>
|
||||
hostPort: bolt://localhost:7687
|
||||
maxConnectionLifeTime: <time in secs.>
|
||||
validateSSL: <true or false>
|
||||
encrypted: <true or false>
|
||||
modelClass: <modelclass>
|
||||
sourceConfig:
|
||||
config:
|
||||
enableDataProfiler: false
|
||||
sink:
|
||||
type: metadata-rest
|
||||
config: {}
|
||||
workflowConfig:
|
||||
openMetadataServerConfig:
|
||||
hostPort: http://localhost:8585/api
|
||||
authProvider: no-auth
|
||||
```
|
||||
|
||||
### Source Configuration - Service Connection
|
||||
|
||||
You can find all the definitions and types for the `serviceConnection` [here](https://github.com/open-metadata/OpenMetadata/blob/main/catalog-rest-service/src/main/resources/json/schema/entity/services/connections/metadata/amundsenConnection.json).
|
||||
|
||||
- `username`: Enter the username of your Amundsen user in the Username field. The specified user should be authorized to read all databases you want to include in the metadata ingestion workflow.
|
||||
- `password`: Enter the password for your amundsen user in the Password field.
|
||||
- `hostPort`: Host and port of the Amundsen Neo4j Connection.
|
||||
- `maxConnectionLifeTime` (optional): Maximum connection lifetime for the Amundsen Neo4j Connection
|
||||
- `validateSSL` (optional): Enable SSL validation for the Amundsen Neo4j Connection.
|
||||
- `encrypted` (Optional): Enable encryption for the Amundsen Neo4j Connection.
|
||||
- `modelClass` (Optional): Model Class for the Amundsen Neo4j Connection.
|
||||
|
||||
### Sink Configuration
|
||||
|
||||
To send the metadata to OpenMetadata, it needs to be specified as `"type": "metadata-rest"`.
|
||||
|
||||
### Workflow Configuration
|
||||
|
||||
The main property here is the `openMetadataServerConfig`, where you can define the host and security provider of your
|
||||
OpenMetadata installation. For a simple, local installation using our docker containers, this looks like:
|
||||
|
||||
```yaml
|
||||
workflowConfig:
|
||||
openMetadataServerConfig:
|
||||
hostPort: http://localhost:8585/api
|
||||
authProvider: no-auth
|
||||
```
|
||||
|
||||
### OpenMetadata Security Providers
|
||||
|
||||
We support different security providers. You can find their definitions here. An example of an `Auth0` configuration would
|
||||
be the following:
|
||||
|
||||
```yaml
|
||||
workflowConfig:
|
||||
openMetadataServerConfig:
|
||||
hostPort: http://localhost:8585/api
|
||||
authProvider: auth0
|
||||
securityConfig:
|
||||
clientId: <client ID>
|
||||
secretKey: <secret key>
|
||||
domain: <domain>
|
||||
```
|
||||
|
||||
## 2. Run with the CLI
|
||||
|
||||
First, we will need to save the YAML file. Afterward, and with all requirements installed, we can run:
|
||||
|
||||
```yaml
|
||||
metadata ingest -c <path-to-yaml>
|
||||
```
|
||||
|
||||
Note that from connector to connector, this recipe will always be the same. By updating the YAML configuration, you will
|
||||
be able to extract metadata from different sources.
|
||||
@ -0,0 +1,8 @@
|
||||
---
|
||||
title: Metadata Services
|
||||
slug: /openmetadata/connectors/metadata
|
||||
---
|
||||
|
||||
# Metadata Services
|
||||
|
||||
- [Amundsen](/openmetadata/connectors/metadata/amundsen)
|
||||
Binary file not shown.
|
After Width: | Height: | Size: 173 KiB |
Binary file not shown.
|
After Width: | Height: | Size: 102 KiB |
Binary file not shown.
|
After Width: | Height: | Size: 176 KiB |
Binary file not shown.
|
After Width: | Height: | Size: 188 KiB |
Binary file not shown.
|
After Width: | Height: | Size: 273 KiB |
Binary file not shown.
|
After Width: | Height: | Size: 324 KiB |
Loading…
x
Reference in New Issue
Block a user