Add amundsen docs (#6369)

This commit is contained in:
Pere Miquel Brull 2022-07-27 06:38:44 +02:00 committed by GitHub
parent aac64e161b
commit ca58c47933
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
10 changed files with 176 additions and 0 deletions

View File

@ -337,6 +337,11 @@ site_menu:
- category: OpenMetadata / Connectors / Pipeline / Glue / CLI
url: /openmetadata/connectors/pipeline/glue/cli
- category: OpenMetadata / Connectors / Metadata
url: /openmetadata/connectors/metadata
- category: OpenMetadata / Connectors / Metadata / Amundsen
url: /openmetadata/connectors/metadata/amundsen
- category: OpenMetadata / Ingestion
url: /openmetadata/ingestion
- category: OpenMetadata / Ingestion / Workflows

View File

@ -8,6 +8,7 @@ slug: /openmetadata/connectors
OpenMetadata can extract metadata from the following list of connectors:
## Database Services
- [Athena](/openmetadata/connectors/database/athena)
- [AzureSQL](/openmetadata/connectors/database/azuresql)
- [BigQuery](/openmetadata/connectors/database/bigquery)
@ -33,6 +34,7 @@ OpenMetadata can extract metadata from the following list of connectors:
- [Vertica](/openmetadata/connectors/database/vertica)
## Dashboard Services
- [Looker](/openmetadata/connectors/dashboard/looker)
- [Metabase](/openmetadata/connectors/dashboard/metabase)
- [PowerBI](/openmetadata/connectors/dashboard/powerbi)
@ -41,6 +43,7 @@ OpenMetadata can extract metadata from the following list of connectors:
- [Tableau](/openmetadata/connectors/dashboard/tableau)
## Messaging Services
- [Kafka](/openmetadata/connectors/messaging/kafka)
## Pipeline Services
@ -48,3 +51,7 @@ OpenMetadata can extract metadata from the following list of connectors:
- [Airbyte](/openmetadata/connectors/pipeline/airbyte)
- [Airflow](/openmetadata/connectors/pipeline/airflow)
- [Glue](/openmetadata/connectors/pipeline/glue)
## Metadata Services
- [Amundsen](/openmetadata/connectors/metadata/amundsen)

View File

@ -0,0 +1,156 @@
---
title: Amundsen
slug: /openmetadata/connectors/metadata/amundsen
---
# Amundsen
In this page, you will learn how to use the `metadata` CLI to run a one-ingestion.
<Requirements />
## Python requirements
To run the Amundsen ingestion, you will need to install:
```commandline
pip3 install "openmetadata-ingestion[amundsen]"
```
Make sure you are running openmetadata-ingestion version 0.10.2 or above.
## Create Database Services
You need to create database services before ingesting the metadata from Amundsen. In the below example we have 5 tables
from 3 data sources i.e., `hive`, `dynamo` & `delta` so in OpenMetadata we have to create database services with the same name
as the source.
<Image src="/images/openmetadata/connectors/amundsen/create-db-service.png" alt="db-service" caption="Amundsen dashboard"/>
To create database service follow these steps:
### 1. Visit the Services Page
The first step is ingesting the metadata from your sources. Under Settings, you will find a Services link an external
source system to OpenMetadata. Once a service is created, it can be used to configure metadata, usage, and profiler
workflows.To visit the Services page, select Services from the Settings menu.serv
<Image src="/images/openmetadata/connectors/amundsen/create-service-1.png" alt="db-service" caption="Navigate to Settings >> Services"/>
### 2. Create a New Service
Click on the Add New Service button to start the Service creation.
<Image src="/images/openmetadata/connectors/amundsen/create-service-2.png" alt="db-service" caption="Add a New Service from the Database Services Page"/>
### 3. Select the Service Type
Select the service type which are available on the amundsen and create a service one by one. In this example we will
need to create services for hive, dynamo db & deltalake. Possible service names are `athena`, `bigquery`, `db2`, `druid`, `delta`,
`salesforce`, `oracle`, `glue`, `snowflake` or `hive`.
<Image src="/images/openmetadata/connectors/amundsen/create-service-3.png" alt="db-service"/>
<Image src="/images/openmetadata/connectors/amundsen/create-service-4.png" alt="db-service"/>
Note: Adding ingestion in this step is optional, because we will fetch the metadata from Amundsen. After creating all
the database services, `my service` page looks like below, and we are ready to start with the Amundsen ingestion via the CLI.
<Image src="/images/openmetadata/connectors/amundsen/create-service-5.png" alt="db-service"/>
## Metadata Ingestion
All connectors are now defined as JSON Schemas. [Here](https://github.com/open-metadata/OpenMetadata/blob/main/catalog-rest-service/src/main/resources/json/schema/entity/services/connections/metadata/amundsenConnection.json)
you can find the structure to create a connection to Amundsen.
In order to create and run a Metadata Ingestion workflow, we will follow the steps to create a
YAML configuration able to connect to the source, process the Entities if needed, and reach the OpenMetadata server.
The workflow is modeled around the following [JSON Schema](https://github.com/open-metadata/OpenMetadata/blob/main/catalog-rest-service/src/main/resources/json/schema/entity/services/connections/metadata/amundsenConnection.json).
### 1. Define the YAML Config
This is a sample config for Amundsen:
```yaml
source:
type: amundsen
serviceName: local_amundsen
serviceConnection:
config:
type: Amundsen
username: <username>
password: <password>
hostPort: bolt://localhost:7687
maxConnectionLifeTime: <time in secs.>
validateSSL: <true or false>
encrypted: <true or false>
modelClass: <modelclass>
sourceConfig:
config:
enableDataProfiler: false
sink:
type: metadata-rest
config: {}
workflowConfig:
openMetadataServerConfig:
hostPort: http://localhost:8585/api
authProvider: no-auth
```
### Source Configuration - Service Connection
You can find all the definitions and types for the `serviceConnection` [here](https://github.com/open-metadata/OpenMetadata/blob/main/catalog-rest-service/src/main/resources/json/schema/entity/services/connections/metadata/amundsenConnection.json).
- `username`: Enter the username of your Amundsen user in the Username field. The specified user should be authorized to read all databases you want to include in the metadata ingestion workflow.
- `password`: Enter the password for your amundsen user in the Password field.
- `hostPort`: Host and port of the Amundsen Neo4j Connection.
- `maxConnectionLifeTime` (optional): Maximum connection lifetime for the Amundsen Neo4j Connection
- `validateSSL` (optional): Enable SSL validation for the Amundsen Neo4j Connection.
- `encrypted` (Optional): Enable encryption for the Amundsen Neo4j Connection.
- `modelClass` (Optional): Model Class for the Amundsen Neo4j Connection.
### Sink Configuration
To send the metadata to OpenMetadata, it needs to be specified as `"type": "metadata-rest"`.
### Workflow Configuration
The main property here is the `openMetadataServerConfig`, where you can define the host and security provider of your
OpenMetadata installation. For a simple, local installation using our docker containers, this looks like:
```yaml
workflowConfig:
openMetadataServerConfig:
hostPort: http://localhost:8585/api
authProvider: no-auth
```
### OpenMetadata Security Providers
We support different security providers. You can find their definitions here. An example of an `Auth0` configuration would
be the following:
```yaml
workflowConfig:
openMetadataServerConfig:
hostPort: http://localhost:8585/api
authProvider: auth0
securityConfig:
clientId: <client ID>
secretKey: <secret key>
domain: <domain>
```
## 2. Run with the CLI
First, we will need to save the YAML file. Afterward, and with all requirements installed, we can run:
```yaml
metadata ingest -c <path-to-yaml>
```
Note that from connector to connector, this recipe will always be the same. By updating the YAML configuration, you will
be able to extract metadata from different sources.

View File

@ -0,0 +1,8 @@
---
title: Metadata Services
slug: /openmetadata/connectors/metadata
---
# Metadata Services
- [Amundsen](/openmetadata/connectors/metadata/amundsen)

Binary file not shown.

After

Width:  |  Height:  |  Size: 173 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 102 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 176 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 188 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 273 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 324 KiB