mirror of
https://github.com/open-metadata/OpenMetadata.git
synced 2026-01-06 04:26:57 +00:00
Docs - Credentials, links and Roadmap (#6870)
* Managing credentials * GCS creds info * Links and roadmap update * Add lineage in the menu
This commit is contained in:
parent
95b2ac276e
commit
ceb9601c67
@ -392,6 +392,9 @@ site_menu:
|
||||
- category: OpenMetadata / Connectors / Metadata / Amundsen
|
||||
url: /openmetadata/connectors/metadata/amundsen
|
||||
|
||||
- category: OpenMetadata / Connectors / Managing Credentials
|
||||
url: /openmetadata/connectors/credentials
|
||||
|
||||
- category: OpenMetadata / Ingestion
|
||||
url: /openmetadata/ingestion
|
||||
- category: OpenMetadata / Ingestion / Workflows
|
||||
@ -403,13 +406,15 @@ site_menu:
|
||||
url: /openmetadata/ingestion/workflows/metadata/dbt
|
||||
- category: OpenMetadata / Ingestion / Workflows/ Metadata / DBT / Ingest DBT UI
|
||||
url: /openmetadata/ingestion/workflows/metadata/dbt/ingest-dbt-ui
|
||||
- category: OpenMetadata / Ingestion / Workflows/ Metadata / DBT / Ingest DBT CLI
|
||||
url: /openmetadata/ingestion/workflows/metadata/dbt/ingest-dbt-cli
|
||||
- category: OpenMetadata / Ingestion / Workflows/ Metadata / DBT / Ingest DBT from Workflow Config
|
||||
url: /openmetadata/ingestion/workflows/metadata/dbt/ingest-dbt-workflow-config
|
||||
|
||||
- category: OpenMetadata / Ingestion / Workflows / Usage
|
||||
url: /openmetadata/ingestion/workflows/usage
|
||||
- category: OpenMetadata / Ingestion / Workflows / Usage / Usage Workflow Through Query Logs
|
||||
url: /openmetadata/ingestion/workflows/usage/usage-workflow-query-logs
|
||||
- category: OpenMetadata / Ingestion / Workflows / Lineage
|
||||
url: /openmetadata/ingestion/workflows/lineage
|
||||
- category: OpenMetadata / Ingestion / Workflows / Profiler
|
||||
url: /openmetadata/ingestion/workflows/profiler
|
||||
- category: OpenMetadata / Ingestion / Workflows / Profiler / Metrics
|
||||
|
||||
@ -0,0 +1,66 @@
|
||||
---
|
||||
title: Managing Credentials
|
||||
slug: /openmetadata/connectors/credentials
|
||||
---
|
||||
|
||||
# Manging Credentials in the CLI
|
||||
|
||||
When running Workflow with the CLI or your favourite scheduler, it's safer to not have the services' credentials
|
||||
at plain sight. For the CLI, the ingestion package can load sensitive information from environment variables.
|
||||
|
||||
For example, if you are using the [Glue](/openmetadata/connectors/database/glue) connector you could specify the
|
||||
AWS configurations as follows in the case of a JSON config file
|
||||
|
||||
```json
|
||||
[...]
|
||||
"awsConfig": {
|
||||
"awsAccessKeyId": "${AWS_ACCESS_KEY_ID}",
|
||||
"awsSecretAccessKey": "${AWS_SECRET_ACCESS_KEY}",
|
||||
"awsRegion": "${AWS_REGION}",
|
||||
"awsSessionToken": "${AWS_SESSION_TOKEN}"
|
||||
},
|
||||
[...]
|
||||
```
|
||||
|
||||
Or
|
||||
|
||||
```yaml
|
||||
[...]
|
||||
awsConfig:
|
||||
awsAccessKeyId: '${AWS_ACCESS_KEY_ID}'
|
||||
awsSecretAccessKey: '${AWS_SECRET_ACCESS_KEY}'
|
||||
awsRegion: '${AWS_REGION}'
|
||||
awsSessionToken: '${AWS_SESSION_TOKEN}'
|
||||
[...]
|
||||
```
|
||||
|
||||
for a YAML configuration.
|
||||
|
||||
# AWS Credentials
|
||||
|
||||
The AWS Credentials are based on the following [JSON Schema](https://github.com/open-metadata/OpenMetadata/blob/main/catalog-rest-service/src/main/resources/json/schema/security/credentials/awsCredentials.json).
|
||||
Note that the only required field is the `awsRegion`. This configuration is rather flexible to allow installations under AWS
|
||||
that directly use instance roles for permissions to authenticate to whatever service we are pointing to without having to
|
||||
write the credentials down.
|
||||
|
||||
## AWS Vault
|
||||
|
||||
If using [aws-vault](https://github.com/99designs/aws-vault), it gets a bit more involved to run the CLI ingestion as the credentials are not globally available in the terminal.
|
||||
In that case, you could use the following command after setting up the ingestion configuration file:
|
||||
|
||||
```bash
|
||||
aws-vault exec <role> -- $SHELL -c 'metadata ingest -c <path to connector>'
|
||||
```
|
||||
|
||||
# GCS Credentials
|
||||
|
||||
The GCS Credentials are based on the following [JSON Schema](https://github.com/open-metadata/OpenMetadata/blob/main/catalog-rest-service/src/main/resources/json/schema/security/credentials/gcsCredentials.json).
|
||||
These are the fields that you can export when preparing a Service Account.
|
||||
|
||||
Once the account is created, you can see the fields in the exported JSON file from:
|
||||
|
||||
```
|
||||
IAM & Admin > Service Accounts > Keys
|
||||
```
|
||||
|
||||
You can validate the whole Google service account setup [here](deployment/security/google).
|
||||
@ -5,10 +5,66 @@ slug: /openmetadata/ingestion
|
||||
|
||||
# Metadata Ingestion
|
||||
|
||||
Explain how we have different types of workflows and the metadata
|
||||
that we can ingest automatically:
|
||||
The goal of OpenMetadata is to serve as a centralised platform where users can gather and collaborate
|
||||
around data. This is possible thanks for different workflows that users can deploy and schedule, which will
|
||||
connect to the data sources to extract metadata.
|
||||
|
||||
- e.g., table metadata
|
||||
- DBT
|
||||
- Lineage
|
||||
- Usage
|
||||
Different metadata being ingested to OpenMetadata can be:
|
||||
- Entities metadata, such as Tables, Dashboards, Topics...
|
||||
- Query usage to rank the most used tables,
|
||||
- Lineage between Entities,
|
||||
- Data Profiles and Quality Tests.
|
||||
|
||||
In this section we will explore the different workflows, how they work and how to use them.
|
||||
|
||||
<InlineCalloutContainer>
|
||||
<InlineCallout
|
||||
color="violet-70"
|
||||
bold="Metadata Ingestion"
|
||||
icon="cable"
|
||||
href="/openmetadata/ingestion/workflows/metadata"
|
||||
>
|
||||
Learn more about how to ingest metadata from dozens of connectors.
|
||||
</InlineCallout>
|
||||
<InlineCallout
|
||||
color="violet-70"
|
||||
bold="Metadata Profiler & Quality Tests"
|
||||
icon="cable"
|
||||
href="/openmetadata/ingestion/workflows/profiler"
|
||||
>
|
||||
Get metrics from your Tables and run automated Quality Tests!
|
||||
</InlineCallout>
|
||||
<InlineCallout
|
||||
color="violet-70"
|
||||
bold="Metadata Usage"
|
||||
icon="cable"
|
||||
href="/openmetadata/ingestion/workflows/usage"
|
||||
>
|
||||
To analyze popular entities.
|
||||
</InlineCallout>
|
||||
<InlineCallout
|
||||
color="violet-70"
|
||||
bold="Metadata Lineage"
|
||||
icon="cable"
|
||||
href="/openmetadata/ingestion/workflows/lineage"
|
||||
>
|
||||
To analyze relationships in your data platform.
|
||||
</InlineCallout>
|
||||
|
||||
</InlineCalloutContainer>
|
||||
|
||||
## Metadata Versioning
|
||||
|
||||
One fundamental aspect of Metadata Ingestion is being able to analyze the evolution of your metadata. OpenMetadata
|
||||
support Metadata Versioning, maintaining the history of changes of all your assets.
|
||||
|
||||
<InlineCalloutContainer>
|
||||
<InlineCallout
|
||||
color="violet-70"
|
||||
bold="Metadata Versioning"
|
||||
icon="360"
|
||||
href="/openmetadata/ingestion/versioning"
|
||||
>
|
||||
Learn how OpenMetadata keeps track of your metadata evolution.
|
||||
</InlineCallout>
|
||||
</InlineCalloutContainer>
|
||||
|
||||
@ -0,0 +1,8 @@
|
||||
---
|
||||
title: Lineage Workflow
|
||||
slug: /openmetadata/ingestion/workflows/lineage
|
||||
---
|
||||
|
||||
# Lineage Workflow
|
||||
|
||||
Introduced in 0.12
|
||||
@ -5,6 +5,27 @@ slug: /openmetadata/ingestion/workflows/metadata/dbt
|
||||
|
||||
# DBT Integration
|
||||
|
||||
You can ingest DBT Metadata both with the UI or by writing down your Workflow configuration:
|
||||
|
||||
<InlineCalloutContainer>
|
||||
<InlineCallout
|
||||
color="violet-70"
|
||||
bold="DBT UI ingestion"
|
||||
icon="cable"
|
||||
href="/openmetadata/ingestion/workflows/metadata/dbt/ingest-dbt-ui"
|
||||
>
|
||||
Configure the DBT ingestion directly in the UI.
|
||||
</InlineCallout>
|
||||
<InlineCallout
|
||||
color="violet-70"
|
||||
bold="DBT CLI ingestion"
|
||||
icon="cable"
|
||||
href="/openmetadata/ingestion/workflows/metadata/dbt/ingest-dbt-cli"
|
||||
>
|
||||
Prepare the DBT ingestion with the CLI or your favourite scheduler.
|
||||
</InlineCallout>
|
||||
</InlineCalloutContainer>
|
||||
|
||||
### What is DBT?
|
||||
|
||||
A DBT model provides transformation logic that creates a table from raw data.
|
||||
@ -15,12 +36,12 @@ DBT does the T in [ELT](https://docs.getdbt.com/terms/elt) (Extract, Load, Trans
|
||||
|
||||
For information regarding setting up a DBT project and creating models please refer to the official DBT documentation [here](https://docs.getdbt.com/docs/introduction).
|
||||
|
||||
### DBT Integration in Openmetadata
|
||||
### DBT Integration in OpenMetadata
|
||||
|
||||
OpenMetadata includes an integration for DBT that enables you to see what models are being used to generate tables.
|
||||
|
||||
Openmetadata parses the [manifest](https://docs.getdbt.com/reference/artifacts/manifest-json) and [catalog](https://docs.getdbt.com/reference/artifacts/catalog-json) json files and shows the queries from which the models are being generated.
|
||||
OpenMetadata parses the [manifest](https://docs.getdbt.com/reference/artifacts/manifest-json) and [catalog](https://docs.getdbt.com/reference/artifacts/catalog-json) json files and shows the queries from which the models are being generated.
|
||||
|
||||
Metadata regarding the tables and views generated via DBT is also ingested and can be seen.
|
||||
|
||||

|
||||

|
||||
|
||||
@ -1,9 +1,9 @@
|
||||
---
|
||||
title: DBT Ingestion CLI
|
||||
slug: /openmetadata/ingestion/workflows/metadata/dbt/ingest-dbt-cli
|
||||
title: DBT Ingestion from Workflow config
|
||||
slug: /openmetadata/ingestion/workflows/metadata/dbt/ingest-dbt-workflow-config
|
||||
---
|
||||
|
||||
# Add DBT while ingesting from CLI
|
||||
# Add DBT to your Workflow config
|
||||
|
||||
Provide and configure the DBT manifest and catalog file source locations.
|
||||
|
||||
@ -4,3 +4,29 @@ slug: /openmetadata/ingestion/workflows/metadata
|
||||
---
|
||||
|
||||
# Metadata Ingestion Workflow
|
||||
|
||||
The easiest way to extract metadata is to use any of our connectors!
|
||||
|
||||
<InlineCalloutContainer>
|
||||
<InlineCallout
|
||||
color="violet-70"
|
||||
bold="Metadata Connectors"
|
||||
icon="add_moderator"
|
||||
href="/openmetadata/connectors"
|
||||
>
|
||||
Configure your automated Metadata extraction.
|
||||
</InlineCallout>
|
||||
</InlineCalloutContainer>
|
||||
|
||||
If you want to learn more about how to extract metadata from DBT, we have you covered:
|
||||
|
||||
<InlineCalloutContainer>
|
||||
<InlineCallout
|
||||
color="violet-70"
|
||||
bold="DBT Ingestion"
|
||||
icon="add_moderator"
|
||||
href="/openmetadata/ingestion/workflows/metadata/dbt"
|
||||
>
|
||||
Extract Metadata and ingest your DBT models.
|
||||
</InlineCallout>
|
||||
</InlineCalloutContainer>
|
||||
|
||||
@ -13,6 +13,21 @@ This workflow is available ONLY for the following connectors:
|
||||
- [Redshift](/openmetadata/connectors/database/redshift)
|
||||
- [Clickhouse](/openmetadata/connectors/database/clickhouse)
|
||||
|
||||
If your database service is not yet supported, you can use this same workflow by providing a Query Log file!
|
||||
|
||||
Learn how to do so 👇
|
||||
|
||||
<InlineCalloutContainer>
|
||||
<InlineCallout
|
||||
color="violet-70"
|
||||
bold="Usage Workflow through Query Logs"
|
||||
icon="add_moderator"
|
||||
href="/openmetadata/ingestion/workflows/usage/usage-workflow-query-logs"
|
||||
>
|
||||
Configure the usage workflow by providing a Query Log file.
|
||||
</InlineCallout>
|
||||
</InlineCalloutContainer>
|
||||
|
||||
## UI Configuration
|
||||
|
||||
Once the metadata ingestion runs correctly and we are able to explore the service Entities, we can add Query Usage and Entity Lineage information.
|
||||
@ -53,4 +68,4 @@ Set the limit for the query log results to be run at a time.
|
||||
### 3. Schedule and Deploy
|
||||
After clicking Next, you will be redirected to the Scheduling form. This will be the same as the Metadata Ingestion. Select your desired schedule and click on Deploy to find the usage pipeline being added to the Service Ingestions.
|
||||
|
||||
<Image src="/images/openmetadata/ingestion/workflows/usage/scheule-and-deploy.png" alt="schedule-and-deploy" caption="View Service Ingestion pipelines"/>
|
||||
<Image src="/images/openmetadata/ingestion/workflows/usage/scheule-and-deploy.png" alt="schedule-and-deploy" caption="View Service Ingestion pipelines"/>
|
||||
|
||||
@ -15,7 +15,7 @@ or ping us on [Slack](https://slack.open-metadata.org/) If you would like to pri
|
||||
|
||||
You can check the latest release [here](/overview/releases).
|
||||
|
||||
## 0.12.0 Release - Aug 17th, 2022
|
||||
## 0.12.0 Release - Sept 7th, 2022
|
||||
|
||||
<TileContainer>
|
||||
<Tile
|
||||
@ -85,10 +85,9 @@ You can check the latest release [here](/overview/releases).
|
||||
bordercolor="blue-70"
|
||||
>
|
||||
<li>Fivetran</li>
|
||||
<li>Sagemaker</li>
|
||||
<li>Mode</li>
|
||||
<li>Redpanda</li>
|
||||
<li>Prefect</li>
|
||||
<li>Dagster</li>
|
||||
</Tile>
|
||||
<Tile
|
||||
title="ML Features"
|
||||
@ -98,7 +97,7 @@ You can check the latest release [here](/overview/releases).
|
||||
/>
|
||||
</TileContainer>
|
||||
|
||||
## 0.13.0 Release - Sept 28th, 2022
|
||||
## 0.13.0 Release - Oct 12th, 2022
|
||||
|
||||
<TileContainer>
|
||||
<Tile
|
||||
@ -144,12 +143,10 @@ You can check the latest release [here](/overview/releases).
|
||||
background="green-70"
|
||||
bordercolor="green-70"
|
||||
>
|
||||
<li>Qwik</li>
|
||||
<li>DataStudio</li>
|
||||
<li>Trino Usage</li>
|
||||
<li>LookML</li>
|
||||
<li>Dagster</li>
|
||||
<li>One click migration from Amundsen and Atlas.</li>
|
||||
<li>Sagemaker</li>
|
||||
</Tile>
|
||||
<Tile
|
||||
title="Data Quality"
|
||||
@ -158,7 +155,7 @@ You can check the latest release [here](/overview/releases).
|
||||
bordercolor="yellow-70"
|
||||
link="https://github.com/open-metadata/OpenMetadata/issues/4652"
|
||||
>
|
||||
<li>Custom SQL improvements, Allow users to validate the sql and run</li>
|
||||
<li>Complex types</li>
|
||||
<li>Improvements to data profiler metrics</li>
|
||||
<li>Performance improvements to data quality</li>
|
||||
</Tile>
|
||||
@ -179,13 +176,16 @@ You can check the latest release [here](/overview/releases).
|
||||
/>
|
||||
<Tile
|
||||
title="Lineage"
|
||||
text="Support Spark Lineage"
|
||||
text=""
|
||||
background="green-70"
|
||||
bordercolor="green-70"
|
||||
/>
|
||||
>
|
||||
<li>Spark Lineage</li>
|
||||
<li>Connector Lineage improvements</li>
|
||||
</Tile>
|
||||
</TileContainer>
|
||||
|
||||
## 0.14.0 Release - Nov 9th, 2022
|
||||
## 0.14.0 Release - Nov 16th, 2022
|
||||
|
||||
<TileContainer>
|
||||
<Tile
|
||||
@ -233,6 +233,15 @@ You can check the latest release [here](/overview/releases).
|
||||
<li>Microstrategy</li>
|
||||
<li>Custom service integration - Users can integrate with their own service type</li>
|
||||
</Tile>
|
||||
<Tile
|
||||
title="Data Quality"
|
||||
text=""
|
||||
background="purple-70"
|
||||
bordercolor="purple-70"
|
||||
link=""
|
||||
>
|
||||
<li>Custom SQL improvements, Allow users to validate the sql and run</li>
|
||||
</Tile>
|
||||
</TileContainer>
|
||||
|
||||
## 1.0 Release - Dec 15th, 2022
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user