MINOR - Add notes on external ingestion docs (#14756)

* MINOR - Add notes on external ingestion docs

* GCS Composer
This commit is contained in:
Pere Miquel Brull 2024-01-18 11:07:34 +01:00 committed by GitHub
parent c5c7171036
commit 05038cc99c
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
14 changed files with 88 additions and 26 deletions

View File

@ -0,0 +1,12 @@
{% note %}
This page is about running the Ingestion Framework **externally**!
There are mainly 2 ways of running the ingestion:
1. Internally, by managing the workflows from OpenMetadata.
2. Externally, by using any other tool capable of running Python code.
If you are looking for how to manage the ingestion process from OpenMetadata, you can follow
this [doc](/deployment/ingestion/openmetadata).
{% /note %}

View File

@ -0,0 +1,12 @@
{% note %}
This page is about running the Ingestion Framework **externally**!
There are mainly 2 ways of running the ingestion:
1. Internally, by managing the workflows from OpenMetadata.
2. Externally, by using any other tool capable of running Python code.
If you are looking for how to manage the ingestion process from OpenMetadata, you can follow
this [doc](/deployment/ingestion/openmetadata).
{% /note %}

View File

@ -5,8 +5,13 @@ slug: /connectors/pipeline/airflow/gcs-composer
# Extract Metadata from GCS Composer
**Note:** This approach has been tested against Airflow 2.1.4 & 2.2.5 If you have any issues or questions,
please do not hesitate to reach out!
## Requirements
This approach has been last tested against:
- Composer version 2.5.4
- Airflow version 2.6.3
It also requires the ingestion package to be at least `openmetadata-ingestion==1.2.4.3`.
There are 2 main approaches we can follow here to extract metadata from GCS. Both of them involve creating a DAG
directly in your Composer instance, but the requirements and the steps to follow are going to be slightly different.
@ -27,10 +32,8 @@ In any case, once the requirements are there, preparing the DAG is super straigh
In your environment you will need to install the following packages:
- `openmetadata-ingestion==x.y.z`, (e.g., `openmetadata-ingestion==0.12.0`).
- `openmetadata-ingestion==x.y.z`, (e.g., `openmetadata-ingestion==1.2.4`).
- `sqlalchemy==1.4.27`: This is needed to align OpenMetadata version with the Composer internal requirements.
- `flask-appbuilder==3.4.5`: Again, this is just an alignment of versions so that `openmetadata-ingestion` can
work with GCS Composer internals.
**Note:** Make sure to use the `openmetadata-ingestion` version that matches the server version
you currently have!

View File

@ -3,6 +3,8 @@ title: Run the ingestion from your Airflow
slug: /deployment/ingestion/airflow
---
{% partial file="/v1.2/deployment/external-ingestion.md" /%}
# Run the ingestion from your Airflow
We can use Airflow in different ways:

View File

@ -3,10 +3,17 @@ title: Run the ingestion from GCS Composer
slug: /deployment/ingestion/gcs-composer
---
{% partial file="/v1.2/deployment/external-ingestion.md" /%}
# Run the ingestion from GCS Composer
**Note:** This approach has been tested against Airflow 2.1.4 & 2.2.5 If you have any issues or questions,
please do not hesitate to reach out!
## Requirements
This approach has been last tested against:
- Composer version 2.5.4
- Airflow version 2.6.3
It also requires the ingestion package to be at least `openmetadata-ingestion==1.2.4.3`.
## Using the Python Operator
@ -19,8 +26,6 @@ In your environment you will need to install the following packages:
- `openmetadata-ingestion[<plugins>]==x.y.z`.
- `sqlalchemy==1.4.27`: This is needed to align OpenMetadata version with the Composer internal requirements.
- `flask-appbuilder==3.4.5`: Again, this is just an alignment of versions so that `openmetadata-ingestion` can
work with GCS Composer internals.
Where `x.y.z` is the version of the OpenMetadata ingestion package. Note that the version needs to match the server version. If we are using the server at 1.1.0, then the ingestion package needs to also be 1.1.0.

View File

@ -3,6 +3,8 @@ title: Run the ingestion from GitHub Actions
slug: /deployment/ingestion/github-actions
---
{% partial file="/v1.2/deployment/external-ingestion.md" /%}
# Run the ingestion from GitHub Actions
{% note %}

View File

@ -13,7 +13,11 @@ for any type of workflow that is supported in the platform: Metadata, Lineage, U
In this guide, we will present the different alternatives to run and manage your ingestion workflows. There are mainly
2 ways of running the ingestion:
1. Internally, by managing the workflows from OpenMetadata.
2. Externally, by using any other tool capable or running Python code.
2. Externally, by using any other tool capable of running Python code.
Note that the end result is going to be the same. The only difference is that running the workflows internally,
OpenMetadata will dynamically generate the processes that will perform the metadata extraction. If configuring
the ingestion externally, you will be managing this processes directly on your platform of choice.
### Option 1 - From OpenMetadata
@ -31,9 +35,10 @@ If you want to learn how to configure your setup to run them from OpenMetadata,
### Option 2 - Externally
If, instead, you want to manage them from any other system, you would need a bit more background:
1. How does the Ingestion Framework work?
2. Ingestion Configuration
Any tool capable of running Python code can be used to configure the metadata extraction from your sources.
In this section, we are going to give you some background on how the Ingestion Framework works, how to configure
the metadata extraction, and some examples on how to host the ingestion in different platforms.
### 1. How does the Ingestion Framework work?

View File

@ -3,6 +3,8 @@ title: Run the ingestion from AWS MWAA
slug: /deployment/ingestion/mwaa
---
{% partial file="/v1.2/deployment/external-ingestion.md" /%}
# Run the ingestion from AWS MWAA
When running ingestion workflows from MWAA we have three approaches:

View File

@ -5,8 +5,13 @@ slug: /connectors/pipeline/airflow/gcs-composer
# Extract Metadata from GCS Composer
**Note:** This approach has been tested against Airflow 2.1.4 & 2.2.5 If you have any issues or questions,
please do not hesitate to reach out!
## Requirements
This approach has been last tested against:
- Composer version 2.5.4
- Airflow version 2.6.3
It also requires the ingestion package to be at least `openmetadata-ingestion==1.2.4.3`.
There are 2 main approaches we can follow here to extract metadata from GCS. Both of them involve creating a DAG
directly in your Composer instance, but the requirements and the steps to follow are going to be slightly different.
@ -27,10 +32,8 @@ In any case, once the requirements are there, preparing the DAG is super straigh
In your environment you will need to install the following packages:
- `openmetadata-ingestion==x.y.z`, (e.g., `openmetadata-ingestion==0.12.0`).
- `openmetadata-ingestion==x.y.z`, (e.g., `openmetadata-ingestion==1.2.4`).
- `sqlalchemy==1.4.27`: This is needed to align OpenMetadata version with the Composer internal requirements.
- `flask-appbuilder==3.4.5`: Again, this is just an alignment of versions so that `openmetadata-ingestion` can
work with GCS Composer internals.
**Note:** Make sure to use the `openmetadata-ingestion` version that matches the server version
you currently have!

View File

@ -3,6 +3,8 @@ title: Run the ingestion from your Airflow
slug: /deployment/ingestion/airflow
---
{% partial file="/v1.3/deployment/external-ingestion.md" /%}
# Run the ingestion from your Airflow
We can use Airflow in different ways:

View File

@ -3,10 +3,17 @@ title: Run the ingestion from GCS Composer
slug: /deployment/ingestion/gcs-composer
---
{% partial file="/v1.3/deployment/external-ingestion.md" /%}
# Run the ingestion from GCS Composer
**Note:** This approach has been tested against Airflow 2.1.4 & 2.2.5 If you have any issues or questions,
please do not hesitate to reach out!
## Requirements
This approach has been last tested against:
- Composer version 2.5.4
- Airflow version 2.6.3
It also requires the ingestion package to be at least `openmetadata-ingestion==1.2.4.3`.
## Using the Python Operator
@ -19,8 +26,6 @@ In your environment you will need to install the following packages:
- `openmetadata-ingestion[<plugins>]==x.y.z`.
- `sqlalchemy==1.4.27`: This is needed to align OpenMetadata version with the Composer internal requirements.
- `flask-appbuilder==3.4.5`: Again, this is just an alignment of versions so that `openmetadata-ingestion` can
work with GCS Composer internals.
Where `x.y.z` is the version of the OpenMetadata ingestion package. Note that the version needs to match the server version. If we are using the server at 1.1.0, then the ingestion package needs to also be 1.1.0.

View File

@ -3,6 +3,8 @@ title: Run the ingestion from GitHub Actions
slug: /deployment/ingestion/github-actions
---
{% partial file="/v1.3/deployment/external-ingestion.md" /%}
# Run the ingestion from GitHub Actions
{% note %}

View File

@ -13,7 +13,11 @@ for any type of workflow that is supported in the platform: Metadata, Lineage, U
In this guide, we will present the different alternatives to run and manage your ingestion workflows. There are mainly
2 ways of running the ingestion:
1. Internally, by managing the workflows from OpenMetadata.
2. Externally, by using any other tool capable or running Python code.
2. Externally, by using any other tool capable of running Python code.
Note that the end result is going to be the same. The only difference is that running the workflows internally,
OpenMetadata will dynamically generate the processes that will perform the metadata extraction. If configuring
the ingestion externally, you will be managing this processes directly on your platform of choice.
### Option 1 - From OpenMetadata
@ -31,9 +35,10 @@ If you want to learn how to configure your setup to run them from OpenMetadata,
### Option 2 - Externally
If, instead, you want to manage them from any other system, you would need a bit more background:
1. How does the Ingestion Framework work?
2. Ingestion Configuration
Any tool capable of running Python code can be used to configure the metadata extraction from your sources.
In this section, we are going to give you some background on how the Ingestion Framework works, how to configure
the metadata extraction, and some examples on how to host the ingestion in different platforms.
### 1. How does the Ingestion Framework work?

View File

@ -3,6 +3,8 @@ title: Run the ingestion from AWS MWAA
slug: /deployment/ingestion/mwaa
---
{% partial file="/v1.3/deployment/external-ingestion.md" /%}
# Run the ingestion from AWS MWAA
When running ingestion workflows from MWAA we have three approaches: