mirror of
https://github.com/open-metadata/OpenMetadata.git
synced 2025-12-24 14:08:45 +00:00
MINOR - Add notes on external ingestion docs (#14756)
* MINOR - Add notes on external ingestion docs * GCS Composer
This commit is contained in:
parent
c5c7171036
commit
05038cc99c
@ -0,0 +1,12 @@
|
||||
{% note %}
|
||||
|
||||
This page is about running the Ingestion Framework **externally**!
|
||||
|
||||
There are mainly 2 ways of running the ingestion:
|
||||
1. Internally, by managing the workflows from OpenMetadata.
|
||||
2. Externally, by using any other tool capable of running Python code.
|
||||
|
||||
If you are looking for how to manage the ingestion process from OpenMetadata, you can follow
|
||||
this [doc](/deployment/ingestion/openmetadata).
|
||||
|
||||
{% /note %}
|
||||
@ -0,0 +1,12 @@
|
||||
{% note %}
|
||||
|
||||
This page is about running the Ingestion Framework **externally**!
|
||||
|
||||
There are mainly 2 ways of running the ingestion:
|
||||
1. Internally, by managing the workflows from OpenMetadata.
|
||||
2. Externally, by using any other tool capable of running Python code.
|
||||
|
||||
If you are looking for how to manage the ingestion process from OpenMetadata, you can follow
|
||||
this [doc](/deployment/ingestion/openmetadata).
|
||||
|
||||
{% /note %}
|
||||
@ -5,8 +5,13 @@ slug: /connectors/pipeline/airflow/gcs-composer
|
||||
|
||||
# Extract Metadata from GCS Composer
|
||||
|
||||
**Note:** This approach has been tested against Airflow 2.1.4 & 2.2.5 If you have any issues or questions,
|
||||
please do not hesitate to reach out!
|
||||
## Requirements
|
||||
|
||||
This approach has been last tested against:
|
||||
- Composer version 2.5.4
|
||||
- Airflow version 2.6.3
|
||||
|
||||
It also requires the ingestion package to be at least `openmetadata-ingestion==1.2.4.3`.
|
||||
|
||||
There are 2 main approaches we can follow here to extract metadata from GCS. Both of them involve creating a DAG
|
||||
directly in your Composer instance, but the requirements and the steps to follow are going to be slightly different.
|
||||
@ -27,10 +32,8 @@ In any case, once the requirements are there, preparing the DAG is super straigh
|
||||
|
||||
In your environment you will need to install the following packages:
|
||||
|
||||
- `openmetadata-ingestion==x.y.z`, (e.g., `openmetadata-ingestion==0.12.0`).
|
||||
- `openmetadata-ingestion==x.y.z`, (e.g., `openmetadata-ingestion==1.2.4`).
|
||||
- `sqlalchemy==1.4.27`: This is needed to align OpenMetadata version with the Composer internal requirements.
|
||||
- `flask-appbuilder==3.4.5`: Again, this is just an alignment of versions so that `openmetadata-ingestion` can
|
||||
work with GCS Composer internals.
|
||||
|
||||
**Note:** Make sure to use the `openmetadata-ingestion` version that matches the server version
|
||||
you currently have!
|
||||
|
||||
@ -3,6 +3,8 @@ title: Run the ingestion from your Airflow
|
||||
slug: /deployment/ingestion/airflow
|
||||
---
|
||||
|
||||
{% partial file="/v1.2/deployment/external-ingestion.md" /%}
|
||||
|
||||
# Run the ingestion from your Airflow
|
||||
|
||||
We can use Airflow in different ways:
|
||||
|
||||
@ -3,10 +3,17 @@ title: Run the ingestion from GCS Composer
|
||||
slug: /deployment/ingestion/gcs-composer
|
||||
---
|
||||
|
||||
{% partial file="/v1.2/deployment/external-ingestion.md" /%}
|
||||
|
||||
# Run the ingestion from GCS Composer
|
||||
|
||||
**Note:** This approach has been tested against Airflow 2.1.4 & 2.2.5 If you have any issues or questions,
|
||||
please do not hesitate to reach out!
|
||||
## Requirements
|
||||
|
||||
This approach has been last tested against:
|
||||
- Composer version 2.5.4
|
||||
- Airflow version 2.6.3
|
||||
|
||||
It also requires the ingestion package to be at least `openmetadata-ingestion==1.2.4.3`.
|
||||
|
||||
## Using the Python Operator
|
||||
|
||||
@ -19,8 +26,6 @@ In your environment you will need to install the following packages:
|
||||
|
||||
- `openmetadata-ingestion[<plugins>]==x.y.z`.
|
||||
- `sqlalchemy==1.4.27`: This is needed to align OpenMetadata version with the Composer internal requirements.
|
||||
- `flask-appbuilder==3.4.5`: Again, this is just an alignment of versions so that `openmetadata-ingestion` can
|
||||
work with GCS Composer internals.
|
||||
|
||||
Where `x.y.z` is the version of the OpenMetadata ingestion package. Note that the version needs to match the server version. If we are using the server at 1.1.0, then the ingestion package needs to also be 1.1.0.
|
||||
|
||||
|
||||
@ -3,6 +3,8 @@ title: Run the ingestion from GitHub Actions
|
||||
slug: /deployment/ingestion/github-actions
|
||||
---
|
||||
|
||||
{% partial file="/v1.2/deployment/external-ingestion.md" /%}
|
||||
|
||||
# Run the ingestion from GitHub Actions
|
||||
|
||||
{% note %}
|
||||
|
||||
@ -13,7 +13,11 @@ for any type of workflow that is supported in the platform: Metadata, Lineage, U
|
||||
In this guide, we will present the different alternatives to run and manage your ingestion workflows. There are mainly
|
||||
2 ways of running the ingestion:
|
||||
1. Internally, by managing the workflows from OpenMetadata.
|
||||
2. Externally, by using any other tool capable or running Python code.
|
||||
2. Externally, by using any other tool capable of running Python code.
|
||||
|
||||
Note that the end result is going to be the same. The only difference is that running the workflows internally,
|
||||
OpenMetadata will dynamically generate the processes that will perform the metadata extraction. If configuring
|
||||
the ingestion externally, you will be managing this processes directly on your platform of choice.
|
||||
|
||||
### Option 1 - From OpenMetadata
|
||||
|
||||
@ -31,9 +35,10 @@ If you want to learn how to configure your setup to run them from OpenMetadata,
|
||||
|
||||
### Option 2 - Externally
|
||||
|
||||
If, instead, you want to manage them from any other system, you would need a bit more background:
|
||||
1. How does the Ingestion Framework work?
|
||||
2. Ingestion Configuration
|
||||
Any tool capable of running Python code can be used to configure the metadata extraction from your sources.
|
||||
|
||||
In this section, we are going to give you some background on how the Ingestion Framework works, how to configure
|
||||
the metadata extraction, and some examples on how to host the ingestion in different platforms.
|
||||
|
||||
### 1. How does the Ingestion Framework work?
|
||||
|
||||
|
||||
@ -3,6 +3,8 @@ title: Run the ingestion from AWS MWAA
|
||||
slug: /deployment/ingestion/mwaa
|
||||
---
|
||||
|
||||
{% partial file="/v1.2/deployment/external-ingestion.md" /%}
|
||||
|
||||
# Run the ingestion from AWS MWAA
|
||||
|
||||
When running ingestion workflows from MWAA we have three approaches:
|
||||
|
||||
@ -5,8 +5,13 @@ slug: /connectors/pipeline/airflow/gcs-composer
|
||||
|
||||
# Extract Metadata from GCS Composer
|
||||
|
||||
**Note:** This approach has been tested against Airflow 2.1.4 & 2.2.5 If you have any issues or questions,
|
||||
please do not hesitate to reach out!
|
||||
## Requirements
|
||||
|
||||
This approach has been last tested against:
|
||||
- Composer version 2.5.4
|
||||
- Airflow version 2.6.3
|
||||
|
||||
It also requires the ingestion package to be at least `openmetadata-ingestion==1.2.4.3`.
|
||||
|
||||
There are 2 main approaches we can follow here to extract metadata from GCS. Both of them involve creating a DAG
|
||||
directly in your Composer instance, but the requirements and the steps to follow are going to be slightly different.
|
||||
@ -27,10 +32,8 @@ In any case, once the requirements are there, preparing the DAG is super straigh
|
||||
|
||||
In your environment you will need to install the following packages:
|
||||
|
||||
- `openmetadata-ingestion==x.y.z`, (e.g., `openmetadata-ingestion==0.12.0`).
|
||||
- `openmetadata-ingestion==x.y.z`, (e.g., `openmetadata-ingestion==1.2.4`).
|
||||
- `sqlalchemy==1.4.27`: This is needed to align OpenMetadata version with the Composer internal requirements.
|
||||
- `flask-appbuilder==3.4.5`: Again, this is just an alignment of versions so that `openmetadata-ingestion` can
|
||||
work with GCS Composer internals.
|
||||
|
||||
**Note:** Make sure to use the `openmetadata-ingestion` version that matches the server version
|
||||
you currently have!
|
||||
|
||||
@ -3,6 +3,8 @@ title: Run the ingestion from your Airflow
|
||||
slug: /deployment/ingestion/airflow
|
||||
---
|
||||
|
||||
{% partial file="/v1.3/deployment/external-ingestion.md" /%}
|
||||
|
||||
# Run the ingestion from your Airflow
|
||||
|
||||
We can use Airflow in different ways:
|
||||
|
||||
@ -3,10 +3,17 @@ title: Run the ingestion from GCS Composer
|
||||
slug: /deployment/ingestion/gcs-composer
|
||||
---
|
||||
|
||||
{% partial file="/v1.3/deployment/external-ingestion.md" /%}
|
||||
|
||||
# Run the ingestion from GCS Composer
|
||||
|
||||
**Note:** This approach has been tested against Airflow 2.1.4 & 2.2.5 If you have any issues or questions,
|
||||
please do not hesitate to reach out!
|
||||
## Requirements
|
||||
|
||||
This approach has been last tested against:
|
||||
- Composer version 2.5.4
|
||||
- Airflow version 2.6.3
|
||||
|
||||
It also requires the ingestion package to be at least `openmetadata-ingestion==1.2.4.3`.
|
||||
|
||||
## Using the Python Operator
|
||||
|
||||
@ -19,8 +26,6 @@ In your environment you will need to install the following packages:
|
||||
|
||||
- `openmetadata-ingestion[<plugins>]==x.y.z`.
|
||||
- `sqlalchemy==1.4.27`: This is needed to align OpenMetadata version with the Composer internal requirements.
|
||||
- `flask-appbuilder==3.4.5`: Again, this is just an alignment of versions so that `openmetadata-ingestion` can
|
||||
work with GCS Composer internals.
|
||||
|
||||
Where `x.y.z` is the version of the OpenMetadata ingestion package. Note that the version needs to match the server version. If we are using the server at 1.1.0, then the ingestion package needs to also be 1.1.0.
|
||||
|
||||
|
||||
@ -3,6 +3,8 @@ title: Run the ingestion from GitHub Actions
|
||||
slug: /deployment/ingestion/github-actions
|
||||
---
|
||||
|
||||
{% partial file="/v1.3/deployment/external-ingestion.md" /%}
|
||||
|
||||
# Run the ingestion from GitHub Actions
|
||||
|
||||
{% note %}
|
||||
|
||||
@ -13,7 +13,11 @@ for any type of workflow that is supported in the platform: Metadata, Lineage, U
|
||||
In this guide, we will present the different alternatives to run and manage your ingestion workflows. There are mainly
|
||||
2 ways of running the ingestion:
|
||||
1. Internally, by managing the workflows from OpenMetadata.
|
||||
2. Externally, by using any other tool capable or running Python code.
|
||||
2. Externally, by using any other tool capable of running Python code.
|
||||
|
||||
Note that the end result is going to be the same. The only difference is that running the workflows internally,
|
||||
OpenMetadata will dynamically generate the processes that will perform the metadata extraction. If configuring
|
||||
the ingestion externally, you will be managing this processes directly on your platform of choice.
|
||||
|
||||
### Option 1 - From OpenMetadata
|
||||
|
||||
@ -31,9 +35,10 @@ If you want to learn how to configure your setup to run them from OpenMetadata,
|
||||
|
||||
### Option 2 - Externally
|
||||
|
||||
If, instead, you want to manage them from any other system, you would need a bit more background:
|
||||
1. How does the Ingestion Framework work?
|
||||
2. Ingestion Configuration
|
||||
Any tool capable of running Python code can be used to configure the metadata extraction from your sources.
|
||||
|
||||
In this section, we are going to give you some background on how the Ingestion Framework works, how to configure
|
||||
the metadata extraction, and some examples on how to host the ingestion in different platforms.
|
||||
|
||||
### 1. How does the Ingestion Framework work?
|
||||
|
||||
|
||||
@ -3,6 +3,8 @@ title: Run the ingestion from AWS MWAA
|
||||
slug: /deployment/ingestion/mwaa
|
||||
---
|
||||
|
||||
{% partial file="/v1.3/deployment/external-ingestion.md" /%}
|
||||
|
||||
# Run the ingestion from AWS MWAA
|
||||
|
||||
When running ingestion workflows from MWAA we have three approaches:
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user