Fix Airflow docs (#12009)

This commit is contained in:
Pere Miquel Brull 2023-06-21 08:36:06 +02:00 committed by GitHub
parent 35cca0e178
commit 7f39cc105f
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

View File

@ -9,25 +9,27 @@ We support different approaches to extracting metadata from Airflow:
2. **Airflow Lineage Backend**: which can be configured in your Airflow instance. You can read more about the Lineage Backend [here](https://docs.open-metadata.org/connectors/pipeline/airflow/lineage-backend).
3. **Airflow Lineage Operator**: To send metadata directly from your Airflow DAGs. You can read more about the Lineage Operator [here](https://docs.open-metadata.org/connectors/pipeline/airflow/lineage-operator).
You can find further information on the Kafka connector in the [docs](https://docs.open-metadata.org/connectors/pipeline/airflow).
From the OpenMetadata UI, you have access to the strategy number 1.
You can find further information on the Airflow connector in the [docs](https://docs.open-metadata.org/connectors/pipeline/airflow).
## Connection Details
$$section
### Host and Port $(id="hostPort")
### Host and Port
Pipeline Service Management URI. This should be specified as a URI string in the format `scheme://hostname:port`. E.g., `http://localhost:8080`, `http://host.docker.internal:8080`.
$$
$$section
### Number Of Status $(id="numberOfStatus")
### Number Of Status
Number of past task status to read every time the ingestion runs. By default, we will pick up and update the last 10 runs.
$$
$$section
### Metadata Database Connection $(id="connection")
### Metadata Database Connection
Select your underlying database connection. We support the [official](https://airflow.apache.org/docs/apache-airflow/stable/howto/set-up-database.html) backends from Airflow.
@ -35,14 +37,76 @@ Note that the **Backend Connection** is only used to extract metadata from a DAG
$$
$$section
### Connection Options $(id="connectionOptions")
Additional connection options to build the URL that can be sent to the service during the connection.
---
$$
## MySQL Connection
$$section
### Connection Arguments $(id="connectionArguments")
Additional connection arguments such as security or protocol configs that can be sent to the service during connection.
If your Airflow is backed by a MySQL database, then you will need to fill in these details:
$$
### Username & Password
Credentials with permissions to connect to the database. Read-only permissions are required.
### Host and Port
Host and port of the MySQL service. This should be specified as a string in the format `hostname:port`. E.g., `localhost:3306`, `host.docker.internal:3306`.
### Database Schema
MySQL schema that contains the Airflow tables.
### SSL CA $(id="sslCA")
Provide the path to SSL CA file, which needs to be local in the ingestion process.
### SSL Certificate $(id="sslCert")
Provide the path to SSL client certificate file (`ssl_cert`)
### SSL Key $(id="sslKey")
Provide the path to SSL key file (`ssl_key`)
---
## Postgres Connection
If your Airflow is backed by a Postgres database, then you will need to fill in these details:
### Username & Password
Credentials with permissions to connect to the database. Read-only permissions are required.
### Host and Port
Host and port of the Postgres service. E.g., `localhost:5432` or `host.docker.internal:5432`.
### Database
Postgres database that contains the Airflow tables.
### SSL Mode $(id="sslMode")
SSL Mode to connect to postgres database. E.g, `prefer`, `verify-ca` etc.
You can ignore the rest of the properties, since we won't ingest any database not policy tags.
---
## MSSQL Connection
If your Airflow is backed by a MSSQL database, then you will need to fill in these details:
### Username & Password
Credentials with permissions to connect to the database. Read-only permissions are required.
### Host and Port
Host and port of the Postgres service. E.g., `localhost:1433` or `host.docker.internal:1433`.
### Database
MSSQL database that contains the Airflow tables.
### URI String $(id="uriString")
Connection URI String to connect with MSSQL. It only works with `pyodbc` scheme. E.g., `DRIVER={ODBC Driver 17 for SQL Server};SERVER=server_name;DATABASE=db_name;UID=user_name;PWD=password`.