mirror of
https://github.com/datahub-project/datahub.git
synced 2025-06-27 05:03:31 +00:00
feat(ingest): unbundle airflow plugin emitter dependencies (#7493)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
This commit is contained in:
parent
de719663ff
commit
cc0772f8d8
@ -6,6 +6,9 @@ This file documents any backwards-incompatible changes in DataHub and assists pe
|
|||||||
|
|
||||||
### Breaking Changes
|
### Breaking Changes
|
||||||
- #7016 Add `add_database_name_to_urn` flag to Oracle source which ensure that Dataset urns have the DB name as a prefix to prevent collision (.e.g. {database}.{schema}.{table}). ONLY breaking if you set this flag to true, otherwise behavior remains the same.
|
- #7016 Add `add_database_name_to_urn` flag to Oracle source which ensure that Dataset urns have the DB name as a prefix to prevent collision (.e.g. {database}.{schema}.{table}). ONLY breaking if you set this flag to true, otherwise behavior remains the same.
|
||||||
|
- The Airflow plugin no longer includes the DataHub Kafka emitter by default. Use `pip install acryl-datahub-airflow-plugin[datahub-kafka]` for Kafka support.
|
||||||
|
- The Airflow lineage backend no longer includes the DataHub Kafka emitter by default. Use `pip install acryl-datahub[airflow,datahub-kafka]` for Kafka support.
|
||||||
|
|
||||||
|
|
||||||
### Potential Downtime
|
### Potential Downtime
|
||||||
|
|
||||||
|
@ -26,6 +26,12 @@ If you're using Airflow 1.x, use the Airflow lineage plugin with acryl-datahub-a
|
|||||||
pip install acryl-datahub-airflow-plugin
|
pip install acryl-datahub-airflow-plugin
|
||||||
```
|
```
|
||||||
|
|
||||||
|
:::note
|
||||||
|
|
||||||
|
The [DataHub Rest](../../metadata-ingestion/sink_docs/datahub.md#datahub-rest) emitter is included in the plugin package by default. To use [DataHub Kafka](../../metadata-ingestion/sink_docs/datahub.md#datahub-kafka) install `pip install acryl-datahub-airflow-plugin[datahub-kafka]`.
|
||||||
|
|
||||||
|
:::
|
||||||
|
|
||||||
2. Disable lazy plugin loading in your airflow.cfg.
|
2. Disable lazy plugin loading in your airflow.cfg.
|
||||||
On MWAA you should add this config to your [Apache Airflow configuration options](https://docs.aws.amazon.com/mwaa/latest/userguide/configuring-env-variables.html#configuring-2.0-airflow-override).
|
On MWAA you should add this config to your [Apache Airflow configuration options](https://docs.aws.amazon.com/mwaa/latest/userguide/configuring-env-variables.html#configuring-2.0-airflow-override).
|
||||||
|
|
||||||
@ -89,6 +95,8 @@ If you are looking to run Airflow and DataHub using docker locally, follow the g
|
|||||||
|
|
||||||
```shell
|
```shell
|
||||||
pip install acryl-datahub[airflow]
|
pip install acryl-datahub[airflow]
|
||||||
|
# If you need the Kafka-based emitter/hook:
|
||||||
|
pip install acryl-datahub[airflow,datahub-kafka]
|
||||||
```
|
```
|
||||||
|
|
||||||
2. You must configure an Airflow hook for Datahub. We support both a Datahub REST hook and a Kafka-based hook, but you only need one.
|
2. You must configure an Airflow hook for Datahub. We support both a Datahub REST hook and a Kafka-based hook, but you only need one.
|
||||||
|
@ -125,5 +125,6 @@ setuptools.setup(
|
|||||||
install_requires=list(base_requirements),
|
install_requires=list(base_requirements),
|
||||||
extras_require={
|
extras_require={
|
||||||
"dev": list(dev_requirements),
|
"dev": list(dev_requirements),
|
||||||
|
"datahub-kafka": f"acryl-datahub[datahub-kafka] == {package_metadata['__version__']}",
|
||||||
},
|
},
|
||||||
)
|
)
|
||||||
|
@ -251,7 +251,6 @@ plugins: Dict[str, Set[str]] = {
|
|||||||
"airflow": {
|
"airflow": {
|
||||||
"apache-airflow >= 2.0.2",
|
"apache-airflow >= 2.0.2",
|
||||||
*rest_common,
|
*rest_common,
|
||||||
*kafka_common,
|
|
||||||
},
|
},
|
||||||
"circuit-breaker": {
|
"circuit-breaker": {
|
||||||
"gql>=3.3.0",
|
"gql>=3.3.0",
|
||||||
|
Loading…
x
Reference in New Issue
Block a user