3975 Commits

Author SHA1 Message Date
mohdsiddique
54ea8244de
feat(ingestion): PowerBI# Improve PowerBI source ingestion (#6549)
Co-authored-by: MohdSiddique Bagwan <mohdsiddique.bagwan@gslab.com>
2023-01-03 08:08:11 -08:00
cc
4209d6f3dd
fix(ingest/metabase): use card_id in dashboard to chart lineage (#6583)
Co-authored-by: 陈城 <cheng.chen@tenclass.com>
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2022-12-30 17:27:09 -05:00
Harshal Sheth
e9176d2cd2
docs(ingest/looker): fix typos + update lookml github action example (#6910) 2022-12-30 20:54:43 +01:00
Harshal Sheth
b9677229a1
chore(ingest): loosen pyspark and pydeequ deps (#6908) 2022-12-30 20:53:38 +01:00
Harshal Sheth
62a2aa94f6
feat: remove jq requirement + tweak modeldocgen args (#6904)
Co-authored-by: Tamas Nemeth <treff7es@gmail.com>
2022-12-30 14:02:57 -05:00
Stijn De Haes
b796db1caf
fix(ingest/airflow): reorder imports to avoid cyclical dependencies (#6719)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2022-12-30 13:12:25 -05:00
Harshal Sheth
092d4c808d
fix(cli): fix delete urn cli bug + stricter type annotations (#6903) 2022-12-30 11:36:00 +01:00
Pedro Silva
594fc1bf5a
fix(cli): Make datahub quickstart work with latest docker compose in M1 (#6891)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2022-12-30 11:33:18 +01:00
Tamas Nemeth
e81a3ad26d
fix(ingest): profiling (bigquery) - Address biquery profiling query error due to timestamp vs data mismatch (#6874) 2022-12-30 11:32:43 +01:00
Harshal Sheth
1b889022f0
test(ingest/kafka-connect): make docker setup more reliable (#6902) 2022-12-30 11:31:33 +01:00
Harshal Sheth
dfc5c6bfce
chore(ingest): remove inferred args to MCPW, part 1 (#6819) 2022-12-30 01:26:47 -05:00
Tamas Nemeth
ead0074169
deprecate(ingest): bigquery - Removing bigquery-legacy source (#6851)
Co-authored-by: John Joyce <john@acryl.io>
2022-12-29 13:19:05 -08:00
Marvin Rösch
5167ed40ef
fix(ingest): trino - fall back to default table comment method for all Trino query errors (#6873) 2022-12-29 18:11:21 +01:00
Aseem Bansal
5755d2ca9e
fix(ingest): okta undefined variable error (#6882) 2022-12-29 20:24:22 +05:30
John Joyce
218f3c3414
refactor(docs): Correctly spell elasticsearch in docs (#6880) 2022-12-29 15:21:24 +01:00
Aseem Bansal
b8664d6630
fix(lint): pin pydantic version (#6886) 2022-12-29 19:36:14 +05:30
Harshal Sheth
667ca8632d
feat(ingest): avoid embedding serialized json in metadata files (#6742) 2022-12-28 19:28:38 -05:00
Harshal Sheth
b474315e07
fix(ingest): conditionally include env in assertion guid (#6811) 2022-12-28 11:35:20 -08:00
Mayuri Nehate
2129496c98
feat(ingest/snowflake): handle failures gracefully and raise permission failures (#6748) 2022-12-28 08:20:37 -08:00
Tamas Nemeth
25b5a12b9d
feat(ingest): bigquery/snowflake - Store last profile date in state (#6832) 2022-12-28 12:09:18 +01:00
cccs-eric
ec8a4e0eab
feat(ingest): upgrade pydantic version (#6858)
This PR also removes the requirement on docker-compose v1 and makes our tests use v2 instead.

Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2022-12-27 17:06:16 -05:00
Mayuri Nehate
14b48489d4
feat(ingest): pass timeout config in kafka admin client api calls (#6863) 2022-12-27 12:45:11 -08:00
Harshal Sheth
31260888fc
feat(ingest/airflow): support raw dataset urns in airflow lineage (#6854)
* feat(ingest/airflow): support dataset Urns in airflow lineage

This PR also
- resolves a reported circular import issue
- refactors the Airflow tests to reduce duplication

* fix test
2022-12-27 08:59:26 +01:00
Mayuri Nehate
69a2347db1
feat(ingest): update profiling to fetch configurable number of sample values (#6859) 2022-12-27 08:57:26 +01:00
david-leifker
ecc01b9a46
refactor(restli-mce-consumer) (#6744)
* fix(security): commons-text in frontend

* refactor(restli): set threads based on cpu cores
feat(mce-consumers): hit local restli endpoint

* testing docker build

* Add retry configuration options for entity client

* Kafka debugging

* fix(kafka-setup): parallelize topic creation

* Adjust docker build

* Docker build updates

* WIP

* fix(lint): metadata-ingestion lint

* fix(gradle-docker): fix docker frontend dep

* fix(elastic): fix race condition between gms and mae for index creation

* Revert "fix(elastic): fix race condition between gms and mae for index creation"

This reverts commit 9629d12c3bdb3c0dab87604d409ca4c642c9c6d3.

* fix(test): fix datahub frontend test for clean/test cycle

* fix(test): datahub-frontend missing assets in test

* fix(security): set protobuf lib datahub-upgrade & mce/mae-consumer

* gitingore update

* fix(docker): remove platform on docker base image, set by buildx

* refactor(kafka-producer): update kafka producer tracking/logging

* updates per PR feedback

* Add documentation around mce standalone consumer
Kafka consumer concurrency to follow thread count for restli & sql connection pool

Co-authored-by: leifker <dleifker@gmail.com>
Co-authored-by: Pedro Silva <pedro@acryl.io>
2022-12-26 16:09:08 +00:00
Harshal Sheth
392115b4c4
feat(ingest): add pydantic helper for removed fields (#6853) 2022-12-26 15:31:49 +05:30
Harshal Sheth
ea5ee6f761
fix(ingest/looker): handle missing label fields (#6849) 2022-12-22 19:43:44 -05:00
mohdsiddique
9daa8ed56f
feat(ingestion): Business Glossary# Add domain support in GlossaryTerm ingestion (#6829)
* lint fix

* domain in term

* domain in term

* review comments

* add todo

Co-authored-by: MohdSiddique Bagwan <mohdsiddique.bagwan@gslab.com>
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2022-12-22 17:47:57 -05:00
Harshal Sheth
1d0c7852a7
feat(ingest): add db/schema properties hook to SQL common (#6847) 2022-12-22 13:38:59 -08:00
John Joyce
4cba09e97d
fix(ingest): Fixing lint (#6844) 2022-12-22 08:33:18 -08:00
wangsaisai
0f8e2d945e
fix(ingest): kafka ingest task hand up with error bootstrap server (#6820) 2022-12-22 07:39:30 -08:00
Mayuri Nehate
a05c5c4069
feat(ingest): extract kafka topic config properties as customProperties (#6783) 2022-12-22 09:34:55 +01:00
John Joyce
2e3a25123d
refactor(ingestion): Browse Paths Upgrade V2 Feast & Sagemaker (#6002) 2022-12-21 08:02:59 -08:00
Dago Romer
9cb1eed6e7
fix(ingest): fixed snowflake oauth ingestion not using role attribute from recipe (#6825) 2022-12-21 07:52:06 -08:00
Harshal Sheth
e2b4a65a8e
refactor(ingest): clean up exception types (#6818) 2022-12-21 07:28:18 -08:00
Harshal Sheth
8972ea4b04
fix(ingest): support patches in auto_status_aspect (#6827)
Patches generate a raw MCP because MCPW doesn't support patches right now, so we need to handle that correctly downstream.
2022-12-21 10:25:24 +01:00
Tamas Nemeth
a1970d2dce
feat(ingest/bigquery): add option to enable/disable legacy sharded table support (#6822)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
Co-authored-by: John Joyce <john@acryl.io>
2022-12-20 23:29:46 -05:00
Harshal Sheth
2c911ccf7b
refactor(ingest): clean up pipeline init error handling (#6817) 2022-12-20 19:21:28 -08:00
Harshal Sheth
88e40a9069
feat(ingest): add failure/warning counts to ingest_stats (#6823) 2022-12-20 19:13:11 -08:00
Harshal Sheth
137f4500b6
feat(ingest/stateful): remove platform_instance_id from state urn (#6795) 2022-12-20 12:12:19 -05:00
Harshal Sheth
5584bfb469
refactor(ingest/stateful): remove get_last_state method (#6794) 2022-12-19 20:48:22 -05:00
raysaka
fcb3242983
chore(ingest): bump python package dependencies to resolve vulns (#6384)
Co-authored-by: John Joyce <john@acryl.io>
2022-12-19 18:12:56 -05:00
Harshal Sheth
e9d50ed992
refactor(ingest/stateful): remove IngestionJobStateProvider (#6792) 2022-12-19 17:03:54 -05:00
Monica Senapati
5c366205f5
fix(bigquery-legacy): Fix for TypeError related failures in legacy plugin (#6806)
Co-authored-by: John Joyce <john@acryl.io>
2022-12-19 13:28:25 -08:00
Harshal Sheth
47be95689e
refactor(ingest/stateful): remove most remaining state classes (#6791) 2022-12-19 13:40:48 -05:00
Harshal Sheth
14a00f4098
chore(ingest): pin black version (#6807) 2022-12-19 19:35:49 +01:00
Tamas Nemeth
e41b455e14
fix(ingest): bigquery - sharded table support improvements (#6789) 2022-12-19 18:57:37 +01:00
Harshal Sheth
54e04ba436
fix(ingest/dbt): remove unsupported usage indicator (#6805) 2022-12-19 09:34:49 -08:00
Mayuri Nehate
9716a49067
fix(ingest): correct external url for account identifier with account name (#6715) 2022-12-16 14:00:42 -05:00
Harshal Sheth
22081f5ecc
feat(ingest): lookml - add unreachable views to report (#6779) 2022-12-15 20:26:30 -08:00