Mayuri Nehate
14b48489d4
feat(ingest): pass timeout config in kafka admin client api calls ( #6863 )
2022-12-27 12:45:11 -08:00
Harshal Sheth
31260888fc
feat(ingest/airflow): support raw dataset urns in airflow lineage ( #6854 )
...
* feat(ingest/airflow): support dataset Urns in airflow lineage
This PR also
- resolves a reported circular import issue
- refactors the Airflow tests to reduce duplication
* fix test
2022-12-27 08:59:26 +01:00
Mayuri Nehate
69a2347db1
feat(ingest): update profiling to fetch configurable number of sample values ( #6859 )
2022-12-27 08:57:26 +01:00
david-leifker
ecc01b9a46
refactor(restli-mce-consumer) ( #6744 )
...
* fix(security): commons-text in frontend
* refactor(restli): set threads based on cpu cores
feat(mce-consumers): hit local restli endpoint
* testing docker build
* Add retry configuration options for entity client
* Kafka debugging
* fix(kafka-setup): parallelize topic creation
* Adjust docker build
* Docker build updates
* WIP
* fix(lint): metadata-ingestion lint
* fix(gradle-docker): fix docker frontend dep
* fix(elastic): fix race condition between gms and mae for index creation
* Revert "fix(elastic): fix race condition between gms and mae for index creation"
This reverts commit 9629d12c3bdb3c0dab87604d409ca4c642c9c6d3.
* fix(test): fix datahub frontend test for clean/test cycle
* fix(test): datahub-frontend missing assets in test
* fix(security): set protobuf lib datahub-upgrade & mce/mae-consumer
* gitingore update
* fix(docker): remove platform on docker base image, set by buildx
* refactor(kafka-producer): update kafka producer tracking/logging
* updates per PR feedback
* Add documentation around mce standalone consumer
Kafka consumer concurrency to follow thread count for restli & sql connection pool
Co-authored-by: leifker <dleifker@gmail.com>
Co-authored-by: Pedro Silva <pedro@acryl.io>
2022-12-26 16:09:08 +00:00
Harshal Sheth
392115b4c4
feat(ingest): add pydantic helper for removed fields ( #6853 )
2022-12-26 15:31:49 +05:30
Harshal Sheth
ea5ee6f761
fix(ingest/looker): handle missing label
fields ( #6849 )
2022-12-22 19:43:44 -05:00
mohdsiddique
9daa8ed56f
feat(ingestion): Business Glossary# Add domain support in GlossaryTerm ingestion ( #6829 )
...
* lint fix
* domain in term
* domain in term
* review comments
* add todo
Co-authored-by: MohdSiddique Bagwan <mohdsiddique.bagwan@gslab.com>
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2022-12-22 17:47:57 -05:00
Harshal Sheth
1d0c7852a7
feat(ingest): add db/schema properties hook to SQL common ( #6847 )
2022-12-22 13:38:59 -08:00
John Joyce
4cba09e97d
fix(ingest): Fixing lint ( #6844 )
2022-12-22 08:33:18 -08:00
wangsaisai
0f8e2d945e
fix(ingest): kafka ingest task hand up with error bootstrap server ( #6820 )
2022-12-22 07:39:30 -08:00
Mayuri Nehate
a05c5c4069
feat(ingest): extract kafka topic config properties as customProperties ( #6783 )
2022-12-22 09:34:55 +01:00
John Joyce
2e3a25123d
refactor(ingestion): Browse Paths Upgrade V2 Feast & Sagemaker ( #6002 )
2022-12-21 08:02:59 -08:00
Dago Romer
9cb1eed6e7
fix(ingest): fixed snowflake oauth ingestion not using role attribute from recipe ( #6825 )
2022-12-21 07:52:06 -08:00
Harshal Sheth
e2b4a65a8e
refactor(ingest): clean up exception types ( #6818 )
2022-12-21 07:28:18 -08:00
Harshal Sheth
8972ea4b04
fix(ingest): support patches in auto_status_aspect
( #6827 )
...
Patches generate a raw MCP because MCPW doesn't support patches right now, so we need to handle that correctly downstream.
2022-12-21 10:25:24 +01:00
Tamas Nemeth
a1970d2dce
feat(ingest/bigquery): add option to enable/disable legacy sharded table support ( #6822 )
...
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
Co-authored-by: John Joyce <john@acryl.io>
2022-12-20 23:29:46 -05:00
Harshal Sheth
2c911ccf7b
refactor(ingest): clean up pipeline init error handling ( #6817 )
2022-12-20 19:21:28 -08:00
Harshal Sheth
88e40a9069
feat(ingest): add failure/warning counts to ingest_stats ( #6823 )
2022-12-20 19:13:11 -08:00
Harshal Sheth
137f4500b6
feat(ingest/stateful): remove platform_instance_id from state urn ( #6795 )
2022-12-20 12:12:19 -05:00
Harshal Sheth
5584bfb469
refactor(ingest/stateful): remove get_last_state
method ( #6794 )
2022-12-19 20:48:22 -05:00
raysaka
fcb3242983
chore(ingest): bump python package dependencies to resolve vulns ( #6384 )
...
Co-authored-by: John Joyce <john@acryl.io>
2022-12-19 18:12:56 -05:00
Harshal Sheth
e9d50ed992
refactor(ingest/stateful): remove IngestionJobStateProvider
( #6792 )
2022-12-19 17:03:54 -05:00
Monica Senapati
5c366205f5
fix(bigquery-legacy): Fix for TypeError related failures in legacy plugin ( #6806 )
...
Co-authored-by: John Joyce <john@acryl.io>
2022-12-19 13:28:25 -08:00
Harshal Sheth
47be95689e
refactor(ingest/stateful): remove most remaining state classes ( #6791 )
2022-12-19 13:40:48 -05:00
Harshal Sheth
14a00f4098
chore(ingest): pin black version ( #6807 )
2022-12-19 19:35:49 +01:00
Tamas Nemeth
e41b455e14
fix(ingest): bigquery - sharded table support improvements ( #6789 )
2022-12-19 18:57:37 +01:00
Harshal Sheth
54e04ba436
fix(ingest/dbt): remove unsupported usage indicator ( #6805 )
2022-12-19 09:34:49 -08:00
Mayuri Nehate
9716a49067
fix(ingest): correct external url for account identifier with account name ( #6715 )
2022-12-16 14:00:42 -05:00
Harshal Sheth
22081f5ecc
feat(ingest): lookml - add unreachable views to report ( #6779 )
2022-12-15 20:26:30 -08:00
Harshal Sheth
8a537b0559
feat(ingest): add datahub state inspect
command ( #6763 )
2022-12-15 18:55:36 -05:00
Harshal Sheth
798d82fe60
docs(ingest): fix error in custom tags transformer example ( #6767 )
2022-12-15 15:31:12 -08:00
Tamas Nemeth
b7bc1e9116
fix(ingest): bigquery - handling custom sql errors as warning ( #6777 )
2022-12-15 23:40:32 +01:00
Harshal Sheth
6152b5e9f7
feat(ingest): simplify more stateful ingestion state ( #6762 )
2022-12-15 11:33:29 -05:00
Shirshanka Das
db182e4639
fix(python-sdk): DataHubGraph get_aspect should accept empty responses ( #6760 )
2022-12-14 10:40:16 -08:00
Harshal Sheth
2f95719dba
feat(ingest): remove source config from DatahubIngestionCheckpoint ( #6722 )
2022-12-14 12:39:21 -05:00
Patrick Franco Braz
f0a371941e
refactor(ingest): bigquery-lineage - allow tables and datasets in uppercase ( #6739 )
2022-12-14 14:58:03 +01:00
Harshal Sheth
68fd802881
fix(ingest/lookml): fix directory handling and a github_info resolution bug ( #6751 )
2022-12-14 14:55:38 +01:00
cccs-seb
3c2982c02c
fix(ingest): support airflow mapped operators ( #6738 )
2022-12-13 22:31:53 -05:00
Harshal Sheth
cf3db168ac
feat(ingest): start simplifying stateful ingestion state ( #6740 )
2022-12-13 10:05:57 +01:00
Harshal Sheth
7d63399d00
fix(ingest): fix serde for empty dicts in unions with null ( #6745 )
...
The code changes in https://github.com/acryldata/avro_gen/pull/16 , but tests are written here.
2022-12-13 08:17:24 +01:00
Dmitry Bryazgin
551ef1b335
feat(ingest): add stateful ingestion to the ldap source ( #6127 )
...
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2022-12-13 01:13:39 -05:00
Harshal Sheth
85bb1f5030
test(ingest): make hive/trino test more reliable ( #6741 )
2022-12-12 21:02:52 -05:00
Tamas Nemeth
5658fd5a54
feat(ingest): bigquery - external url support and a small profiling filter fix ( #6714 )
2022-12-12 16:25:32 -08:00
cccs-Dustin
2cc64742e0
feat(ingest/iceberg): add stateful ingestion ( #6344 )
2022-12-12 13:06:03 -05:00
Mayuri Nehate
65ba13d9aa
feat(ingest): snowflake - add separate config for include_column_lineage in snowflake ( #6712 )
2022-12-12 15:23:12 +01:00
Jan Hicken
d3fca44e16
fix(ingest): bigquery - rectify filter for BigQuery external tables ( #6691 )
2022-12-12 10:58:23 +01:00
Harshal Sheth
fd911c9820
feat(ingest): redact configs reported in ingestion_run_summary ( #6696 )
2022-12-12 10:48:26 +01:00
Mayuri Nehate
5c99f20b7d
fix(ingest): mysql - fix mysql ingestion issue with non-lowercase database ( #6713 )
2022-12-12 10:48:01 +01:00
Harshal Sheth
b7735d5b21
fix(ingest): fix bug in auto_status_aspect ( #6705 )
...
Co-authored-by: Tamas Nemeth <treff7es@gmail.com>
2022-12-09 12:24:39 -05:00
Harshal Sheth
c211cfbbe6
fix(ingest/sagemaker): handle missing ProcessingInputs field ( #6697 )
...
Fixes #6360 .
2022-12-08 18:42:28 -08:00