780 Commits

Author SHA1 Message Date
Harshal Sheth
432feaa16d
feat(ingest): mark database_alias and env as deprecated (#6901) 2023-01-09 19:58:19 +05:30
VISHAL KUMAR
96ac4c431f
feat(ingest/vertica): support projections and lineage in vertica (#6785)
Co-authored-by: mraman2512 <MY_mramaan2512@gmail.com>
Co-authored-by: Aman.Kumar <64635307+mraman2512@users.noreply.github.com>
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-01-06 16:20:19 -05:00
Harshal Sheth
f651646d3d
chore(ingest): remove inferred args to MCPW, part 2 (#6905) 2023-01-04 23:29:56 -05:00
Harshal Sheth
8b1dc4bbdf
fix(ingest): use branch info when cloning git repos (#6937) 2023-01-04 16:52:16 -08:00
Fredrik Sannholm
e0aa812621
feat(ingest): allow extracting snowflake tags (#6500) 2023-01-04 16:05:23 -05:00
mohdsiddique
54ea8244de
feat(ingestion): PowerBI# Improve PowerBI source ingestion (#6549)
Co-authored-by: MohdSiddique Bagwan <mohdsiddique.bagwan@gslab.com>
2023-01-03 08:08:11 -08:00
Harshal Sheth
1b889022f0
test(ingest/kafka-connect): make docker setup more reliable (#6902) 2022-12-30 11:31:33 +01:00
Harshal Sheth
dfc5c6bfce
chore(ingest): remove inferred args to MCPW, part 1 (#6819) 2022-12-30 01:26:47 -05:00
Tamas Nemeth
ead0074169
deprecate(ingest): bigquery - Removing bigquery-legacy source (#6851)
Co-authored-by: John Joyce <john@acryl.io>
2022-12-29 13:19:05 -08:00
Harshal Sheth
667ca8632d
feat(ingest): avoid embedding serialized json in metadata files (#6742) 2022-12-28 19:28:38 -05:00
Mayuri Nehate
2129496c98
feat(ingest/snowflake): handle failures gracefully and raise permission failures (#6748) 2022-12-28 08:20:37 -08:00
cccs-eric
ec8a4e0eab
feat(ingest): upgrade pydantic version (#6858)
This PR also removes the requirement on docker-compose v1 and makes our tests use v2 instead.

Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2022-12-27 17:06:16 -05:00
Harshal Sheth
31260888fc
feat(ingest/airflow): support raw dataset urns in airflow lineage (#6854)
* feat(ingest/airflow): support dataset Urns in airflow lineage

This PR also
- resolves a reported circular import issue
- refactors the Airflow tests to reduce duplication

* fix test
2022-12-27 08:59:26 +01:00
Mayuri Nehate
69a2347db1
feat(ingest): update profiling to fetch configurable number of sample values (#6859) 2022-12-27 08:57:26 +01:00
mohdsiddique
9daa8ed56f
feat(ingestion): Business Glossary# Add domain support in GlossaryTerm ingestion (#6829)
* lint fix

* domain in term

* domain in term

* review comments

* add todo

Co-authored-by: MohdSiddique Bagwan <mohdsiddique.bagwan@gslab.com>
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2022-12-22 17:47:57 -05:00
Mayuri Nehate
a05c5c4069
feat(ingest): extract kafka topic config properties as customProperties (#6783) 2022-12-22 09:34:55 +01:00
John Joyce
2e3a25123d
refactor(ingestion): Browse Paths Upgrade V2 Feast & Sagemaker (#6002) 2022-12-21 08:02:59 -08:00
Harshal Sheth
e2b4a65a8e
refactor(ingest): clean up exception types (#6818) 2022-12-21 07:28:18 -08:00
Harshal Sheth
137f4500b6
feat(ingest/stateful): remove platform_instance_id from state urn (#6795) 2022-12-20 12:12:19 -05:00
Harshal Sheth
5584bfb469
refactor(ingest/stateful): remove get_last_state method (#6794) 2022-12-19 20:48:22 -05:00
Harshal Sheth
e9d50ed992
refactor(ingest/stateful): remove IngestionJobStateProvider (#6792) 2022-12-19 17:03:54 -05:00
Harshal Sheth
47be95689e
refactor(ingest/stateful): remove most remaining state classes (#6791) 2022-12-19 13:40:48 -05:00
Tamas Nemeth
e41b455e14
fix(ingest): bigquery - sharded table support improvements (#6789) 2022-12-19 18:57:37 +01:00
Mayuri Nehate
9716a49067
fix(ingest): correct external url for account identifier with account name (#6715) 2022-12-16 14:00:42 -05:00
Harshal Sheth
8a537b0559
feat(ingest): add datahub state inspect command (#6763) 2022-12-15 18:55:36 -05:00
Harshal Sheth
6152b5e9f7
feat(ingest): simplify more stateful ingestion state (#6762) 2022-12-15 11:33:29 -05:00
Shirshanka Das
db182e4639
fix(python-sdk): DataHubGraph get_aspect should accept empty responses (#6760) 2022-12-14 10:40:16 -08:00
Harshal Sheth
2f95719dba
feat(ingest): remove source config from DatahubIngestionCheckpoint (#6722) 2022-12-14 12:39:21 -05:00
Patrick Franco Braz
f0a371941e
refactor(ingest): bigquery-lineage - allow tables and datasets in uppercase (#6739) 2022-12-14 14:58:03 +01:00
Harshal Sheth
68fd802881
fix(ingest/lookml): fix directory handling and a github_info resolution bug (#6751) 2022-12-14 14:55:38 +01:00
Harshal Sheth
cf3db168ac
feat(ingest): start simplifying stateful ingestion state (#6740) 2022-12-13 10:05:57 +01:00
Harshal Sheth
7d63399d00
fix(ingest): fix serde for empty dicts in unions with null (#6745)
The code changes in https://github.com/acryldata/avro_gen/pull/16, but tests are written here.
2022-12-13 08:17:24 +01:00
Dmitry Bryazgin
551ef1b335
feat(ingest): add stateful ingestion to the ldap source (#6127)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2022-12-13 01:13:39 -05:00
Harshal Sheth
85bb1f5030
test(ingest): make hive/trino test more reliable (#6741) 2022-12-12 21:02:52 -05:00
cccs-Dustin
2cc64742e0
feat(ingest/iceberg): add stateful ingestion (#6344) 2022-12-12 13:06:03 -05:00
Mayuri Nehate
65ba13d9aa
feat(ingest): snowflake - add separate config for include_column_lineage in snowflake (#6712) 2022-12-12 15:23:12 +01:00
Harshal Sheth
fd911c9820
feat(ingest): redact configs reported in ingestion_run_summary (#6696) 2022-12-12 10:48:26 +01:00
Mayuri Nehate
5c99f20b7d
fix(ingest): mysql - fix mysql ingestion issue with non-lowercase database (#6713) 2022-12-12 10:48:01 +01:00
Harshal Sheth
b7735d5b21
fix(ingest): fix bug in auto_status_aspect (#6705)
Co-authored-by: Tamas Nemeth <treff7es@gmail.com>
2022-12-09 12:24:39 -05:00
Harshal Sheth
4dded454ff
fix(ingest): cleanup config extra usage (#6699) 2022-12-08 16:34:34 -08:00
Harshal Sheth
acc79d7d0d
fix(ingest/tableau): support ssl_verify flag properly (#6682) 2022-12-08 14:58:31 -08:00
Mayuri Nehate
9e3267a0ec
feat(ingest): add timestamps for snowflake objects (#6570)
Co-authored-by: Tamas Nemeth <treff7es@gmail.com>
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2022-12-07 18:11:08 -05:00
mohdsiddique
c4dcd268a6
feat(ingest): support knowledge links in business glossary (#6375)
Co-authored-by: Shirshanka Das <shirshanka@apache.org>
Co-authored-by: MohdSiddique Bagwan <mohdsiddique.bagwan@gslab.com>
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2022-12-07 18:09:50 -05:00
Harshal Sheth
bf307a4bcf
feat(ingest): run profiler in more cardinality cases (#6397) 2022-12-07 12:20:06 -05:00
Mayuri Nehate
eeb7a9dfe5
feat(ingest): snowflake - update snowflake docs, add simple validations (#6636) 2022-12-07 14:56:03 +01:00
Tamas Nemeth
9a1f78fc60
fix(ingest): profiling - Changing profiling defaults on Bigquery and Snowflake (#6640) 2022-12-07 10:33:10 +01:00
David Haglund
1a6677083e
fix(ingest/powerbi-report-server): deprecate unused graphql config (#6630) 2022-12-07 01:03:49 -05:00
Matthieu Blais
4e2dde84f6
feat(ingest/dbt): add support for latest DBT version 1.3 (#6651)
Co-authored-by: Matthieu Blais <matthieu.blais@tech.jago.com>
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2022-12-06 19:03:24 -05:00
Harshal Sheth
fceef480a2
chore(ingest): remove feast-legacy (#6661) 2022-12-06 14:19:38 -08:00
Fredrik Sannholm
4dd66be654
feat(ingest/kafka-connect): support MongoSourceConnector (#6416)
Co-authored-by: John Joyce <john@acryl.io>
Co-authored-by: Tamas Nemeth <treff7es@gmail.com>
2022-12-05 16:09:58 -05:00