3975 Commits

Author SHA1 Message Date
Harshal Sheth
8a537b0559
feat(ingest): add datahub state inspect command (#6763) 2022-12-15 18:55:36 -05:00
Harshal Sheth
798d82fe60
docs(ingest): fix error in custom tags transformer example (#6767) 2022-12-15 15:31:12 -08:00
Tamas Nemeth
b7bc1e9116
fix(ingest): bigquery - handling custom sql errors as warning (#6777) 2022-12-15 23:40:32 +01:00
Harshal Sheth
6152b5e9f7
feat(ingest): simplify more stateful ingestion state (#6762) 2022-12-15 11:33:29 -05:00
Shirshanka Das
db182e4639
fix(python-sdk): DataHubGraph get_aspect should accept empty responses (#6760) 2022-12-14 10:40:16 -08:00
Harshal Sheth
2f95719dba
feat(ingest): remove source config from DatahubIngestionCheckpoint (#6722) 2022-12-14 12:39:21 -05:00
Patrick Franco Braz
f0a371941e
refactor(ingest): bigquery-lineage - allow tables and datasets in uppercase (#6739) 2022-12-14 14:58:03 +01:00
Harshal Sheth
68fd802881
fix(ingest/lookml): fix directory handling and a github_info resolution bug (#6751) 2022-12-14 14:55:38 +01:00
cccs-seb
3c2982c02c
fix(ingest): support airflow mapped operators (#6738) 2022-12-13 22:31:53 -05:00
Harshal Sheth
cf3db168ac
feat(ingest): start simplifying stateful ingestion state (#6740) 2022-12-13 10:05:57 +01:00
Harshal Sheth
7d63399d00
fix(ingest): fix serde for empty dicts in unions with null (#6745)
The code changes in https://github.com/acryldata/avro_gen/pull/16, but tests are written here.
2022-12-13 08:17:24 +01:00
Dmitry Bryazgin
551ef1b335
feat(ingest): add stateful ingestion to the ldap source (#6127)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2022-12-13 01:13:39 -05:00
Harshal Sheth
85bb1f5030
test(ingest): make hive/trino test more reliable (#6741) 2022-12-12 21:02:52 -05:00
Tamas Nemeth
5658fd5a54
feat(ingest): bigquery - external url support and a small profiling filter fix (#6714) 2022-12-12 16:25:32 -08:00
cccs-Dustin
2cc64742e0
feat(ingest/iceberg): add stateful ingestion (#6344) 2022-12-12 13:06:03 -05:00
Mayuri Nehate
65ba13d9aa
feat(ingest): snowflake - add separate config for include_column_lineage in snowflake (#6712) 2022-12-12 15:23:12 +01:00
Jan Hicken
d3fca44e16
fix(ingest): bigquery - rectify filter for BigQuery external tables (#6691) 2022-12-12 10:58:23 +01:00
Harshal Sheth
fd911c9820
feat(ingest): redact configs reported in ingestion_run_summary (#6696) 2022-12-12 10:48:26 +01:00
Mayuri Nehate
5c99f20b7d
fix(ingest): mysql - fix mysql ingestion issue with non-lowercase database (#6713) 2022-12-12 10:48:01 +01:00
Harshal Sheth
b7735d5b21
fix(ingest): fix bug in auto_status_aspect (#6705)
Co-authored-by: Tamas Nemeth <treff7es@gmail.com>
2022-12-09 12:24:39 -05:00
Harshal Sheth
c211cfbbe6
fix(ingest/sagemaker): handle missing ProcessingInputs field (#6697)
Fixes #6360.
2022-12-08 18:42:28 -08:00
Harshal Sheth
4dded454ff
fix(ingest): cleanup config extra usage (#6699) 2022-12-08 16:34:34 -08:00
Felix Lüdin
e7acc8ef30
fix(config): unify the handling of boolean environment variables (#6684) 2022-12-08 15:00:09 -08:00
Harshal Sheth
acc79d7d0d
fix(ingest/tableau): support ssl_verify flag properly (#6682) 2022-12-08 14:58:31 -08:00
Tamas Nemeth
729e486b62
feat(ingest): bigquery - option to set on behalf project (#6660) 2022-12-08 15:25:22 -05:00
orlandine
b219f0848a
docs(ingest/salesforce): list required permissions (#6610) 2022-12-08 14:50:15 -05:00
Felix Lüdin
05e18a0ae7
feat(ingest): use entry point for registering transformers (#6628) 2022-12-07 23:08:08 -05:00
Mayuri Nehate
9e3267a0ec
feat(ingest): add timestamps for snowflake objects (#6570)
Co-authored-by: Tamas Nemeth <treff7es@gmail.com>
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2022-12-07 18:11:08 -05:00
İnanç Dokurel
996dabfcac
fix(ingestion/vertica): support columns with timestamp precision (#6295)
Co-authored-by: İnanç Dokurel <inancdokurel@users.noreply.github.com>
Fixes https://github.com/datahub-project/datahub/issues/5295
2022-12-07 18:10:37 -05:00
mohdsiddique
c4dcd268a6
feat(ingest): support knowledge links in business glossary (#6375)
Co-authored-by: Shirshanka Das <shirshanka@apache.org>
Co-authored-by: MohdSiddique Bagwan <mohdsiddique.bagwan@gslab.com>
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2022-12-07 18:09:50 -05:00
Harshal Sheth
bf307a4bcf
feat(ingest): run profiler in more cardinality cases (#6397) 2022-12-07 12:20:06 -05:00
Mayuri Nehate
eeb7a9dfe5
feat(ingest): snowflake - update snowflake docs, add simple validations (#6636) 2022-12-07 14:56:03 +01:00
Tamas Nemeth
9a1f78fc60
fix(ingest): profiling - Changing profiling defaults on Bigquery and Snowflake (#6640) 2022-12-07 10:33:10 +01:00
David Haglund
1a6677083e
fix(ingest/powerbi-report-server): deprecate unused graphql config (#6630) 2022-12-07 01:03:49 -05:00
Matthieu Blais
4e2dde84f6
feat(ingest/dbt): add support for latest DBT version 1.3 (#6651)
Co-authored-by: Matthieu Blais <matthieu.blais@tech.jago.com>
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2022-12-06 19:03:24 -05:00
Harshal Sheth
c8969d9ba8
fix(ingest/snowflake): support domains for snowflake schema containers (#6662) 2022-12-06 14:24:07 -08:00
Harshal Sheth
fceef480a2
chore(ingest): remove feast-legacy (#6661) 2022-12-06 14:19:38 -08:00
Harshal Sheth
f0206baa8b
fix(ingest): issue warning correctly (#6623) 2022-12-06 14:17:14 -08:00
Tamas Nemeth
2373c707b8
feat(ingest): bigquery - Running lineage extraction after metadata extraction (#6653)
* Running lineage extraction after metadata extraction
Adding table creation/alter time to the datasetproperties
Fixing bigquery permissions doc

* Disabling by default to run sql parser in a separate process
Fixing adding views to the global view list
2022-12-06 23:04:27 +01:00
Harshal Sheth
71bfa98f89
fix(ingest): fix lingering demo-data source issues (#6659) 2022-12-06 16:10:21 -05:00
Aseem Bansal
43c566ee4f
feat(ingest): add dummy data source for automated testing (#6550) 2022-12-06 16:57:12 +05:30
Fredrik Sannholm
4dd66be654
feat(ingest/kafka-connect): support MongoSourceConnector (#6416)
Co-authored-by: John Joyce <john@acryl.io>
Co-authored-by: Tamas Nemeth <treff7es@gmail.com>
2022-12-05 16:09:58 -05:00
Mayuri Nehate
e5a823e0d8
feat(ingest/snowflake): support filtering by fully qualified schema_pattern (#6611) 2022-12-05 14:27:25 -05:00
Mayuri Nehate
fdcb731e29
feat(ingest): snowflake - config variable for specifying a direct private key (#6609) 2022-12-05 19:09:08 +05:30
david-leifker
2de9d3d5bf
fix(logging): Remove lombok as source of slf4j-api, convert to compileOnly where possible (#6616) 2022-12-04 19:57:47 -08:00
djordje-mijatovic
99e6f3a87c
feat(ingest): print detailed GMS error messages (#6519)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2022-12-02 18:20:09 -05:00
Harshal Sheth
a1e62c723e
docs(ingest): add airflow docs that use the PythonVirtualenvOperator (#6604) 2022-12-02 19:56:17 +01:00
Harshal Sheth
71466aab36
fix(ingest): only require github_info for lookml and not looker (#6608) 2022-12-02 19:54:24 +01:00
Harshal Sheth
44cfd21a65
chore(ingest): bump and pin mypy (#6584) 2022-12-02 19:53:28 +01:00
Mayuri Nehate
1689212434
feat(ingest): add external url for snowflake objects (#6580) 2022-12-02 13:38:46 -05:00