935 Commits

Author SHA1 Message Date
Alexey Kravtsov
3c3ab64954
feat(ingest): implement compression for CheckpointState (#6007) 2022-09-26 10:18:42 -07:00
Harshal Sheth
27f28019de
refactor(ingest): move common host_port validation (#6009) 2022-09-22 16:32:07 -07:00
Ravindra Lanka
b8941ab190
feat(ingestion): Add fail-safe stale entity removal via configurable 'fail_safe_threshold' param. (#6027) 2022-09-22 16:09:22 -07:00
Harshal Sheth
68db859ca1
refactor(ingest): streamline two-tier db config validation (#5986) 2022-09-21 10:45:37 -07:00
Mayuri Nehate
b195b6c123
fix(ingest): encode reserved characters when creating dataset urn (#5977)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2022-09-20 16:59:02 -07:00
Harshal Sheth
937ab192c0
feat(ingest): add support for aliases in plugin registry (#5958) 2022-09-16 07:19:32 -07:00
skrydal
a026c84691
feat: qualifiedName support + populating glue ARN (#5952) 2022-09-15 21:15:03 -07:00
skrydal
f61a040555
feat(ingestion) Add more info to glue entities (#5874)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2022-09-14 12:25:09 -07:00
Harshal Sheth
e23523a781
fix(ingest): fix type annotations on some pydantic fields (#5795) 2022-09-14 11:05:31 -07:00
Harshal Sheth
a1e1d2fd0a
feat(ingest): add ConfigEnum type (#5734) 2022-09-14 09:57:42 -07:00
Ravindra Lanka
ee68f09624
feat(ingestion): Refactor standard state-handling tasks into a common handler that are common across all stateful ingestion sources. (#5766) 2022-09-14 09:30:42 -07:00
Mayuri Nehate
aedf1522fb
feat(ingest): snowflake-beta - minor changes, tests (#5910) 2022-09-12 10:42:52 -07:00
Harshal Sheth
e556bcb306
feat(ingest): add entity type inference to mcpw (#5880) 2022-09-10 20:36:10 -07:00
Harshal Sheth
220ae0b6c9
feat(ingest): make sink use type annotations (#5899) 2022-09-10 19:46:20 -07:00
Shirshanka Das
056add128d
fix(ingest): datahub-api - move instantiation to the right config class (#5878) 2022-09-09 13:34:21 -07:00
Harshal Sheth
6063484714
fix(ingest): avrogen handling for missing fields with default values (#5844) 2022-09-08 14:05:28 -07:00
Harshal Sheth
08622f25ef
feat(ingest): add utility for converting MCEs to MCPs (#5812) 2022-09-06 15:25:48 -07:00
mohdsiddique
2f65e2f226
feat(transformers): Add semantics & transform_aspect support in transformers (#5514)
Co-authored-by: MohdSiddique Bagwan <mohdsiddique.bagwan@gslab.com>
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2022-09-06 14:44:14 -07:00
Shirshanka Das
9afda47085
feat(cli): add support for sampled reporting to keep logs manageable (#5800) 2022-09-01 14:47:28 -07:00
Harshal Sheth
c05f3970fd
feat(ingest): cli - add rewrite option for metadata file check (#5763) 2022-09-01 14:30:00 -07:00
Tamas Nemeth
4572c96d60
feat(ingestion): bigquery - Bigquery beta connector - first cut (#5663) 2022-08-30 07:33:24 +02:00
Harshal Sheth
eb87db9813
fix(ingest): proper null skip logic in serialization (#5749) 2022-08-29 16:34:58 -07:00
Ravindra Lanka
b23195d3df
Fix sqllineage parser to handle special tokens with a hyphens in the table and column names. (#5748) 2022-08-26 18:11:00 -07:00
liyuhui666
08f5a44df0
feat(elasticsearch): Add nested type display (#5524)
Co-authored-by: Shirshanka Das <shirshanka@apache.org>
Co-authored-by: Ravindra Lanka <rslanka@gmail.com>
2022-08-26 09:07:03 -07:00
David Haglund
b830247727
fix(superset): do not crash when display_uri is not set (#5711) 2022-08-24 23:26:02 -07:00
Shirshanka Das
bb788ac317
feat(ingest): file - add support for folders, large files, improve co… (#5692) 2022-08-21 14:18:22 +05:30
Ravindra Lanka
228f3b50ea
feat(ingestion): send reports of ingestion runs to datahub (#5639)
Co-authored-by: Shirshanka Das <shirshanka@apache.org>
2022-08-19 09:08:17 -07:00
Kwanyoung Son
9143663f1f
fix(ingest): redash - fix redash dashboard url bug (#5500)
Co-authored-by: Shirshanka Das <shirshanka@apache.org>
2022-08-16 13:39:27 -07:00
Mayuri Nehate
dc08bedd6e
feat(ingest): snowflake - add snowflake-beta connector (#5517) 2022-08-15 20:54:02 -07:00
Amanda Hernando
337087cac0
feat(ingest): glue - add stateful ingestion (#5553) 2022-08-15 20:50:45 -07:00
Harshal Sheth
355c129c7c
chore(ingest): drop python 3.6 support (#5521) 2022-08-10 15:00:31 -07:00
liyuhui666
0481075705
fix(ingest): Fix ingest Clickhouse without password (#5511)
* fix(ingest): Fix ingest Clickhouse without password

Co-authored-by: Ravindra Lanka <rlanka@acryl.io>
2022-08-09 10:30:56 -07:00
Jordan Wolinsky
33339e2c89
Expose catalog_name in athena.py (#5548)
* expose catalog_name to the sql alchemy uri that is passed into pyathena

Co-authored-by: Ravindra Lanka <rlanka@acryl.io>
Co-authored-by: Shirshanka Das <shirshanka@apache.org>
2022-08-08 11:40:05 -07:00
Harshal Sheth
9790f3cefa
feat(ingest): infer aspectName from aspect type in MCP (#5566) 2022-08-07 07:52:58 -07:00
Piotr Sierkin
828a711684
feat(ingest): dbt - control over emitting test_results, test_definitions, etc. (#5328)
Co-authored-by: Piotr Sierkin <piotr.sierkin@getindata.com>
Co-authored-by: Shirshanka Das <shirshanka@apache.org>
2022-08-06 21:42:53 -07:00
Harshal Sheth
64e7da8a68
fix(ingest): use temp dir for file generated during test (#5505) 2022-07-27 14:29:11 -07:00
Mayuri Nehate
04de6c27b7
feat(ingest): snowflake - test_connection add support for capability report (#5472)
Co-authored-by: Shirshanka Das <shirshanka@apache.org>
2022-07-26 09:29:57 -07:00
Shirshanka Das
7ed9cd2838
feat(ingest): snowflake - basic test connection capability (#5464) 2022-07-22 09:14:37 +02:00
Shirshanka Das
14d764a26f
fix(ingest): fix serialization of report to handle nesting (#5455) 2022-07-20 18:25:07 -07:00
Aseem Bansal
acb9879eb4
feat(cli,build): remove deprecated variables GMS_HOST/_PORT (#5451) 2022-07-20 20:54:43 +05:30
Mugdha Hardikar
ced6c38239
fix(ingest): bigquery-usage - fix dataset name for sharded table (#5412) 2022-07-19 20:59:02 -07:00
Mugdha Hardikar
a6dc669891
docs(bigquery): add changelog and unittest for profiling limits (#5407) 2022-07-19 09:39:09 +05:30
Pedro Silva
b2edd44b6a
Adds support for Domains in CSV source (#5372) 2022-07-15 14:20:41 +05:30
Felix Lüdin
a0303448ba
feat(dashboards): add datasets field to DashboardInfo aspect (#5188)
Co-authored-by: John Joyce <john@acryl.io>
2022-07-14 09:54:02 -07:00
Mugdha Hardikar
94dd3ad5a1
fix(ingest): bigquery-usage - dataset name for sharded tables (#5347)
Co-authored-by: Shirshanka Das <shirshanka@apache.org>
2022-07-07 00:37:13 -07:00
Aditya Radhakrishnan
fc8e59387d
feat(ingest): update CSV source to support description and ownership type (#5346) 2022-07-06 21:29:29 +05:30
Shirshanka Das
4b3135a0f8
feat(ingest): dbt - improving dbt_meta mapping (#5237) 2022-06-24 13:43:12 +02:00
Aditya Radhakrishnan
82ca92f8f9
feat(ingest): adds csv enricher ingestion source (#5221) 2022-06-22 12:25:39 +05:30
Tamas Nemeth
393c07ee52
refactor(ingest): bigquery-usage - Adding tests for bigquery usage filters (#5195) 2022-06-20 18:28:27 -07:00
Aseem Bansal
d518b5a085
fix(cli): correct handling of env variables (#5203) 2022-06-20 20:53:47 +05:30