Kevin Hu
df514cfd6e
feat(ingest): mysql - add decimal128 custom type ( #4624 )
2022-04-21 16:43:29 -07:00
mayurinehate
9e30a9cc81
fix(glue): fix error for custom connector if ignore_unsupported_conne… ( #4667 )
2022-04-21 11:33:04 -07:00
BZ
bbfc902950
fix(ingestion): glue - delete CatalogId parameter from get_jobs api call ( #4646 )
2022-04-21 09:30:01 +02:00
Aseem Bansal
c66ef7c1fe
fix(snowflake): deprecate config, update examples ( #4644 )
...
* fix(snowflake): deprecate config, update examples
Co-authored-by: Ravindra Lanka <rlanka@acryl.io>
2022-04-20 15:21:09 -07:00
Aditya Radhakrishnan
15e90f6dd0
feat(ingest) - update identity sources to add flags for masking sensitive work units ( #4711 )
2022-04-20 14:21:08 -07:00
Tamas Nemeth
bb2b8515ff
fix(ingest) bigquery: Moving bigquery temporary credential deletion to atexit ( #4701 )
2022-04-20 13:35:16 -07:00
Tamas Nemeth
b217c8cd4e
fix(looker): Pydantic validation error for Looker TransportOptions on python 3.8 ( #4705 )
...
* Fix for pydantic validation error for Looker TransportOptions on python 3.8
2022-04-20 22:27:56 +02:00
Aseem Bansal
4b7f407e26
fix(bigquery): error due to not handling date properly ( #4702 )
2022-04-20 18:14:33 +02:00
Aseem Bansal
bb0a87ae74
fix(snowflake): remove extra lineage edges in reports, change badly named config variable ( #4595 )
...
* fix(snowflake): remove extra lineage edges in reports
2022-04-20 07:03:54 -07:00
Aseem Bansal
98d4fd4ea9
fix(cli): rest emitter should override config and env variables ( #4622 )
...
* fix(cli): rest emitter should override env variables
* fix(cli): change to not update env variables, small refactor
* fix bug
2022-04-18 07:31:01 -07:00
Atul Saurav
e8e0067f23
fix(cli):Supress printing variables to logs during ingestion failure ( #4566 )
...
Currently when ingestion while running pipeline, stackprinter prints all vars to logs.
This may contain sensitive information. To prevent this from happening, a optional `safe`
flag is added to cli. If this flag is set while running ingestion, no variables are logged in
case of unexpected failures.
2022-04-15 10:30:48 -07:00
Aseem Bansal
8e2bd00059
chore(ingestion): update example recipes ( #4660 )
2022-04-14 16:09:19 -07:00
Maggie Hays
2534221d5d
remove source summary table ( #4670 )
2022-04-14 16:28:05 -05:00
Tamas Nemeth
61dc6e8723
fix(ingestion): airflow - import emitters indirectly to avoid unneeded dependency ( #4668 )
2022-04-14 10:22:16 -07:00
Fernanda de Camargo
d508f5c036
fix(ingestion): tableau - validate datasource before creating its upstream ( #4613 )
...
Co-authored-by: Tamas Nemeth <treff7es@gmail.com>
2022-04-13 23:08:01 -07:00
Aseem Bansal
73d69510f8
fix(sqlparser): fix sqlparser breaking due to # sign ( #4662 )
...
* fix(sqlparser): fix sqlparser breaking due to # sign
Co-authored-by: Ravindra Lanka <rlanka@acryl.io>
2022-04-13 17:15:38 -07:00
Arun Vasudevan
5aa3da5c9c
feat(ingestion) dbt: Fixing issue with strip_user_ids_from_email and adding owner_extraction_pattern ( #4587 )
...
* Fixing issue with strip_user_ids_from_email and adding owner_extraction_pattern
Co-authored-by: BZ <93607724+BoyuanZhangDE@users.noreply.github.com>
Co-authored-by: Ravindra Lanka <rlanka@acryl.io>
2022-04-13 16:58:36 -07:00
Aseem Bansal
53d357b4eb
fix(bigquery-usage): missing dependency ( #4661 )
2022-04-13 14:29:31 +02:00
Aseem Bansal
5a59d5a1dd
fix(ingestion): Adding missing init.py ( #4659 )
2022-04-13 11:02:57 +02:00
Aseem Bansal
155209f0e1
fix(ingestion): add missing workunit ids ( #4657 )
2022-04-13 10:19:37 +02:00
Tamas Nemeth
f99d27fd8c
feat(ingest): airflow - add support to capture airflow executions, add high level dataflow jobs api to python sdk ( #4615 )
...
Co-authored-by: Gabe Lyons <itsgabelyons@gmail.com>
2022-04-12 23:19:39 -07:00
Kevin Hu
08c34bfe15
feat(ingest): capture MSSQL table+column descriptions ( #4579 )
...
* feat(ingest): capture MSSQL table+column descriptions
Co-authored-by: Ravindra Lanka <rlanka@acryl.io>
2022-04-12 17:49:56 -07:00
David Sanchez
9a950ef231
fix(tableau): avoid duplicate schema in URNs for upstream tables ( #4645 )
...
* fix(tableau): avoid duplicate schema in URNs for upstream tables
* Fix(lint)
2022-04-12 16:26:52 -07:00
Meenakshi Kamalaseshan Radha
e75e2f8bbf
fix(ingest): Fix snowflake KEY_PAIR auth ( #4638 )
...
* fix(ingest): Fix snowflake KEY_PAIR auth to work with stateful ingestion.
Co-authored-by: Ravindra Lanka <rlanka@acryl.io>
2022-04-12 15:58:53 -07:00
Zach Bluhm
ff685b7feb
feat: Enable the ingestion of bigquery audit logs to parse usage info… ( #4441 )
...
* feat: Enable the ingestion of bigquery audit logs to parse usage information
Co-authored-by: Ravindra Lanka <rlanka@acryl.io>
2022-04-12 14:58:34 -07:00
Ravindra Lanka
9226e3e27f
Enable lower-casing of the name part of dataset urn via an environment vairable. ( #4649 )
2022-04-12 12:54:22 -07:00
Dyana Rose
5b22d96e04
fix(ingestion): looker - extract explore views from join name ( #4627 )
...
Co-authored-by: Dyana Rose <dyanarose@gmail.com>
Co-authored-by: Shirshanka Das <shirshanka@apache.org>
2022-04-12 08:20:10 -07:00
Aseem Bansal
23ece3b1a4
fix(ingestion): ensure source/sink reports are always logged ( #4592 )
2022-04-12 05:00:59 -07:00
Xu Wang
7b1487135a
feat(ingest): add Urn python library for DataJob, DataFlow, Domain and Tag ( #4618 )
...
* feat(ingest): add python library for DataJobUrn
* add DataFlowUrn lib and fix DataJobUrn
* fix create_from_str method
* fix lint error and unit test
* add DomainUrn and TagUrn
Co-authored-by: Xu Wang <xu.wang@grandrounds.com>
2022-04-12 09:02:28 +02:00
Marcin Szymański
e7c5eb357c
feat(ingest): add trino platform for great expectations ( #4594 )
2022-04-11 19:48:15 -07:00
jchen0824
524d183d93
feat: add presto-on-hive metadata ingestion source ( #4625 )
...
* feat(metadata ingestion source): add presto-on-hive metadata ingestion source
Co-authored-by: Houren Chen <houren.chen@grabtaxi.com>
Co-authored-by: Ravindra Lanka <rlanka@acryl.io>
2022-04-11 17:46:44 -07:00
Tamas Nemeth
e2a617f183
Restricting pytest docker version ( #4639 )
2022-04-11 23:35:34 +02:00
BZ
5637e73ca5
feat(glue): add CatalogId parameter for cross-account access ( #4608 )
...
* Update glue.py
* Update glue.md
* Update glue.py
2022-04-11 09:08:25 +02:00
Aseem Bansal
aa0fe3636a
doc(scheduling): make it easier to find ui ingestion ( #4610 )
2022-04-08 10:26:41 -07:00
Aseem Bansal
61a95f41ae
chore: fix lint and remove incorrect integration mark from unit tests ( #4621 )
...
* chore: fix lint and remove incorrect integration mark from unit tests
* add to test requirements
* revert athena source tests
2022-04-08 17:18:48 +02:00
Abhiram98
cd43a4a543
doc(redshift): Add grant statements ( #4559 )
2022-04-08 16:30:43 +02:00
Marcin Szymański
7c3ad3d293
feat(ingest): enable connection string for all sqlalchemy datasources ( #4508 )
...
* feat(ingest): enable connection string for all sqlalchemy datasources
* Update sql_common.py
* fix types
* update docs
* rename variable to sqlalchemy_uri
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2022-04-07 23:11:52 -04:00
Aseem Bansal
336a628c5b
fix(bigquery): fix lineage bug, improve docs, add dataset filter config ( #4607 )
...
* fix(bigquery): fix metadata from exported logs, doc missing permission, improve logging, add tests
Co-authored-by: Ravindra Lanka <rslanka@gmail.com>
2022-04-07 13:10:21 -07:00
David Haglund
0785ed6143
fix: urlencode slash in urns too ( #4527 )
...
* fix: urlencode slash in urns too + tests
Co-authored-by: Ravindra Lanka <rlanka@acryl.io>
2022-04-07 13:04:57 -07:00
Gabe Lyons
112589db32
feat(tableau): add some logic to normalize table names in tableau ( #4609 )
...
* add some logic to normalize table names in tableau
2022-04-07 12:15:41 -07:00
Ravindra Lanka
5e25cd1e22
feat(ingestion): Redshift Usage Source - simplify OperationalStats workunit generation. ( #4585 )
...
* feat(ingestion): Redshift Usage Source - simplify OperationalStats workunit generation.
2022-04-07 11:24:26 -07:00
Aseem Bansal
5ebb37ab4c
fix(bigquery): incorrect lineage when views are present ( #4568 )
...
* fix(bigquery): incorrect lineage when views are present
Co-authored-by: Ravindra Lanka <rlanka@acryl.io>
2022-04-06 17:29:02 -07:00
Aseem Bansal
df1d8ad07e
doc(snowflake): add example of table pattern ( #4580 )
2022-04-05 16:23:21 -07:00
Aditya Radhakrishnan
aeafa7e63f
feat(okta) - add support for filtering/searching when ingesting Okta groups and users ( #4586 )
2022-04-05 16:15:34 -07:00
mayurinehate
0a97fa22f9
fix(tableau): fix for incorrect schema returned by tableau api for snowflake connectionType ( #4577 )
2022-04-05 14:56:35 -07:00
Ravindra Lanka
fe5f24c2b3
fix(ingestion): Refactor redshift_usage source: simplify, annotate & fix bugs. ( #4572 )
2022-04-05 09:21:27 -07:00
Aseem Bansal
809d1beae9
feat(snowflake): reduce permissions provisioned by default ( #4543 )
...
* feat(snowflake): reduce permissions provisioned by default
Co-authored-by: John Joyce <john@acryl.io>
2022-04-05 09:03:00 -07:00
Pedro Silva
a20012fd6c
feat(docs) Improves docs around developing datahub, removes deprecated docs on building metadata service ( #4552 )
2022-04-04 19:15:21 -07:00
Kevin Hu
030d25f0a1
feat(ingest): add option for external Spark cluster ( #4571 )
...
* Add option for configuring spark cluster manager
Co-authored-by: Ravindra Lanka <rslanka@gmail.com>
Co-authored-by: Ravindra Lanka <rslanka@gmail.com>
2022-04-04 15:56:50 -07:00
David Haglund
df9e07fda2
fix: replace direct and indirect references to linkedin with datahub-project ( #4557 )
...
* Update links for github-related links to use datahub-project:
- https://github.com
- https://img.shields.io/github/ ...
- https://raw.githubusercontent.com/ ...
* Also replace references for github repo linkedin/datahub with
datahub-project/datahub.
2022-04-04 14:39:30 -05:00