Gabe Lyons
a8079ca163
feat(transformers): add transformers to provide tags & terms to schema fields based on regex patterns ( #4936 )
...
* add tag & term transformers for schemas
* added documentation
* lint fixes
* add clarification that only first set of matching terms is applied
2022-05-18 16:03:34 -07:00
Gabe Lyons
7b1cf6f8b2
feat(dbt): enable data platform instance on dbt ( #4926 )
2022-05-17 16:53:16 -07:00
Aseem Bansal
15438f62f1
fix(doc): update doc url to generated docs ( #4860 )
2022-05-13 10:19:46 +05:30
BZ
367fac6066
feat(ingestion): For all usage connectors, allow exclusion of top_n_queries from ingestion via a config param. ( #4839 )
...
* feat(redshift-usage): allow users to not ingest top_n_queries
Co-authored-by: Ravindra Lanka <rlanka@acryl.io>
2022-05-12 14:26:03 -07:00
Aseem Bansal
b9f78026c8
revert(bigquery-usage): dataset allow filter impl ( #4901 )
...
* Revert "fix(ingestion): bigquery-usage: Fix biquery usage table deny pattern template (#4898 )"
2022-05-11 20:03:03 +02:00
Sebo Kim
f3df15d6dc
fix(ingestion): ElasticSearch when no properties from elastic_mappings, gracefully continue ( #4853 )
...
* when no properties from elastic_mappings, gracefully continue
Co-authored-by: Ravindra Lanka <rlanka@acryl.io>
2022-05-11 09:07:18 -07:00
Shirshanka Das
8d281fc013
fix(ingest): lookml - add view definitions for all views ( #4875 )
2022-05-10 10:48:36 -07:00
Zach Bluhm
6ced69cf31
fix(bigquery-usage): dataset allow filter impl ( #4776 )
...
Co-authored-by: Ravindra Lanka <rlanka@acryl.io>
2022-05-06 13:45:37 -07:00
Tamas Nemeth
56ee4d9651
feat(ingest): s3 - add support for multiple pathspecs in one recipe ( #4777 )
2022-05-05 10:09:47 -07:00
mayurinehate
d3fb6ce026
fix(ingest): great-expectations - fix failure to serialize type Decimal ( #4763 )
2022-05-04 22:56:08 -07:00
Ravindra Lanka
842fb391eb
feat(ingestion): kafka - add protobuf schema support ( #4819 )
...
Co-authored-by: Luis Angel Vicente Sanchez <luis.vicentesanchez@aaqua.live>
2022-05-04 17:07:01 -07:00
Aseem Bansal
3ff53b417b
fix(snowflake): passing connect args should not cause failures ( #4764 )
...
* fix(snowflake): passing connect args should not cause failures
Co-authored-by: Ravindra Lanka <rlanka@acryl.io>
2022-05-03 05:20:11 -07:00
Vladislavs Gaidass
8a24408cbf
fix(bigquery): improve handling of extracted audit log sql queries ( #4735 )
2022-05-03 14:43:23 +05:30
Ravindra Lanka
df75eafcfc
fix(ci): fix presto_on_hive tests. ( #4802 )
2022-05-02 21:09:33 -07:00
Ravindra Lanka
2b62ed5260
fix(ingest): avro - fix schema field type for avro logical types ( #4801 )
2022-05-02 17:43:42 -07:00
Aditya Radhakrishnan
c20a47f34c
feat(operation): display the reported time for last updated in the UI ( #4800 )
2022-05-02 16:00:29 -07:00
Shirshanka Das
a9ad138172
feat(ingest): docs - overhaul source connector docs to make it code driven ( #4798 )
...
Co-authored-by: MugdhaHardikar-GSLab <mugdha.hardikar@gslab.com>
2022-05-02 00:18:15 -07:00
mayurinehate
c34a1ba735
fix(s3): improved handling for corner cases ( #4774 )
2022-04-29 12:25:41 -07:00
vanmeete
74d6d35881
feat(ingestion): add Pulsar source ( #4721 )
2022-04-29 15:57:02 +05:30
Jordan Wolinsky
bbac4a7a11
feat(ingestion): glue/s3 - Ingest Tags from s3 bucket on an AWS Glue job and S3 Data Lake Ingest Job ( #4689 )
2022-04-29 10:09:06 +02:00
Shirshanka Das
d0eb772301
fix(ingest): fwk - datahub_api should be initialized by datahub-rest … ( #4786 )
2022-04-28 22:31:19 -07:00
mayurinehate
33d6842ab0
fix(tableau): miscellaneous tableau fixes for lineage, browse path, non-embedded datasets ( #4724 )
...
* fix(tableau): add config whether to emit aspects for external datasets
other changes:
- do not set browse path in absence of datasource or project name
- remove unused nodes from tableau metadata query
* fix(tableau): remove redundant (transitive) lineage edges between tables, datasource, sheet
other changes:
- update subtypes for datasource to be more specific
* fix(tableau): fix browse paths for custom sql and embedded datasource
other changes:
- do not set browse path if any intermediate folder level in browse path is empty
* docs(tableau): update tableau doc
2022-04-27 11:20:03 -07:00
Danilo Peixoto
d2a6bc06dc
feat(ingest): feast - add support for Feast 0.18, deprecate older integration ( #4094 )
2022-04-26 14:35:02 -07:00
cccs-eric
abf8d62cf5
fix(azure_ad): silently discard other Azure AD object types ( #4693 ) ( #4704 )
2022-04-26 13:56:46 -07:00
Sebo Kim
958b52f2f4
fix(ingest): bigquery - Fix BigQuery Datetime/Timestamp type column partition table profile bug ( #4658 )
...
* fix BigQuery Datetime type column partition table profile bug
* inplace datetime replace
* extract out 'if' blocks and write a unit-test
* parse logic inside get_partition_range func
2022-04-26 16:54:19 +02:00
Tamas Nemeth
474b0ba61e
feat(ingest): dbt - add query tag mapping and match template ( #4744 )
2022-04-25 10:56:45 -07:00
Shirshanka Das
a518e3d13e
feat(cli): improve error reporting, make sink config optional ( #4718 )
2022-04-24 17:12:21 -07:00
Aseem Bansal
c66ef7c1fe
fix(snowflake): deprecate config, update examples ( #4644 )
...
* fix(snowflake): deprecate config, update examples
Co-authored-by: Ravindra Lanka <rlanka@acryl.io>
2022-04-20 15:21:09 -07:00
Aseem Bansal
4b7f407e26
fix(bigquery): error due to not handling date properly ( #4702 )
2022-04-20 18:14:33 +02:00
Aseem Bansal
73d69510f8
fix(sqlparser): fix sqlparser breaking due to # sign ( #4662 )
...
* fix(sqlparser): fix sqlparser breaking due to # sign
Co-authored-by: Ravindra Lanka <rlanka@acryl.io>
2022-04-13 17:15:38 -07:00
Arun Vasudevan
5aa3da5c9c
feat(ingestion) dbt: Fixing issue with strip_user_ids_from_email and adding owner_extraction_pattern ( #4587 )
...
* Fixing issue with strip_user_ids_from_email and adding owner_extraction_pattern
Co-authored-by: BZ <93607724+BoyuanZhangDE@users.noreply.github.com>
Co-authored-by: Ravindra Lanka <rlanka@acryl.io>
2022-04-13 16:58:36 -07:00
Tamas Nemeth
f99d27fd8c
feat(ingest): airflow - add support to capture airflow executions, add high level dataflow jobs api to python sdk ( #4615 )
...
Co-authored-by: Gabe Lyons <itsgabelyons@gmail.com>
2022-04-12 23:19:39 -07:00
Kevin Hu
08c34bfe15
feat(ingest): capture MSSQL table+column descriptions ( #4579 )
...
* feat(ingest): capture MSSQL table+column descriptions
Co-authored-by: Ravindra Lanka <rlanka@acryl.io>
2022-04-12 17:49:56 -07:00
Zach Bluhm
ff685b7feb
feat: Enable the ingestion of bigquery audit logs to parse usage info… ( #4441 )
...
* feat: Enable the ingestion of bigquery audit logs to parse usage information
Co-authored-by: Ravindra Lanka <rlanka@acryl.io>
2022-04-12 14:58:34 -07:00
Dyana Rose
5b22d96e04
fix(ingestion): looker - extract explore views from join name ( #4627 )
...
Co-authored-by: Dyana Rose <dyanarose@gmail.com>
Co-authored-by: Shirshanka Das <shirshanka@apache.org>
2022-04-12 08:20:10 -07:00
Xu Wang
7b1487135a
feat(ingest): add Urn python library for DataJob, DataFlow, Domain and Tag ( #4618 )
...
* feat(ingest): add python library for DataJobUrn
* add DataFlowUrn lib and fix DataJobUrn
* fix create_from_str method
* fix lint error and unit test
* add DomainUrn and TagUrn
Co-authored-by: Xu Wang <xu.wang@grandrounds.com>
2022-04-12 09:02:28 +02:00
Marcin Szymański
e7c5eb357c
feat(ingest): add trino platform for great expectations ( #4594 )
2022-04-11 19:48:15 -07:00
jchen0824
524d183d93
feat: add presto-on-hive metadata ingestion source ( #4625 )
...
* feat(metadata ingestion source): add presto-on-hive metadata ingestion source
Co-authored-by: Houren Chen <houren.chen@grabtaxi.com>
Co-authored-by: Ravindra Lanka <rlanka@acryl.io>
2022-04-11 17:46:44 -07:00
Aseem Bansal
61a95f41ae
chore: fix lint and remove incorrect integration mark from unit tests ( #4621 )
...
* chore: fix lint and remove incorrect integration mark from unit tests
* add to test requirements
* revert athena source tests
2022-04-08 17:18:48 +02:00
Aseem Bansal
336a628c5b
fix(bigquery): fix lineage bug, improve docs, add dataset filter config ( #4607 )
...
* fix(bigquery): fix metadata from exported logs, doc missing permission, improve logging, add tests
Co-authored-by: Ravindra Lanka <rslanka@gmail.com>
2022-04-07 13:10:21 -07:00
David Haglund
0785ed6143
fix: urlencode slash in urns too ( #4527 )
...
* fix: urlencode slash in urns too + tests
Co-authored-by: Ravindra Lanka <rlanka@acryl.io>
2022-04-07 13:04:57 -07:00
Gabe Lyons
112589db32
feat(tableau): add some logic to normalize table names in tableau ( #4609 )
...
* add some logic to normalize table names in tableau
2022-04-07 12:15:41 -07:00
Ravindra Lanka
5e25cd1e22
feat(ingestion): Redshift Usage Source - simplify OperationalStats workunit generation. ( #4585 )
...
* feat(ingestion): Redshift Usage Source - simplify OperationalStats workunit generation.
2022-04-07 11:24:26 -07:00
mayurinehate
0a97fa22f9
fix(tableau): fix for incorrect schema returned by tableau api for snowflake connectionType ( #4577 )
2022-04-05 14:56:35 -07:00
Ravindra Lanka
fe5f24c2b3
fix(ingestion): Refactor redshift_usage source: simplify, annotate & fix bugs. ( #4572 )
2022-04-05 09:21:27 -07:00
David Haglund
df9e07fda2
fix: replace direct and indirect references to linkedin with datahub-project ( #4557 )
...
* Update links for github-related links to use datahub-project:
- https://github.com
- https://img.shields.io/github/ ...
- https://raw.githubusercontent.com/ ...
* Also replace references for github repo linkedin/datahub with
datahub-project/datahub.
2022-04-04 14:39:30 -05:00
Abhiram98
26742728a6
feat(ingestion): schema, table filtering for redshift-usage ( #4396 )
...
* Filter based on table/schema pattern + documentation
Co-authored-by: Ravindra Lanka <rlanka@acryl.io>
2022-04-01 20:48:23 -07:00
darapuk
a05d798939
(fix): Update path generated when creating LookML URL ( #4554 )
...
* (fix): Update path generated when creating LookML URL
2022-04-01 11:54:36 -07:00
Corentin
2fc3a48bc5
feat(ingest): indent sql queries for usage sources ( #3782 )
...
* feat(ingest): indent sql queries for usage connectors.
Co-authored-by: EC2 Default User <ec2-user@ip-172-31-22-140.eu-west-1.compute.internal>
Co-authored-by: Ravindra Lanka <rlanka@acryl.io>
2022-03-31 15:15:09 -07:00
mayurinehate
c09834d52b
fix(kafka-connect): add platform for default case in jdbc connector, update tests for platform instance map ( #4545 )
2022-03-31 08:13:09 -07:00