matthew-piatkus-cko
bfde4662c7
fix(ingest/salesforce): support JSON web token auth ( #7963 )
2023-05-05 18:17:43 +00:00
Andrew Sikowitz
8019d17aa6
fix(ingest/bigquery): Filter projects for lineage and usage ( #7954 )
2023-05-04 18:14:48 +02:00
Harshal Sheth
ca5dffa54d
refactor(ingest/biz-glossary): simplify business glossary source ( #7912 )
2023-05-03 17:01:58 -07:00
Reilman79
b6e2cc549a
fix(ldap): properly handle escaped characters in LDAP DNs ( #7928 )
2023-05-03 13:57:52 -07:00
Harshal Sheth
b12c2b8327
fix(ingest): improve error message when graph connection fails ( #7946 )
2023-05-02 16:30:58 -07:00
Harshal Sheth
6833494347
feat(airflow): respect port parameter if provided ( #7945 )
2023-05-02 16:28:22 -07:00
Harshal Sheth
bf86235e26
fix(ingest/unity): use fully qualified catalog/schema patterns ( #7900 )
2023-05-02 16:27:17 -07:00
Mayuri Nehate
3c04b1bb17
docs(ingest): add note about path_specs configuration in data lake sources ( #7941 )
2023-05-02 15:08:54 +02:00
Mayuri Nehate
a711baa131
fix(ingest/hive): fix containers generation for hive ( #7926 )
2023-05-02 15:07:51 +02:00
Andrew Sikowitz
5b290c9bc5
feat(ingest/unity): Add usage extraction; add TableReference ( #7910 )
...
- Adds usage extraction to the unity catalog source and a TableReference object to handle references to tables
Also makes the following refactors:
- Creates UsageAggregator class to usage_common, as I've seen this same logic multiple times.
- Allows customizable user_urn_builder in usage_common as not all unity users are emails. We create emails with a default email_domain config in other connectors like redshift and snowflake, which seems unnecessary now?
- Creates TableReference for unity catalog and adds it to the Table dataclass, for managing string references to tables. Replaces logic, especially in lineage extraction, with these references
- Creates gen_dataset_urn and gen_user_urn on unity source to reduce duplicate code
Breaks up proxy.py into implementation and types
2023-05-01 11:30:09 -07:00
david-leifker
cd05f5b174
feat(schema-registry): replace confluent schema registry ( #7930 )
...
Co-authored-by: Pedro Silva <pedro@acryl.io>
Co-authored-by: Shirshanka Das <shirshanka@apache.org>
Co-authored-by: Ryan Holstien <ryan@acryl.io>
2023-05-01 13:18:41 -05:00
Andrew Sikowitz
ca3cab4e23
refactor(ingest): report soft deleted stale entities with LossyList ( #7907 )
2023-04-27 15:40:19 -07:00
xiphl
af09034523
[bugfix] Fix remote file ingestion for Windows ( #7888 )
...
Co-authored-by: Shirshanka Das <shirshanka+github@gmail.com>
Co-authored-by: Shirshanka Das <shirshanka@apache.org>
2023-04-27 10:28:10 -07:00
Mayuri Nehate
a0c4e0dd46
feat(ingest): add GCS ingestion source ( #7903 )
...
Co-authored-by: Tamas Nemeth <treff7es@gmail.com>
2023-04-27 19:03:41 +02:00
Harshal Sheth
a33153c1f6
feat(sdk): add DataHubGraph.get_entity_semityped
method ( #7905 )
2023-04-26 13:44:13 -07:00
Pedro Silva
967260634c
Revert "feat(cli): Modifies ingest-sample-data command to use DataHub… ( #7899 )
2023-04-26 16:56:22 +01:00
Harshal Sheth
29e5cfd643
fix(ingest): fix minor bug + protective dep requirements ( #7861 )
2023-04-25 14:35:01 -07:00
Mayuri Nehate
031aee4298
fix(ingest/bigquery): fix handling of time decorator offset queries ( #7843 )
...
Co-authored-by: Andrew Sikowitz <andrew.sikowitz@acryl.io>
2023-04-25 13:51:20 -07:00
Mayuri Nehate
ca1f1903ea
fix(ingest/snowflake): fix optimised lineage query, filter temporary tables ( #7894 )
...
With this change, below snowflake query errors for larger lineage time window are fixed:
error 1 - 100099 (22000): Result array of ARRAYAGG is too large.
error 2 - max LOB size (16777216) exceeded, actual size of parsed column is xxxxxxxxxx
2023-04-25 13:51:04 -07:00
Harshal Sheth
19d7c392d6
feat(sdk): support entity types filter in get_urns_by_filter
( #7902 )
2023-04-25 13:31:55 -07:00
Harshal Sheth
71ecbd6060
fix(ingest/dbt): ensure dbt shows view properties ( #7872 )
2023-04-25 12:25:07 -07:00
Mayuri Nehate
28986d8081
fix(ingestion/tableau): backward compatibility with version 2021.1 and above ( #7864 )
...
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-04-24 11:08:56 -07:00
Mayuri Nehate
3212e74969
feat(ingest/snowflake): optionally emit all upstreams irrespective of recipe pattern ( #7842 )
2023-04-24 11:01:15 -07:00
Pedro Silva
a5fa933fb0
feat(cli): Modifies ingest-sample-data command to use DataHub url & token based on config ( #7896 )
2023-04-24 15:52:10 +01:00
Andrew Sikowitz
e9c2f9afcc
feat(ingest/unity): Ingest ownership for containers; lookup service principal display names ( #7869 )
2023-04-21 11:02:39 -07:00
mohdsiddique
f21eeed6e7
feat(ingestion): lookml refinement support ( #7781 )
...
Co-authored-by: MohdSiddiqueBagwan <mohdsiddique.bagwan@gslab.com>
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-04-21 10:55:31 -07:00
Yusuf Mahtab
fa10256c47
feat(glue): allow resource links to be ignored ( #7639 )
...
Co-authored-by: Justas Cernas <justas.cernas@fundingcircle.com>
2023-04-21 10:42:32 -07:00
Aezo
1a5c716b87
feat(ingest/powerbi): support modified_since, extract_dataset_schema and many more ( #7519 )
...
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-04-20 22:58:45 -07:00
Harshal Sheth
66f44945e3
docs(ingest): update dbt and aws docs ( #7870 )
2023-04-20 21:08:22 -07:00
Andrew Sikowitz
1ff6949e36
refactor(ingest): Add helper DataHubGraph methods ( #7851 )
...
Adds:
- get_urns_by_filter(), using scroll by entities
- get_latest_pipeline_checkpoint()
- soft_delete_urn()
2023-04-20 10:16:33 -07:00
Harshal Sheth
6802142f6e
fix(ingest/salesforce): use report timestamp for operations ( #7838 )
...
Co-authored-by: John Joyce <john@acryl.io>
2023-04-19 20:39:07 -07:00
Harshal Sheth
399e3333ad
feat(cli): improve quickstart stability ( #7839 )
2023-04-17 21:19:19 -07:00
Harshal Sheth
e461d03d94
feat(ingest/unity): capture create/lastModified timestamps ( #7819 )
...
Co-authored-by: Tamas Nemeth <treff7es@gmail.com>
2023-04-17 12:18:21 -07:00
Mayuri Nehate
a8681dae75
fix(ingest/snowflake): fix column name in snowflake optimised lineage ( #7834 )
2023-04-17 11:44:53 -07:00
Andrew Sikowitz
1ac1ccf26e
perf(ingest/bigquery): Improve bigquery usage disk usage and speed ( #7825 )
2023-04-14 18:09:43 -07:00
Andrew Sikowitz
e839ac4c40
fix(ingest/bigquery): Handle null values from usage aggregation ( #7827 )
2023-04-14 16:54:22 -07:00
Mayuri Nehate
8ec74ce41c
fix(ingest/bigquery): update usage query, remove erroneous init ( #7811 )
2023-04-14 13:38:50 -07:00
Andrew Sikowitz
37e7485184
fix(ingest/bigquery): Do not query columns when not ingesting tables or views ( #7823 )
2023-04-14 09:08:22 -07:00
Andrew Sikowitz
408cd7db2a
fix(ingest/bigquery): Enable lineage and usage ingestion without tables ( #7820 )
2023-04-14 01:41:00 -07:00
Andrew Sikowitz
d8d8176b1a
fix(ingest/bigquery): Add to lineage, not overwrite, when using sql parser ( #7814 )
2023-04-14 08:46:10 +02:00
Tamas Nemeth
4ec280ee20
fix(ingest/redshift): Remove pg_user table from metadata queries ( #7815 )
2023-04-13 15:35:26 -07:00
Andrew Sikowitz
ce795406b9
feat(ingest): Track disk usage in report ( #7812 )
2023-04-13 14:43:25 -07:00
RyanHolstien
0d5873db2a
feat(patch): patch support for flow info and job info and refactor patchbuilders for java sdk ( #7495 )
...
Co-authored-by: Shirshanka Das <shirshanka@apache.org>
Co-authored-by: david-leifker <114954101+david-leifker@users.noreply.github.com>
Co-authored-by: David Leifker <david.leifker@acryl.io>
2023-04-13 15:46:35 -05:00
Harshal Sheth
4f59169566
feat(ingest/lookml): correctly handle include directives from imported projects ( #7798 )
2023-04-13 13:28:58 -07:00
Harshal Sheth
204727a6ee
feat(ingest/unity): support extracting ownership ( #7801 )
2023-04-12 19:45:41 -07:00
Harshal Sheth
3079f0a7e1
feat(sdk): support executing graphql via DataHubGraph ( #7753 )
...
Co-authored-by: Hyejin Yoon <0327jane@gmail.com>
2023-04-12 11:30:05 -07:00
Tamas Nemeth
0cc12bcce7
feat(ingest): redshift - Redshift rework ( #6906 )
2023-04-12 19:15:43 +02:00
Andrew Sikowitz
b7feb2a671
config(ingest/bigquery): Default lineage_use_sql_parser to true; update description ( #7797 )
2023-04-11 23:00:41 -07:00
Andrew Sikowitz
156d9df6b5
fix(ingest/bigquery): Fix lineage / usage table ref checks ( #7800 )
2023-04-11 23:00:27 -07:00
Andrew Sikowitz
54f047e1a8
test(ingest/snowflake): fix tests around host_port ( #7791 )
...
Co-authored-by: Tamas Nemeth <treff7es@gmail.com>
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-04-11 16:06:35 -07:00