Andrew Sikowitz
afcf462cb1
feat(ingest/unity): Add profiling support ( #7976 )
...
- Also adds a new databricks sdk
2023-05-11 10:00:50 -07:00
Tamas Nemeth
dec54bf098
feat(ingest/s3): Inferring schema from the alphabetically last folder ( #8005 )
2023-05-10 21:55:05 +02:00
Andrew Sikowitz
44406f7adf
fix(ingest/postgres): Allow specification of initial engine database; set default database to postgres ( #7915 )
...
Co-authored-by: Mayuri Nehate <33225191+mayurinehate@users.noreply.github.com>
2023-05-09 11:11:43 -07:00
Mayuri Nehate
c845c75a2d
feat(ingest/snowflake): add config option to specify deny patterns for upstreams ( #7962 )
...
Co-authored-by: Andrew Sikowitz <andrew.sikowitz@acryl.io>
2023-05-08 14:13:57 -07:00
Mayuri Nehate
13b1d66170
fix(ingest/bigquery): remove incorrectly used table_pattern filter ( #7810 )
...
Co-authored-by: Andrew Sikowitz <andrew.sikowitz@acryl.io>
2023-05-08 10:33:42 -07:00
Mayuri Nehate
0131aeefb1
fix(ingest/unity): improve error message if no scheme in workspace_url ( #7951 )
...
Co-authored-by: John Joyce <john@acryl.io>
2023-05-08 10:13:53 -07:00
Tamas Nemeth
0e69e5a810
fix(ingest/redshift): Enabling autocommit for Redshift connection ( #7983 )
2023-05-08 10:24:40 +02:00
Andrew Sikowitz
8019d17aa6
fix(ingest/bigquery): Filter projects for lineage and usage ( #7954 )
2023-05-04 18:14:48 +02:00
Harshal Sheth
ca5dffa54d
refactor(ingest/biz-glossary): simplify business glossary source ( #7912 )
2023-05-03 17:01:58 -07:00
Reilman79
b6e2cc549a
fix(ldap): properly handle escaped characters in LDAP DNs ( #7928 )
2023-05-03 13:57:52 -07:00
Felipe Ribeiro
d504cbd1b6
docs(ingest): update max_threads default value ( #7947 )
...
Co-authored-by: Felipe Ribeiro <fribeiro@fanatics.com>
2023-05-02 22:54:15 -07:00
Mayuri Nehate
a711baa131
fix(ingest/hive): fix containers generation for hive ( #7926 )
2023-05-02 15:07:51 +02:00
Andrew Sikowitz
5b290c9bc5
feat(ingest/unity): Add usage extraction; add TableReference ( #7910 )
...
- Adds usage extraction to the unity catalog source and a TableReference object to handle references to tables
Also makes the following refactors:
- Creates UsageAggregator class to usage_common, as I've seen this same logic multiple times.
- Allows customizable user_urn_builder in usage_common as not all unity users are emails. We create emails with a default email_domain config in other connectors like redshift and snowflake, which seems unnecessary now?
- Creates TableReference for unity catalog and adds it to the Table dataclass, for managing string references to tables. Replaces logic, especially in lineage extraction, with these references
- Creates gen_dataset_urn and gen_user_urn on unity source to reduce duplicate code
Breaks up proxy.py into implementation and types
2023-05-01 11:30:09 -07:00
Mayuri Nehate
a0c4e0dd46
feat(ingest): add GCS ingestion source ( #7903 )
...
Co-authored-by: Tamas Nemeth <treff7es@gmail.com>
2023-04-27 19:03:41 +02:00
Harshal Sheth
916cb21454
test(ingest/biz-glossary): add test for enable_auto_id ( #7911 )
2023-04-26 19:48:52 -07:00
Mayuri Nehate
031aee4298
fix(ingest/bigquery): fix handling of time decorator offset queries ( #7843 )
...
Co-authored-by: Andrew Sikowitz <andrew.sikowitz@acryl.io>
2023-04-25 13:51:20 -07:00
Harshal Sheth
19d7c392d6
feat(sdk): support entity types filter in get_urns_by_filter
( #7902 )
2023-04-25 13:31:55 -07:00
Harshal Sheth
71ecbd6060
fix(ingest/dbt): ensure dbt shows view properties ( #7872 )
2023-04-25 12:25:07 -07:00
Mayuri Nehate
28986d8081
fix(ingestion/tableau): backward compatibility with version 2021.1 and above ( #7864 )
...
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-04-24 11:08:56 -07:00
Mayuri Nehate
3212e74969
feat(ingest/snowflake): optionally emit all upstreams irrespective of recipe pattern ( #7842 )
2023-04-24 11:01:15 -07:00
Andrew Sikowitz
e9c2f9afcc
feat(ingest/unity): Ingest ownership for containers; lookup service principal display names ( #7869 )
2023-04-21 11:02:39 -07:00
mohdsiddique
f21eeed6e7
feat(ingestion): lookml refinement support ( #7781 )
...
Co-authored-by: MohdSiddiqueBagwan <mohdsiddique.bagwan@gslab.com>
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-04-21 10:55:31 -07:00
Yusuf Mahtab
fa10256c47
feat(glue): allow resource links to be ignored ( #7639 )
...
Co-authored-by: Justas Cernas <justas.cernas@fundingcircle.com>
2023-04-21 10:42:32 -07:00
Aezo
1a5c716b87
feat(ingest/powerbi): support modified_since, extract_dataset_schema and many more ( #7519 )
...
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-04-20 22:58:45 -07:00
Harshal Sheth
6802142f6e
fix(ingest/salesforce): use report timestamp for operations ( #7838 )
...
Co-authored-by: John Joyce <john@acryl.io>
2023-04-19 20:39:07 -07:00
Harshal Sheth
e461d03d94
feat(ingest/unity): capture create/lastModified timestamps ( #7819 )
...
Co-authored-by: Tamas Nemeth <treff7es@gmail.com>
2023-04-17 12:18:21 -07:00
Harshal Sheth
af566e1184
feat(model): fully populate the entity registry ( #7818 )
2023-04-15 13:33:05 -07:00
Andrew Sikowitz
1ac1ccf26e
perf(ingest/bigquery): Improve bigquery usage disk usage and speed ( #7825 )
2023-04-14 18:09:43 -07:00
Andrew Sikowitz
e839ac4c40
fix(ingest/bigquery): Handle null values from usage aggregation ( #7827 )
2023-04-14 16:54:22 -07:00
Harshal Sheth
204727a6ee
feat(ingest/unity): support extracting ownership ( #7801 )
2023-04-12 19:45:41 -07:00
Harshal Sheth
3079f0a7e1
feat(sdk): support executing graphql via DataHubGraph ( #7753 )
...
Co-authored-by: Hyejin Yoon <0327jane@gmail.com>
2023-04-12 11:30:05 -07:00
Andrew Sikowitz
73016ebff9
test(ingest/bigquery): Add sql parser xfail test to fix later ( #7792 )
2023-04-12 10:51:29 -07:00
Tamas Nemeth
0cc12bcce7
feat(ingest): redshift - Redshift rework ( #6906 )
2023-04-12 19:15:43 +02:00
Andrew Sikowitz
54f047e1a8
test(ingest/snowflake): fix tests around host_port ( #7791 )
...
Co-authored-by: Tamas Nemeth <treff7es@gmail.com>
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-04-11 16:06:35 -07:00
David Sanchez
a50c71264d
feat(ingest/tableau): extract lineage from csql queries ( #7561 )
...
Co-authored-by: Mayuri Nehate <33225191+mayurinehate@users.noreply.github.com>
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-04-11 11:12:15 -07:00
Harshal Sheth
e99875cac6
chore(ingest): enable flake8 bugbear linting ( #7763 )
2023-04-10 14:14:42 -07:00
mohdsiddique
5e145cbb2d
feat(ingestion/okta): okta stateful ingestion ( #7736 )
...
Co-authored-by: MohdSiddiqueBagwan <mohdsiddique.bagwan@gslab.com>
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
Co-authored-by: John Joyce <john@acryl.io>
2023-04-07 13:44:32 -07:00
Mayuri Nehate
5fd7981532
fix(ingest/snowflake): fix incorrect tag urn case, improve tag display name ( #7758 )
2023-04-07 13:07:08 -07:00
Andrew Sikowitz
087855f374
fix(ingest/bigquery): Support cross project usage using FileBackedDict ( #7663 )
...
Includes major refactor of bigquery usage ingestion, minor refactor of the source as a whole, and reporting cleanup.
Includes bigquery performance testing changes.
2023-04-07 12:18:26 -07:00
Mayuri Nehate
1fda92441f
feat(snowflake): improve snowflake lineage perf and memory, push down to snowflake ( #7710 )
2023-04-07 11:06:06 -07:00
Andrew Sikowitz
44663fa035
fix(ingest/bigquery): Raise report_failure threshold; add robustness around table parsing ( #7772 )
...
- Converted getting views and tables to iterators
- Catches exception around table expiration time being impossible to represent in python because it's too far in the future
2023-04-06 13:24:22 -07:00
Tamas Nemeth
96bacfc5d7
fix(ingest/redshift): Fixing adding back db name in redshift urn ( #7765 )
2023-04-06 11:45:10 +02:00
Tamas Nemeth
29d2492667
fix(ingest/bigquery): Lineage edges use datetime with timezone; correctly parse last_altered ( #7762 )
2023-04-06 02:46:50 +00:00
Aseem Bansal
a11a7fa9d0
feat(snowflake): better error message on key pair authentication ( #7734 )
...
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-04-05 00:46:07 +00:00
Harshal Sheth
8d99babf75
feat(ingest/dbt): update subtypes for dbt ( #7750 )
2023-04-04 17:11:23 -07:00
Andrew Sikowitz
ce1ac7fa12
refactor(ingest): Use sqlite.Row row_factory for FileBackedCollections ( #7739 )
2023-04-04 11:53:56 -07:00
Harshal Sheth
f860ce95c0
feat(ingest): emit state payloads as soft-deleted ( #7714 )
2023-04-04 17:06:21 +00:00
Harshal Sheth
8394dcb538
chore(ingest): change kafka connect mapped ports ( #7728 )
2023-04-04 18:38:30 +05:30
Harshal Sheth
1634edaf25
feat(ingest/dbt): include dbt unique_id in properties ( #7737 )
2023-04-04 13:37:13 +05:30
Harshal Sheth
f780da4c0a
feat(ingest/lookml): support views with derived_table
.explore_source
( #7704 )
2023-04-03 16:18:39 -07:00