945 Commits

Author SHA1 Message Date
Andrew Sikowitz
3a21c27f06
feat(ingest): Turn on browse path v2 creation (#8342)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-07-06 16:43:42 -04:00
Andrew Sikowitz
afbd52bdf0
test(ingest/mysql): Configure sql_server tests for arm64 (#8360)
Co-authored-by: Tamas Nemeth <treff7es@gmail.com>
2023-07-05 13:41:24 -04:00
Tamas Nemeth
74ab1bea06
fix(ingest/s3): Fix for flaky s3 test - uploading s3 files in consistent order (#8367) 2023-07-04 19:19:39 +02:00
Andrew Sikowitz
72a41ef9f6
test(ingest/trino): xfail test to unblock CI (#8340) 2023-06-30 17:51:50 +05:30
Mayuri Nehate
75d67b97bc
fix(ingest/postgres): fix profiling errors, skip json type column (#8291) 2023-06-28 10:59:31 -04:00
Gabe Lyons
d075bb4824
fix(embed): set embed url to false for tableau config (#8308) 2023-06-27 07:55:16 -07:00
Andrew Sikowitz
584366771d
refactor(unity): Remove databricks_cli and cleanup (#8249)
Co-authored-by: Tamas Nemeth <treff7es@gmail.com>
2023-06-23 18:01:05 +05:30
Andrew Sikowitz
aa5e02d0ec
feat(ingest): Create zero usage aspects (#8205)
Co-authored-by: Mayuri Nehate <33225191+mayurinehate@users.noreply.github.com>
2023-06-22 17:07:50 -04:00
Serhii Dimchenko
5b9fd977eb
fix(ingest/dbt-athena): dbt-athena types mapping for complex types (#8264)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-06-22 11:40:27 +02:00
Aseem Bansal
af6973ebe0
fix(ingest/okta): Set default of okta connector to match OIDC defaults (#8272) 2023-06-21 19:15:31 +05:30
Mayuri Nehate
88ceac316b
fix(ingest/tableau): split table columns query from datasources query (#8217)
Co-authored-by: Tamas Nemeth <treff7es@gmail.com>
2023-06-21 11:50:35 +02:00
Andrew Sikowitz
02bc71dd84
fix(ingest/okta): Set default of okta_profile_to_username_attr to email (#8263) 2023-06-21 13:38:59 +05:30
mohdsiddique
e7e07a73b4
feat(ingestion/powerbi): Ingest datasets not used in PowerBI visualization(tiles/pages) (#8212)
Co-authored-by: MohdSiddiqueBagwan <mohdsiddique.bagwan@gslab.com>
2023-06-15 14:04:40 -07:00
Andrew Sikowitz
66806a805e
feat(ingest/unity): Set external url for containers and datasets (#8238) 2023-06-15 09:05:49 +02:00
Andrew Sikowitz
c5cc53b99a
feat(ingest/bigquery_v2): enable platform instance using project id (#8216)
Co-authored-by: Adrián Pertíñez <khurzak92@gmail.com>
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-06-14 09:50:21 -07:00
Mayuri Nehate
ab3fe0da81
refractor(classification): datahub classifier init (#8193) 2023-06-12 05:07:03 -07:00
Andrew Sikowitz
369a04ae30
revert(ingest/bigquery): Do not emit DataPlatformInstance; remove references to platform_instance (#8196) 2023-06-09 13:44:24 +05:30
mohdsiddique
45e592b7c6
fix(ingestion/looker): ingest looks not part of dashboard (#8140)
Co-authored-by: MohdSiddiqueBagwan <mohdsiddique.bagwan@gslab.com>
2023-06-08 12:54:14 +05:30
Mayuri Nehate
ac06cf3d3f
feat(classification): configurable minimum values threshold (#8186) 2023-06-07 21:28:13 -07:00
Andrew Sikowitz
6bad15be5c
fix(ingest): Fix modeldocgen; bump feast to relax pyarrow constraint (#8178) 2023-06-06 13:12:10 -07:00
Mayuri Nehate
983a8ca675
feat(classification): support for regex based custom infotypes (#8177) 2023-06-06 14:41:51 +02:00
Adrián Pertíñez
743439c11d
feat(ingest/bigquery_v2): enable platform instance using project id (#8142)
Co-authored-by: Andrew Sikowitz <andrew.sikowitz@acryl.io>
2023-06-05 15:17:40 -07:00
Andrew Sikowitz
3022c2d12e
feat(ingest/unity): Add qualified name to dataset properties (#8164) 2023-06-05 11:20:13 -07:00
mohdsiddique
e7d1b900ec
fix(ingestion/looker): set project-name for imported_projects views (#8086) 2023-06-02 17:04:34 -07:00
Tamas Nemeth
d50a99935b
fix(ingest/s3): Path spec aware folder traversal (#8095) 2023-05-30 16:20:49 +02:00
Serhii Dimchenko
6adb496581
feat: add dbt-athena adapter support for column types mapping (#8116)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-05-27 09:22:55 -05:00
Aseem Bansal
96f364802b
feat(lineage source): add fine grained lineage support (#7904) 2023-05-26 17:09:32 +05:30
Harshal Sheth
2d442161c4
ci(ingest/kafka): improve kafka integration test reliability (#8085) 2023-05-25 15:40:56 -07:00
Andrew Sikowitz
d3cd4dbb0c
feat(ingest/unity): Allow ingestion without metastore admin role (#8091)
- Adds more detailed docs and connection test
- Fixes empty username queries
2023-05-24 15:36:22 -07:00
Andrew Sikowitz
fdbc4de695
refactor(ingest): Call source_helpers via new WorkUnitProcessors in base Source (#8101) 2023-05-24 13:36:19 -07:00
Harshal Sheth
b0f8c3de1e
refactor(ingest): simplify stateful ingestion provider interface (#8104)
Co-authored-by: Tamas Nemeth <treff7es@gmail.com>
2023-05-23 12:57:57 -07:00
Andrew Sikowitz
a43903bf6d
refactor(ingest): Auto report workunits (#8061) 2023-05-22 17:06:31 -07:00
Harshal Sheth
4e9c652707
feat(ingest): add env to container properties (#8027) 2023-05-22 12:07:16 -07:00
Shubham Jagtap
e6371c8e94
fix(ingestion/powerbi): skip erroneous pages of a report (#8021)
Co-authored-by: mohdsiddique <mohdsiddiquebagwan@gmail.com>
Co-authored-by: MohdSiddiqueBagwan <mohdsiddique.bagwan@gslab.com>
2023-05-19 18:02:55 -07:00
Tamas Nemeth
bdd4bc7b92
feat(ingest/s3) - Stateful ingestion and last-updated support (#8022) 2023-05-19 13:10:15 +02:00
Harshal Sheth
1902e7d4db
ci(ingest/clickhouse): don't use kernel ephemeral ports (#8060) 2023-05-19 11:17:41 +02:00
mohdsiddique
ae30be9c25
fix(ingestion/tableau): ingest parent project name in container properties (#8030)
Co-authored-by: MohdSiddiqueBagwan <mohdsiddique.bagwan@gslab.com>
2023-05-17 14:19:41 -07:00
Shubham Jagtap
8cc6606e68
feat(ingestion/kafka): add description in dataset properties (#7974)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
Co-authored-by: MohdSiddiqueBagwan <mohdsiddique.bagwan@gslab.com>
Co-authored-by: mohdsiddique <mohdsiddiquebagwan@gmail.com>
2023-05-17 11:03:08 -07:00
Harshal Sheth
e70d0b3859
fix(ingest/dbt): fix dbt subtypes for sources (#8048) 2023-05-16 11:11:26 -07:00
xiphl
d619cc6b3f
feat(ingest): Allow csv-enricher to update more types (#7932)
Co-authored-by: xiphl <xiphlerl9@gmail.com>
2023-05-15 10:38:19 -07:00
Shubham Jagtap
7483d9a4de
fix(ingestion/metabase): metabase connector bigquery lineage fix (#8042)
Co-authored-by: mohdsiddique <mohdsiddiquebagwan@gmail.com>
2023-05-15 14:30:20 +02:00
cccs-Dustin
87d32d7377
feat(ingest/superset): add stateful ingestion (#8013) 2023-05-11 21:56:05 -07:00
Tamas Nemeth
dec54bf098
feat(ingest/s3): Inferring schema from the alphabetically last folder (#8005) 2023-05-10 21:55:05 +02:00
Andrew Sikowitz
44406f7adf
fix(ingest/postgres): Allow specification of initial engine database; set default database to postgres (#7915)
Co-authored-by: Mayuri Nehate <33225191+mayurinehate@users.noreply.github.com>
2023-05-09 11:11:43 -07:00
Mayuri Nehate
c845c75a2d
feat(ingest/snowflake): add config option to specify deny patterns for upstreams (#7962)
Co-authored-by: Andrew Sikowitz <andrew.sikowitz@acryl.io>
2023-05-08 14:13:57 -07:00
Harshal Sheth
ca5dffa54d
refactor(ingest/biz-glossary): simplify business glossary source (#7912) 2023-05-03 17:01:58 -07:00
Mayuri Nehate
a711baa131
fix(ingest/hive): fix containers generation for hive (#7926) 2023-05-02 15:07:51 +02:00
Andrew Sikowitz
5b290c9bc5
feat(ingest/unity): Add usage extraction; add TableReference (#7910)
- Adds usage extraction to the unity catalog source and a TableReference object to handle references to tables
Also makes the following refactors:
- Creates UsageAggregator class to usage_common, as I've seen this same logic multiple times.
- Allows customizable user_urn_builder in usage_common as not all unity users are emails. We create emails with a default email_domain config in other connectors like redshift and snowflake, which seems unnecessary now?
- Creates TableReference for unity catalog and adds it to the Table dataclass, for managing string references to tables. Replaces logic, especially in lineage extraction, with these references
- Creates gen_dataset_urn and gen_user_urn on unity source to reduce duplicate code
Breaks up proxy.py into implementation and types
2023-05-01 11:30:09 -07:00
Harshal Sheth
916cb21454
test(ingest/biz-glossary): add test for enable_auto_id (#7911) 2023-04-26 19:48:52 -07:00
Harshal Sheth
71ecbd6060
fix(ingest/dbt): ensure dbt shows view properties (#7872) 2023-04-25 12:25:07 -07:00