315 Commits

Author SHA1 Message Date
Tamas Nemeth
a8f0080c08
feat(ingest/teradata): Teradata source (#8977) 2023-10-12 15:14:45 -07:00
Pedro Silva
f6e1312063
feat(ingestion): Adds support for memory profiling (#8856)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-10-12 18:43:14 +01:00
Jinlin Yang
6310e51eb0
feat(ingestion/dynamodb): implement pagination for list_tables (#8910) 2023-10-05 09:33:31 +05:30
Mayuri Nehate
2fcced6db9
docs(ingest): add permissions required for athena ingestion (#8948) 2023-10-05 09:31:57 +05:30
ethan-cartwright
e2afd44bfe
feat(dbt-ingestion): add documentation link from dbt source to institutionalMemory (#8686)
Co-authored-by: Ethan Cartwright <ethan.cartwright@acryl.io>
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-10-04 20:38:58 +00:00
Andrew Sikowitz
d3346a04e4
feat(ingest/unity): Ingest notebooks and their lineage (#8940) 2023-10-04 10:22:45 -04:00
siddiquebagwan-gslab
c415d63dda
feat(ingestion/powerbi): column level lineage extraction for M-Query (#8796) 2023-10-04 16:22:51 +05:30
Aseem Bansal
ad313ad282
feat(transfomer): add transformer to get ownership from tags (#8748) 2023-10-04 14:06:03 +05:30
hariishaa
622816dcb8
feat(metadata-ingestion): implement mlflow source (#7971)
Co-authored-by: Andrew Sikowitz <andrew.sikowitz@acryl.io>
2023-09-26 13:51:30 -04:00
Shubham Jagtap
501522d891
feat(ingest/kafka-connect): Lineage for Kafka Connect > Snowflake (#8811)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-09-22 17:12:48 -07:00
Mayuri Nehate
5c40390a92
feat(ingest/kafka): support metadata mapping from kafka avro schemas (#8825)
Co-authored-by: Daniel Messias <danielcmessias@gmail.com>
Co-authored-by: Deepankarkr <deepankar.kumar@gslab.com>
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-09-22 17:11:42 -07:00
Harshal Sheth
5bb9f30895
docs(ingest/lookml): add guide on debugging lkml parse errors (#8890) 2023-09-22 16:55:15 -07:00
Tony Ouyang
f4da93988e
feat(ingestion/dynamodb): Add DynamoDB as new metadata ingestion source (#8768)
Co-authored-by: Mayuri Nehate <33225191+mayurinehate@users.noreply.github.com>
2023-09-15 13:26:17 -07:00
Hyejin Yoon
31abf383d1
ci: add markdown-link-check (#8771) 2023-09-14 11:34:21 +09:00
Adriano Vega Llobell
3cc0f76d17
docs(transformer): fix names in sample code of 'pattern_add_dataset_domain' (#8755) 2023-09-12 14:34:24 -07:00
cccs-eric
6fe60a274e
feat(iceberg): Upgrade Iceberg ingestion source to pyiceberg 0.4.0 (#8357)
Co-authored-by: cccs-Dustin <96579982+cccs-Dustin@users.noreply.github.com>
Co-authored-by: Fokko Driesprong <fokko@apache.org>
Co-authored-by: Andrew Sikowitz <andrew.sikowitz@acryl.io>
2023-08-31 13:01:05 -04:00
Andrew Sikowitz
40d17f00ea
feat(ingest/datahub): Improvements, bug fixes, and docs (#8735) 2023-08-29 14:33:40 -04:00
Tamas Nemeth
d86b336e70
chore(ingest/s3) Bump Deequ and Pyspark version (#8638)
Co-authored-by: Andrew Sikowitz <andrew.sikowitz@acryl.io>
2023-08-29 18:11:37 +02:00
Mayuri Nehate
cc94ffbf6c
fix(ingest): stateful redundant run skip handler (#8467) 2023-08-28 15:03:31 +05:30
Hyejin Yoon
04ecf4f75a
docs(docs): add native versioning (#8714) 2023-08-25 14:10:13 -07:00
Mayuri Nehate
e285da3e75
feat(ingest/snowflake): tables from snowflake shares as siblings (#8531) 2023-08-24 10:23:07 -04:00
RChygir
43d48ddde4
feat(ingest/mssql): load jobs and stored procedures (#5363) 2023-08-24 14:48:03 +05:30
Mayuri Nehate
3681e1a128
docs(ingest/kafka-connect): add details on platform instance mapping (#8654) 2023-08-18 18:51:14 +05:30
alplatonov
11fdfcf956
Fix(ingestion/clickhouse) move to two tier sqlalchemy (#8300)
Co-authored-by: Mayuri Nehate <33225191+mayurinehate@users.noreply.github.com>
2023-08-11 16:11:40 -04:00
Mayuri Nehate
7bb4e7b90d
docs(ingest): update s3 and gcs doc with concept mapping (#8575) 2023-08-11 11:01:15 -07:00
Jarod Smilkstein
f51bd01a70
feat(ingest): add ability to read other method types than GET for OAS ingest recipes (#8303) 2023-08-02 09:54:09 +05:30
Kirill Popov
eec89a884a
feat(ingest): Add metabase database id to platform instance mapping (#8359) 2023-08-02 09:53:48 +05:30
Mayuri Nehate
e67f811034
feat(classification): allow parallelisation to reduce time (#8368) 2023-08-02 09:53:39 +05:30
VISHAL KUMAR
ef3b9489aa
feat(ingest/vertica): performance improvement and bug fixes (#8328)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-08-01 19:34:35 +05:30
Harshal Sheth
99f1624ce7
docs(ingest/lookml): clarify connection map config (#8508) 2023-07-27 17:06:04 +05:30
Mayuri Nehate
5a3f91de53
docs(ingest/bigquery): add permissions to profile google drive backed… (#8490) 2023-07-27 17:01:39 +05:30
Mayuri Nehate
f4fde21168
feat(ingest/nifi): add support for basic auth in nifi (#8457) 2023-07-20 14:12:33 -04:00
Aseem Bansal
9df70d7355
ingest(elasticsearch): add basic profiling (#8351) 2023-07-20 08:25:30 +05:30
Andrew Sikowitz
8a198cd615
fix(ingest/unity): Pin databricks-sdk and update docs (#8293) 2023-06-27 13:38:55 -04:00
Ellie O'Neil
8880b47ca1
docs(business glossary) Update business glossary docs (#8287) 2023-06-26 11:00:09 -07:00
Ellie O'Neil
795b185f59
docs(ingest/lineage): Update fine grained file lineage docs (#8283) 2023-06-23 12:29:54 +05:30
Mayuri Nehate
ac06cf3d3f
feat(classification): configurable minimum values threshold (#8186) 2023-06-07 21:28:13 -07:00
Andrew Sikowitz
9fa8489cb8
feat(ingest/snowflake): Okta OAuth support; update docs (#8157) 2023-06-07 01:09:05 -07:00
Harshal Sheth
690ed083d9
feat(ingest): add more fail-safes to stateful ingestion (#8111) 2023-05-31 18:49:48 -07:00
Mayuri Nehate
abc2f85331
docs(ingest/nifi): fix broken links (#8143) 2023-05-30 11:04:15 -07:00
Gabe Lyons
ada6ea5a45
docs(csv-enricher): add example csv file & recipe (#8141) 2023-05-29 19:02:26 +05:30
Mayuri Nehate
3e727c5e9c
docs(glue): fix broken link (#8135)
Co-authored-by: Pedro Silva <pedro@acryl.io>
2023-05-26 09:25:59 -05:00
Mayuri Nehate
f2c53a3660
feat(ingest/glue): report glue job lineage failures, update doc (#8126) 2023-05-26 10:30:03 +02:00
Andrew Sikowitz
d3cd4dbb0c
feat(ingest/unity): Allow ingestion without metastore admin role (#8091)
- Adds more detailed docs and connection test
- Fixes empty username queries
2023-05-24 15:36:22 -07:00
Mayuri Nehate
84270bcac8
feat(ingest/nifi): kerberos authentication (#8097)
Co-authored-by: david-leifker <114954101+david-leifker@users.noreply.github.com>
Co-authored-by: Andrew Sikowitz <andrew.sikowitz@acryl.io>
Co-authored-by: Indy Prentice <iprentic@users.noreply.github.com>
2023-05-24 15:09:01 -07:00
Mayuri Nehate
798ce3d6c8
feat(classification): configurable sample size (#8096)
Co-authored-by: david-leifker <114954101+david-leifker@users.noreply.github.com>
2023-05-24 00:07:01 -07:00
matthew-piatkus-cko
bfde4662c7
fix(ingest/salesforce): support JSON web token auth (#7963) 2023-05-05 18:17:43 +00:00
Harshal Sheth
a9e0038199
docs(ingest/postgres): add example with ssl configuration (#7916)
Co-authored-by: John Joyce <john@acryl.io>
2023-05-03 15:22:24 -07:00
Andrew Sikowitz
5b290c9bc5
feat(ingest/unity): Add usage extraction; add TableReference (#7910)
- Adds usage extraction to the unity catalog source and a TableReference object to handle references to tables
Also makes the following refactors:
- Creates UsageAggregator class to usage_common, as I've seen this same logic multiple times.
- Allows customizable user_urn_builder in usage_common as not all unity users are emails. We create emails with a default email_domain config in other connectors like redshift and snowflake, which seems unnecessary now?
- Creates TableReference for unity catalog and adds it to the Table dataclass, for managing string references to tables. Replaces logic, especially in lineage extraction, with these references
- Creates gen_dataset_urn and gen_user_urn on unity source to reduce duplicate code
Breaks up proxy.py into implementation and types
2023-05-01 11:30:09 -07:00
Mayuri Nehate
a0c4e0dd46
feat(ingest): add GCS ingestion source (#7903)
Co-authored-by: Tamas Nemeth <treff7es@gmail.com>
2023-04-27 19:03:41 +02:00