362 Commits

Author SHA1 Message Date
Harshal Sheth
3428bcaaad
fix(ingest): add tableau sqlglot dep (#8552) 2023-08-02 15:18:06 -03:00
Pedro Silva
a4a8182001
feat(cli): Adds ability to upload recipes to DataHub's UI (#8317)
Co-authored-by: Indy Prentice <iprentic@users.noreply.github.com>
2023-08-01 17:35:42 -03:00
VISHAL KUMAR
ef3b9489aa
feat(ingest/vertica): performance improvement and bug fixes (#8328)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-08-01 19:34:35 +05:30
Harshal Sheth
d8b2397b93
fix(ingest): pin boto3-stubs in CI (#8527)
Co-authored-by: Tamas Nemeth <treff7es@gmail.com>
2023-07-31 19:48:05 -07:00
Harshal Sheth
89f23d3c36
chore(ingest): bump sqllineage and sqlparse (#8481) 2023-07-28 13:10:19 -07:00
Harshal Sheth
d733363bed
chore(ingest): drop bigquery-beta and snowflake-beta aliases (#8451) 2023-07-20 14:05:25 -04:00
Aseem Bansal
9df70d7355
ingest(elasticsearch): add basic profiling (#8351) 2023-07-20 08:25:30 +05:30
Andrew Sikowitz
48c1dc820e
build(ingest/boto3): Update boto3-stubs to fix CI (#8452) 2023-07-18 21:29:50 +00:00
Andrew Sikowitz
20b3adb7b1
fix(ingest/snowflake): Add sqlglot as snowflake dependency (#8427) 2023-07-14 21:31:24 -04:00
Andrew Sikowitz
f41f642eaf
build(ingest/boto3): Update boto3-stubs to fix CI (#8425) 2023-07-14 15:48:04 -07:00
mohdsiddique
cbbe083731
fix(ingestion/powerbi): increment msal version (#8385)
Co-authored-by: MohdSiddiqueBagwan <mohdsiddique.bagwan@gslab.com>
2023-07-13 17:33:19 +05:30
Tamas Nemeth
54c7aef1bc
feat(ingest/presto-on-hive): Extracting all the table properties from Hive Metastore (#8348)
Co-authored-by: Pedro Silva <pedro@acryl.io>
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-07-12 15:56:13 -03:00
Andrew Sikowitz
2261531e31
test(ingest): Aspect level golden file comparison (#8310) 2023-07-11 10:39:47 -04:00
Harshal Sheth
3e47b3d228
feat(ingest): schema-aware SQL parsing for column-level lineage (#8334) 2023-07-07 16:24:35 -07:00
Andrew Sikowitz
8617e072fa
build(ingest): Pin pydeequ to unblock CI (#8381) 2023-07-07 16:18:52 -04:00
Andrew Sikowitz
8a198cd615
fix(ingest/unity): Pin databricks-sdk and update docs (#8293) 2023-06-27 13:38:55 -04:00
Andrew Sikowitz
584366771d
refactor(unity): Remove databricks_cli and cleanup (#8249)
Co-authored-by: Tamas Nemeth <treff7es@gmail.com>
2023-06-23 18:01:05 +05:30
Tamas Nemeth
d3aed62778
feat(cli): Initial support for sending exceptions to Sentry (#7172) 2023-06-22 10:24:58 +02:00
Mayuri Nehate
ac06cf3d3f
feat(classification): configurable minimum values threshold (#8186) 2023-06-07 21:28:13 -07:00
Andrew Sikowitz
7041281bbe
build(ingest/feast): Pin feast to minor version (#8180) 2023-06-07 10:04:42 +02:00
Andrew Sikowitz
6bad15be5c
fix(ingest): Fix modeldocgen; bump feast to relax pyarrow constraint (#8178) 2023-06-06 13:12:10 -07:00
Mayuri Nehate
983a8ca675
feat(classification): support for regex based custom infotypes (#8177) 2023-06-06 14:41:51 +02:00
Vinícius Mello
7059874dec
feat(ingest/bigquery): Add BigQuery Views lineage extraction from Google Data Catalog API (#8100) 2023-05-25 08:37:46 -07:00
Mayuri Nehate
84270bcac8
feat(ingest/nifi): kerberos authentication (#8097)
Co-authored-by: david-leifker <114954101+david-leifker@users.noreply.github.com>
Co-authored-by: Andrew Sikowitz <andrew.sikowitz@acryl.io>
Co-authored-by: Indy Prentice <iprentic@users.noreply.github.com>
2023-05-24 15:09:01 -07:00
Tamas Nemeth
4ca7a9b50e
fix(ingest/build): setting typing extension <4.6.0 because it breaks tests (#8108) 2023-05-23 18:55:28 +05:30
Shirshanka Das
b3c790aab6
feat: Add support for Data Products (#8039)
Co-authored-by: Chris Collins <chriscollins3456@gmail.com>
2023-05-17 07:17:25 +00:00
Tamas Nemeth
c0d50d0b2c
fix(ingest/s3) Adding missing more-itertools dependency (#8023) 2023-05-11 12:14:25 -07:00
Andrew Sikowitz
9c7742b1d7
fix(ingest/unity): Update databricks-cli pin (#8024) 2023-05-11 12:14:10 -07:00
Andrew Sikowitz
a68833769e
refactor(ingest/unity): Use databricks-sdk over databricks-cli for usage query (#7981) 2023-05-09 13:30:11 -07:00
Andrew Sikowitz
4e9c398e1d
fix(ingest/unity): Add sqllineage dependency (#7938) 2023-05-01 23:26:49 -04:00
Andrew Sikowitz
eb1674ffdb
fix(ingest/unity-catalog): Add usage_common dependency to unity catalog plugin (#7935) 2023-05-01 14:47:44 -07:00
Andrew Sikowitz
5b290c9bc5
feat(ingest/unity): Add usage extraction; add TableReference (#7910)
- Adds usage extraction to the unity catalog source and a TableReference object to handle references to tables
Also makes the following refactors:
- Creates UsageAggregator class to usage_common, as I've seen this same logic multiple times.
- Allows customizable user_urn_builder in usage_common as not all unity users are emails. We create emails with a default email_domain config in other connectors like redshift and snowflake, which seems unnecessary now?
- Creates TableReference for unity catalog and adds it to the Table dataclass, for managing string references to tables. Replaces logic, especially in lineage extraction, with these references
- Creates gen_dataset_urn and gen_user_urn on unity source to reduce duplicate code
Breaks up proxy.py into implementation and types
2023-05-01 11:30:09 -07:00
Mayuri Nehate
a0c4e0dd46
feat(ingest): add GCS ingestion source (#7903)
Co-authored-by: Tamas Nemeth <treff7es@gmail.com>
2023-04-27 19:03:41 +02:00
Harshal Sheth
29e5cfd643
fix(ingest): fix minor bug + protective dep requirements (#7861) 2023-04-25 14:35:01 -07:00
Harshal Sheth
f0ea79060b
chore(ingest): bug fix in sqlparse pin (#7848) 2023-04-18 16:05:23 -07:00
Harshal Sheth
cf7eb570a0
fix(ingest): pin sqlparse version (#7847) 2023-04-18 14:25:42 -07:00
Andrew Sikowitz
1ac1ccf26e
perf(ingest/bigquery): Improve bigquery usage disk usage and speed (#7825) 2023-04-14 18:09:43 -07:00
Tamas Nemeth
0cc12bcce7
feat(ingest): redshift - Redshift rework (#6906) 2023-04-12 19:15:43 +02:00
Mayuri Nehate
ec1228f67d
fix(dep): add sqllineage dependency for tableau (#7803) 2023-04-12 15:33:31 +02:00
Harshal Sheth
e99875cac6
chore(ingest): enable flake8 bugbear linting (#7763) 2023-04-10 14:14:42 -07:00
Harshal Sheth
89734587f7
feat(ingest): add urn modification helper (#7440)
Co-authored-by: Tamas Nemeth <treff7es@gmail.com>
2023-03-16 13:27:08 -07:00
Harshal Sheth
b87e4c312a
fix(ingest): pin typeguard version for feast (#7591) 2023-03-15 13:27:20 +05:30
Mayuri Nehate
dac3938077
chore(ingest): snowflake - bump up classification library version to 0.0.6 (#7542) 2023-03-12 10:20:32 -07:00
J Feldman
aa4228734c
feat(ingest/looker): upgrade to Looker API from 3.1 to 4.0 (#7411)
Co-authored-by: Tamas Nemeth <treff7es@gmail.com>
Co-authored-by: Shirshanka Das <shirshanka@apache.org>
2023-03-07 16:49:37 -08:00
Mayuri Nehate
dc2a7d8a46
chore(ingest): remove unused dependency for bigquery (#7510)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-03-07 16:43:14 -08:00
cburroughs
cc0772f8d8
feat(ingest): unbundle airflow plugin emitter dependencies (#7493)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-03-07 09:07:42 -08:00
Harshal Sheth
b4dd1d7d82
chore(ingest): pin acryl-datahub-classify (#7485) 2023-03-03 11:36:20 -05:00
Aseem Bansal
1adbc2cab0
chore(ci): upgrade GE version (#7290) 2023-03-02 10:47:38 -08:00
Aseem Bansal
f8a73005d4
chore(ci): relax bigquery dependency (#7309) 2023-02-21 08:33:00 +01:00
Tamas Nemeth
097d4e6bbd
fix(dep/json-schema): Fixing json-schema dependencies (#7383) 2023-02-20 14:02:08 -08:00