595 Commits

Author SHA1 Message Date
Tamas Nemeth
d3aed62778
feat(cli): Initial support for sending exceptions to Sentry (#7172) 2023-06-22 10:24:58 +02:00
Mayuri Nehate
ac06cf3d3f
feat(classification): configurable minimum values threshold (#8186) 2023-06-07 21:28:13 -07:00
Andrew Sikowitz
7041281bbe
build(ingest/feast): Pin feast to minor version (#8180) 2023-06-07 10:04:42 +02:00
Andrew Sikowitz
6bad15be5c
fix(ingest): Fix modeldocgen; bump feast to relax pyarrow constraint (#8178) 2023-06-06 13:12:10 -07:00
Mayuri Nehate
983a8ca675
feat(classification): support for regex based custom infotypes (#8177) 2023-06-06 14:41:51 +02:00
Vinícius Mello
7059874dec
feat(ingest/bigquery): Add BigQuery Views lineage extraction from Google Data Catalog API (#8100) 2023-05-25 08:37:46 -07:00
Mayuri Nehate
84270bcac8
feat(ingest/nifi): kerberos authentication (#8097)
Co-authored-by: david-leifker <114954101+david-leifker@users.noreply.github.com>
Co-authored-by: Andrew Sikowitz <andrew.sikowitz@acryl.io>
Co-authored-by: Indy Prentice <iprentic@users.noreply.github.com>
2023-05-24 15:09:01 -07:00
Tamas Nemeth
4ca7a9b50e
fix(ingest/build): setting typing extension <4.6.0 because it breaks tests (#8108) 2023-05-23 18:55:28 +05:30
Shirshanka Das
b3c790aab6
feat: Add support for Data Products (#8039)
Co-authored-by: Chris Collins <chriscollins3456@gmail.com>
2023-05-17 07:17:25 +00:00
Tamas Nemeth
c0d50d0b2c
fix(ingest/s3) Adding missing more-itertools dependency (#8023) 2023-05-11 12:14:25 -07:00
Andrew Sikowitz
9c7742b1d7
fix(ingest/unity): Update databricks-cli pin (#8024) 2023-05-11 12:14:10 -07:00
Andrew Sikowitz
a68833769e
refactor(ingest/unity): Use databricks-sdk over databricks-cli for usage query (#7981) 2023-05-09 13:30:11 -07:00
Andrew Sikowitz
4e9c398e1d
fix(ingest/unity): Add sqllineage dependency (#7938) 2023-05-01 23:26:49 -04:00
Andrew Sikowitz
eb1674ffdb
fix(ingest/unity-catalog): Add usage_common dependency to unity catalog plugin (#7935) 2023-05-01 14:47:44 -07:00
Andrew Sikowitz
5b290c9bc5
feat(ingest/unity): Add usage extraction; add TableReference (#7910)
- Adds usage extraction to the unity catalog source and a TableReference object to handle references to tables
Also makes the following refactors:
- Creates UsageAggregator class to usage_common, as I've seen this same logic multiple times.
- Allows customizable user_urn_builder in usage_common as not all unity users are emails. We create emails with a default email_domain config in other connectors like redshift and snowflake, which seems unnecessary now?
- Creates TableReference for unity catalog and adds it to the Table dataclass, for managing string references to tables. Replaces logic, especially in lineage extraction, with these references
- Creates gen_dataset_urn and gen_user_urn on unity source to reduce duplicate code
Breaks up proxy.py into implementation and types
2023-05-01 11:30:09 -07:00
Mayuri Nehate
a0c4e0dd46
feat(ingest): add GCS ingestion source (#7903)
Co-authored-by: Tamas Nemeth <treff7es@gmail.com>
2023-04-27 19:03:41 +02:00
Harshal Sheth
29e5cfd643
fix(ingest): fix minor bug + protective dep requirements (#7861) 2023-04-25 14:35:01 -07:00
Harshal Sheth
f0ea79060b
chore(ingest): bug fix in sqlparse pin (#7848) 2023-04-18 16:05:23 -07:00
Harshal Sheth
cf7eb570a0
fix(ingest): pin sqlparse version (#7847) 2023-04-18 14:25:42 -07:00
Andrew Sikowitz
1ac1ccf26e
perf(ingest/bigquery): Improve bigquery usage disk usage and speed (#7825) 2023-04-14 18:09:43 -07:00
Tamas Nemeth
0cc12bcce7
feat(ingest): redshift - Redshift rework (#6906) 2023-04-12 19:15:43 +02:00
Mayuri Nehate
ec1228f67d
fix(dep): add sqllineage dependency for tableau (#7803) 2023-04-12 15:33:31 +02:00
Harshal Sheth
e99875cac6
chore(ingest): enable flake8 bugbear linting (#7763) 2023-04-10 14:14:42 -07:00
Harshal Sheth
89734587f7
feat(ingest): add urn modification helper (#7440)
Co-authored-by: Tamas Nemeth <treff7es@gmail.com>
2023-03-16 13:27:08 -07:00
Harshal Sheth
b87e4c312a
fix(ingest): pin typeguard version for feast (#7591) 2023-03-15 13:27:20 +05:30
Mayuri Nehate
dac3938077
chore(ingest): snowflake - bump up classification library version to 0.0.6 (#7542) 2023-03-12 10:20:32 -07:00
J Feldman
aa4228734c
feat(ingest/looker): upgrade to Looker API from 3.1 to 4.0 (#7411)
Co-authored-by: Tamas Nemeth <treff7es@gmail.com>
Co-authored-by: Shirshanka Das <shirshanka@apache.org>
2023-03-07 16:49:37 -08:00
Mayuri Nehate
dc2a7d8a46
chore(ingest): remove unused dependency for bigquery (#7510)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-03-07 16:43:14 -08:00
cburroughs
cc0772f8d8
feat(ingest): unbundle airflow plugin emitter dependencies (#7493)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-03-07 09:07:42 -08:00
Harshal Sheth
b4dd1d7d82
chore(ingest): pin acryl-datahub-classify (#7485) 2023-03-03 11:36:20 -05:00
Aseem Bansal
1adbc2cab0
chore(ci): upgrade GE version (#7290) 2023-03-02 10:47:38 -08:00
Aseem Bansal
f8a73005d4
chore(ci): relax bigquery dependency (#7309) 2023-02-21 08:33:00 +01:00
Tamas Nemeth
097d4e6bbd
fix(dep/json-schema): Fixing json-schema dependencies (#7383) 2023-02-20 14:02:08 -08:00
Shirshanka Das
07e4d0696f
feat(ingest): json-schema - add json schema support for files and kaf… (#7361) 2023-02-19 08:43:13 -08:00
Andrew Sikowitz
a605f0752f
fix(deps): pin snowflake-connector-python (#7365)
Co-authored-by: Tamas Nemeth <treff7es@gmail.com>
2023-02-18 10:44:55 +01:00
Harshal Sheth
582fdf95cd
chore(ingest): upgrade to mypy 1.0.0 (#7313) 2023-02-10 13:24:05 -08:00
Tamas Nemeth
793f303a79
fix(ingest/bigquery): Lowering significantly the memory usage of the BigQuery connector (#7315) 2023-02-10 13:12:02 -08:00
Harshal Sheth
55442042ff
feat(cli): improve startup time (#7292) 2023-02-10 21:36:01 +05:30
Harshal Sheth
e3af6168d3
fix(ingest): upgrade feast to avoid build issues (#7218) 2023-02-02 15:24:28 +01:00
david-leifker
39920bb00f
feat(elasticsearch): Elasticsearch improvements (#6894) 2023-01-31 18:44:37 -06:00
Patrick Franco Braz
8ee9fa1930
feat(ingest): bigquery - extracts lineage metadata from catalog api (#7137) 2023-01-31 15:02:30 +01:00
Harshal Sheth
927d45dda9
feat(ingest): add --log-file option and show CLI logs in UI report (#7118)
Co-authored-by: Tamas Nemeth <treff7es@gmail.com>
2023-01-26 09:25:02 -08:00
Harshal Sheth
54c5017efd
feat(ingest): move datahub-lite to optional dep and add shim when missing (#7097) 2023-01-20 17:24:43 -08:00
Harshal Sheth
13cc16fbc2
fix(cli/lite): fix datahub lite serve command (#7089) 2023-01-20 10:21:24 +01:00
Shirshanka Das
bdcc356cc5
feat(datahub-lite): introduces a new experimental lightweight impleme… (#7052) 2023-01-18 19:18:56 -08:00
Harshal Sheth
890dae0199
fix(ingest): temporarily disable vertica tests (#7059) 2023-01-17 12:37:16 -08:00
Rajasekhar-Vuppala
cd9fc26a25
feat(ingest/vertica): Adding Vertica as source in Datahub UI (#7010)
Co-authored-by: Vishal <vishal.k@simplify3x.com>
Co-authored-by: VISHAL KUMAR <110387730+vishalkSimplify@users.noreply.github.com>
Co-authored-by: John Joyce <john@acryl.io>
2023-01-13 13:23:32 -08:00
Harshal Sheth
211c30fe30
fix(ingest): add missing dep for powerbi (#6969) 2023-01-06 18:16:32 -05:00
VISHAL KUMAR
96ac4c431f
feat(ingest/vertica): support projections and lineage in vertica (#6785)
Co-authored-by: mraman2512 <MY_mramaan2512@gmail.com>
Co-authored-by: Aman.Kumar <64635307+mraman2512@users.noreply.github.com>
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-01-06 16:20:19 -05:00
Aseem Bansal
d55ad6ca14
fix(ci): restrict GE to fix build issues (#6967) 2023-01-06 18:25:36 +05:30