905 Commits

Author SHA1 Message Date
Andrew Sikowitz
e82e284982
fix(ingest/kafka): Remove topic from kafka browse path (#7398) 2023-02-22 18:38:08 -05:00
Andrew Sikowitz
2764c44977
fix(ingest): Do not require platform_instance for stateful ingestion (#7397) 2023-02-21 21:27:44 -05:00
Aseem Bansal
986086ae00
test(cli): add check for missing init files (#7378) 2023-02-20 18:41:12 +05:30
Shirshanka Das
07e4d0696f
feat(ingest): json-schema - add json schema support for files and kaf… (#7361) 2023-02-19 08:43:13 -08:00
Mayuri Nehate
2cffec9452
fix(check upgrade): update logic to compare server and client version (#7238)
Co-authored-by: John Joyce <john@acryl.io>
2023-02-13 13:09:38 -08:00
Andrew Sikowitz
8901498582
fix(transformers): pattern add domain transformer - enable replace_existing (#7317) 2023-02-13 12:52:44 -08:00
Tamas Nemeth
f10d622e47
fix(ingest/bigquery): Improve memory usage of lineage extraction (#7326) 2023-02-13 19:59:11 +01:00
Tamas Nemeth
b34e4fe1f1
fix(ingest/bigquery): Fix for table cache was not cleared (#7323) 2023-02-13 19:04:19 +01:00
Harshal Sheth
55442042ff
feat(cli): improve startup time (#7292) 2023-02-10 21:36:01 +05:30
Tamas Nemeth
1402071e48
fix(ingest/bigquery) - Fix for Bigquery parser quoted semicolon in the FROM table name as well (#7277) 2023-02-08 10:18:55 +01:00
Daniel Messias
0d67e188ef
feat(glue): Use table name as human-readable name for Glue ingestion (#7213)
Co-authored-by: John Joyce <john@acryl.io>
2023-02-02 18:04:35 +01:00
Harshal Sheth
db1a0f13f3
fix(ingest): fix issue in glue tests (#7185) 2023-01-30 21:51:21 -08:00
Harshal Sheth
927d45dda9
feat(ingest): add --log-file option and show CLI logs in UI report (#7118)
Co-authored-by: Tamas Nemeth <treff7es@gmail.com>
2023-01-26 09:25:02 -08:00
Harshal Sheth
45f50d2614
test(ingest): fix kafka admin client mocking (#7098)
Co-authored-by: Tamas Nemeth <treff7es@gmail.com>
2023-01-23 16:22:20 +01:00
Tamas Nemeth
0cdb5e4b4b
refactor(ingest/containers): Refactoring container creation to common place (#6877) 2023-01-21 00:14:31 +01:00
Harshal Sheth
e23eb7108f
feat(ingest): reporting revamp, part 1 (#7031) 2023-01-18 13:34:32 -08:00
Harshal Sheth
35bd73a28b
feat(ingest): fix handling of unions with aliases in post restli conversion (#7058) 2023-01-18 09:29:46 -08:00
Tim
e2ad881d79
refactor(ingest/athena): Replace s3_staging_dir parameter in Athena source with query_result_location (#7044)
Co-authored-by: John Joyce <john@acryl.io>
2023-01-18 09:25:37 -08:00
Harshal Sheth
cb12910b6b
feat(ingest): add entity registry in codegen (#6984)
Co-authored-by: Pedro Silva <pedro@acryl.io>
2023-01-17 19:41:43 -08:00
Harshal Sheth
432feaa16d
feat(ingest): mark database_alias and env as deprecated (#6901) 2023-01-09 19:58:19 +05:30
Harshal Sheth
f651646d3d
chore(ingest): remove inferred args to MCPW, part 2 (#6905) 2023-01-04 23:29:56 -05:00
Harshal Sheth
dfc5c6bfce
chore(ingest): remove inferred args to MCPW, part 1 (#6819) 2022-12-30 01:26:47 -05:00
Tamas Nemeth
ead0074169
deprecate(ingest): bigquery - Removing bigquery-legacy source (#6851)
Co-authored-by: John Joyce <john@acryl.io>
2022-12-29 13:19:05 -08:00
Harshal Sheth
667ca8632d
feat(ingest): avoid embedding serialized json in metadata files (#6742) 2022-12-28 19:28:38 -05:00
Mayuri Nehate
2129496c98
feat(ingest/snowflake): handle failures gracefully and raise permission failures (#6748) 2022-12-28 08:20:37 -08:00
Harshal Sheth
31260888fc
feat(ingest/airflow): support raw dataset urns in airflow lineage (#6854)
* feat(ingest/airflow): support dataset Urns in airflow lineage

This PR also
- resolves a reported circular import issue
- refactors the Airflow tests to reduce duplication

* fix test
2022-12-27 08:59:26 +01:00
Mayuri Nehate
a05c5c4069
feat(ingest): extract kafka topic config properties as customProperties (#6783) 2022-12-22 09:34:55 +01:00
John Joyce
2e3a25123d
refactor(ingestion): Browse Paths Upgrade V2 Feast & Sagemaker (#6002) 2022-12-21 08:02:59 -08:00
Harshal Sheth
137f4500b6
feat(ingest/stateful): remove platform_instance_id from state urn (#6795) 2022-12-20 12:12:19 -05:00
Harshal Sheth
5584bfb469
refactor(ingest/stateful): remove get_last_state method (#6794) 2022-12-19 20:48:22 -05:00
Harshal Sheth
e9d50ed992
refactor(ingest/stateful): remove IngestionJobStateProvider (#6792) 2022-12-19 17:03:54 -05:00
Harshal Sheth
47be95689e
refactor(ingest/stateful): remove most remaining state classes (#6791) 2022-12-19 13:40:48 -05:00
Tamas Nemeth
e41b455e14
fix(ingest): bigquery - sharded table support improvements (#6789) 2022-12-19 18:57:37 +01:00
Mayuri Nehate
9716a49067
fix(ingest): correct external url for account identifier with account name (#6715) 2022-12-16 14:00:42 -05:00
Harshal Sheth
8a537b0559
feat(ingest): add datahub state inspect command (#6763) 2022-12-15 18:55:36 -05:00
Harshal Sheth
6152b5e9f7
feat(ingest): simplify more stateful ingestion state (#6762) 2022-12-15 11:33:29 -05:00
Shirshanka Das
db182e4639
fix(python-sdk): DataHubGraph get_aspect should accept empty responses (#6760) 2022-12-14 10:40:16 -08:00
Harshal Sheth
2f95719dba
feat(ingest): remove source config from DatahubIngestionCheckpoint (#6722) 2022-12-14 12:39:21 -05:00
Patrick Franco Braz
f0a371941e
refactor(ingest): bigquery-lineage - allow tables and datasets in uppercase (#6739) 2022-12-14 14:58:03 +01:00
Harshal Sheth
cf3db168ac
feat(ingest): start simplifying stateful ingestion state (#6740) 2022-12-13 10:05:57 +01:00
Harshal Sheth
7d63399d00
fix(ingest): fix serde for empty dicts in unions with null (#6745)
The code changes in https://github.com/acryldata/avro_gen/pull/16, but tests are written here.
2022-12-13 08:17:24 +01:00
Dmitry Bryazgin
551ef1b335
feat(ingest): add stateful ingestion to the ldap source (#6127)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2022-12-13 01:13:39 -05:00
Mayuri Nehate
65ba13d9aa
feat(ingest): snowflake - add separate config for include_column_lineage in snowflake (#6712) 2022-12-12 15:23:12 +01:00
Harshal Sheth
fd911c9820
feat(ingest): redact configs reported in ingestion_run_summary (#6696) 2022-12-12 10:48:26 +01:00
Harshal Sheth
b7735d5b21
fix(ingest): fix bug in auto_status_aspect (#6705)
Co-authored-by: Tamas Nemeth <treff7es@gmail.com>
2022-12-09 12:24:39 -05:00
Harshal Sheth
4dded454ff
fix(ingest): cleanup config extra usage (#6699) 2022-12-08 16:34:34 -08:00
Mayuri Nehate
eeb7a9dfe5
feat(ingest): snowflake - update snowflake docs, add simple validations (#6636) 2022-12-07 14:56:03 +01:00
Tamas Nemeth
9a1f78fc60
fix(ingest): profiling - Changing profiling defaults on Bigquery and Snowflake (#6640) 2022-12-07 10:33:10 +01:00
Harshal Sheth
a1e62c723e
docs(ingest): add airflow docs that use the PythonVirtualenvOperator (#6604) 2022-12-02 19:56:17 +01:00
Harshal Sheth
44cfd21a65
chore(ingest): bump and pin mypy (#6584) 2022-12-02 19:53:28 +01:00