935 Commits

Author SHA1 Message Date
Harshal Sheth
f651646d3d
chore(ingest): remove inferred args to MCPW, part 2 (#6905) 2023-01-04 23:29:56 -05:00
Harshal Sheth
dfc5c6bfce
chore(ingest): remove inferred args to MCPW, part 1 (#6819) 2022-12-30 01:26:47 -05:00
Tamas Nemeth
ead0074169
deprecate(ingest): bigquery - Removing bigquery-legacy source (#6851)
Co-authored-by: John Joyce <john@acryl.io>
2022-12-29 13:19:05 -08:00
Harshal Sheth
667ca8632d
feat(ingest): avoid embedding serialized json in metadata files (#6742) 2022-12-28 19:28:38 -05:00
Mayuri Nehate
2129496c98
feat(ingest/snowflake): handle failures gracefully and raise permission failures (#6748) 2022-12-28 08:20:37 -08:00
Harshal Sheth
31260888fc
feat(ingest/airflow): support raw dataset urns in airflow lineage (#6854)
* feat(ingest/airflow): support dataset Urns in airflow lineage

This PR also
- resolves a reported circular import issue
- refactors the Airflow tests to reduce duplication

* fix test
2022-12-27 08:59:26 +01:00
Mayuri Nehate
a05c5c4069
feat(ingest): extract kafka topic config properties as customProperties (#6783) 2022-12-22 09:34:55 +01:00
John Joyce
2e3a25123d
refactor(ingestion): Browse Paths Upgrade V2 Feast & Sagemaker (#6002) 2022-12-21 08:02:59 -08:00
Harshal Sheth
137f4500b6
feat(ingest/stateful): remove platform_instance_id from state urn (#6795) 2022-12-20 12:12:19 -05:00
Harshal Sheth
5584bfb469
refactor(ingest/stateful): remove get_last_state method (#6794) 2022-12-19 20:48:22 -05:00
Harshal Sheth
e9d50ed992
refactor(ingest/stateful): remove IngestionJobStateProvider (#6792) 2022-12-19 17:03:54 -05:00
Harshal Sheth
47be95689e
refactor(ingest/stateful): remove most remaining state classes (#6791) 2022-12-19 13:40:48 -05:00
Tamas Nemeth
e41b455e14
fix(ingest): bigquery - sharded table support improvements (#6789) 2022-12-19 18:57:37 +01:00
Mayuri Nehate
9716a49067
fix(ingest): correct external url for account identifier with account name (#6715) 2022-12-16 14:00:42 -05:00
Harshal Sheth
8a537b0559
feat(ingest): add datahub state inspect command (#6763) 2022-12-15 18:55:36 -05:00
Harshal Sheth
6152b5e9f7
feat(ingest): simplify more stateful ingestion state (#6762) 2022-12-15 11:33:29 -05:00
Shirshanka Das
db182e4639
fix(python-sdk): DataHubGraph get_aspect should accept empty responses (#6760) 2022-12-14 10:40:16 -08:00
Harshal Sheth
2f95719dba
feat(ingest): remove source config from DatahubIngestionCheckpoint (#6722) 2022-12-14 12:39:21 -05:00
Patrick Franco Braz
f0a371941e
refactor(ingest): bigquery-lineage - allow tables and datasets in uppercase (#6739) 2022-12-14 14:58:03 +01:00
Harshal Sheth
cf3db168ac
feat(ingest): start simplifying stateful ingestion state (#6740) 2022-12-13 10:05:57 +01:00
Harshal Sheth
7d63399d00
fix(ingest): fix serde for empty dicts in unions with null (#6745)
The code changes in https://github.com/acryldata/avro_gen/pull/16, but tests are written here.
2022-12-13 08:17:24 +01:00
Dmitry Bryazgin
551ef1b335
feat(ingest): add stateful ingestion to the ldap source (#6127)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2022-12-13 01:13:39 -05:00
Mayuri Nehate
65ba13d9aa
feat(ingest): snowflake - add separate config for include_column_lineage in snowflake (#6712) 2022-12-12 15:23:12 +01:00
Harshal Sheth
fd911c9820
feat(ingest): redact configs reported in ingestion_run_summary (#6696) 2022-12-12 10:48:26 +01:00
Harshal Sheth
b7735d5b21
fix(ingest): fix bug in auto_status_aspect (#6705)
Co-authored-by: Tamas Nemeth <treff7es@gmail.com>
2022-12-09 12:24:39 -05:00
Harshal Sheth
4dded454ff
fix(ingest): cleanup config extra usage (#6699) 2022-12-08 16:34:34 -08:00
Mayuri Nehate
eeb7a9dfe5
feat(ingest): snowflake - update snowflake docs, add simple validations (#6636) 2022-12-07 14:56:03 +01:00
Tamas Nemeth
9a1f78fc60
fix(ingest): profiling - Changing profiling defaults on Bigquery and Snowflake (#6640) 2022-12-07 10:33:10 +01:00
Harshal Sheth
a1e62c723e
docs(ingest): add airflow docs that use the PythonVirtualenvOperator (#6604) 2022-12-02 19:56:17 +01:00
Harshal Sheth
44cfd21a65
chore(ingest): bump and pin mypy (#6584) 2022-12-02 19:53:28 +01:00
Tamas Nemeth
8d525d67a9
fix(ingest): kafka - properly picking doc from union type (#6472) 2022-11-23 20:42:21 +01:00
Mayuri Nehate
22847a987a
feat(ingest): automated term classification for snowflake (#6376) 2022-11-23 00:43:30 -05:00
Harshal Sheth
74cc88f2df
fix(ingest): correctly handle transformer patch semantics (#6505) 2022-11-22 09:29:57 -08:00
Harshal Sheth
05a0f3e2a6
feat(ingest): dbt cloud integration (#6323) 2022-11-21 14:14:33 -05:00
Harshal Sheth
3e907ab0d1
feat(ingest): loosen sqlalchemy dep & support airflow 2.3+ (#6204)
Co-authored-by: Tamas Nemeth <treff7es@gmail.com>
2022-11-11 15:04:36 -05:00
Harshal Sheth
1a81d8de6a
feat(ingest): supports MCEs in domain transformer (#6364) 2022-11-05 11:41:43 -07:00
Harshal Sheth
0ca3383d30
feat(ingest): support reserved keywords in model codegen (#6351) 2022-11-02 22:26:56 -07:00
Harshal Sheth
b4687ffceb
feat(ingest): drop plugin support for airflow 1.x (#6331) 2022-11-01 21:12:34 -07:00
Harshal Sheth
ef824bd082
feat(ingest): add fallthrough support to KeyValuePattern (#6302) 2022-10-28 11:07:47 +02:00
Tamas Nemeth
9015a43f25
fix(ingest): bigquery-beta - Adding python 3.8 fix for memory footprint util (#6228) 2022-10-18 17:59:31 -07:00
Harshal Sheth
d08f5f7cdd
feat(ingest): replace base85's pickle with json (#6178) 2022-10-14 14:48:44 -07:00
Harshal Sheth
09616ee2b3
feat(ingest): include instance in container dataPlatform when provided (#6083) 2022-10-13 11:29:54 -07:00
Tamas Nemeth
6e34cd6001
feat(ingest): bigquery-beta - Parsing view ddl definition for lineage (#6187)
Co-authored-by: Shirshanka Das <shirshanka@apache.org>
2022-10-12 18:24:04 -07:00
Harshal Sheth
95d1e01195
feat(ingest): infer aspect name from type in get_aspect (#6033) 2022-10-11 13:35:41 -07:00
Mayuri Nehate
7b88de89d5
fix(ingest): snowflake - allow profiling to work with geography type (#6162) 2022-10-10 08:05:09 -07:00
Shirshanka Das
e9c4c823d8
fix(ingest): bigquery-beta - ensure that status aspect is emitted for… (#6154)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2022-10-08 16:00:45 -07:00
Tamas Nemeth
2f79b50c24
fix(ingest): presto-on-hive - not failing on Hive type parsing error (#6118)
Co-authored-by: Shirshanka Das <shirshanka@apache.org>
2022-10-04 20:54:38 -07:00
Ravindra Lanka
055e4082da
fix(ingestion): fix percent change computation in stale_entity_removal (#6121) 2022-10-04 20:40:59 -07:00
Tamas Nemeth
3b9e9793a7
fix(ingest): bigquery-beta - handling complex types properly (#6062) 2022-09-27 21:31:24 +02:00
Harshal Sheth
3f1d47c069
feat(ingest): list referenced env variables in recipe (#6043) 2022-09-26 23:16:18 -07:00