61 Commits

Author SHA1 Message Date
Mayuri Nehate
9568a4254d
feat: separate great-expectations action package (#11096)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2024-08-21 12:13:36 -04:00
Gabe Lyons
423af83ef1
feat(rest-emitter): adding async flag to rest emitter (#10902)
Co-authored-by: Gabe Lyons <gabe.lyons@acryl.io>
2024-07-12 13:30:21 -07:00
Harshal Sheth
f4be88d0a9
feat(ingest): set pipeline name in system metadata (#10190)
Co-authored-by: david-leifker <114954101+david-leifker@users.noreply.github.com>
2024-06-27 15:00:35 -07:00
Harshal Sheth
3d5735cbc5
chore(ingest): run pyupgrade for python 3.8 (#10513) 2024-05-15 22:31:05 -07:00
dushayntAW
a164b70e1d
chore(ingest/presto-on-hive) Set enable_properties_merge to True by default (#10469) 2024-05-15 18:57:13 +05:30
Harshal Sheth
0d780e5f8f
feat(ingest): sql parsing aggregator (#9786) 2024-02-09 16:27:45 -05:00
Harshal Sheth
0e418b527e
fix(ingest): upgrade pytest-docker (#9765) 2024-02-01 16:33:15 -08:00
Shubham Jagtap
1741c07d76
feat(ingestion): Add test_connection methods for important sources (#9334) 2023-12-14 12:31:51 -05:00
Harshal Sheth
f9fd9467ef
feat(ingest): clean up DataHubRestEmitter return type (#9286)
Co-authored-by: Andrew Sikowitz <andrew.sikowitz@acryl.io>
2023-11-30 21:00:43 -05:00
Harshal Sheth
73514ad9c5
fix(ingest): cleanup large images in CI (#9153) 2023-10-31 21:28:38 -07:00
Harshal Sheth
9deb7be3fc
fix(ingest): refactor test markers + fix disk space issues in CI (#8938) 2023-10-03 20:17:49 -07:00
Andrew Sikowitz
2261531e31
test(ingest): Aspect level golden file comparison (#8310) 2023-07-11 10:39:47 -04:00
Harshal Sheth
2d442161c4
ci(ingest/kafka): improve kafka integration test reliability (#8085) 2023-05-25 15:40:56 -07:00
Andrew Sikowitz
fdbc4de695
refactor(ingest): Call source_helpers via new WorkUnitProcessors in base Source (#8101) 2023-05-24 13:36:19 -07:00
Harshal Sheth
b0f8c3de1e
refactor(ingest): simplify stateful ingestion provider interface (#8104)
Co-authored-by: Tamas Nemeth <treff7es@gmail.com>
2023-05-23 12:57:57 -07:00
Shirshanka Das
151eab3628
fix(build): fix lint issue (#8066) 2023-05-17 08:17:52 -07:00
Shirshanka Das
b3c790aab6
feat: Add support for Data Products (#8039)
Co-authored-by: Chris Collins <chriscollins3456@gmail.com>
2023-05-17 07:17:25 +00:00
Harshal Sheth
e99875cac6
chore(ingest): enable flake8 bugbear linting (#7763) 2023-04-10 14:14:42 -07:00
Harshal Sheth
667ca8632d
feat(ingest): avoid embedding serialized json in metadata files (#6742) 2022-12-28 19:28:38 -05:00
cccs-eric
ec8a4e0eab
feat(ingest): upgrade pydantic version (#6858)
This PR also removes the requirement on docker-compose v1 and makes our tests use v2 instead.

Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2022-12-27 17:06:16 -05:00
Harshal Sheth
5584bfb469
refactor(ingest/stateful): remove get_last_state method (#6794) 2022-12-19 20:48:22 -05:00
Harshal Sheth
e9d50ed992
refactor(ingest/stateful): remove IngestionJobStateProvider (#6792) 2022-12-19 17:03:54 -05:00
Tamas Nemeth
e41b455e14
fix(ingest): bigquery - sharded table support improvements (#6789) 2022-12-19 18:57:37 +01:00
Harshal Sheth
8a537b0559
feat(ingest): add datahub state inspect command (#6763) 2022-12-15 18:55:36 -05:00
Harshal Sheth
85bb1f5030
test(ingest): make hive/trino test more reliable (#6741) 2022-12-12 21:02:52 -05:00
Harshal Sheth
44cfd21a65
chore(ingest): bump and pin mypy (#6584) 2022-12-02 19:53:28 +01:00
Harshal Sheth
3e907ab0d1
feat(ingest): loosen sqlalchemy dep & support airflow 2.3+ (#6204)
Co-authored-by: Tamas Nemeth <treff7es@gmail.com>
2022-11-11 15:04:36 -05:00
Shirshanka Das
c5fc2ea798
fix(ingest): looker - deps, column level lineage fixes (#6271) 2022-10-24 08:31:48 +02:00
Harshal Sheth
73fd35888b
build(ingest): remove markupsafe dep and bump pytest-docker (#6201) 2022-10-14 18:59:40 -07:00
Harshal Sheth
220ae0b6c9
feat(ingest): make sink use type annotations (#5899) 2022-09-10 19:46:20 -07:00
Piotr Sierkin
828a711684
feat(ingest): dbt - control over emitting test_results, test_definitions, etc. (#5328)
Co-authored-by: Piotr Sierkin <piotr.sierkin@getindata.com>
Co-authored-by: Shirshanka Das <shirshanka@apache.org>
2022-08-06 21:42:53 -07:00
Shirshanka Das
558a65a3c3
fix(ci): fix mysql test and attempt kafka-connect ingestion (#5352) 2022-07-07 08:28:34 -07:00
Shirshanka Das
e93e4691fb
feat(ingest): lookml - adding support for only emitting reachable views from explores (#5333) 2022-07-05 10:14:12 -07:00
buggythepirate
92338c7912
feat(ingest): Added new ingestion source SAP HANA (#4376)
Co-authored-by: Ravindra Lanka <rlanka@acryl.io>
2022-05-26 03:42:50 -07:00
Swaroop Jagadish
35b187a8d4
feat(ingest): transformers - add support for processing MCP-s (#4337) 2022-03-07 13:14:29 -08:00
Claudio Benfatto
aeefde4fa1
feat(ingestion): Kafka stateful ingestion (#4028)
* test: test stateful ingestion for kafka

test: some more advancement

test: some improvements

refactoring

* refactor: remove some linter modifications

* tests: add unit tests for kafka state

* refactor: minor changes

* tests: improve test coverage

* fix: fix naming

* style: fix format with black

* fix: fix broken test

* revert: revert smoke tests to master

* feat: add reporting to kafka source

* tests: add smoke tests for kafka reporting

* revert: revert changes to the smoke tests

* test: add kafka integration test for stateful ingestion

* docs: update documentation on kafka source

* fix: return empty string when no platform instance

* revert: remove unwanted file

* fix: solve problem with platform instance

* chore: use console sink instead of file

* fix: disable complexity check for _extract_record

* fix: remove if condition in get_platform_instance_id

* chore: remove unneeded integration test

* test: test platform instance in kafka source unit tests
2022-02-15 07:18:36 -08:00
Tamas Nemeth
63bc830cfe
Data domain containers ingestion (#4051) 2022-02-07 09:51:49 -08:00
Swaroop Jagadish
ded16809da
feat(ingest): add tests for platform instance (#4047) 2022-02-02 22:52:50 -08:00
Kevin Hu
5f701ebb3c
fix(cli): disable telemetry in CLI tests (#3877) 2022-01-12 00:25:42 -08:00
Harshal Sheth
22cef5f897
refactor(test): replace CliRunner with run_datahub_cmd method (#3746) 2021-12-16 20:07:38 -08:00
Swaroop Jagadish
ebdd30bb73
feat(model): adding a field to capture unmodeled field level properties (#3593) 2021-11-17 17:34:20 -08:00
mayurinehate
192f0d33a2
feat(ingest): kafka connect source improvements (#3481) 2021-11-03 15:03:05 -07:00
varunbharill
73bd7657aa
fix(test): Fixing lookml integration test. (#3405) 2021-10-14 18:09:32 -07:00
Swaroop Jagadish
0cf157e991
fix(ingest): lookml view file resolution and looker spurious aspect issues (#3397) 2021-10-14 01:16:35 -07:00
rslanka
8844240328
feat: Adding support for nested schemas in ingestion and visualization (#3079) 2021-08-11 15:47:18 -07:00
John Joyce
352a0abf8d
Introducing TimeSeries Aspects + Dataset Profile (Stats) Aspect (#2983)
Co-authored-by: Dexter Lee <dexter@acryl.io>
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
Co-authored-by: Ravindra Lanka <rlanka@acryl.io>
Co-authored-by: Shirshanka Das <shirshanka@apache.org>
2021-07-30 17:41:03 -07:00
Kevin Hu
6abd5e191a
feat(ingest): lineage for SageMaker model endpoints and groups (#2894) 2021-07-19 11:30:43 -07:00
Kevin Hu
904d4410fe
feat(ingest): update golden files only when diff fails (#2869) 2021-07-13 14:59:22 -07:00
Harshal Sheth
d66381451a
feat(ingest): refactor mce comparison and add pytest update golden files option (#2812) 2021-06-30 16:53:20 -07:00
Harshal Sheth
5e69a4355e
refactor(ingest): use common get_sys_time method (#2782) 2021-06-28 20:40:10 -07:00