43 Commits

Author SHA1 Message Date
Aseem Bansal
262dd76518
dev: remove black in favor of ruff for formatting (#12378) 2025-01-18 15:06:20 +05:30
sagar-salvi-apptware
2e544614f1
feat(ingest): add looker meta extractor support in sql parsing (#12062)
Co-authored-by: Mayuri N <mayuri.nehate@gslab.com>
Co-authored-by: Mayuri Nehate <33225191+mayurinehate@users.noreply.github.com>
2024-12-19 12:41:40 +05:30
Harshal Sheth
5519a330e2
chore(ingest): bump black (#11898) 2024-11-20 13:33:54 -08:00
Harshal Sheth
7dbb3e60cb
chore(ingest): start using explicit exports (#11899) 2024-11-20 13:33:30 -08:00
Harshal Sheth
b8144699fd
chore(ingest): reorganize unit tests (#11636) 2024-10-16 19:18:32 -07:00
Harshal Sheth
d34717fd82
fix(ingest): remove default value from DatahubClientConfig.server (#11570) 2024-10-16 13:50:33 -07:00
Mayuri Nehate
d0d09a09f8
fix(ingest): ignore irrelevant urns from % change computation (#11583) 2024-10-11 16:55:27 +05:30
Mayuri Nehate
26bbe02e44
feat(ingest/stateful): omit irrelevant urns for deletion (#11558) 2024-10-09 08:46:58 -07:00
Harshal Sheth
a4bce6af1c
feat(ingest): add snowflake-queries source (#10835) 2024-07-12 15:08:51 -07:00
Harshal Sheth
fa2ab1bcee
fix(ingest): add status aspect to dataProcessInstance (#10757) 2024-06-27 12:07:28 -07:00
Shubham Jagtap
05aee03f3f
perf(ingestion/fivetran): Connector performance optimization (#10556) 2024-06-11 20:19:57 -07:00
Harshal Sheth
3d5735cbc5
chore(ingest): run pyupgrade for python 3.8 (#10513) 2024-05-15 22:31:05 -07:00
Shubham Jagtap
ae3f0fd5ee
feat(ingestion): Copy urns from previous checkpoint state on ingestion failure (#10347) 2024-05-07 17:36:40 +05:30
Shubham Jagtap
fda5eb89f7
feat(ingest): enable stateful_ingestion by default for DataHub rest sink (#9934) 2024-03-05 11:18:03 -08:00
Harshal Sheth
a7dc9c9d22
feat(sdk): autogenerate urn types (#9257) 2023-11-30 18:11:36 -05:00
Shubham Jagtap
a187127ac5
feat(ingestion): file-based state checkpoint provider (#9029) 2023-11-10 14:36:00 -08:00
Andrew Sikowitz
40d17f00ea
feat(ingest/datahub): Improvements, bug fixes, and docs (#8735) 2023-08-29 14:33:40 -04:00
Mayuri Nehate
cc94ffbf6c
fix(ingest): stateful redundant run skip handler (#8467) 2023-08-28 15:03:31 +05:30
Andrew Sikowitz
526e626146
feat(ingest): Add DataHub source (#8561)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-08-15 17:49:20 -04:00
Harshal Sheth
9718505fc7
fix(ingest): respect max_threads for ingestion reporter (#8521) 2023-07-28 13:09:32 -07:00
Harshal Sheth
690ed083d9
feat(ingest): add more fail-safes to stateful ingestion (#8111) 2023-05-31 18:49:48 -07:00
Harshal Sheth
b0f8c3de1e
refactor(ingest): simplify stateful ingestion provider interface (#8104)
Co-authored-by: Tamas Nemeth <treff7es@gmail.com>
2023-05-23 12:57:57 -07:00
Harshal Sheth
4873a32e4a
fix(ingest): emitter bug fixes (#8093)
Co-authored-by: Tamas Nemeth <treff7es@gmail.com>
2023-05-23 12:04:16 -07:00
Felipe Ribeiro
d504cbd1b6
docs(ingest): update max_threads default value (#7947)
Co-authored-by: Felipe Ribeiro <fribeiro@fanatics.com>
2023-05-02 22:54:15 -07:00
Harshal Sheth
3079f0a7e1
feat(sdk): support executing graphql via DataHubGraph (#7753)
Co-authored-by: Hyejin Yoon <0327jane@gmail.com>
2023-04-12 11:30:05 -07:00
Harshal Sheth
f860ce95c0
feat(ingest): emit state payloads as soft-deleted (#7714) 2023-04-04 17:06:21 +00:00
Harshal Sheth
137f4500b6
feat(ingest/stateful): remove platform_instance_id from state urn (#6795) 2022-12-20 12:12:19 -05:00
Harshal Sheth
5584bfb469
refactor(ingest/stateful): remove get_last_state method (#6794) 2022-12-19 20:48:22 -05:00
Harshal Sheth
e9d50ed992
refactor(ingest/stateful): remove IngestionJobStateProvider (#6792) 2022-12-19 17:03:54 -05:00
Harshal Sheth
47be95689e
refactor(ingest/stateful): remove most remaining state classes (#6791) 2022-12-19 13:40:48 -05:00
Tamas Nemeth
e41b455e14
fix(ingest): bigquery - sharded table support improvements (#6789) 2022-12-19 18:57:37 +01:00
Harshal Sheth
8a537b0559
feat(ingest): add datahub state inspect command (#6763) 2022-12-15 18:55:36 -05:00
Harshal Sheth
6152b5e9f7
feat(ingest): simplify more stateful ingestion state (#6762) 2022-12-15 11:33:29 -05:00
Harshal Sheth
2f95719dba
feat(ingest): remove source config from DatahubIngestionCheckpoint (#6722) 2022-12-14 12:39:21 -05:00
Harshal Sheth
cf3db168ac
feat(ingest): start simplifying stateful ingestion state (#6740) 2022-12-13 10:05:57 +01:00
Harshal Sheth
d08f5f7cdd
feat(ingest): replace base85's pickle with json (#6178) 2022-10-14 14:48:44 -07:00
Ravindra Lanka
055e4082da
fix(ingestion): fix percent change computation in stale_entity_removal (#6121) 2022-10-04 20:40:59 -07:00
Alexey Kravtsov
3c3ab64954
feat(ingest): implement compression for CheckpointState (#6007) 2022-09-26 10:18:42 -07:00
Harshal Sheth
68db859ca1
refactor(ingest): streamline two-tier db config validation (#5986) 2022-09-21 10:45:37 -07:00
Ravindra Lanka
ee68f09624
feat(ingestion): Refactor standard state-handling tasks into a common handler that are common across all stateful ingestion sources. (#5766) 2022-09-14 09:30:42 -07:00
Claudio Benfatto
aeefde4fa1
feat(ingestion): Kafka stateful ingestion (#4028)
* test: test stateful ingestion for kafka

test: some more advancement

test: some improvements

refactoring

* refactor: remove some linter modifications

* tests: add unit tests for kafka state

* refactor: minor changes

* tests: improve test coverage

* fix: fix naming

* style: fix format with black

* fix: fix broken test

* revert: revert smoke tests to master

* feat: add reporting to kafka source

* tests: add smoke tests for kafka reporting

* revert: revert changes to the smoke tests

* test: add kafka integration test for stateful ingestion

* docs: update documentation on kafka source

* fix: return empty string when no platform instance

* revert: remove unwanted file

* fix: solve problem with platform instance

* chore: use console sink instead of file

* fix: disable complexity check for _extract_record

* fix: remove if condition in get_platform_instance_id

* chore: remove unneeded integration test

* test: test platform instance in kafka source unit tests
2022-02-15 07:18:36 -08:00
Tamas Nemeth
63bc830cfe
Data domain containers ingestion (#4051) 2022-02-07 09:51:49 -08:00
Ravindra Lanka
f20382f956
feat(ingest): framework - client side changes for monitoring and reporting (#3807) 2022-02-02 13:19:15 -08:00