39 Commits

Author SHA1 Message Date
Sergio Gómez Villamor
1563b0e9fb
fix(ingestion): use default generate_browse_path_v2 even if no pipeline_config (#13117) 2025-04-23 13:25:58 +02:00
Aseem Bansal
9c3bd34995
dev: enable ruff rule (#12749) 2025-02-28 17:49:52 +05:30
skrydal
b091e4615d
feat(ingest/kafka): Flag for optional schemas ingestion (#12077) 2024-12-11 16:02:31 +00:00
Mayuri Nehate
b5fb691f0d
feat(ingest/kafka): improve error handling of oauth_cb config (#11929) 2024-11-25 10:31:35 +05:30
sid-acryl
86b8175627
fix(ingestion/kafka): OAuth callback execution (#11900) 2024-11-22 13:08:23 +05:30
aabharti-visa
8a905774f7
feat(ingestion/kafka)-Add support for ingesting schemas from schema registry (#10612) 2024-06-11 14:00:12 +02:00
Harshal Sheth
7d31420b69
feat(ingest): materialize terms produced by ingestion (#10249) 2024-04-18 10:48:16 -07:00
Mayuri Nehate
5c40390a92
feat(ingest/kafka): support metadata mapping from kafka avro schemas (#8825)
Co-authored-by: Daniel Messias <danielcmessias@gmail.com>
Co-authored-by: Deepankarkr <deepankar.kumar@gslab.com>
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-09-22 17:11:42 -07:00
Tamas Nemeth
1a47a51f1b
fix(ingest/build): Fix sagemaker mypy and flake8 issues (#8530) 2023-07-31 16:13:07 +02:00
Shubham Jagtap
8cc6606e68
feat(ingestion/kafka): add description in dataset properties (#7974)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
Co-authored-by: MohdSiddiqueBagwan <mohdsiddique.bagwan@gslab.com>
Co-authored-by: mohdsiddique <mohdsiddiquebagwan@gmail.com>
2023-05-17 11:03:08 -07:00
Harshal Sheth
ca5dffa54d
refactor(ingest/biz-glossary): simplify business glossary source (#7912) 2023-05-03 17:01:58 -07:00
Andrew Sikowitz
e82e284982
fix(ingest/kafka): Remove topic from kafka browse path (#7398) 2023-02-22 18:38:08 -05:00
Andrew Sikowitz
2764c44977
fix(ingest): Do not require platform_instance for stateful ingestion (#7397) 2023-02-21 21:27:44 -05:00
Shirshanka Das
07e4d0696f
feat(ingest): json-schema - add json schema support for files and kaf… (#7361) 2023-02-19 08:43:13 -08:00
Harshal Sheth
45f50d2614
test(ingest): fix kafka admin client mocking (#7098)
Co-authored-by: Tamas Nemeth <treff7es@gmail.com>
2023-01-23 16:22:20 +01:00
Mayuri Nehate
a05c5c4069
feat(ingest): extract kafka topic config properties as customProperties (#6783) 2022-12-22 09:34:55 +01:00
Tamas Nemeth
e41b455e14
fix(ingest): bigquery - sharded table support improvements (#6789) 2022-12-19 18:57:37 +01:00
Ravindra Lanka
b8941ab190
feat(ingestion): Add fail-safe stale entity removal via configurable 'fail_safe_threshold' param. (#6027) 2022-09-22 16:09:22 -07:00
Claudio Benfatto
bbd0ab823d
feat(ingestion): optionally disable some kafka schema warnings (#4169)
Co-authored-by: Claudio Benfatto <claudio.benfatto@adevinta.com>
Co-authored-by: Ravindra Lanka <rlanka@acryl.io>
2022-05-24 14:27:02 -07:00
Shirshanka Das
a9ad138172
feat(ingest): docs - overhaul source connector docs to make it code driven (#4798)
Co-authored-by: MugdhaHardikar-GSLab <mugdha.hardikar@gslab.com>
2022-05-02 00:18:15 -07:00
Sunil Patil
36e9552d61
feat(ingestion): Support pluggable Schema Registry for Kafka Source (#4535)
* Support for pluggable schema registry for the Kafka source.
Co-authored-by: Sunil Patil <spatil@twilio.com>
Co-authored-by: Ravindra Lanka <rlanka@acryl.io>
2022-03-30 13:20:23 -07:00
Shirshanka Das
a69eac8247
feat(ingest): dbt,looker,sql_common,kafka - moving sources to produce display names and subtypes more consistently (#4496) 2022-03-27 18:49:26 -05:00
Ravindra Lanka
84005d3848
feat(ingest): kafka - add support for non-default schema registry subject name strategies (#4215) 2022-02-22 16:05:46 -08:00
Claudio Benfatto
aeefde4fa1
feat(ingestion): Kafka stateful ingestion (#4028)
* test: test stateful ingestion for kafka

test: some more advancement

test: some improvements

refactoring

* refactor: remove some linter modifications

* tests: add unit tests for kafka state

* refactor: minor changes

* tests: improve test coverage

* fix: fix naming

* style: fix format with black

* fix: fix broken test

* revert: revert smoke tests to master

* feat: add reporting to kafka source

* tests: add smoke tests for kafka reporting

* revert: revert changes to the smoke tests

* test: add kafka integration test for stateful ingestion

* docs: update documentation on kafka source

* fix: return empty string when no platform instance

* revert: remove unwanted file

* fix: solve problem with platform instance

* chore: use console sink instead of file

* fix: disable complexity check for _extract_record

* fix: remove if condition in get_platform_instance_id

* chore: remove unneeded integration test

* test: test platform instance in kafka source unit tests
2022-02-15 07:18:36 -08:00
Ravindra Lanka
f4209504f1
feat(ingest): support Kafka confluent external schema resolution by name or subject (#4035) 2022-02-02 07:44:56 -08:00
Aseem Bansal
400e0fe838
feat(ingest): kafka - support schema references (#3862) 2022-01-17 14:29:54 -08:00
John Joyce
352a0abf8d
Introducing TimeSeries Aspects + Dataset Profile (Stats) Aspect (#2983)
Co-authored-by: Dexter Lee <dexter@acryl.io>
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
Co-authored-by: Ravindra Lanka <rlanka@acryl.io>
Co-authored-by: Shirshanka Das <shirshanka@apache.org>
2021-07-30 17:41:03 -07:00
Harshal Sheth
937f02c6bc
feat: usage stats (part 1) (#2750)
Co-authored-by: Gabe Lyons <itsgabelyons@gmail.com>
2021-06-24 17:11:00 -07:00
Harshal Sheth
bd78b84bd3
feat(ingest): start airflow integration + metadata builders (#2331) 2021-04-05 19:11:28 -07:00
Harshal Sheth
ac064584ae
refactor(ingest): cleanup configuration models (#2134) 2021-02-23 15:55:31 -08:00
Harshal Sheth
38f75be8ad gometa -> datahub 2021-02-15 18:29:27 -08:00
Harshal Sheth
9332e6b878 Add isort to CI 2021-02-15 18:29:27 -08:00
Harshal Sheth
d483d23fd7 Allow/deny patterns for kafka source 2021-02-15 18:29:27 -08:00
Harshal Sheth
43d5fac494 Black 2021-02-15 18:29:27 -08:00
Harshal Sheth
a87161cad7 Run black formatting on tests 2021-02-15 18:29:27 -08:00
Harshal Sheth
2307c59296 Add support for rich kafka config 2021-02-15 18:29:27 -08:00
Harshal Sheth
8ca8ef2d23 Fix kafka tests 2021-02-15 18:29:27 -08:00
Harshal Sheth
c7892ada4c Codegen avro + datahub kafka sink (#3)
* Add codegen

* New architecture + setup file -> console pipeline

* Cleanup source loader

* Basic Kafka metadata source

* Kafka source and extractor

* Add kwargs construct interface

* Fix kafka source unit test

* start working on pipeline test

* kafka datahub sink

* Make myself a profile

* Ingest to datahub from kafka

* Update codegen

* Add restli transport

* Fix bug in restli conversion
2021-02-15 18:29:27 -08:00
Shirshanka Das
9e61220132 checking in testing fixtures. docker still not working 2021-02-15 18:29:27 -08:00