45 Commits

Author SHA1 Message Date
david-leifker
1194f910bd
bump(): upgrade Kafka 7.9.1 (#13667) 2025-05-30 21:51:36 -05:00
david-leifker
463803e2d1
feat(restore-indices): createDefaultAspects argument (#12859) 2025-03-13 10:17:14 -05:00
david-leifker
412600a163
feat(telemetry): cross-component async write tracing (#12405) 2025-01-29 11:30:44 -06:00
Chakru
85b42e3ea5
build(coverage): enable code coverage for java and python (#11992)
Co-authored-by: david-leifker <114954101+david-leifker@users.noreply.github.com>
2024-12-02 19:27:43 -06:00
david-leifker
738eaed6f1
feat(throttle): extend throttling to API requests (#11325) 2024-09-12 09:52:20 -05:00
david-leifker
dfa9bd2779
feat(consumers): mce-consumer throttling based on mae-consumer lag (#10626) 2024-05-31 15:53:02 -05:00
Aseem Bansal
e14474176f
feat(lint): add spotless for java lint (#9373) 2023-12-06 11:02:42 +05:30
RyanHolstien
1b737243b2
feat(avro): upgrade avro to 1.11 (#9031) 2023-10-18 13:45:46 -05:00
david-leifker
1b79142d9e
feat(EntityService): batched transactions and ebean updates (#8456) 2023-09-02 19:25:44 -05:00
david-leifker
7dd6e09ac5
refactor(build): upgrade to gradle 7 & guava update (#8745) 2023-09-01 19:36:01 +05:30
david-leifker
749c3e85cb
chore(snappy): fix snappy version constraint (#8629) 2023-08-17 10:56:28 +05:30
david-leifker
cd05f5b174
feat(schema-registry): replace confluent schema registry (#7930)
Co-authored-by: Pedro Silva <pedro@acryl.io>
Co-authored-by: Shirshanka Das <shirshanka@apache.org>
Co-authored-by: Ryan Holstien <ryan@acryl.io>
2023-05-01 13:18:41 -05:00
Pedro Silva
4732694780
fix(gms): Corrects MCP generation in async mode (#7214)
Co-authored-by: John Joyce <john@acryl.io>
2023-02-02 11:45:44 -08:00
david-leifker
39920bb00f
feat(elasticsearch): Elasticsearch improvements (#6894) 2023-01-31 18:44:37 -06:00
david-leifker
ecc01b9a46
refactor(restli-mce-consumer) (#6744)
* fix(security): commons-text in frontend

* refactor(restli): set threads based on cpu cores
feat(mce-consumers): hit local restli endpoint

* testing docker build

* Add retry configuration options for entity client

* Kafka debugging

* fix(kafka-setup): parallelize topic creation

* Adjust docker build

* Docker build updates

* WIP

* fix(lint): metadata-ingestion lint

* fix(gradle-docker): fix docker frontend dep

* fix(elastic): fix race condition between gms and mae for index creation

* Revert "fix(elastic): fix race condition between gms and mae for index creation"

This reverts commit 9629d12c3bdb3c0dab87604d409ca4c642c9c6d3.

* fix(test): fix datahub frontend test for clean/test cycle

* fix(test): datahub-frontend missing assets in test

* fix(security): set protobuf lib datahub-upgrade & mce/mae-consumer

* gitingore update

* fix(docker): remove platform on docker base image, set by buildx

* refactor(kafka-producer): update kafka producer tracking/logging

* updates per PR feedback

* Add documentation around mce standalone consumer
Kafka consumer concurrency to follow thread count for restli & sql connection pool

Co-authored-by: leifker <dleifker@gmail.com>
Co-authored-by: Pedro Silva <pedro@acryl.io>
2022-12-26 16:09:08 +00:00
djordje-mijatovic
e6c48e5f19
feat(kafka): expose default kafka producer mechanism (#6381)
* Expose Kafka Sender Retry Parameters

* Implement KafkaHealthChecker

* feat(kafka): expose default kafka producer mechanism
2022-12-20 14:41:24 -06:00
david-leifker
2de9d3d5bf
fix(logging): Remove lombok as source of slf4j-api, convert to compileOnly where possible (#6616) 2022-12-04 19:57:47 -08:00
david-leifker
4ca3327d89
fix(security): update ranger commons & dependencies for security vulns (#6577)
* fix(security): update ranger commons & dependencies for security vulns
2022-11-30 17:05:01 -06:00
RyanHolstien
bfb903cfb8
feat(ingest): add async option to ingest proposal endpoint (#6097)
* feat(ingest): add async option to ingest proposal endpoint

* small tweak to validate before write to K, also keep existing path for timeseries aspects

* avoid double convert

Co-authored-by: Shirshanka Das <shirshanka@apache.org>
2022-10-03 19:56:19 -05:00
John Joyce
c69310522b
feat(metadata service): Introducing Platform Events (#4477) 2022-03-29 18:32:04 -07:00
RyanHolstien
34c27f076b
feat(removeGMA): remove all dependencies on gma libraries (#3835) 2022-01-05 17:32:31 -08:00
xiphl
8cd1e91072
Upgrade to 3rd Apache patch for log4j (#3772) 2021-12-20 06:55:22 -08:00
John Joyce
5b5135be0b
fix(vuln): log4j vulnerability - bumping to 2.16.0 (#3755) 2021-12-15 11:07:45 -08:00
Fredrik Sannholm
d651040c85
Fix vulnderability (#3716) 2021-12-10 10:07:55 -08:00
Claudio Benfatto
f9bc3b32c4
fix(metadata-service): fix debug logging in MAE producer (#3626)
closes: https://github.com/linkedin/datahub/issues/3625
2021-11-28 21:07:42 -08:00
Dexter Lee
8747fbe43c
feat(perf): Add perf testing and monitoring framework (#3195) 2021-09-07 23:06:15 -07:00
John Joyce
f3fc0970f3
refactor(build): Remove unnecessary ext modules. (#3074) 2021-08-10 22:48:06 -07:00
John Joyce
20b1685de2
fix(gms): better logging on failed MCL / MAE (#3007) 2021-08-02 17:53:56 -07:00
John Joyce
352a0abf8d
Introducing TimeSeries Aspects + Dataset Profile (Stats) Aspect (#2983)
Co-authored-by: Dexter Lee <dexter@acryl.io>
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
Co-authored-by: Ravindra Lanka <rlanka@acryl.io>
Co-authored-by: Shirshanka Das <shirshanka@apache.org>
2021-07-30 17:41:03 -07:00
Gabe Lyons
aa253f5b3b
feat(deletes): add run commands (list, show, rollback) to datahub ingest (#2960) 2021-07-29 20:04:40 -07:00
John Joyce
09cbc548a4
feat(logs): improve logging in GMS and datahub-frontend (#2761) 2021-06-25 10:56:45 -07:00
John Joyce
97e9660037
feat: No Code Metadata Modeling (#2629)
Co-authored-by: Dexter Lee <dexter@acryl.io>
Co-authored-by: Gabe Lyons <itsgabelyons@gmail.com>
Co-authored-by: Shirshanka Das <shirshanka@apache.org>
2021-06-03 13:24:33 -07:00
Dexter Lee
fa015c5aaa
fix(kafka-topic-convention): Fix DAOs that do not refer to TopicConvention (#2387) 2021-04-13 07:58:31 -07:00
John Plaisted
25b663cc18
refactor: move code to linkedin/datahub-gma. (#1955)
Move code to linkedin/datahub-gma.

"GMA" (Generalized Metadata Architecture) is the backend of DataHub, and has been moved to its own repository.

This deletes the code that was moved and uses jars that GMA publishes to bintray to load it.

Note that not all of GMA was moved, but most of it. We may still move more things to the other repository in the future.
2020-10-23 15:14:57 -07:00
John Plaisted
542ae67cb1 Add support for customizing topic names via a convention.
Requested by a few people in OS. See https://github.com/linkedin/datahub/issues/1840.

Companies need full customization over the topic name. This new class should be easily customizable by using a spring factory.

TODO to finish the implmentation for v5. For right now v5 is LI only and unfinished. Getting this in for v4 so it is useful to other companies now.

TODO AFTER OPEN SOURCE PUSH - make configurable via spring
TODO AFTER SUBMIT - see where else we can use this (jobs, where else?)
2020-09-24 16:02:12 -07:00
John Plaisted
6b9a053f6e ROLL FORWARD: Add new style checks and fix issues.
- Upgrade to checkstyle 8
- Copy javadoc checks from Google
- Disable missing class and method checks for now, too many warnings. I'll have to figure out how to suppress them instead.
- Fix other issues, which are mostly missing periods at the end of sentences and lack of paragraph tags.

Revert "Reverting the commit range: 8dfdb73ac6c73581ef56c0d81c21a2a92e8a1a02..194bd6f57f4a4d075d2ea1f442397d1139080f7a."

This reverts commit ab178ec1469fa72c0c339f0b842e7ff0850e7c74.
2020-09-11 09:15:56 -07:00
Chris Lee
679069e16f Made the log to WARN for the v5 producer early termination. 2020-09-11 09:15:56 -07:00
John Plaisted
6ac7622af6 Reverting the commit range: 8dfdb73ac6c73581ef56c0d81c21a2a92e8a1a02..194bd6f57f4a4d075d2ea1f442397d1139080f7a.
REVERTED RB=99999 PCVALIDATIONOVERRIDE I18NOVERRIDE CIOVERRIDE TRUNKBLOCKERFIX

See https://crt.prod.linkedin.com/#/testing/executions/77e10182-d60f-4c8d-9e55-599bdc4384e0/execution for more details.
2020-09-11 09:15:56 -07:00
John Plaisted
b9f11ae21b Add new style checks and fix issues.
- Upgrade to checkstyle 8
- Copy javadoc checks from Google
- Disable missing class and method checks for now, too many warnings. I'll have to figure out how to suppress them instead.
- Fix other issues, which are mostly missing periods at the end of sentences and lack of paragraph tags.
2020-09-11 09:15:56 -07:00
John Plaisted
d9b86d1f05
Update metadata-models to head! (#1811)
metadata-models 80.0.0 -> 90.0.13:

   90.0.13: Roll forward: Fix the open source build by avoiding URN method that isn't part of the open source URN.
    90.0.2: Refactor listUrnsFromIndex method
    90.0.0: Start distinguishing between [] aspects vs null aspects input param
    89.0.4: Fix the open source build by avoiding URN method that isn't part of the open source URN.
    89.0.2: fix some test case name
    89.0.0: META-12686: Made the MXE_v5 topics become strictly ACL'ed to avoid the wildcard write ACL as "MetadataXEvent.+"
    88.0.6: change DAO to take Storage Config as input
    88.0.3: Add a comment on lack of avro generation for MXEv5 + add MXEv5 to the pegasus validation task.
   87.0.15: META-12651: Integrate the metadata-models-ext with metadata-models
   87.0.13: add StorageConfig to Local DAO
    87.0.3: Treat empty aspect vs optional aspect same until all clients are migrated
    87.0.2: Treat empty aspect vs optional aspect differently
    87.0.1: META-12533: Skip processing unregistered aspect specific MAE.
    83.0.6: action method to return list of urns from strong consistent index
    83.0.4: Change input param type for batch backfill
    83.0.3: Implement batch backfill
    83.0.1: Implement support for OR filter in browse query
   82.0.10: Throw UnsupportedOperationException for unsupported condition types in search filter
    82.0.6: Implement local secondary backfilling index as part of backfill method
    82.0.5: [strongly consistent index] implement getUrns method
    82.0.4: Add indexing urn fields to the local secondary index
    82.0.0: Render Delta fiels in the MCE_v5.
    81.0.1: Add pegasus to avro conversion for FMCE
    80.0.4: add get all support for BaseSingleAspectEntitySimpleKeyResource
    80.0.2: Add a BaseSearchWriterDAO with an ESBulkWriterDAO implementation.
    80.0.1: META-12254: Produce aspect specific MAE with always emit option
    80.0.0: Convert getNodesInTraversedPath to getSubgraph to return complete view of the subgraph (nodes+edges)
2020-08-19 16:06:29 -07:00
Jyoti Wadhwani
779eaeed70
metadata-models 72.0.8 -> 80.0.0 (#1756) 2020-07-29 11:42:35 -07:00
Mars Lan
f932437742
build: start enforcing checkstyle and fix all violations (#1670) 2020-05-11 08:41:02 -07:00
Kerem Sahin
1168501083 Enable tests for all modules by using global gradle config 2020-02-21 11:53:45 -08:00
Kerem Sahin
b17b91f24a Bump gradle to 5.6.4 and pegasus to 27.7.18 2020-02-12 17:10:49 -08:00
Kerem Sahin
23339df23a Initial commit for Data Hub 2019-08-31 20:51:14 -07:00