mirror of
https://github.com/datahub-project/datahub.git
synced 2025-12-27 09:58:14 +00:00
docs(): remove Kafka Streams from documentation (#6596)
This commit is contained in:
parent
83b21b021c
commit
10deee7333
@ -25,7 +25,7 @@ As long as you can emit a [Metadata Change Proposal (MCP)] event to Kafka or mak
|
||||
|
||||
### Applying Metadata Change Proposals to DataHub Metadata Service (mce-consumer-job)
|
||||
|
||||
DataHub comes with a Kafka Streams based job, [mce-consumer-job], which consumes the Metadata Change Proposals and writes them into the DataHub Metadata Service (datahub-gms) using the `/ingest` endpoint.
|
||||
DataHub comes with a Spring job, [mce-consumer-job], which consumes the Metadata Change Proposals and writes them into the DataHub Metadata Service (datahub-gms) using the `/ingest` endpoint.
|
||||
|
||||
[Metadata Change Proposal (MCP)]: ../what/mxe.md#metadata-change-proposal-mcp
|
||||
[Metadata Change Log (MCL)]: ../what/mxe.md#metadata-change-log-mcl
|
||||
|
||||
@ -25,11 +25,11 @@ Note that not all MCP-s will result in an MCL, because the DataHub serving tier
|
||||
|
||||
### Metadata Index Applier (mae-consumer-job)
|
||||
|
||||
[Metadata Change Logs]s are consumed by another Kafka Streams job, [mae-consumer-job], which applies the changes to the [graph] and [search index] accordingly.
|
||||
[Metadata Change Logs]s are consumed by another Spring job, [mae-consumer-job], which applies the changes to the [graph] and [search index] accordingly.
|
||||
The job is entity-agnostic and will execute corresponding graph & search index builders, which will be invoked by the job when a specific metadata aspect is changed.
|
||||
The builder should instruct the job how to update the graph and search index based on the metadata change.
|
||||
The builder should instruct the job how to update the graph and search index based on the metadata change.
|
||||
|
||||
To ensure that metadata changes are processed in the correct chronological order, MCLs are keyed by the entity [URN] — meaning all MAEs for a particular entity will be processed sequentially by a single Kafka streams thread.
|
||||
To ensure that metadata changes are processed in the correct chronological order, MCLs are keyed by the entity [URN] — meaning all MAEs for a particular entity will be processed sequentially by a single thread.
|
||||
|
||||
### Metadata Query Serving
|
||||
|
||||
|
||||
@ -1,10 +1,10 @@
|
||||
# MXE Processing Jobs
|
||||
DataHub uses Kafka as the pub-sub message queue in the backend. There are 2 Kafka topics used by DataHub which are
|
||||
DataHub uses Kafka as the pub-sub message queue in the backend. There are 2 Kafka topics used by DataHub which are
|
||||
`MetadataChangeEvent` and `MetadataAuditEvent`.
|
||||
* `MetadataChangeEvent:` This message is emitted by any data platform or crawler in which there is a change in the metadata.
|
||||
* `MetadataAuditEvent:` This message is emitted by [DataHub GMS](../gms) to notify that metadata change is registered.
|
||||
|
||||
To be able to consume from these two topics, there are two [Kafka Streams](https://kafka.apache.org/documentation/streams/)
|
||||
To be able to consume from these two topics, there are two Spring
|
||||
jobs DataHub uses:
|
||||
* [MCE Consumer Job](mce-consumer-job): Writes to [DataHub GMS](../gms)
|
||||
* [MAE Consumer Job](mae-consumer-job): Writes to [Elasticsearch](../docker/elasticsearch) & [Neo4j](../docker/neo4j)
|
||||
|
||||
@ -4,7 +4,7 @@ title: "metadata-jobs:mae-consumer-job"
|
||||
|
||||
# Metadata Audit Event Consumer Job
|
||||
|
||||
The Metadata Audit Event Consumer is a [Kafka Streams](https://kafka.apache.org/documentation/streams/) job which can be deployed by itself, or as part of the Metadata Service.
|
||||
The Metadata Audit Event Consumer is a Spring job which can be deployed by itself, or as part of the Metadata Service.
|
||||
|
||||
Its main function is to listen to change log events emitted as a result of changes made to the Metadata Graph, converting changes in the metadata model into updates
|
||||
against secondary search & graph indexes (among other things)
|
||||
@ -15,10 +15,10 @@ Today the job consumes from two important Kafka topics:
|
||||
2. `MetadataChangeLog_Timeseries_v1`
|
||||
|
||||
> Where does the name **Metadata Audit Event** come from? Well, history. Previously, this job consumed
|
||||
> a single `MetadataAuditEvent` topic which has been deprecated and removed from the critical path. Hence, the name!
|
||||
> a single `MetadataAuditEvent` topic which has been deprecated and removed from the critical path. Hence, the name!
|
||||
|
||||
## Pre-requisites
|
||||
* You need to have [JDK8](https://www.oracle.com/java/technologies/jdk8-downloads.html)
|
||||
* You need to have [JDK8](https://www.oracle.com/java/technologies/jdk8-downloads.html)
|
||||
installed on your machine to be able to build `DataHub Metadata Service`.
|
||||
|
||||
## Build
|
||||
@ -46,7 +46,7 @@ the application directly from command line after a successful [build](#build):
|
||||
```
|
||||
|
||||
## Endpoints
|
||||
Spring boot actuator has been enabled for MAE Application.
|
||||
Spring boot actuator has been enabled for MAE Application.
|
||||
`healthcheck`, `metrics` and `info` web endpoints are enabled by default.
|
||||
|
||||
`healthcheck` - http://localhost:9091/actuator/health
|
||||
|
||||
@ -4,10 +4,10 @@ title: "metadata-jobs:mce-consumer-job"
|
||||
|
||||
# Metadata Change Event Consumer Job
|
||||
|
||||
The Metadata Change Event Consumer is a [Kafka Streams](https://kafka.apache.org/documentation/streams/) job which can be deployed by itself, or as part of the Metadata Service.
|
||||
The Metadata Change Event Consumer is a Spring job which can be deployed by itself, or as part of the Metadata Service.
|
||||
|
||||
Its main function is to listen to change proposal events emitted by clients of DataHub which request changes to the Metadata Graph. It then applies
|
||||
these requests against DataHub's storage layer: the Metadata Service.
|
||||
these requests against DataHub's storage layer: the Metadata Service.
|
||||
|
||||
Today the job consumes from two topics:
|
||||
|
||||
@ -62,7 +62,7 @@ listen on port 5005 for a remote debugger.
|
||||
```
|
||||
|
||||
## Endpoints
|
||||
Spring boot actuator has been enabled for MCE Application.
|
||||
Spring boot actuator has been enabled for MCE Application.
|
||||
`healthcheck`, `metrics` and `info` web endpoints are enabled by default.
|
||||
|
||||
`healthcheck` - http://localhost:9090/actuator/health
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user