mirror of
https://github.com/datahub-project/datahub.git
synced 2025-07-04 07:34:44 +00:00
90 lines
4.8 KiB
Markdown
90 lines
4.8 KiB
Markdown
---
|
|
title: "Configuring Kafka"
|
|
hide_title: true
|
|
---
|
|
|
|
# How to configure Kafka?
|
|
|
|
With the exception of `KAFKA_BOOTSTRAP_SERVER` and `KAFKA_SCHEMAREGISTRY_URL`, Kafka is configured via [spring-boot](https://spring.io/projects/spring-boot), specifically with [KafkaProperties](https://docs.spring.io/spring-boot/docs/current/api/org/springframework/boot/autoconfigure/kafka/KafkaProperties.html). See [Integration Properties](https://docs.spring.io/spring-boot/docs/current/reference/html/appendix-application-properties.html#integration-properties) prefixed with `spring.kafka`.
|
|
|
|
Below is an example of how SASL/GSSAPI properties can be configured via environment variables:
|
|
```bash
|
|
export KAFKA_BOOTSTRAP_SERVER=broker:29092
|
|
export KAFKA_SCHEMAREGISTRY_URL=http://schema-registry:8081
|
|
export SPRING_KAFKA_PROPERTIES_SASL_KERBEROS_SERVICE_NAME=kafka
|
|
export SPRING_KAFKA_PROPERTIES_SECURITY_PROTOCOL=SASL_PLAINTEXT
|
|
export SPRING_KAFKA_PROPERTIES_SASL_JAAS_CONFIG=com.sun.security.auth.module.Krb5LoginModule required principal='principal@REALM' useKeyTab=true storeKey=true keyTab='/keytab';
|
|
```
|
|
|
|
These properties can be specified via `application.properties` or `application.yml` files, or as command line switches, or as environment variables. See Spring's [Externalized Configuration](https://docs.spring.io/spring-boot/docs/current/reference/html/spring-boot-features.html#boot-features-external-config) to see how this works.
|
|
|
|
See [Kafka Connect Security](https://docs.confluent.io/current/connect/security.html) for more ways to connect.
|
|
|
|
DataHub components that connect to Kafka are currently:
|
|
- mce-consumer-job
|
|
- mae-consumer-job
|
|
- gms
|
|
- Various ingestion example apps
|
|
|
|
## Configuring Topic Names
|
|
|
|
By default, ingestion relies upon the `MetadataChangeEvent_v4`, `MetadataAuditEvent_v4`, and `FailedMetadataChangeEvent` kafka topics by default for
|
|
[metadata events](../what/mxe.md).
|
|
|
|
We've included environment variables to customize the name each of these topics, if your company or organization has naming rules for your topics.
|
|
|
|
### datahub-gms
|
|
- `METADATA_CHANGE_EVENT_NAME`: The name of the metadata change event topic.
|
|
- `METADATA_AUDIT_EVENT_NAME`: The name of the metadata audit event topic.
|
|
- `FAILED_METADATA_CHANGE_EVENT_NAME`: The name of the failed metadata change event topic.
|
|
|
|
### datahub-mce-consumer
|
|
- `KAFKA_MCE_TOPIC_NAME`: The name of the metadata change event topic.
|
|
- `KAFKA_FMCE_TOPIC_NAME`: The name of the failed metadata change event topic.
|
|
|
|
### datahub-mae-consumer
|
|
- `KAFKA_TOPIC_NAME`: The name of the metadata audit event topic.
|
|
|
|
Please ensure that these environment variables are set consistently throughout your ecosystem. DataHub has a few different applications running which communicate with Kafka (see above).
|
|
|
|
## Configuring Consumer Group Id
|
|
|
|
Kafka Consumers in Spring are configured using Kafka listeners. By default, consumer group id is same as listener id.
|
|
|
|
We've included an environment variable to customize the consumer group id, if your company or organization has specific naming rules.
|
|
|
|
### datahub-mce-consumer and datahub-mae-consumer
|
|
- `KAFKA_CONSUMER_GROUP_ID`: The name of the kafka consumer's group id.
|
|
|
|
## How to apply configuration?
|
|
- For quickstart, add these environment variables to the corresponding application's docker.env
|
|
- For helm charts, add these environment variables as extraEnvs to the corresponding application's chart.
|
|
For example,
|
|
```
|
|
extraEnvs:
|
|
- name: METADATA_CHANGE_EVENT_NAME
|
|
value: "MetadataChangeEvent"
|
|
- name: METADATA_AUDIT_EVENT_NAME
|
|
value: "MetadataAuditEvent"
|
|
- name: FAILED_METADATA_CHANGE_EVENT_NAME
|
|
value: "FailedMetadataChangeEvent"
|
|
- name: KAFKA_CONSUMER_GROUP_ID
|
|
value: "my-apps-mae-consumer"
|
|
```
|
|
|
|
## SSL
|
|
|
|
We are using the Spring Boot framework to start our apps, including setting up Kafka. You can
|
|
[use environment variables to set system properties](https://docs.spring.io/spring-boot/docs/current/reference/html/spring-boot-features.html#boot-features-external-config-relaxed-binding-from-environment-variables),
|
|
including [Kafka properties](https://docs.spring.io/spring-boot/docs/current/reference/html/appendix-application-properties.html#integration-properties).
|
|
From there you can set your SSL configuration for Kafka.
|
|
|
|
If Schema Registry is configured to use security (SSL), then you also need to set
|
|
[this config](https://docs.confluent.io/current/kafka/encryption.html#encryption-ssl-schema-registry).
|
|
|
|
> **Note** In the logs you might see something like
|
|
> `The configuration 'kafkastore.ssl.truststore.password' was supplied but isn't a known config.` The configuration is
|
|
> not a configuration required for the producer. These WARN message can be safely ignored. Each of Datahub services are
|
|
> passed a full set of configuration but may not require all the configurations that are passed to them. These warn
|
|
> messages indicate that the service was passed a configuration that is not relevant to it and can be safely ignored.
|