2021-03-05 00:12:12 -08:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								---
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								title: "Configuring Kafka"
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								hide_title: true
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								---
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2022-06-17 09:29:50 -04:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								# Configuring Kafka in DataHub
  
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2025-08-27 17:03:56 +05:30 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								DataHub uses Kafka as the pub-sub message queue in the backend.
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								[Official Confluent Kafka Docker images ](https://hub.docker.com/u/confluentinc ) found in Docker Hub is used without
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								any modification. Kafka is used as a durable log that can be used to store inbound
							 
						 
					
						
							
								
									
										
										
										
											2022-06-17 09:29:50 -04:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								requests to update the Metadata Graph (Metadata Change Proposal), or as a change log detailing the updates
							 
						 
					
						
							
								
									
										
										
										
											2025-04-16 16:55:51 -07:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								that have been made to the Metadata Graph (Metadata Change Log).
							 
						 
					
						
							
								
									
										
										
										
											2022-06-17 09:29:50 -04:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								## Environment Variables
  
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								The following environment variables can be used to customize DataHub's connection to Kafka for the following DataHub components,
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								each of which requires a connection to Kafka:
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								-  `metadata-service`  (datahub-gms container) 
						 
					
						
							
								
									
										
										
										
											2025-08-27 17:03:56 +05:30 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								-  `system-update`  (dathub-system-update container if setting up topics via datahub) 
						 
					
						
							
								
									
										
										
										
											2022-06-17 09:29:50 -04:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								-  (Advanced - if standalone consumers are deployed) `mce-consumer-job`  (datahub-mce-consumer container) 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								-  (Advanced - if standalone consumers are deployed) `mae-consumer-job`  (datahub-mae-consumer container) 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								### Connection Configuration
  
						 
					
						
							
								
									
										
										
										
											2020-07-27 05:22:51 -07:00 
										
									 
								 
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2025-04-16 16:55:51 -07:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								With the exception of `KAFKA_BOOTSTRAP_SERVER`  and `KAFKA_SCHEMAREGISTRY_URL` , Kafka is configured via [spring-boot ](https://spring.io/projects/spring-boot ), specifically with [KafkaProperties ](https://docs.spring.io/spring-boot/docs/current/api/org/springframework/boot/autoconfigure/kafka/KafkaProperties.html ). See [Integration Properties ](https://docs.spring.io/spring-boot/docs/current/reference/html/appendix-application-properties.html#integration-properties ) prefixed with `spring.kafka` .
							 
						 
					
						
							
								
									
										
										
										
											2020-07-27 05:22:51 -07:00 
										
									 
								 
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								Below is an example of how SASL/GSSAPI properties can be configured via environment variables:
							 
						 
					
						
							
								
									
										
										
										
											2022-06-17 09:29:50 -04:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2020-07-27 05:22:51 -07:00 
										
									 
								 
							 
							
								
							 
							
								 
							
							
								```bash
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								export KAFKA_BOOTSTRAP_SERVER=broker:29092
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								export KAFKA_SCHEMAREGISTRY_URL=http://schema-registry:8081
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								export SPRING_KAFKA_PROPERTIES_SASL_KERBEROS_SERVICE_NAME=kafka
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								export SPRING_KAFKA_PROPERTIES_SECURITY_PROTOCOL=SASL_PLAINTEXT
							 
						 
					
						
							
								
									
										
										
										
											2020-07-31 16:40:21 +05:30 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								export SPRING_KAFKA_PROPERTIES_SASL_JAAS_CONFIG=com.sun.security.auth.module.Krb5LoginModule required principal='principal@REALM ' useKeyTab=true storeKey=true keyTab='/keytab';
							 
						 
					
						
							
								
									
										
										
										
											2020-07-27 05:22:51 -07:00 
										
									 
								 
							 
							
								
							 
							
								 
							
							
								```
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2022-06-17 09:29:50 -04:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								#### Example: Connecting using AWS IAM (MSK)
  
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2022-01-12 23:49:34 -06:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								Here is another example of how SASL_SSL can be configured for AWS_MSK_IAM when connecting to MSK using IAM via environment variables
							 
						 
					
						
							
								
									
										
										
										
											2025-04-16 16:55:51 -07:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2022-01-12 23:49:34 -06:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								```bash
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								SPRING_KAFKA_PROPERTIES_SECURITY_PROTOCOL=SASL_SSL
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								SPRING_KAFKA_PROPERTIES_SSL_TRUSTSTORE_LOCATION=/tmp/kafka.client.truststore.jks
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								SPRING_KAFKA_PROPERTIES_SASL_MECHANISM=AWS_MSK_IAM
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								SPRING_KAFKA_PROPERTIES_SASL_JAAS_CONFIG=software.amazon.msk.auth.iam.IAMLoginModule required;
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								SPRING_KAFKA_PROPERTIES_SASL_CLIENT_CALLBACK_HANDLER_CLASS=software.amazon.msk.auth.iam.IAMClientCallbackHandler
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								```
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2022-06-17 09:29:50 -04:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								For more information about configuring these variables, check out Spring's [Externalized Configuration ](https://docs.spring.io/spring-boot/docs/current/reference/html/spring-boot-features.html#boot-features-external-config ) to see how this works.
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								Also see [Kafka Connect Security ](https://docs.confluent.io/current/connect/security.html ) for more ways to connect.
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								### Topic Configuration
  
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								By default, DataHub relies on the a set of Kafka topics to operate. By default, they have the following names:
							 
						 
					
						
							
								
									
										
										
										
											2020-07-27 05:22:51 -07:00 
										
									 
								 
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2023-10-04 19:24:04 +02:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								1.  **MetadataChangeProposal_v1**  
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								2.  **FailedMetadataChangeProposal_v1**  
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								3.  **MetadataChangeLog_Versioned_v1**  
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								4.  **MetadataChangeLog_Timeseries_v1**  
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								5.  **DataHubUsageEvent_v1** : User behavior tracking event for UI 
						 
					
						
							
								
									
										
										
										
											2022-06-17 09:29:50 -04:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								6.  (Deprecated) **MetadataChangeEvent_v4** : Metadata change proposal messages 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								7.  (Deprecated) **MetadataAuditEvent_v4** : Metadata change log messages 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								8.  (Deprecated) **FailedMetadataChangeEvent_v4** : Failed to process #1  event 
						 
					
						
							
								
									
										
										
										
											2023-10-04 19:24:04 +02:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								9.  **MetadataGraphEvent_v4** : 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								10.  **MetadataGraphEvent_v4** : 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								11.  **PlatformEvent_v1** : 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								12.  **DataHubUpgradeHistory_v1** : Notifies the end of DataHub Upgrade job so dependants can act accordingly (_eg_, startup). 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								    Note this topic requires special configuration: **Infinite retention** . Also, 1 partition is enough for the occasional traffic.
							 
						 
					
						
							
								
									
										
										
										
											2020-07-27 05:22:51 -07:00 
										
									 
								 
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2023-10-04 19:24:04 +02:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								How Metadata Events relate to these topics is discussed at more length in [Metadata Events ](../what/mxe.md ).
							 
						 
					
						
							
								
									
										
										
										
											2020-10-02 09:15:07 -07:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2022-06-17 09:29:50 -04:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								We've included environment variables to customize the name each of these topics, for cases where an organization has naming rules for your topics.
							 
						 
					
						
							
								
									
										
										
										
											2020-10-02 09:15:07 -07:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2025-08-27 17:03:56 +05:30 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								### Metadata Service (datahub-gms) and System Update (datahub-system-update)
  
						 
					
						
							
								
									
										
										
										
											2020-10-02 09:15:07 -07:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2025-08-27 17:03:56 +05:30 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								The following are environment variables you can use to configure topic names used in the Metadata Service container and
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								the System Update container for topic setup:
							 
						 
					
						
							
								
									
										
										
										
											2020-10-02 09:15:07 -07:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2022-06-17 09:29:50 -04:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								-  `METADATA_CHANGE_PROPOSAL_TOPIC_NAME` : The name of the topic for Metadata Change Proposals emitted by the ingestion framework. 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								-  `FAILED_METADATA_CHANGE_PROPOSAL_TOPIC_NAME` : The name of the topic for Metadata Change Proposals emitted when MCPs fail processing. 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								-  `METADATA_CHANGE_LOG_VERSIONED_TOPIC_NAME` : The name of the topic for Metadata Change Logs that are produced for Versioned Aspects. 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								-  `METADATA_CHANGE_LOG_TIMESERIES_TOPIC_NAME` : The name of the topic for Metadata Change Logs that are produced for Timeseries Aspects. 
						 
					
						
							
								
									
										
										
										
											2023-01-11 21:02:31 +07:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								-  `PLATFORM_EVENT_TOPIC_NAME` : The name of the topic for Platform Events (high-level semantic events). 
						 
					
						
							
								
									
										
										
										
											2025-04-16 16:55:51 -07:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								-  `DATAHUB_USAGE_EVENT_NAME` : The name of the topic for product analytics events. 
						 
					
						
							
								
									
										
										
										
											2022-06-17 09:29:50 -04:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								-  (Deprecated) `METADATA_CHANGE_EVENT_NAME` : The name of the metadata change event topic. 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								-  (Deprecated) `METADATA_AUDIT_EVENT_NAME` : The name of the metadata audit event topic. 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								-  (Deprecated) `FAILED_METADATA_CHANGE_EVENT_NAME` : The name of the failed metadata change event topic. 
						 
					
						
							
								
									
										
										
										
											2025-04-16 16:55:51 -07:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2022-06-17 09:29:50 -04:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								### MCE Consumer (datahub-mce-consumer)
  
						 
					
						
							
								
									
										
										
										
											2020-10-02 09:15:07 -07:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2022-06-17 15:15:51 -04:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								-  `METADATA_CHANGE_PROPOSAL_TOPIC_NAME` : The name of the topic for Metadata Change Proposals emitted by the ingestion framework. 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								-  `FAILED_METADATA_CHANGE_PROPOSAL_TOPIC_NAME` : The name of the topic for Metadata Change Proposals emitted when MCPs fail processing. 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								-  (Deprecated) `METADATA_CHANGE_EVENT_NAME` : The name of the deprecated topic that an embedded MCE consumer will consume from. 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								-  (Deprecated) `FAILED_METADATA_CHANGE_EVENT_NAME` : The name of the deprecated topic that failed MCEs will be written to. 
						 
					
						
							
								
									
										
										
										
											2020-11-14 10:38:48 +05:30 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2025-04-16 16:55:51 -07:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								### MAE Consumer (datahub-mae-consumer)
  
						 
					
						
							
								
									
										
										
										
											2022-06-17 09:29:50 -04:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2022-06-17 15:15:51 -04:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								-  `METADATA_CHANGE_LOG_VERSIONED_TOPIC_NAME` : The name of the topic for Metadata Change Logs that are produced for Versioned Aspects. 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								-  `METADATA_CHANGE_LOG_TIMESERIES_TOPIC_NAME` : The name of the topic for Metadata Change Logs that are produced for Timeseries Aspects. 
						 
					
						
							
								
									
										
										
										
											2023-01-11 21:02:31 +07:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								-  `PLATFORM_EVENT_TOPIC_NAME` : The name of the topic for Platform Events (high-level semantic events). 
						 
					
						
							
								
									
										
										
										
											2022-06-17 15:15:51 -04:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								-  `DATAHUB_USAGE_EVENT_NAME` : The name of the topic for product analytics events. 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								-  (Deprecated) `METADATA_AUDIT_EVENT_NAME` : The name of the deprecated metadata audit event topic. 
						 
					
						
							
								
									
										
										
										
											2025-04-16 16:55:51 -07:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2020-10-02 09:15:07 -07:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								Please ensure that these environment variables are set consistently throughout your ecosystem. DataHub has a few different applications running which communicate with Kafka (see above).
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2021-03-28 08:54:36 +05:30 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								## Configuring Consumer Group Id
  
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								Kafka Consumers in Spring are configured using Kafka listeners. By default, consumer group id is same as listener id.
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								We've included an environment variable to customize the consumer group id, if your company or organization has specific naming rules.
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								### datahub-mce-consumer and datahub-mae-consumer
  
						 
					
						
							
								
									
										
										
										
											2022-06-17 09:29:50 -04:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2021-03-28 08:54:36 +05:30 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								-  `KAFKA_CONSUMER_GROUP_ID` : The name of the kafka consumer's group id. 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2024-08-09 11:40:03 -05:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								#### datahub-mae-consumer MCL Hooks
  
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2025-04-16 16:55:51 -07:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								By default, all MetadataChangeLog processing hooks execute as part of the same kafka consumer group based on the
							 
						 
					
						
							
								
									
										
										
										
											2024-08-09 11:40:03 -05:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								previously mentioned `KAFKA_CONSUMER_GROUP_ID` .
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2025-04-16 16:55:51 -07:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								The various MCL Hooks could alsp be separated into separate groups which allows for controlling parallelization and
							 
						 
					
						
							
								
									
										
										
										
											2024-08-09 11:40:03 -05:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								prioritization of the hooks.
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								For example, the `UpdateIndicesHook`  and `SiblingsHook`  processing can be delayed by other hooks. Separating these
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								hooks into their own group can reduce latency from these other hooks. The `application.yaml`  configuration
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								includes options for assigning a suffix to the consumer group, see `consumerGroupSuffix` .
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								| Environment Variable                           | Default | Description                                                                                 |
							 
						 
					
						
							
								
									
										
										
										
											2025-04-16 16:55:51 -07:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								| ---------------------------------------------- | ------- | ------------------------------------------------------------------------------------------- |
							 
						 
					
						
							
								
									
										
										
										
											2024-08-09 11:40:03 -05:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								| SIBLINGS_HOOK_CONSUMER_GROUP_SUFFIX            | ''      | Siblings processing hook. Considered one of the primary hooks in the `datahub-mae-consumer`  |
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								| UPDATE_INDICES_CONSUMER_GROUP_SUFFIX           | ''      | Primary processing hook.                                                                    |
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								| INGESTION_SCHEDULER_HOOK_CONSUMER_GROUP_SUFFIX | ''      | Scheduled ingestion hook.                                                                   |
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								| INCIDENTS_HOOK_CONSUMER_GROUP_SUFFIX           | ''      | Incidents hook.                                                                             |
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								| ECE_CONSUMER_GROUP_SUFFIX                      | ''      | Entity Change Event hook which publishes to the Platform Events topic.                      |
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								| FORMS_HOOK_CONSUMER_GROUP_SUFFIX               | ''      | Forms processing.                                                                           |
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2022-06-17 09:29:50 -04:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								## Applying Configurations
  
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								### Docker
  
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2025-04-16 16:55:51 -07:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								Simply add the above environment variables to the required `docker.env`  files for the containers. These can
							 
						 
					
						
							
								
									
										
										
										
											2022-06-17 09:29:50 -04:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								be found inside the `docker`  folder of the repository.
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2025-04-16 16:55:51 -07:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								### Helm
  
						 
					
						
							
								
									
										
										
										
											2022-06-17 09:29:50 -04:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								On Helm, you'll need to configure these environment variables using the `extraEnvs`  sections of the specific container's
							 
						 
					
						
							
								
									
										
										
										
											2025-04-16 16:55:51 -07:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								configurations inside your `values.yaml`  file.
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2020-11-14 10:38:48 +05:30 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								```
							 
						 
					
						
							
								
									
										
										
										
											2025-04-16 16:55:51 -07:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								datahub-gms:
							 
						 
					
						
							
								
									
										
										
										
											2022-06-17 09:29:50 -04:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								    ...
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								    extraEnvs:
							 
						 
					
						
							
								
									
										
										
										
											2022-06-17 15:15:51 -04:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								      -  name: METADATA_CHANGE_PROPOSAL_TOPIC_NAME
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								        value: "CustomMetadataChangeProposal_v1"
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								      -  name: METADATA_CHANGE_LOG_VERSIONED_TOPIC_NAME
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								        value: "CustomMetadataChangeLogVersioned_v1"
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								      -  name: FAILED_METADATA_CHANGE_PROPOSAL_TOPIC_NAME
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								        value: "CustomFailedMetadataChangeProposal_v1"
							 
						 
					
						
							
								
									
										
										
										
											2022-06-17 09:29:50 -04:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								      -  name: KAFKA_CONSUMER_GROUP_ID
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								        value: "my-apps-mae-consumer"
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								        ....
							 
						 
					
						
							
								
									
										
										
										
											2025-08-27 17:03:56 +05:30 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								datahub-system-update:
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								    ...
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								    extraEnvs:
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								      -  name: METADATA_CHANGE_PROPOSAL_TOPIC_NAME
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								        value: "CustomMetadataChangeProposal_v1"
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								      -  name: METADATA_CHANGE_LOG_VERSIONED_TOPIC_NAME
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								        value: "CustomMetadataChangeLogVersioned_v1"
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								      -  name: FAILED_METADATA_CHANGE_PROPOSAL_TOPIC_NAME
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								        value: "CustomFailedMetadataChangeProposal_v1"
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								      -  name: KAFKA_CONSUMER_GROUP_ID
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								        value: "my-apps-mae-consumer"
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								        ....
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2025-04-16 16:55:51 -07:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2022-06-17 09:29:50 -04:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								datahub-frontend:
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								    ...
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								    extraEnvs:
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								        -  name: DATAHUB_TRACKING_TOPIC
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								          value: "MyCustomTrackingEvent"
							 
						 
					
						
							
								
									
										
										
										
											2025-04-16 16:55:51 -07:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2022-06-17 09:29:50 -04:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								# If standalone consumers are enabled
  
						 
					
						
							
								
									
										
										
										
											2025-04-16 16:55:51 -07:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								datahub-mae-consumer;
							 
						 
					
						
							
								
									
										
										
										
											2022-06-17 09:29:50 -04:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								    extraEnvs:
							 
						 
					
						
							
								
									
										
										
										
											2022-06-17 15:15:51 -04:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								        -  name: METADATA_CHANGE_LOG_VERSIONED_TOPIC_NAME
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								          value: "CustomMetadataChangeLogVersioned_v1"
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								          ....
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								        -  name: METADATA_AUDIT_EVENT_NAME
							 
						 
					
						
							
								
									
										
										
										
											2022-06-17 09:29:50 -04:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								          value: "MetadataAuditEvent"
							 
						 
					
						
							
								
									
										
										
										
											2025-04-16 16:55:51 -07:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								datahub-mce-consumer;
							 
						 
					
						
							
								
									
										
										
										
											2022-06-17 09:29:50 -04:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								    extraEnvs:
							 
						 
					
						
							
								
									
										
										
										
											2022-06-17 15:15:51 -04:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								        -  name: METADATA_CHANGE_PROPOSAL_TOPIC_NAME
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								          value: "CustomMetadataChangeLogVersioned_v1"
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								          ....
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								        -  name: METADATA_CHANGE_EVENT_NAME
							 
						 
					
						
							
								
									
										
										
										
											2022-06-17 09:29:50 -04:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								          value: "MetadataChangeEvent"
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								        ....
							 
						 
					
						
							
								
									
										
										
										
											2020-11-14 10:38:48 +05:30 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								```
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2022-01-06 03:01:12 +05:30 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								## Other Components that use Kafka can be configured using environment variables:
  
						 
					
						
							
								
									
										
										
										
											2025-04-16 16:55:51 -07:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2022-01-06 03:01:12 +05:30 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								-  schema-registry 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2025-08-27 17:03:56 +05:30 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								## SASL/GSSAPI properties for system-update and datahub-frontend via environment variables
  
						 
					
						
							
								
									
										
										
										
											2025-04-16 16:55:51 -07:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2022-01-06 03:01:12 +05:30 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								```bash
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								KAFKA_BOOTSTRAP_SERVER=broker:29092
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								KAFKA_SCHEMAREGISTRY_URL=http://schema-registry:8081
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								KAFKA_PROPERTIES_SASL_KERBEROS_SERVICE_NAME=kafka
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								KAFKA_PROPERTIES_SECURITY_PROTOCOL=SASL_PLAINTEXT
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								KAFKA_PROPERTIES_SASL_JAAS_CONFIG=com.sun.security.auth.module.Krb5LoginModule required principal='principal@REALM ' useKeyTab=true storeKey=true keyTab='/keytab';
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								```
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								## SASL/GSSAPI properties for schema-registry via environment variables
  
						 
					
						
							
								
									
										
										
										
											2025-04-16 16:55:51 -07:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2022-01-06 03:01:12 +05:30 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								```bash
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								SCHEMA_REGISTRY_KAFKASTORE_BOOTSTRAP_SERVERS=broker:29092
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								SCHEMA_REGISTRY_KAFKASTORE_SASL_KERBEROS_SERVICE_NAME=kafka
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								SCHEMA_REGISTRY_KAFKASTORE_SECURITY_PROTOCOL=SASL_PLAINTEXT
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								SCHEMA_REGISTRY_KAFKASTORE_SASL_JAAS_CONFIG=com.sun.security.auth.module.Krb5LoginModule required principal='principal@REALM ' useKeyTab=true storeKey=true keyTab='/keytab';
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								```
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2020-10-02 09:15:07 -07:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								## SSL
  
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2022-01-04 14:41:36 -06:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								### Kafka
  
						 
					
						
							
								
									
										
										
										
											2022-06-17 09:29:50 -04:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2020-10-02 09:15:07 -07:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								We are using the Spring Boot framework to start our apps, including setting up Kafka. You can
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								[use environment variables to set system properties ](https://docs.spring.io/spring-boot/docs/current/reference/html/spring-boot-features.html#boot-features-external-config-relaxed-binding-from-environment-variables ),
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								including [Kafka properties ](https://docs.spring.io/spring-boot/docs/current/reference/html/appendix-application-properties.html#integration-properties ).
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								From there you can set your SSL configuration for Kafka.
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2022-01-04 14:41:36 -06:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								### Schema Registry
  
						 
					
						
							
								
									
										
										
										
											2025-04-16 16:55:51 -07:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2022-01-04 14:41:36 -06:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								If Schema Registry is configured to use security (SSL), then you also need to set additional values.
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2025-04-16 16:55:51 -07:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								The [MCE ](../../metadata-jobs/mce-consumer-job ) and [MAE ](../../metadata-jobs/mae-consumer-job ) consumers can set
							 
						 
					
						
							
								
									
										
										
										
											2022-01-04 14:41:36 -06:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								default Spring Kafka environment values, for example:
							 
						 
					
						
							
								
									
										
										
										
											2025-04-16 16:55:51 -07:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2022-01-04 14:41:36 -06:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								-  `SPRING_KAFKA_PROPERTIES_SCHEMA_REGISTRY_SECURITY_PROTOCOL`  
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								-  `SPRING_KAFKA_PROPERTIES_SCHEMA_REGISTRY_SSL_KEYSTORE_LOCATION`  
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								-  `SPRING_KAFKA_PROPERTIES_SCHEMA_REGISTRY_SSL_KEYSTORE_PASSWORD`  
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								-  `SPRING_KAFKA_PROPERTIES_SCHEMA_REGISTRY_SSL_TRUSTSTORE_LOCATION`  
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								-  `SPRING_KAFKA_PROPERTIES_SCHEMA_REGISTRY_SSL_TRUSTSTORE_PASSWORD`  
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								[GMS ](../what/gms.md ) can set the following environment variables that will be passed as properties when creating the Schema Registry
							 
						 
					
						
							
								
									
										
										
										
											2025-04-16 16:55:51 -07:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								Client.
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2022-01-04 14:41:36 -06:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								-  `KAFKA_SCHEMA_REGISTRY_SECURITY_PROTOCOL`  
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								-  `KAFKA_SCHEMA_REGISTRY_SSL_KEYSTORE_LOCATION`  
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								-  `KAFKA_SCHEMA_REGISTRY_SSL_KEYSTORE_PASSWORD`  
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								-  `KAFKA_SCHEMA_REGISTRY_SSL_TRUSTSTORE_LOCATION`  
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								-  `KAFKA_SCHEMA_REGISTRY_SSL_TRUSTSTORE_PASSWORD`  
						 
					
						
							
								
									
										
										
										
											2020-10-02 09:15:07 -07:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								>  **Note** In the logs you might see something like
  
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								>  `The configuration 'kafkastore.ssl.truststore.password' was supplied but isn't a known config.` The configuration is
  
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								>  not a configuration required for the producer. These WARN message can be safely ignored. Each of Datahub services are
  
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								>  passed a full set of configuration but may not require all the configurations that are passed to them. These warn
  
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								>  messages indicate that the service was passed a configuration that is not relevant to it and can be safely ignored.
  
						 
					
						
							
								
									
										
										
										
											2022-01-04 14:41:36 -06:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2025-04-16 16:55:51 -07:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								>  Other errors: `Failed to start bean 'org.springframework.kafka.config.internalKafkaListenerEndpointRegistry'; nested exception is org.apache.kafka.common.errors.TopicAuthorizationException: Not authorized to access topics: [DataHubUsageEvent_v1]`. Please check ranger permissions or kafka broker logs.
  
						 
					
						
							
								
									
										
										
										
											2025-08-27 17:03:56 +05:30 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								### Additional Kafka Topic level configuration
  
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								For additional [Kafka topic level config properties ](https://kafka.apache.org/documentation/#topicconfigs ), either add them to application.yaml under `kafka.topics.<topicName>.configProperties`  or `kafka.topicDefaults.configProperties`  or define env vars in the following form (standard Spring conventions applied to application.yaml)
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								These env vars are required in datahub-system-update contained.
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								Examples:
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								1.  To configure `max.message.bytes`  on topic used for `metadataChangeLogVersioned` , set 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								   `KAFKA_TOPICS_metadataChangeLogVersioned_CONFIGPROPERTIES_max_message_bytes=10000` 
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								2.  To configure `max.message.bytes`  for all topics that don't explicitly define one, set the `topicDefaults`  via 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								   `KAFKA_TOPICDEFAULTS_CONFIGPROPERTIES_max_message_bytes=10000` 
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								Configurations specified in `topicDefaults`  are applied to all topics by merging them with any configs defined per topic, with the per-topic config taking precedence over those specified in `topicDefault` .
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								If you intend to create and configure the topics yourself and not have datahub create them, the kafka setup process of
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								datahub-system-update can be turned off by setting env var DATAHUB_PRECREATE_TOPICS to false
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								## Debugging Kafka
  
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								You can install [kafkacat ](https://github.com/edenhill/kafkacat ) to consume and produce messaged to Kafka topics.
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								For example, to consume messages on MetadataAuditEvent topic, you can run below command.
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								```
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								kafkacat -b localhost:9092 -t MetadataAuditEvent
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								```
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								However, `kafkacat`  currently doesn't support Avro deserialization at this point,
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								but they have an ongoing [work ](https://github.com/edenhill/kafkacat/pull/151 ) for that.