2019-12-18 18:57:18 -08:00
..
2019-12-18 18:57:18 -08:00

Kafka, Zookeeper and Schema Registry

DataHub uses Kafka as the pub-sub message queue in the backend. Official Confluent Kafka Docker images found in Docker Hub is used without any modification.

Run Docker container

Below command will start all Kafka related containers.

cd docker/kafka && docker-compose pull && docker-compose up

As part of docker-compose, we also initialize a container called kafka-setup to create MetadataAuditEvent and MetadataChangeEvent topics. The only thing this container does is creating Kafka topics after Kafka broker is ready.

There is also a container which provides visual schema registry interface which you can register/unregister schemas. You can connect to schema-registry-ui on your web browser to monitor Kafka Schema Registry via below link:

http://localhost:8000

Container configuration

External Port

If you need to configure default configurations for your container such as the exposed port, you will do that in docker-compose.yml file. Refer to this link to understand how to change your exposed port settings.

ports:
  - "9092:9092"

Docker Network

All Docker containers for DataHub are supposed to be on the same Docker network which is datahub_network. If you change this, you will need to change this for all other Docker containers as well.

networks:
  default:
    name: datahub_network

Debugging Kafka

You can install kafkacat to consume and produce messaged to Kafka topics. For example, to consume messages on MetadataAuditEvent topic, you can run below command.

kafkacat -b localhost:9092 -t MetadataAuditEvent

However, kafkacat currently doesn't support Avro deserialization at this point, but they have an ongoing work for that.